Developing custom GRASS programs
As the majority of the RHESSys group is using OS X, this article targets that OS. However, most of the content should apply to any operating system.
Contents
Get a compiler
Development tools are not installed by default on OS X. You can install the dev tools from any OS X installation CD. For the most recent version, go Apple's Mac Dev Center and download the latest version of XCode. You will be required to create a free account before you can download the tools.
Get the GRASS source
In order to write your own custom GRASS programs, you will need the GRASS header files and libraries. Even if you already have GRASS on your computer, these may not be installed. Additionally, the actual GRASS source code is an excellent resource for learning to write new GRASS modules, and should be downloaded even if your installation already has the necessary headers and libraries.
First, check to see if your GRASS install has the necessary headers and libraries.
- Open a terminal window
- cd to the folder containing your GRASS executable
- $> ls <GRASS app name>/Contents/MacOS/include/grass
- If you see several files ending in .h, one of which is gis.h, then you have the GRASS headers.
- $> ls <GRASS app name>/Contents/MacOS/lib
- You should see a long list of files ending in .dylib. Make sure that one of them is libgrass_gis.dylib.
- If your GRASS install had both the .h and .dylib files, then you already have the necessary files and do not need to download the source.
Next, getting the GRASS source. To download the GRASS source, go to http://grass.itc.it/download/index.php. It is best to download
the RC code, as svn code may have issues compiling. If you did not find the necessary .h and .dylib files, it will be necessary to
compile GRASS. Unfortunately, GRASS has a long list of dependencies, and building can be rather difficult.
An explanation of building software and all it's dependancies is beyond the scope of this tutorial. However, learning to build the software yourself can be a useful skill. For example, GRASS can support NetCDF, a format we intend to use more in the future. However, most pre-built GRASS installations do not use this functionality. By building GRASS yourself, you can have access to useful tools before they become mainstream.
Examining the GRASS source
GRASS is very different than a program like MATLAB or MSWord. Where those programs are a monolithic single application, GRASS is actually a collection of very small programs. This makes reading through the source code and expanding GRASS functionality very easy. Every individual GRASS command, such as g.region, d.mon, r.mapcalc, etc. is actually a small self contained program. These smaller programs all interact with each other by accessing the GRASS map data stored in your projects LOCATION folder. One can think of this as similar to using RHESSys, where tools such as g2w, cf, and rhessys all work together to form the larger RHESSys package.
In your terminal window, browse to where you decompressed the source tarball and look around. Most of the files in the top level of the source are either documentation, or for building GRASS. The source is in several folders. display contains the code for all the tools that start with d., general contains the source for all the tools that start with g., raster for the r. tools, and so on. Lets take a look at one of these tools.
From the top of the GRASS source folder, type:
$> cd raster/r.clump
$> ls
The files of interest here are:
- main.c
- clump.c
- local_proto.h
- Makefile
The files ending in .c are C language source files. They are roughly equivalent to a .m file in MATLAB. These are what contain the actual data processing aspect of the program. main.c is a special file. This is where the main() function of your program is, and the first
code to be run when executing the program. The file is not required to be called main.c, but by convention it is. If you are looking at
a module and do not see a file main.c, use the command
$> ls *.c | xargs grep -H main
And look for a file that contains a line:
int main(int argc, char *argv[])
For example, if you look at the r.circle folder, there is no main.c file. Running the above command will show you that the main() function is actually locaated in the dist.c file.
Next is the .h file. These play the same role as the gis.h and other .h files we were looking for at the beginning of the tutorial. .h files are called headers. Their job is to tell a .c what functionality is available outside of itself. Unlike MATLAB, where each .m file is sharing the same space, each .c file is completely cut off from each other. When we start writing GRASS modules, the gis.h file will tell our program what gis functionallity is available.
Makefile describes the steps necessary to convert our .c and .h files into an actual program. While Makefiles can be very complex, ours will be basic.
Take a minute to look through the code of a few GRASS modules you use often and get a feel for what they look like.
The example program
Thankfully, there is a simple program that demonstrates the basics of reading and writing GRASS raster maps. We will go through
this program line by line and see what is happening. Go back to the root of the GRASS source tree, then
$> cd doc/raster/r.example
The r.example program copies the contents of a GRASS raster map, then writes that data back out is a new map. Open up main.c in a text editor.
Line 1-15
1 /****************************************************************************
2 *
3 * MODULE: r.example
4 * AUTHOR(S): Markus Neteler - neteler itc.it
5 * with hints from: Glynn Clements - glynn gclements.plus.com
6 * PURPOSE: Just copies a raster map, preserving the raster map type
7 * Intended to explain GRASS raster programming
8 *
9 * COPYRIGHT: (C) 2002,2005 by the GRASS Development Team
10 *
11 * This program is free software under the GNU General Public
12 * License (>=v2). Read the file COPYING that comes with GRASS
13 * for details.
14 *
15 *****************************************************************************/
This first part is a comment, similar to using %
in MATLAB, or #
in a bash shell script. C
encases comments in a slash and star such as
/* Contents of comment here.... */
Note that this may cause problems if you insert multiple closing comment symbols. For example
/* Comments of comment here... */ <random C code> */
will produce an error.
Line 18-22
18 #include <stdio.h> 19 #include <stdlib.h> 20 #include <string.h> 21 #include <grass/gis.h> 22 #include <grass/glocale.h>
This is a list of header files that the program needs. These lines MUST come before everything else. stdio.h, stdlib.h, and string.h
are all part of the standard C language library, and define various utility functions that the program will use. gis.h and glocale.h
are header files from GRASS. They tell your program how to access the GRASS location data.
Line 58
18 int main(int argc, char *argv[])
Every C program has a main function. This is where the program starts. You need not worry about argc and argv, the GRASS library will handle those values for you.
Line 61-78
61 struct Cell_head cellhd; /* it stores region information,
62 and header information of rasters */
63 char *name; /* input raster name */
64 char *result; /* output raster name */
65 char *mapset; /* mapset name */
66 void *inrast; /* input buffer */
67 unsigned char *outrast; /* output buffer */
68 int nrows, ncols;
69 int row, col;
70 int infd, outfd; /* file descriptor */
71 int verbose;
72 RASTER_MAP_TYPE data_type; /* type of the map (CELL/DCELL/...) */
73 struct History history; /* holds meta-data (title, comments,..) */
74
75 struct GModule *module; /* GRASS module for parsing arguments */
76
77 struct Option *input, *output; /* options */
78 struct Flag *flag1; /* flags */
The programmer chose to declare all variables here at the start of the program. Note that it is not necessary to declare all variables together
at the start of a function. Variables may be declared anywhere, and more recent style guides recommend declaring your variables closer to where they are actually used. Also note that each variable is prefixed with a type, such as struct, char, or int. This behavior is very different than R or MATLAB. In R or MATLAB, you can simply type
R> my_array <- c(1,1,2,3,5,8)
and R will decide that my_array should be a vector and get on with it. C requires you to explicitly give each variable a type. For example, the following code
int a_number;
a_number = "hi"
will fail to compile.
Line 80
80 G_gisinit(argv[0]); /* reads grass env, stores program name to G_program_name() */
This function connects your program to the GRASS location database. This line must be the first function called. Note that the function name starts with 'G_'. All GRASS functions will follow this pattern.
Line 83-85
83 module = G_define_module();
84 module->keywords = _("raster, keyword2, keyword3");
85 module->description = _("My first raster module");
More mandatory GRASS setup. This will tell give GRASS information about your program. Information
is optional, but useful.
Options and flags
Arguments to a GRASS program come in two types, options and flags. Flags are a '-' followed by
a letter. For example, the following command uses -p as a flag.
GRASS> g.region -p
Flags can be though of as switches, they are either true or false. Options are arguments that can
take a larger number of values. In the r.in.ascii command, input and output are options.
GRASS> r.in.ascii input=my_ascii_file.txt output=new_raster
To create a flag corresponding to '-f', use this code:
Flag *my_flag;
my_flag = G_define_flag();
my_flag->key = 'f';
my_flag->description = "Nobody knows what this flag does"
key is the letter you want to use. Note that this can only be a single letter. Description is the text that will
be used in the automatically generated '--help' option of your program.
To create an option such as opt=value, use this code:
Option *my_option;
my_option = G_define_option();
my_option->key = 'opt';
my_option->type = TYPE_STRING; /* Note this can be TYPE_STRING, TYPE_FLOAT, or TYPE_INTEGER */
my_option->required = NO; /* May also be set to YES. If requires is set to YES and the user does not
specify the option, your program will exit immediately */
my_option->description = "Useless option"
Line 98-104
98 if (G_parser(argc, argv))
99 exit(EXIT_FAILURE);
100
101 /* stores options and flags to variables */
102 name = input->answer;
103 result = output->answer;
104 verbose = (!flag1->answer);
Once you have defined all your arguments, these two lines will fill the Option and Flag
structures you created with values taken from the command line. Line 101-104 copies
those values for future use. The r.example program skirts the data type issues mentioned
in the above Options and flags section by using the G_define_standard_option() function,
however I think it is easier and more consistent to do it the generic way.