Difference between revisions of "Developing custom GRASS programs"

From Rhessys
Jump to navigationJump to search
 
(9 intermediate revisions by the same user not shown)
Line 17: Line 17:
 
# Open a terminal window
 
# Open a terminal window
 
# cd to the folder containing your GRASS executable
 
# cd to the folder containing your GRASS executable
# $> '''ls <GRASS app name>/Contents/MacOS/include/grass'''
+
# $> '''ls <GRASS app path>/Contents/MacOS/include/grass'''
 
# If you see several files ending in .h, one of which is gis.h, then you have the GRASS headers.
 
# If you see several files ending in .h, one of which is gis.h, then you have the GRASS headers.
# $> '''ls <GRASS app name>/Contents/MacOS/lib'''
+
# $> '''ls <GRASS app path>/Contents/MacOS/lib'''
 
# You should see a long list of files ending in .dylib.  Make sure that one of them is libgrass_gis.dylib.
 
# You should see a long list of files ending in .dylib.  Make sure that one of them is libgrass_gis.dylib.
 
# If your GRASS install had both the .h and .dylib files, then you already have the necessary files and do not need to download the source.  
 
# If your GRASS install had both the .h and .dylib files, then you already have the necessary files and do not need to download the source.  
  
  
Next, getting the GRASS source.  To download the GRASS source, go to http://grass.itc.it/download/index.php.  It is best to download
+
If you do not have these files, or still want to download GRASS directly from source to have access to the example code, keep reading.  To download the GRASS source, go to http://grass.itc.it/download/index.php.  It is best to download
 
the RC code, as svn code may have issues compiling.  If you did not find the necessary .h and .dylib files, it will be necessary to
 
the RC code, as svn code may have issues compiling.  If you did not find the necessary .h and .dylib files, it will be necessary to
 
compile GRASS.  Unfortunately, GRASS has a long list of dependencies, and building can be rather difficult.   
 
compile GRASS.  Unfortunately, GRASS has a long list of dependencies, and building can be rather difficult.   
  
An explanation of building software and all it's dependancies is beyond the scope of this tutorial.  However, learning to build the software yourself can be a useful skill.  For example, GRASS can support NetCDF, a format we intend to use more in the future.  However, most pre-built GRASS installations do not use this functionality.  By building GRASS yourself, you can have access to useful tools before they become mainstream.
+
An explanation of building software and all it's dependancies is beyond the scope of this tutorial.  However, learning to build the software yourself can be a useful skill.  For example, GRASS can support NetCDF, a format we intend to use more in the future.  However, most pre-built GRASS installations do not use this functionality.  By building GRASS yourself, you gain access to useful tools before they become mainstream.
  
 
===Examining the GRASS source===
 
===Examining the GRASS source===
Line 297: Line 297:
 
Note the use of the atoi() and atof() functions. GRASS stores the option result as an array of characters, which is different than how C
 
Note the use of the atoi() and atof() functions. GRASS stores the option result as an array of characters, which is different than how C
 
represents a number. These two functions convert the text version of the numbers into the actual representation.
 
represents a number. These two functions convert the text version of the numbers into the actual representation.
 +
 +
 +
====Line 106-126 - Opening a raster map====
 +
Much of this code is standard for opening up a new map.  Look back to the section on variable
 +
declarations for the types of various variables like data_type, cellhd, etc.  In the r.example program,
 +
the variable <code>name</code> was read in as a command line option, and is used as the name
 +
of the raster map to open.  If all goes correctly, the result is that <code>infd</code> has been set
 +
to point to the newly open raster map.
 +
<code>
 +
106    /* returns NULL if the map was not found in any mapset,
 +
107      * mapset name otherwise */
 +
108    mapset = G_find_cell2(name, "");
 +
109    if (mapset == NULL)
 +
110    G_fatal_error(_("Raster map <%s> not found"), name);
 +
111
 +
112    if (G_legal_filename(result) < 0)
 +
113    G_fatal_error(_("<%s> is an illegal file name"), result);
 +
114
 +
115
 +
116    /* determine the inputmap type (CELL/FCELL/DCELL) */
 +
117    data_type = G_raster_map_type(name, mapset);
 +
118
 +
119    /* G_open_cell_old - returns file destriptor (>0) */
 +
120    if ((infd = G_open_cell_old(name, mapset)) < 0)
 +
121    G_fatal_error(_("Unable to open raster map <%s>"), name);
 +
122
 +
123
 +
124    /* controlling, if we can open input raster */
 +
125    if (G_get_cellhd(name, mapset, &cellhd) < 0)
 +
126    G_fatal_error(_("Unable to read file header of <%s>"), name);
 +
</code>
 +
 +
 +
====Line 130-140 -Allocating Buffers====
 +
<code>
 +
130    /* Allocate input buffer */
 +
131    inrast = G_allocate_raster_buf(data_type);
 +
132
 +
133    /* Allocate output buffer, use input map data_type */
 +
134    nrows = G_window_rows();
 +
135    ncols = G_window_cols();
 +
136    outrast = G_allocate_raster_buf(data_type);
 +
137
 +
138    /* controlling, if we can write the raster */
 +
139    if ((outfd = G_open_raster_new(result, data_type)) < 0)
 +
140    G_fatal_error(_("Unable to create raster map <%s>"), result);
 +
</code>
 +
 +
====Line 151-154 - Reading Raster Data====
 +
<code>
 +
151    /* read input map */
 +
152    if (G_get_raster_row(infd, inrast, row, data_type) < 0)
 +
153        G_fatal_error(_("Unable to read raster map <%s> row %d"), name,
 +
154              row);
 +
</code>
  
 
====Line 156-176 - Process Data====
 
====Line 156-176 - Process Data====
Line 333: Line 388:
 
  important to process ALL raster types, as this greatly increases the usability of
 
  important to process ALL raster types, as this greatly increases the usability of
 
  your software.
 
  your software.
 
====Line 106-126 - Opening a raster map====
 
Much of this code is standard for opening up a new map.  Look back to the section on variable
 
declarations for the types of various variables like data_type, cellhd, etc.  In the r.example program,
 
the variable <code>name</code> was read in as a command line option, and is used as the name
 
of the raster map to open.  If all goes correctly, the result is that <code>infd</code> has been set
 
to point to the newly open raster map.
 
<code>
 
106    /* returns NULL if the map was not found in any mapset,
 
107      * mapset name otherwise */
 
108    mapset = G_find_cell2(name, "");
 
109    if (mapset == NULL)
 
110    G_fatal_error(_("Raster map <%s> not found"), name);
 
111
 
112    if (G_legal_filename(result) < 0)
 
113    G_fatal_error(_("<%s> is an illegal file name"), result);
 
114
 
115
 
116    /* determine the inputmap type (CELL/FCELL/DCELL) */
 
117    data_type = G_raster_map_type(name, mapset);
 
118
 
119    /* G_open_cell_old - returns file destriptor (>0) */
 
120    if ((infd = G_open_cell_old(name, mapset)) < 0)
 
121    G_fatal_error(_("Unable to open raster map <%s>"), name);
 
122
 
123
 
124    /* controlling, if we can open input raster */
 
125    if (G_get_cellhd(name, mapset, &cellhd) < 0)
 
126    G_fatal_error(_("Unable to read file header of <%s>"), name);
 
</code>
 
  
 
====Line 183-198 - Memory cleanup and exiting====
 
====Line 183-198 - Memory cleanup and exiting====
Line 394: Line 419:
 
Unlike the R and MATLAB languages, C is not designed to be user friendly, and really enjoys allowing you to blow your own
 
Unlike the R and MATLAB languages, C is not designed to be user friendly, and really enjoys allowing you to blow your own
 
foot off. If you are serious about writing your own custom GRASS programs, a basic understanding of the C language will
 
foot off. If you are serious about writing your own custom GRASS programs, a basic understanding of the C language will
save you hours of headaches latter on.  The [http://www.faqs.org/docs/learnc/index.html Learning GNU C] tutorial is a
+
save you hours of headaches later on.  The [http://www.faqs.org/docs/learnc/index.html Learning GNU C] tutorial is a
 
decent place to start learning.
 
decent place to start learning.
 +
 +
===GRASS Program Project Template===
 +
 +
====The code====
 +
 +
====The makefile====
 +
For more information on customizing your own makefiles, see [http://www.gnu.org/software/make/manual/make.html GNU make]

Latest revision as of 14:45, 4 January 2010

As the majority of the RHESSys group is using OS X, this article targets that OS. However, most of the content should apply to any operating system.

Get a compiler

Development tools are not installed by default on OS X. You can install the dev tools from any OS X installation CD. For the most recent version, go Apple's Mac Dev Center and download the latest version of XCode. You will be required to create a free account before you can download the tools.

Get the GRASS source

In order to write your own custom GRASS programs, you will need the GRASS header files and libraries. Even if you already have GRASS on your computer, these may not be installed. Additionally, the actual GRASS source code is an excellent resource for learning to write new GRASS modules, and should be downloaded even if your installation already has the necessary headers and libraries.

First, check to see if your GRASS install has the necessary headers and libraries.

  1. Open a terminal window
  2. cd to the folder containing your GRASS executable
  3. $> ls <GRASS app path>/Contents/MacOS/include/grass
  4. If you see several files ending in .h, one of which is gis.h, then you have the GRASS headers.
  5. $> ls <GRASS app path>/Contents/MacOS/lib
  6. You should see a long list of files ending in .dylib. Make sure that one of them is libgrass_gis.dylib.
  7. If your GRASS install had both the .h and .dylib files, then you already have the necessary files and do not need to download the source.


If you do not have these files, or still want to download GRASS directly from source to have access to the example code, keep reading. To download the GRASS source, go to http://grass.itc.it/download/index.php. It is best to download the RC code, as svn code may have issues compiling. If you did not find the necessary .h and .dylib files, it will be necessary to compile GRASS. Unfortunately, GRASS has a long list of dependencies, and building can be rather difficult.

An explanation of building software and all it's dependancies is beyond the scope of this tutorial. However, learning to build the software yourself can be a useful skill. For example, GRASS can support NetCDF, a format we intend to use more in the future. However, most pre-built GRASS installations do not use this functionality. By building GRASS yourself, you gain access to useful tools before they become mainstream.

Examining the GRASS source

GRASS is very different than a program like MATLAB or MSWord. Where those programs are a monolithic single application, GRASS is actually a collection of very small programs. This makes reading through the source code and expanding GRASS functionality very easy. Every individual GRASS command, such as g.region, d.mon, r.mapcalc, etc. is actually a small self contained program. These smaller programs all interact with each other by accessing the GRASS map data stored in your projects LOCATION folder. One can think of this as similar to using RHESSys, where tools such as g2w, cf, and rhessys all work together to form the larger RHESSys package.

In your terminal window, browse to where you decompressed the source tarball and look around. Most of the files in the top level of the source are either documentation, or for building GRASS. The source is in several folders. display contains the code for all the tools that start with d., general contains the source for all the tools that start with g., raster for the r. tools, and so on. Lets take a look at one of these tools.

From the top of the GRASS source folder, type:

$> cd raster/r.clump
$> ls

The files of interest here are:

  • main.c
  • clump.c
  • local_proto.h
  • Makefile

The files ending in .c are C language source files. They are roughly equivalent to a .m file in MATLAB. These are what contain the actual data processing aspect of the program. main.c is a special file. This is where the main() function of your program is, and the first code to be run when executing the program. The file is not required to be called main.c, but by convention it is. If you are looking at a module and do not see a file main.c, use the command

$> ls *.c | xargs grep -H main

And look for a file that contains a line:

int main(int argc, char *argv[])

For example, if you look at the r.circle folder, there is no main.c file. Running the above command will show you that the main() function is actually located in the dist.c file.

Next is the .h file. These play the same role as the gis.h and other .h files we were looking for at the beginning of the tutorial. .h files are called headers. Their job is to tell a .c what functionality is available outside of itself. Unlike MATLAB, where each .m file is sharing the same space, each .c file is completely cut off from each other. When we start writing GRASS modules, the gis.h file will tell our program what gis functionallity is available.

Makefile describes the steps necessary to convert our .c and .h files into an actual program. While Makefiles can be very complex, ours will be basic.

Take a minute to look through the code of a few GRASS modules you use often and get a feel for what they look like.

The example program

Thankfully, there is a simple program that demonstrates the basics of reading and writing GRASS raster maps. We will go through this program line by line and see what is happening. Go back to the root of the GRASS source tree, then

$> cd doc/raster/r.example

The r.example program copies the contents of a GRASS raster map, then writes that data back out is a new map. Open up main.c in a text editor.

Line 1-15 - Opening Comments

 1 /****************************************************************************
 2  *
 3  * MODULE:       r.example
 4  * AUTHOR(S):    Markus Neteler - neteler itc.it
 5  *               with hints from: Glynn Clements - glynn gclements.plus.com
 6  * PURPOSE:      Just copies a raster map, preserving the raster map type
 7  *               Intended to explain GRASS raster programming
 8  *
 9  * COPYRIGHT:    (C) 2002,2005 by the GRASS Development Team
10  *
11  *               This program is free software under the GNU General Public
12  *               License (>=v2). Read the file COPYING that comes with GRASS
13  *               for details.
14  *
15  *****************************************************************************/

This first part is a comment, similar to using % in MATLAB, or # in a bash shell script. C encases comments in a slash and star such as

/* Contents of comment here.... */

Note that this may cause problems if you insert multiple closing comment symbols. For example

/* Comments of comment here... */  <random C code> */

will produce an error.

Line 18-22 - Header Files

18 #include <stdio.h>
19 #include <stdlib.h>
20 #include <string.h>
21 #include <grass/gis.h>
22 #include <grass/glocale.h>


This is a list of header files that the program needs. These lines MUST come before everything else. stdio.h, stdlib.h, and string.h are all part of the standard C language library, and define various utility functions that the program will use. gis.h and glocale.h are header files from GRASS. They tell your program how to access the GRASS location data.

This differs from MATLAB. In MATLAB, all variables and functions are stored globally. That means that if I have a variable ALPHA in my workspace, when a script references ALPHA, it will see the existing one in the workspace. In C, each file is it's own isolated namespace, so you must use headers to explicitly tell each file what functions and variables it has access to.


Line 31-51 - Function Declarations and Definitions

31 /*
32  * function definitions 
33  */
34 
35 CELL c_calc(CELL x)
36 {
37     /* we do nothing exciting here */
38     return x;
39 }
40 
41 FCELL f_calc(FCELL x)
42 {
43     /* we do nothing exciting here */
44     return x;
45 }
46 
47 DCELL d_calc(DCELL x)
48 {
49     /* we do nothing exciting here */
50     return x;
51 }

The example program creates three functions for processing data. These particular functions simply return the initial value, resulting in r.example copying the input raster map to a new raster. If your program performs a per cell operation, then this can be a good way to organize the program. However for any raster operation that requires access to the value of multiple cells, using a per-cell function will not work.

Line 58 - The main() function

18 int main(int argc, char *argv[])

Every C program has a main function. This is where the program starts. You need not worry about argc and argv, the GRASS library will handle those values for you.

Line 61-78 - Variable Declarations

61     struct Cell_head cellhd;    /* it stores region information,
62                    and header information of rasters */
63     char *name;         /* input raster name */
64     char *result;       /* output raster name */
65     char *mapset;       /* mapset name */
66     void *inrast;       /* input buffer */
67     unsigned char *outrast; /* output buffer */
68     int nrows, ncols;
69     int row, col;
70     int infd, outfd;        /* file descriptor */
71     int verbose;
72     RASTER_MAP_TYPE data_type;  /* type of the map (CELL/DCELL/...) */
73     struct History history; /* holds meta-data (title, comments,..) */
74 
75     struct GModule *module; /* GRASS module for parsing arguments */
76 
77     struct Option *input, *output;  /* options */
78     struct Flag *flag1;     /* flags */

The programmer chose to declare all variables here at the start of the program. Note that it is not necessary to declare all variables together at the start of a function. Variables may be declared anywhere, and more recent style guides recommend declaring your variables closer to where they are actually used. Also note that each variable is prefixed with a type, such as struct, char, or int. This behavior is very different than R or MATLAB. In R or MATLAB, you can simply type

R> my_array <- c(1,1,2,3,5,8)

and R will decide that my_array should be a vector and get on with it. C requires you to explicitly give each variable a type. For example, the following code

int a_number;
a_number = "hi"

will fail to compile, since "hi" is not a integer.

Line 80 - GRASS Initialization

80     G_gisinit(argv[0]);     /* reads grass env, stores program name to G_program_name() */

This function connects your program to the GRASS location database. This line must be the first function called. Note that the function name starts with 'G_'. All GRASS functions will follow this pattern.

Line 83-85 - Module Initialization

83     module = G_define_module();
84     module->keywords = _("raster, keyword2, keyword3");
85     module->description = _("My first raster module");

More mandatory GRASS setup. This will tell give GRASS information about your program. Information is optional, but useful.

Options and flags

Arguments to a GRASS program come in two types, options and flags. Flags are a '-' followed by a letter. For example, the following command uses -p as a flag.

GRASS> g.region -p

Flags can be though of as switches, they are either true or false. Options are arguments that can take a larger number of values. In the r.in.ascii command, input and output are options.

GRASS> r.in.ascii input=my_ascii_file.txt output=new_raster

To create a flag corresponding to '-f', use this code:

Flag *my_flag; 
my_flag = G_define_flag();
my_flag->key = 'f';
my_flag->description = "Nobody knows what this flag does"

key is the letter you want to use. Note that this can only be a single letter. Description is the text that will be used in the automatically generated '--help' option of your program.

To create an option such as opt=value, use this code:

Option *my_option;
my_option = G_define_option();
my_option->key = 'opt';
my_option->type = TYPE_STRING; /* Note this can be TYPE_STRING, TYPE_FLOAT, or TYPE_INTEGER */
my_option->required = NO; /* May also be set to YES.  If requires is set to YES and the user does not
                                    specify the option, your program will exit immediately */
my_option->description = "Useless option"

The statically typed nature of C complicates options a little. The big difference between an option and a flag is the 'type' field. Here you must tell GRASS what kind of answer you are expecting. TYPE_INTEGER specifies an integer value, TYPE_FLOAT a decimal value, and TYPE_STRING is a string of characters, for example a file name. This type will be important as we try and pick out the value the field was set to.

Line 98-104

98     if (G_parser(argc, argv))
99         exit(EXIT_FAILURE);
100  
101     /* stores options and flags to variables */
102     name = input->answer;
103     result = output->answer;
104     verbose = (!flag1->answer);

Once you have defined all your arguments, these two lines will fill the Option and Flag structures you created with values taken from the command line. Line 101-104 copies those values for future use. The r.example program skirts the data type issues mentioned in the above Options and flags section by using the G_define_standard_option() function, however I think it is easier and more consistent to do it the generic way.

Getting the values of options and flags

Now that we've run the G_parser() function, the Flag and Option structures previously created will have information for us. Getting flats is straightforward

int my_flag_value;
my_flag_value = my_flag->answer;

If my_flag was set, then my_flag_value will be non-zero. Else my_flag will equal 0.

The statically typed nature of C makes extracting option values a little more complicated. This example shows how to get the value of each of the three types of options.

int my_integer_value;
float my_float_value;
char* my_string_value; /* char* is how the C language denotes a string, there is no native string type */
/* Set default values for the arguments in case they were not set at the command line */ my_integer_value = 0; my_flaot_value = 0.0; my_string_value = "";
/* TYPE_INTEGER */ if (integer_opt->answer != NULL) my_integer_value = atoi(integer_opt->answer);
/* TYPE_FLOAT */ if (float_opt->answer != NULL) my_float_value = atof(float_opt->answer);
/* TYPE_STRING */ if (string_opt->answer != NULL) my_string_value = string_opt->answer;

Note the use of the atoi() and atof() functions. GRASS stores the option result as an array of characters, which is different than how C represents a number. These two functions convert the text version of the numbers into the actual representation.


Line 106-126 - Opening a raster map

Much of this code is standard for opening up a new map. Look back to the section on variable declarations for the types of various variables like data_type, cellhd, etc. In the r.example program, the variable name was read in as a command line option, and is used as the name of the raster map to open. If all goes correctly, the result is that infd has been set to point to the newly open raster map.

106     /* returns NULL if the map was not found in any mapset, 
107      * mapset name otherwise */
108     mapset = G_find_cell2(name, "");
109     if (mapset == NULL)
110     G_fatal_error(_("Raster map <%s> not found"), name);
111 
112     if (G_legal_filename(result) < 0)
113     G_fatal_error(_("<%s> is an illegal file name"), result);
114 
115 
116     /* determine the inputmap type (CELL/FCELL/DCELL) */
117     data_type = G_raster_map_type(name, mapset);
118 
119     /* G_open_cell_old - returns file destriptor (>0) */
120     if ((infd = G_open_cell_old(name, mapset)) < 0)
121     G_fatal_error(_("Unable to open raster map <%s>"), name);
122 
123 
124     /* controlling, if we can open input raster */
125     if (G_get_cellhd(name, mapset, &cellhd) < 0)
126     G_fatal_error(_("Unable to read file header of <%s>"), name);


Line 130-140 -Allocating Buffers

130     /* Allocate input buffer */
131     inrast = G_allocate_raster_buf(data_type);
132 
133     /* Allocate output buffer, use input map data_type */
134     nrows = G_window_rows();
135     ncols = G_window_cols();
136     outrast = G_allocate_raster_buf(data_type);
137 
138     /* controlling, if we can write the raster */
139     if ((outfd = G_open_raster_new(result, data_type)) < 0)
140     G_fatal_error(_("Unable to create raster map <%s>"), result);

Line 151-154 - Reading Raster Data

151     /* read input map */
152     if (G_get_raster_row(infd, inrast, row, data_type) < 0)
153         G_fatal_error(_("Unable to read raster map <%s> row %d"), name,
154               row);

Line 156-176 - Process Data

156     /* process the data */
157     for (col = 0; col < ncols; col++) {
158         /* use different function for each data type */
159         switch (data_type) {
160         case CELL_TYPE:
161         c = ((CELL *) inrast)[col];
162         c = c_calc(c);  /* calculate */
163         ((CELL *) outrast)[col] = c;
164         break;
165         case FCELL_TYPE:
166         f = ((FCELL *) inrast)[col];
167         f = f_calc(f);  /* calculate */
168         ((FCELL *) outrast)[col] = f;
169         break;
170         case DCELL_TYPE:
171         d = ((DCELL *) inrast)[col];
172         d = d_calc(d);  /* calculate */
173         ((DCELL *) outrast)[col] = d;
174         break;
175         }
176     }

The example program iterates over the row, calling a function on each cell of data. Note the use of a switch statement on the data_type. GRASS raster maps may be of three types, CELL, FCLL, and DCELL. CELL refers to integer values, FCELL refers to single precision decimal values, and DCELL refers to double precision decimal values. This example program does the right thing, and makes no assumptions about the type of raster map to expect, and instead calls a different function depending on the type of the raster map. If you know that your map will only be integer data, for example a vegetation or soil map, then it is only necessary to process CELL data. However if you do this, make sure you check the cell type of the input raster map, and print an error message.

Ever wonder why you need to multiply K map values by 1000 in RHESSys sometimes, 
or occasionally perform other odd multiplications?  This is because the programs 
that read those raster maps only parse out CELL data. By multiplying by 1000, they 
emulate decimal data. This is a bug, and we are in the process of adding proper raster
functionality to the RHESSys tool chain. When writing your own modules, it is very
important to process ALL raster types, as this greatly increases the usability of
your software.

Line 183-198 - Memory cleanup and exiting

Nothing should break if you don't clean up before your program exits, however over time your system will run out of memory and the only solution will be to reboot the computer. Cleaning up after yourself is just good practice. Any variable that was created with G_allocate_raster_buf() must be called with G_free(). Any variable created with G_open_raster_new should be sent to G_close_cell. The G_*_history calls store the work you did in the GRASS history manager for use in writing automated scripts.

183     /* memory cleanup */ 
184     G_free(inrast);
185     G_free(outrast);
186 
187     /* closing raster maps */
188     G_close_cell(infd);
189     G_close_cell(outfd);
190 
191     /* add command line incantation to history file */
192     G_short_history(result, "raster", &history);
193     G_command_history(&history);
194     G_write_history(result, &history);
195 
196 
197     exit(EXIT_SUCCESS);
198 }


The C language

Unlike the R and MATLAB languages, C is not designed to be user friendly, and really enjoys allowing you to blow your own foot off. If you are serious about writing your own custom GRASS programs, a basic understanding of the C language will save you hours of headaches later on. The Learning GNU C tutorial is a decent place to start learning.

GRASS Program Project Template

The code

The makefile

For more information on customizing your own makefiles, see GNU make