Difference between revisions of "Preparing input data sets"

Latest revision as of 22:50, 11 July 2012

The GRASS GIS program will be used to process and store your set of spatial data layers associated with a project. The latest GRASS build can be downloaded at http://grass.itc.it.

Set the geographic region

A region refers to a geographic area with some defined boundaries, based on a specific map coordinate system and map projection. In GRASS, each region also has associated with it the specific east-west and north-south resolutions of its smallest units (rectangular units called "cells").

The region's boundaries are given as the northernmost, southernmost, easternmost, and westernmost points that define its extent. The north and south boundaries are commonly called northings, while the east and west boundaries are called eastings.

You need to set the geographic region for your personal mapset. For now, this will be the same as the DEM. Set the region with:
grass> g.region rast=<dem>

To find out the settings of the current region:
grass> g.region –p

This will tell you the projection, zone, datum, ellipsoid, N/S/E/W coordinates, resolution, and number of rows and columns of your DEM.

Start a display monitor and view a map

To look at the DEM, start a monitor with the GRASS command:
grass> d.mon start=x0

This will start a graphics display monitor called x0. You can have up to 10 monitors open at once (each one must be started with grass> d.mon start=x_); each must be assigned a different number, i.e. x0, x1, x2, etc...; when you have more than one monitor open, select one for use with the command:
grass> d.mon select=x2

To display the raster map layer, type the GRASS command:
grass> d.rast map=hjadem

To clear the monitor, type the GRASS command:
grass> d.erase

Defining a watershed boundary

Rather than working with the full extent of the DEM (which probably contains areas outside of the watershed you are interested in) you can define the boundaries of a landscape that drain to a specific point (or outlet). You will use the GRASS watershed basin creation program (r.water.outlet) to define the boundaries for a sub-watershed within the HJ Andrews basin. First however, r.water.outlet requires a drainage direction map that must be created with the GRASS program r.watershed.

To create the drainage direction map, type the GRASS command:
grass> r.watershed elevation=hjadem drainage=hja_drain

This traces the flow through the elevation model and creates a new raster map called drain in your personal mapset.

GRASS recognizes geographic coordinates as the Easting and Northing of each point across the landscape. To define the watershed boundaries associated with a particular outlet, you will need to identify the Easting (E) and Northing (N) of a point on the stream network. For this exercise, you can experiment with picking any point you choose on the stream network, regardless if it is a gaged stream or not. The purpose of this exercise is simply to acquaint you with how to create a sub-watershed.

Generally, you will want to work with a watershed for which you have some measured information, such as streamflow, soil moisture, or LAI, so you have observed data to compare model results to. This is most often observed streamflow data, which is widely and readily available from many gaged streams. Therefore, you attempt to delineate the modeled watershed to match the real watershed draining to that gage. To identify the outlet point for a gaged stream, it is helpful to overlay a map of stream gages on the stream network if available. To do this, you will need to import the stream and gage maps provided with the tutorial data set into GRASS. At the beginning of this tutorial, you should have copied the ascii files hjagages.asc and hjastreams.asc into your grassdata directory.

Importing files into grass

To import ascii raster files into GRASS, direct input to the ascii map, give the resultant map an output name, and set null values to 0 (read working in GRASS and UNIX below):
grass> r.in.ascii input=/full_path_name/hjagages.asc output=hjagages nv=0
grass> r.in.ascii input=/full_path_name/hjastreams.asc output=hjastreams nv=0

(GRASS supports many formats. For information on importing different formats, i.e. ARC-ASCII-GRID with r.in.arc, ESRI/E00 with m.in.e00, see the GRASS data import help webpage).

Display the maps, overlaying the gages on the streams, with:
grass> d.rast map=hjastreams
grass> d.rast -o map=hjagages

A map of the stream network overlain by the stream gages should be displayed in your monitor. Note the gages are very small points across the watershed and can be hard to see without zooming in, which will be discussed shortly.

This map illustrates gage locations throughout the HJA.

Identify an outlet point To identify location information on either the stream or gage maps, type the GRASS command
grass> d.what.rast

Move the cursor over the display; use the left mouse button to click on a gage point. The E and N and map category value of the point you clicked will be displayed on your terminal window. Right click to end the session.

If you are having trouble placing your mouse right on top of a gage point, it may be helpful to zoom in with the GRASS command grass> d.zoom

Follow the on screen instructions using the mouse to zoom in on a gage or group of gages. Then use grass> d.what.rast to query the points.

NOTE: You want to find the E and N coordinates for an outlet that falls on the stream network in order to delineate the boundary of all area draining to that point. However, the gages may not fall exactly on the stream network, so you may need to adjust where you choose your E and N coordinates so the point you choose falls on the stream network as close to the gage as possible. d.zoom resets the region to the area you are zoomed in on. Any additional functions you perform after zooming in would only process the area you are zoomed in on. As you may be zoomed in very tightly on a stream segment and the full area of the watershed you want to capture may not be displayed, you should reset your region with:
grass> g.region rast=hjadem

To redisplay the map at the full extent of the hjadem region, you could use:
grass> d.redraw

Or to clear your monitor, use:
grass> d.erase

Create a watershed

Once you have the location of an outlet (the E and N coordinates), you can generate the sub-basin from the drainage direction map created with r.watershed and the E and N coordinates you isolated with r.what.rast.

G> r.water.outlet drainage=drain basin=new_basin_name easting=xxx northing=xxx

This will generate a watershed basin map (new_basin_name) from the drainage direction map (drain) and put it in your personal mapset. This map will have a value of 1 everywhere inside the boundary, and a value of 0 everywhere that falls outside of new_basin_name (ie. the rest of the DEM).

Display this map:
G> d.rast map=new_basin_name

NOTE: Remember, the set of coordinates you use represents the outlet point of the watershed, and the watershed basin is all area upstream of that point that drains to that point. Therefore, if the point you choose is on a hill slope, the resulting map will only reflect the small sliver of land uphill that drains to that point. This may take several tries, so don't get frustrated!

Using the overlay option with a new map generated from an existing map
  
When you create a new map that represents a portion of the extent of an 
existing map, you may need to run the GRASS command r.support in order to set all null 
values to transparent.  For example, you would use r.support if you wanted to overlay  
new_basin_name on top of the DEM so that all areas outside of the new sub- basin were 
transparent, allowing the DEM underneath to show through.  You may find the GRASS
command r.support helpful as you create new maps. r.support can only be run interactively.  
To use it for the purpose of setting null values to transparent, accept the default (no) for 
the first six questions, but answer yes to the last question ‘Do you want to delete null file 
for map_name?’  Answer yes.

Preparing spatial input data sets

You have a DEM (hjadem) and have now created a basin (new_basin_name) map. However, RHESSys requires additional spatial data to form a complete landscape representation and establish connectivity between spatial units. One of the unique features of RHESSys is its hierarchical landscape representation. RHESSys partitions the landscape into a hierarchical spatial structure, where each level of the spatial hierarchy fully covers the spatial extent of the landscape. For example, stratum (vegetation) are processed within each patch, patches are contained within hillslopes and zones, which are units contained within the full basin. Each level of the hierarchy is defined as a particular object type with a set of storage (state) and flux variables, and an associated set of model parameters (default files).

Object type and associated processes:

Basin - defines the drainage basin, full extent of location being modeled.
Hillslope - defines areas which drain to a single point or stream reach.
Zone - denotes areas of similar climate.
Patch - soil moisture processes and carbon and nitrogen cycling.
Stratum - vertical layers within a patch (ie. vegetation)

From the DEM you'll need to derive the following maps:

Slope
Aspect
Basin
Hillslope
Patch
Saturated soil hydraulic conductivity at the surface (Ksat0)
Decay of hydraulic conductivity with depth (m)
Stream network
Roads

Create horizon maps

RHESSys expects horizon values as sin(horizon angle). You will need to compute two horizon maps, one for east, and one for west. In GRASS, you would do this with the commands

g> r.horizon -d elevin=<DEM> direction=0 horizon=east
g> r.horizon -d elevin=<DEM> direction=180 horizon=west

This will produce two maps, east_0, and west_0. These maps are the horizon in degrees though, so you will need to use mapcalc to create new maps with the sin of these values.

g> r.mapcalc 'east_horizon = sin(east_0)'
g> r.mapcalc 'west_horizon = sin(west_0)'

These two maps will be referenced by your template file.

Create slope and aspect maps

To generate slope and aspect layers for the new_basin_name map from the DEM, give the the input elevation map (el), and output names for both the slope and aspect maps. For this exercise use the following:
grass> r.slope.aspect el=hjadem slope=slope aspect=aspect

This will create 2 new maps in your personal mapset, slope and aspect. You can name these maps whatever you want, but by convention, they are generally called slope and aspect. As you created these for the full DEM, you may want to use demslope, etc…. If you were still zoomed in and did not reset your region to the full extent of the DEM then you may want to give your slope and aspect maps an identifying extension.

Verify the maps are there with:
grass> g.list type=rast mapset=your_userID will list maps only in your personal mapset directory.

Create hillslope and stream maps

You will use a GRASS watershed basin analysis program to generate a set of maps used to delineate key spatial layers in RHESSys. RHESSys uses different spatial objects to model specific hydro-ecological processes. These objects form a hierarchical representation of the landscapes key objects. The hillslope and stream maps are necessary as they represent key landscape objects. The GRASS r.watershed program creates maps of watershed basins, hillslopes, flow accumulation, drainage direction, and stream segments within a DEM. You only want to create these maps for the new_basin_name portion of the DEM.

For this step, zoom in on the new_basin_name map fairly close, however, leave at least one pixel width of space around the watershed. As d.zoom resets the region, it allows you to work with only the portion of the map displayed as you create additional maps in this step.
grass> d.zoom

Hillslopes are defined as land area draining either side of a stream reach in each sub-basin. Hillslopes are created by breaking up the watershed into sub basins associated with a particular stream reach. To generate a hillslope map and a stream map, you will use the r.watershed program (r.watershed was used to create the hjastreams map you viewed previously). This program can be run interactively or non-interactively. To run non-interactively (command line):
grass> r.watershed el=hjadem t=100 ac=acc dr=drain ba=b100 ha=h100 stream=str100

The proceeding was an abbreviation of:
grass> r.watershed elevation=hjadem threshold=100 accumulation=acc drainage=drain basin=b100 half.basin=h100 stream=str100

Where threshold sets the aggregation value, elevation is the input map, and the rest are output maps.

This will put several new maps in your PERMANENT mapset (you can check with g.list). You can name the output maps whatever you choose, this is simply the convention established by previous RHESSys users (h100 indicates hillslopes at a threshold of 100 cells, etc...). Threshold represents the minimum size of a sub-basin in cells, or overland flow units. Try experimenting with different threshold values (i.e. try additional thresholds at 50 and 200), then display the hillslope maps (in different monitors) to see the difference.

Setting a mask

In order to work with data that falls only inside of this new sub-basin area rather than working with the full extent of the DEM or region you currently have set (i.e. as a result of zooming), you must set a mask. Setting a mask 'ignores' those areas that fall outside of the designated mask area on all subsequent operations, blocking them from analysis. This program can only be run interactively.

Type the GRASS command:
grass> r.mask

Choose option 2 'Identify a new mask'

You will be prompted to enter the name of a raster map layer
grass> new_basin_name

You will be shown a listing of this map's categories, and asked to assign a value of "1" or "0" to each map category. Areas assigned category value "1" will become part of the mask's surface, while areas assigned category value "0" will become "no data" areas in the MASK file.

Hit enter to move your cursor to the second category and type 1 in the space (type 1 next to all non-zero categories), hit <Esc><Enter> as instructed at the bottom of the page. You will be returned to the option page where you will see listed at the top of the screen that the current mask is new_basin_name, enter to return to the GRASS prompt.

Now if you display the DEM you will only see the area of the DEM that falls inside the new_basin_name basin area.
grass> d.rast map=hjadem

Now reset the region to the geographic boundaries of new_basin_name with the g.region zoom option (sets the current region settings to the smallest region encompassing all non-zero data in the named raster map layer): grass> g.region raster=new_basin_name zoom=new_basin_name
grass> d.redraw

You should now have a much closer view of the DEM for new_basin_name.

Zones

Zones denote areas of similar climate. Zones store climate variables such as radiation, temperature etc... Each zone is linked to an input climate station. Precipitation and temperature data, for example, from this station are modified based on zone elevation, slope and aspect relative to the climate station. Zone processing also generates climate data not available from climate station (i.e. zones will estimate radiation fluxes if they are not available). Numerous strategies exist to partition areas of similar climate. Elevation bands in a mountainous area, for example, are likely to denote similarity. Distribution of climate stations can also be used to define zone partitioning, where each zone defines the area associated with a particular climate station. For these exercises, you will not be creating a zone map, as meteorological data from 1 climate station is sufficient for this small watershed. In this instance, it is best to use the same map for zones as you use for hillslopes.

Ksat0, m, and roads

You will now use the GRASS raster map layer data calculator to create additional maps required to run RHESSys. The maps you will create in this exercise are the basic maps required to run RHESSys. However, this GRASS command can also be used to perform many arithmetic functions involving one or several existing map layers to create new map layers. See the GRASS website for a list of arithmetic operators.

The GRASS command r.mapcalc is typed in the form:
grass> r.mapcalc 'result = expression'

where result is the name of the new map you are creating and expression is the arithmetic being performed. Use r.mapcalc to generate three additional maps needed by RHESSys. The m and Ksat0 maps define spatial patterns of two soil hydrologic parameters. For this exercise we will assume these parameters do not vary spatially and use the same value for the entire watershed.

Type the following GRASS commands to create maps for Ksat0, m, and roads: (the ' are deliberate characters)
grass> r.mapcalc 'K = new_basin_name * 2'
grass> r.mapcalc 'm = new_basin_name * 0.12'
grass> r.mapcalc 'zero = 0' (zero will be used as the road map as there are no roads in this watershed)

When you created new_basin_name, it was assigned a value of 1. The multiplication operator (*) was used to create a new map called K (Ksat0-saturated soil hydraulic conductivity at the surface) that covers the extent of this sub-basin, which will have a value of 2, and a new map called m (decay of hydraulic conductivity with depth) with a value of 0.12. You also created a road map equal to zero to indicate that there are no roads in this dataset. K and m should be initialized based on values for the soil type in your area of study (see website for soil type values).

Patches

The final map required for a basic RHESSys run is the patch map. Patches represent the smallest resolution spatial unit and define areas of similar soil moisture and land cover characteristics. Vertical soil moisture processing and soil biogeochemistry are modeled within each unit defined as a patch. Lateral transport of material and water occurs between patches, so patch definition must reflect drainage organization of the watershed. You have considerable flexibility in defining a patch structure; patches can be strictly grid-based (i.e. based on 30m DEM), or of arbitrary shape (reflecting the patterns of relevant variability within the landscape, i.e. wetness index, vegetation cover, and stream/road networks).

However, when defining a patch structure attention should be given to the landscape size due to the associated processing time. Most of the processing in RHESSYS – for patch and stratum processes – is done at the patch spatial level. These processes must be performed for each individual patch. Therefore, the more patches there are, the more processing that must be done. Choosing a smaller resolution spatial unit will increase the number of patches, therefore, increase total processing time for a simulation.

For this exercise, you will define the patch structure for new_map_name based on the 30 meter DEM. The GRASS command r.clump re-categorizes data in a raster map layer by grouping cells that form physically discrete areas into unique categories.
grass> r.clump input=hjadem output=new_map_name.cl

By user defined convention, when a map is clumped the extension .cl is added to the map name to identify that the r.clump function has be done.

To look at the patch structure:
grass> d.rast map=new_map_name.cl

Stratum

Strata define vertical, aspatial layers within the patch. Processes such as photosynthesis and transpiration are modeled at the stratum level. Strata usually have the same spatial structure as patches. So the patch (new_map_name.cl) map will also be used to define the stratum object.

You have now created all the maps necessary for RHESSys to define the landscape representation. For now, however, you can quit out of GRASS (G> quit) and work on the additional timeseries data sets required to run RHESSys.

Create maps of x and y (or easting and northing) coordinates

The cxy.c program is included in the RHESSys download in the /trunk/util directory. If you have not previously downloaded it, from a terminal window you can acquire it by going into the /trunk/util directory, then type svn update. Once you download it, simply type make to compile the program, and put the executable with your other program files.

This program creates raster maps of x values and y values based on the currently selected GRASS region.

Start GRASS

From within GRASS, type the following command:

cxy -v xmap=x_coords ymap=y_coords

(or you can use the BASH script cxy.sh to run the program. this is an example to run the cxy program with sample argument values for the names of the xmap and ymap. You can edit this script for the output map names of your choosing, then simply execute the script)

Developing timeseries data sets

Climate RHESSys requires daily climate inputs for precipitation, minimum and maximum air temperature. These climate inputs are linked to zones in the landscape representation by a base station ID affiliated with that zone. A single base station will typically serve multiple (or all) zones within the landscape. Each base station is described by a base station file and stored in the clim directory.

The tutorial data set included a base station file (w8_base) and 3 climate files (w8_daily.rain, w8_daily.tmin, w8_daily.tmax) for a small watershed (w8) in the HJA that you should have copied into your clim directory. Have a look at the base station file structure.

From the UNIX prompt:
unix> more w8_base

Take note of the following content:

Line 1: 101 base_station_id. This ID will be read by zones in the landscape to link it to the base station climate file.

Line 4: 485 z_coordinate. Elevation (in meters) of meterological station where climate data was collected.

Line 11: ../clim/w8_daily daily_climate_prefix. Tells the base station what daily precipitation and temperature files to read, and the directory they are in.

NOTE: if you were developing your own climate input files, you would edit this base station file to reflect your meteorological station elevation and daily climate prefix.

Each of the climate input files (precipitation, maximum temperature and minimum temperature) is contained in a separate file. RHESSys assumes that the climate input files associated with a base station are all named with the same prefix (i.e. w8_daily = daily climate inputs for w8). A filename extension specifying the kind of climate variable (ie. precipitation, maximum or minimum temperature) contained in each file is attached to the end of the prefix. The prefix is then given in the base station description file, directing it to read in all climate files with that prefix.

There are 3 required climate input files - precipitaiton, minimum and maximum daily temperature. (There are also a number of optional climate inputs. If optional climate sequences are not available/used, a process model within RHESSys will provide estimates of these variables. See the RHESSys website for more information.) RHESSys requires the climate files to be in a particular format:

Review the structure of these climate input files. The first line of each input time series file must give the starting date of the time series (Year Month Day Hour). Following the starting date, daily time series values are listed sequentially.

unix> more w8_daily.rain (daily precipitation in meters) These are long files, to end viewing a file, type U> q. unix> more w8_daily.tmax (daily maximum temperature °C) unix> more w8_daily.tmin (daily minimum temperature °C)

Developing your own daily climate files may require some preprocessing in order to format the data so it can be read by RHESSys. Be sure to check for missing data and make sure data is in the correct units. The RHESSys website describes a process for dealing with missing data and formatting it for RHESSys. It is also a good idea to inspect a graph of your input time series data visually to check for abnormal spikes or unrealistic values such as negative precipitation events. You may need to use a statistical computation program such as R, S-Plus, SPSS, or Excel to fill in missing data.

A climate data file should only consist of the start date at the top, followed by 1 value for each day. When naming the files you can use any name you want for the beginning of the prefix (just make sure all three have the same prefix). However, by convention, the prefix should include the time period of the data (i.e. daily, to distinguish these from hourly time series inputs, an optional input type in RHESSys) followed by the extensions: .rain, .tmin, .tmax

NOTE: Depending on the program you use to save your data files, it may attach an additional file extension to the end of the file (such as .txt). When you bring the files into your UNIX clim directory, make sure to remove any file extensions. Also, if you develop the files on a PC and transfer them to UNIX, you may need to run the converter dos2unix to convert from DOS text format to UNIX format - which will remove the ^M characters from the file. If you open one of the files with vi, you can see if the characters are present.

Streamflow

You will need observed streamflow data for calibration. As with climate, when creating your own streamflow files some preprocessing may be necessary to check for missing or abnormal data, and for formatting. When used in the calibration procedure (discussed in Module III) format for streamflow files is the same as for climate files. The Observed streamflow has been provided for w8 (obs.wy79_80dw8) for the exercises you will do in the next module.

RHESSys outputs streamflow in millimeters per day (streamflow normalized by basin area), so the observed streamflow must also be converted to millimeters per day. (RHESSys also aggregates output at a monthly timestep in the event you only have observed monthly streamflow.)

For example, the USGS uses cubic-feet per second (CFS) as their unit of measurement, so you would need to convert from CFS to mm/day. To convert from mean daily CFS to mm per day, you need to know the area of the watershed draining to the gage you have streamflow (Q) data for in square feet.

(If necessary, to convert area in hectares (ha) to square feet (sq.ft.), multiply ha by 107639.104169.)

To convert mean CFS to mm per day:

1. divide mean CFS by the basin area in sq.ft = Q ft/sec

2. multiply Q ft/sec by 86400 = Q ft/day

3. multiply Q ft/day by 304.8 = Q mm/day

@@ Line 144: / Line 144: @@
 for east, and one for west. In GRASS, you would do this with the commands
-'''g> r.horizon -d elevin=<DEM> direction=0 horizon=horizon'''<br>
+'''g> r.horizon -d elevin=<DEM> direction=0 horizon=east'''<br>
-'''g> r.horizon -d elevin=<DEM> direction=180 horizon=horizon'''
+'''g> r.horizon -d elevin=<DEM> direction=180 horizon=west'''
-This will produce two maps, horizon_0, and horizon_180, where 0 is east and 180 is west. These maps are
+This will produce two maps, east_0, and west_0. These maps are
 the horizon in degrees though, so you will need to use mapcalc to create new maps with the sin of these
 values.
-'''g> r.mapcalc 'east_horizon = sin(horizon_0)' '''<br>
+'''g> r.mapcalc 'east_horizon = sin(east_0)' '''<br>
-'''g> r.mapcalc 'west_horizon = sin(horizon_180)' '''
+'''g> r.mapcalc 'west_horizon = sin(west_0)' '''
 These two maps will be referenced by your template file.
@@ Line 256: / Line 256: @@
 You have now created all the maps necessary for RHESSys to define the landscape representation.  For now, however, you can quit out of GRASS (G> quit) and work on the additional timeseries data sets required to run RHESSys.
+==Create maps of x and y (or easting and northing) coordinates==
+The cxy.c program is included in the RHESSys download in the /trunk/util directory.  If you have not previously downloaded it, from a terminal window you can acquire it by going into the /trunk/util directory, then type svn update.  Once you download it, simply type make to compile the program, and put the executable with your other program files.
-===Carbon Stores===
+This program creates raster maps of x values and y values based on the currently selected GRASS region.
+Start GRASS
+From within GRASS, type the following command:
+cxy -v xmap=x_coords ymap=y_coords
+(or you can use the BASH script cxy.sh to run the program.  this is an example to run the cxy program with sample argument values for the names of the xmap and ymap. You can edit this script for the output map names of your choosing, then simply execute the script)
 ==Developing timeseries data sets==