Difference between revisions of "Calibrating and running RHESSys"

Latest revision as of 14:41, 28 October 2009

A number of models are embedded in RHESSys that use mathematical representations of the key controls on ecosystem processes. While these relatively simple models are unable to capture every process in a forested ecosystem, they can help explain key mechanisms and responses. RHESSys uses many parameters to describe typical soil, vegetation, and land use characteristics. These parameters, stored in the default files, have values that stay constant through a model run. Literature based estimates have been used to compile values for common vegetation and soil types. If your site requires a vegetation or soil type for which a default file has not been developed, you will need to develop your own default files by researching the relevant literature (see special module Developing soil and vegetation parameters).

In an ideal process based model, all parameters would represent directly measurable quantities. While efforts have been made in the development of RHESSys to limit the number of calibrated parameters, several hydrologic parameters commonly require calibration. These parameters reflect drainage efficiency and storage capacity of soils. While soil depth and hydraulic conductivity can be measured for a block of soil, the effective value of these parameters at spatial scales typically used in a model cannot be directly measured. Further, values for these parameters integrate the impact of soil heterogeneity and include impacts of macro pore and preferential flowpath distributions. Calibration of these parameters is usually based on measures of fit between observed and modeled hydrographs. Consequently, although initial values for these parameters vary spatially, calibration applies a watershed-wide scaling factor.

The four independent parameters that are typically calibrated in RHESSys are the decay of hydraulic conductivity with depth (m), saturated soil hydraulic conductivity at the surface – Ksat0 (K), and two groundwater parameters which control the proportion of infiltrated water that bypasses soil (via macropores and fractures) to a deeper groundwater table (gw1), and the rate of lateral flow from a hillslope scale groundwater table (modeled as a linear reservoir) to the stream channel (gw2). This module will take you through the process of calibration to establish a set of parameters to use in a basic RHESSys simulation. However, prior to calibrating or running model simulations for predictive purposes, values for the state variables in the worldfile must be initialized.

Model spin-up

One way to initialize the model is to 'spin it up'. This is a preliminary RHESSys simulation you will run to establish values for the state variables in the worldfile. The 'spin-up' period in RHESSys is the time of adjustment it takes for the model to reach a state of equilibrium in vegetation and soil carbon (C) and nitrogen (N) stores. The period of time required for the model to reach equilibrium will vary depending on the characteristics of the landscape you are spinning up (i.e. type of climate, vegetation and soil). A spin-up period of several hundred years is often necessary, due to the slow development of soil organic matter pools, which may have turnover times on the order of decades. Vegetation is an important spatial and temporal dynamic component in the vegetation-soil-water relationship. Therefore, when spinning-up a worldfile, the model should be run as a dynamic system to allow vegetation to adapt, evolve and respond to the seasonal and interannual cycles of climate, water redistribution in the system, and soil organic matter pools. RHESSys has the ability to output the current value of all state variables at any particular point of a simulation, thus, generating a new worldfile. For example, at the end of a 100-year spin-up simulation, a new worldfile can be output, reflecting estimates of C and N values (as well as all other state variables) attained after 100 years of processing. The resultant worldfile produced by the spin-up process can then be used as the input worldfile (starting point) for subsequent RHESSys simulations. By using the 'output_current_state' option in the TEC file a new worldfile can be output at the end of the spin-up period.

Spin-up strategies To obtain a reasonable hydrologic parameter set (m, K, gw1, gw2) to employ in the spin-up period, it may be advantageous to test a range of possible parameter sets (see calibration procedure) with the vegetation in static mode (see special module Hydrologic modeling with prescribed vegetation), where the vegetation does not change and soil decomposition does not occur. Running the model in static mode allows you to focus on the hydrologic component of the system without the dynamic influence that vegetation has on the system. As this parameter set would only reasonably model the hydrologic component of the system, it is therefore a temporary parameter set to be employed only in the spin-up period. The spin-up itself must be run in dynamic mode to simulate carbon and nitrogen cycling processes more fully. Initial calibration in static mode is simply designed to give you a reasonable starting point for the hydrologic parameters used in the spin-up simulation. In the event you do not have vegetation estimates (LAI) to initialize and run the model in static mode, the following standard set of hydrologic parameters can be used for the spin-up period: m=1, K=10, gw1=0, gw2=0.

Cold start - Soil, litter, and vegetation worldfile state variables are initialized in the template at very low or zero values and the model is run until a statistical equilibrium is achieved.

Warm start - Reduces model spin up time. Soil, litter, and vegetation worldfile state variables are initialized in the template from literature values or estimates from a previous simulation in a similar biogeographic region.

Depending on the spatial heterogeneity of your patch structure, or the size of your landscape, a spin-up period of several hundred years can take a substantial amount of processing time. Therefore, it might be easier to do a model spin-up using a single hillslope or patch dataset isolated from your watershed. Once this spin-up is complete, use the resulting worldfile state variable values to initialize vegetation, litter, and soil state variables in the template, and then generate a new worldfile for the full landscape.

RHESSys requires climate data to run a simulation, however, you will generally find that most climate data records don't extend beyond 50 years or so. Since spinning-up a worldfile can often take several hundred years, you would need a much longer climate data set. One way to approach this is to run a spin-up simulation for the length of time for which you have climate data, output a new worldfile with the 'output_current_state' TEC event, and then restart the spin-up simulation with the new worldfile. You would repeat this process until soil N and C stores had stabilized (i.e. do not show any long term upward or downward trends). Another approach would be to create extended climate files by repeating the climate data.

Spin-up process The following exercise is intended to acquaint you with the process of spinning-up a worldfile. You will use the RHESSys User Interface to run a 1-year spin-up simulation (using the worldfile and flowtable you created in Module II), outputting a new "spun-up" worldfile and result files.

Due to the time involved to spin-up the worldfile to equilibrium for the site used in these exercises, a spun-up worldfile has been provided for you (world.w8) to use in later simulations. This worldfile has been spun-up for 450+ years and reflects RHESSys estimates for an old-growth homogenous conifer forest.

Spin-up tecfile In your tecfiles directory, create a TEC file that includes the 'output_current_state' event that will tell RHESSys to output the conditions for all state variables at a given time.

At the Unix prompt type the following command to start vi (or use another text editor):

unix> vi tec.spinup (insert the following text (i) and write it to the new file :wq)

1978 1 1 1 print_daily_on
1978 1 1 2 print_daily_growth_on
1978 12 31 24 output_current_state

• To run the script manually:

U> nice nohup ./script_file_name &

nice = Invokes a command with an altered scheduling priority. This option should be used when running simulations on multi-processor machines that are shared by multiple users. nohup = Runs a command even if the session is disconnected or the user logs out. ./ = Reads the file in the directory you are in and runs commands from the script_file_name The & runs the job in the background and returns the prompt straight away, allowing you do run other programs while waiting for that one to finish.

You will not receive a message when a process is finished, however, you can check on the status of any processes you have running with the U> ps –u | grep userID command. The ps command will also tell you the process ID (PID). In the event you want to cancel a job, use the U> kill PID command.

Spin-up simulation results - maintenance

Move to the worldfiles directory. Because you gave RHESSys the TEC event 'output_current_state', a new worldfile will have been generated in the worldfiles directory after the simulation completes the day specified in the TEC file. It will have the name of the original worldfile you used in the simulation followed by the date the new worldifle was generated and the extension .state - i.e. world.tutorial.Y1958M12D31H24.state. Because of the long, cumbersome name given to the new worldfile, you will probably want to rename it (U> mv command). Either overwrite the original non-spun-up worldfile (world.tutorial) by giving the new worldfile that same name, or add an extension to this new worldfile to reflect the number of years the worldfile has been spun-up (i.e. world.tutorial.su1) and retain the original. Keep in mind that a worldfile can be a very large file requiring considerable disc space; therefore, multiple worldfiles may cause disk space issues.

Move to the out directory. In the TEC file, you requested daily and daily_growth output, and in the interface, you requested the command line options for basin and growth output. In combination, these two options produced basin.daily and grow_basin.daily ascii output files in the out directory. Every simulation will construct an output file for each temporal level of output that RHESSys can aggregate result for: hourly, daily, monthly, and yearly. However, you only requested that daily and daily_growth output be written for this simulation, so only the daily and daily_growth output files would contain results. After running many simulations, the out directory can really fill up. It may be useful to delete useless files, and once you have results from many simulations, you may want to organize these in sub-directories within the out directory.

You may want to clean up the out directory by deleting the empty hourly, monthly, and

yearly files with the Unix command:

unix> rm *hourly (etc. for monthly and yearly) The * is a placeholder for any number of characters, allowing you to perform a command on any file ending in hourly. (Use the –f option with the remove command to avoid confirmation questions, i.e. U> rm -f *hourly) (Using the –l option with the UNIX ls command will tell you the size of all files in a directory, i.e. U> ls -l. This can be very useful for managing the out directory, as you can see what files contain data. This command will also show you file permissions, ownership, and the modification date)

Determining model stabilization

To determine if the model has stabilized and the spin-up process is now complete, RHESSys output for plant and soil C and N from the spin-up simulation should be evaluated. Output for these variables is contained in the basin_daily_growth file. It is also useful to look at patterns for LAI, streamflow, and saturation deficit, which are contained in the basin_daily file. You will use the statistical and graphics program R to look at these results.

Move to the out directory (if you are not in the out directory, you will need to give R complete path names). In addition to the output files generated from the spin-up exercise, you should have three output files in this directory (w8.su75_grow_basin.daily, w8.su75_basin.daily, and w8.su225_grow_basin.daily) that you copied from the tutorial data in the Getting Started portion of the tutorial. As the output files generated in the spin-up exercise only reflect 1 year, output files from a 450-year spin-up have been provided for you for illustrative purposes.

Start R simply by typing R at the Unix prompt. unix> R Some information on R will appear along with a standard > prompt.

To determine if the model has stabilized, you are interested in looking at each of the following output variables from the grow_basin.daily files: plantc, plantn, soilc, soiln. First look at output after 75 years of processing, then after 225 years of processing in order to see how these variables have changed and moved toward stabilization.

You may also want to look at streamflow, LAI, and soil moisture (computed by subtracting the variable unsat_stor from sat_def) from the w8.su75_basin.daily output file (for a complete list of the variables contained in each of the output files see the RHESSys website).

• First you must read an ascii output file into R as a table, then plot each of the columns of interest using the following commands:

R> SuGrow75 = read.table("w8su75_grow_basin.daily", header=TRUE) R> SuGrow225 = read.table("w8su225_grow_basin.daily", header=TRUE)

This will create two R tables called SuGrow75 and SuGrow225 that contain all of the output varialbles from the RHESSys files. You could look at a table simply by typing the table name at the R prompt, however, these files contain data for each day over multiple years for 19 different fields, so they are very long. Instead of looking at a long list of daily values, it may be more useful to simply see what type of output is contained in a file by listing the variable headings.

To see the variable heading names for an output file, use the command R> names(SuGrow75)

R only displays data for a certain width across the monitor, so the variable headings will continue to wrap to the end of the list. The numbers on the left side simply provide an index of the variable for the first entry displayed on a line.

Now plot a timeseries graph for one of the variables (plantc, plantn, soilc, soiln) from each of the tables to see how its trend has changed:

R> plot(SuGrow75$soiln)
R> plot(SuGrow225$soiln)

A new window should have opened and displayed a timeseries graph of soil nitrogen after 75 years, then after 225 years. You would use the same method to plot the other variables and examine the trends.

The model may be considered stable when the plant and soil C and N variables are not trending up or down, not including small seasonal and annual fluctuations. The range of fluctuation should be relatively small, perhaps a magnitude on the order of 5% (take note of the y scale on the graph). If you have measurements or estimates for each of these variables, you should check model results against those values.

You can plot the variable soiln from both the 75 year and 225 year spin-up’s on the same graph and see how soiln has changed over time with the additional years of processing. There are no spaces in the first command, type it as one continuous line (ylim sets the y axis, so soiln from both files is visible): R> plot(SuGrow75$soiln,ylim=c(min(SuGrow75$soiln), max(SuGrow225$soiln)),col="red") R> lines(SuGrow225$soiln,col="blue")

To quit an R session:
R> q()

Model Calibration

Model calibration consists of modifying values of model input parameters in an attempt to match field conditions within some acceptable criteria. Observed streamflow data is available and easily obtained for many watersheds. Determining reasonable values for the calibrated parameters (m, K, gw1 and gw2 in this calibration) in RHESSys is done by measuring the correspondence of modeled streamflow to observed streamflow for goodness of fit. There may be many hydrologic parameter sets that acceptably reproduce observed ecosystem behavior. Equifinality refers to an observation that different initial conditions (combinations of parameter values) may generate similar, or equivalent, output from a model. The interactions between the components of such a complex system cannot be considered independently, and so different parameter combinations may arrive at the same end result. Testing a large number of parameter sets across a wide range of possible parameter space helps to reduce uncertainty. Methods (such as GLUE) exist to assess the behaviour of acceptable parameter sets, however, these will not be discussed here.

There are different methods of generating and sampling from the possible parameter space and calculating uncertainty. RHESSys generally employs the Monte Carlo method - a statistical sampling technique used to generate random parameter values from probability distributions, and the Nash-Sutcliffe efficiency metric - which measures the correspondence of modeled streamflow to observed streamflow for goodness of fit.

The RHESSys calibration interface helps to automate the procedure involved in running the model over randomly selected parameter values within a delimited range of parameters (using the Monte Carlo approach) and computing objective function values for each parameter set (using the Nash-Sutcliffe efficiency metric). It should be emphasized, however, that this is not a fully automated calibration procedure that results in optimized parameter sets. It is up to the user to view the results of the calibration, and choose desirable parameter set(s) based on values of objective function(s).

Calibration strategies

Calibration Parameters: Calibration in RHESSys is usually focused on streamflow. However, if you have other observed data (i.e. daily nitrate, monthly psn, daily snowpack, etc...) it is possible to calibrate using these variables. It should be noted that m, K, gw1 and gw2 are hydrologic parameters; however, since water often exerts a strong control on biogeochemical cycling, there may be situations where calibration of these parameters using secondary variables (i.e. PSN) is appropriate. Standard RHESSys calibration involves varying m and K to achieve the ‘best’ correspondence between observed and modeled daily streamflow patterns - where best is usually defined as a high value of the Nash-Sutcliffe efficiency measure and a close match between observed and modeled total streamflow over the calibration period. Some users follow a more hierarchical calibration approach where first total annual streamflow volumes are matched and then further adjustments to m and K (within ranges that produce reasonable total streamflow volumes) are made.

A deeper groundwater model can also be included that utilizes the two additional parameters gw1 and gw2. If you choose to use this additional groundwater model, then gw1 and gw2 must also be calibrated. Since bypass flow (gw1) can have a strong impact on ET (because this water becomes unavailable to plants), the following calibration procedure is recommended when using the groundwater model. The user first calibrates annual streamflow volumes by varying gw1 and gw2 and subsequently calibrates using daily streamflow by adjusting m, K, and gw2 while holding gw1 constant.

Objective function: Historically, we have used Nash-Sutcliffe efficiency as the objective function. However, this measure tends to emphasize correspondence between peak flows. If low flow is an issue, you might want to run two separate calibrations; one where high flow periods are used and a second where low flow periods are used. Further, taking the log of the streamflow values before computing the objective function can reduce peak flow dominance. In the calibration procedures, there are different objective functions that can be used to quantify the degree of correspondence between observed and modeled streamflow. It is up to you to base your selection of parameter sets on the performance described by one or more of the objective functions.

Temporal resolution: RHESSys can output modeled streamflow at a daily, monthly, or yearly timestep. The calibration interface also gives you the option of aggregating streamflow by a given number of days. This allows you to calibrate at different intervals (i.e. weekly, over 3 days, etc...). You may want to employ this strategy when the model is reasonably capturing total streamflow over a period of time (monthly or yearly) but unable to match the daily pattern.

At minimum, you should calibrate the model for at least one year to account for seasonal differences that occur annually. The finer the temporal resolution you calibrate at, the shorter the time period you can generally calibrate over. For example, daily calibration over 1 year, monthly calibration over 5 years, or annual calibration over 10 years.

Calibration time period: When choosing a time period to calibrate on, you should consider the history of the watershed you are modeling. You want your modeled landscape to resemble the actual watershed as closely as possible, so that streamflow predictions from the model are the result of the same conditions under which observed streamflow happened. For example, if you use a period in the observed streamflow record when the watershed was in a state reflecting mature vegetation, free of disturbance (fire, harvesting, roads), you would compare it with predicted streamflow from an undisturbed, mature model. This is perhaps the easiest scenario to calibrate on, as watershed conditions will only be subjected to seasonal changes rather than additional outside factors such as disturbance. You may also want to consider seasonality in choosing the calibration period. It may be more appropriate to work with a water year (divides the wet-weather season from one year to the next; Oct. 1 – Sept. 30 is widely used in the US) than a calendar year, to keep peak and low flow periods intact.

RHESSys is driven by climate inputs. Precipitation and temperature data from the base station are modified based on zone elevation, slope and aspect relative to the climate station. Zone processing also generates climate data not available from the climate station (i.e. zones will estimate radiation fluxes if they are not available). Therefore, it is useful to calibrate across a range of climate periods, i.e. a warm, average, and cool year, to address the ability of the model to respond to the climatic extremes of the environment being modeled.

Antecedent soil moisture should be allowed to stabilize by running the model for approximately 1-2 years before calculating efficiency. Allow the model to run for 1-2 years before you use the model results (i.e. if you are calibrating daily over 1 year, your total simulation will consist of at least 2 years, 1 year for soil moisture to stabilize followed by 1 year of calibration). You would begin printing (TEC event) results in the second year. Parameter ranges: Values for each of the hydrologic parameters will generally fall within the following ranges:

Parameter Range m .01 - 20 Ksat0 1 - 150 gw1 .001 - .3 gw2 .01 - .9

In order to generate an adequate sample across the full range for each parameter, values from .001 - .01, values from .01 - 1, and values over 1 should be generated separately. The m and K parameters are multipliers on the initial values set for m and K (values assigned to the m and K maps).

Calibrating is an iterative process that requires refinement as you go along. Start by testing sets across the full range of possible values. The range tested should be progressively restricted as a pattern emerges indicating the reasonable range for each parameter for the watershed being modeled.

Calibration process

You will set up the calibration for a water year of average precipitation and temperature. The watershed used in these exercises has never been harvested and is typical of an old growth, homogeneous Douglas fir forest; therefore, the model reflects the same conditions.

You should have copied the ascii text file obs.dw8_wy63_01 (reflects daily w8 observed streamflow from Oct. 1, 1963 through Sept. 30, 2001) into your obs file. By convention, all observed files should begin with the prefix obs. It is also useful to indicate the time period of the observed file in the name, for documentation purposes.

The observed streamflow file is a list of total daily streamflow values in mm (normalized by basin area). Each entry includes the date and streamflow in mm for each day in the record. Creation of the observed streamflow file usually requires some preprocessing to get it into the necessary format. Use a statistical program such as R or Excel to check the observed streamflow file for inconsistencies and convert the data into the necessary format. Streamflow must be in millimeters per day. The file should have columns for the year, month, day and streamflow (mm), with the appropriate headings at the top, followed by the entry for that day, for example:

year month day w8_obs_flow(mm) 1963 10 1 0.091460768

Save this data as a text file (if necessary, make sure to remove the .txt extension and covert from DOS to UNIX format with dos2unix).

This step will only be necessary after you run multiple calibrations for a project.  However, this step has already been done 
for the  tutorial data, so you do not need to execute this command for these exercises.  This is for future reference.  
When you run calibrations, you will generate multiple calibration result data files (multiple eff files, each with n, i.e. 25, 
number of results).  However, you will want to have all of the results in one file when you analyze them.  You can bind 
multiple files together in R with the rbind command, for example:
R> NewFile = rbind(read.table(“eff.tut1”, header=TRUE), read.table(“eff.tut2”, header=TRUE))

To view a summary for each of the variables and see the min and max for each: R> summary(effwy)

You can now sort the results to determine which parameter sets produced the highest efficiencies, which produced streamflow closest to observed, etc...

R> effwy.ord = order(effwy[,"eff"])

R> effwy.sort = effwy[effwy.ord,]

R> effwy.sort

This will return a permutation that rearranges the table by efficiency, in ascending order. The screen will scroll through all the results, so if this is a long file you may not want to display the entire record. You may only want to look at results with the highest efficiencies.

Print just the parameter sets that produced an efficiency over 0.70:

R> effwy.best = subset(effwy.sort[effwy.sort$eff >.7,])

R> effwy.best

Use a scatter plot to display the results of the RHESSys result file eff.w8wy (called effwy after read into R) in order to determine the range of values producing the best efficiencies for each parameter. To plot the efficiency results for each parameter:

R> plot(effwy$m, eff$eff)

The scatter plot pattern illustrates the range of values that produce the best efficiencies for each parameter. Generally, values producing above .70 efficiency can be used as a guide to restricting the range for refining the calibration. Also plot K, gw1 and gw2.

Repeat this process with eff.w8sum to see how the parameter range was restricted and used to determine the 'best' parameter set range produced by these runs based on both a high efficiency and correspondence between modeled and observed streamflow totals.

Quit out of R:
R> q()

Basic Run

In order to get a feel for how changing the hydrologic parameter values affects modeled streamflow prediction, run RHESSys to produce output (for the same time period used in calibration) and plot modeled and observed streamflow (follow the RHESSy User Interface instructions to create basin_daily output). Choose a parameter set (or two) from the calibration result data file (the eff file) that appears to have low correspondence to observed streamflow (output from some ‘high’ performing parameters has already been provided for you).

To run RHESSys and generate output, start the RHESSys User Interface from within the scripts directory:

Enter the necessary information, using the m, K, gw1 and gw2 parameter set you chose from the calibration result data file (eff file). You can only run one parameter set at a time.

Use the tecfile you created in module II named tec.tutorial, which tells RHESSys to start printing results on Oct. 1, 1979.

Be sure to start the simulation 1 year before the date you want results to begin being printed (to allow antecedent soil moisture to stabilize), and end the simulation 1 day after the last day you want output for (you want output from 10/1/79 through 9/30/80).

Choose daily basin output, and make sure RHESSys is in dynamic mode.

It is useful to give the output file an identifying name, for example, including the parameter values in the prefix (i.e. m1k10) so you know what parameters were used to produce that result file.

The optimization metric tends to focus on peak streamflow correspondence; therefore, it is useful to actually look at hydrograph correspondence to determine which parameter set produces streamflow that most closely matches observed streamflow (a file of observed streamflow for the calibration water year has been provided – obs.wy79_80dw8). When choosing an ‘optimal’ parameter set from a number of sets that produce similar statistical results, you should look at how well modeled streamflow response matches observed peak flows, low flows, and recessions. Output from three of the ‘best’ parameter sets determined from previous calibration procedures has been provided for you (m1k16_basin.daily, m1k19_basin.daily, and m1k23_basin.daily).

On the same graph, plot modeled streamflow from the output you generated with modeled streamflow from one of the output files provided for you to see streamflow response to different hydrologic parameter sets. Then add observed streamflow to the hydrograph and assess how well the modeled streamflow corresponds with observed streamflow.

First, read in your new result file, one of the provided output files, and the observed streamflow file into R with the read.table command:

R> par_set = read.table("result_basin.daily", header=TRUE)

R> m1k23 = read.table("m1k23_basin.daily", header=TRUE)

R> obs = read.table(“obs.wy79_80dw8”, header=TRUE)

You must generate a file of dates in order to plot this as a time series:

R> dates.wy7980 = as.Date(paste(m1k23$year, m1k23$month, m1k23$day, sep="-"))

(you can look at the new file by typing dates.wy7980 at the prompt)

Now, graph modeled streamflow from both output files with observed streamflow:

R> plot(dates.wy7980, par_set$totalstreamflow, type="l", col="green")

R> lines(dates.wy7980, m1k23$totalstreamflow, type="l", col="red")

R> lines(dates.wy7980, obs.wy79_80dw8$obs, type="l", col="blue")

You may want to plot streamflow from the three output files provided, and analyze the hydrographs to determine which parameter set produces streamflow that most closely matches observed streamflow. This is how you would choose the parameter set you would use for your final simulation and to generate the temporal and spatial output you are interested in for your research project.

Generating patch output

You already have daily basin output for your chosen parameter set from a previous simulation. Another run is needed to generate output that can be visualized spatially. For this run you will generate patch output.

There are typically many patches in a landscape representation. Printing out data for every variable for every patch on a daily basis over the course of a year would create very large files. Therefore, it is best to determine a particular time period you want to look at results for, either 1 day or 1 month. For example, if you want to look at the spatial distribution of spring and summer soil moisture, you may want to print patch output just for the months of April and August. Average soil moisture deficit is one of the output variables in the patch_monthly output file.

Create a new tecfile in the tecfiles directory that will print monthly output just for April and August, that you will use in combination with the command line option for patch output:

1980 4 1 1 print_monthly_on
1980 5 1 1 print_monthly_off
1980 8 1 1 print_monthly_on

Start the RHESSys User Interface.

o Use the final chosen parameter set (one of the ‘best’ parameter sets provided for you): m K gw1 gw2 1.272 23.377 0.175 0.394 o Use the same start date, but change the simulation end date to 1980 9 1 1 to end the simulation 1 day after the end of August, as you want output to end after the month of August. If you do not end the simulation there, patch results will also be printed for the month September.

o Instead of basin output, this time you will choose patch output.

You will view the patch output spatially in the next module.

Important Note: when interested in output for different spatial scales (i.e. basin and patch output) from the same run, keep in mind the time period you want to print results over and the resulting data that will be written to the output file. For example, if you chose both basin and patch output as command line options in the same run, using the following tecfile for that run:

1979 10 1 1 print_daily_on
1979 10 1 2 print_monthly_on

you would end up with the basin.daily and basin.monthly output you probably wanted, but you would also end up with very large patch output files for every patch in the basin over every day and month of the run. Instead, it is necessary to make two runs, one for the time period you want basin output, one for the time period you want patch output. Exception, if you were interested in output from just one patch over every day of the run, you could choose both basin and patch output in the same run, using the same tecfile. Output from these patch files would be manageable.

Difference between revisions of "Calibrating and running RHESSys"

Latest revision as of 14:41, 28 October 2009

Contents

Model spin-up

Spin-up simulation results - maintenance

Determining model stabilization

Model Calibration

Calibration strategies

Calibration process

Basic Run

Generating patch output

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 1: / Line 1: @@
-Module III - Calibrating and Running RHESSys
 A number of models are embedded in RHESSys that use mathematical representations of the key controls on ecosystem processes.  While these relatively simple models are unable to capture every process in a forested ecosystem, they can help explain key mechanisms and responses. RHESSys uses many parameters to describe typical soil, vegetation, and land use characteristics.  These parameters, stored in the default files, have values that stay constant through a model run.  Literature based estimates have been used to compile values for common vegetation and soil types.  If your site requires a vegetation or soil type for which a default file has not been developed, you will need to develop your own default files by researching the relevant literature (see special module Developing soil and vegetation parameters).
@@ Line 7: / Line 5: @@
 The four independent parameters that are typically calibrated in RHESSys are the decay of hydraulic conductivity with depth (m), saturated soil hydraulic conductivity at the surface – Ksat0 (K), and two groundwater parameters which control the proportion of infiltrated water that bypasses soil (via macropores and fractures) to a deeper groundwater table (gw1), and the rate of lateral flow from a hillslope scale groundwater table (modeled as a linear reservoir) to the stream channel (gw2).  This module will take you through the process of calibration to establish a set of parameters to use in a basic RHESSys simulation.  However, prior to calibrating or running model simulations for predictive purposes, values for the state variables in the worldfile must be initialized.
-Model Spin-up
+==Model spin-up==
 One way to initialize the model is to 'spin it up'.  This is a preliminary RHESSys simulation you will run to establish values for the state variables in the worldfile.  The 'spin-up' period in RHESSys is the time of adjustment it takes for the model to reach a state of equilibrium in vegetation and soil carbon (C) and nitrogen (N) stores.  The period of time required for the model to reach equilibrium will vary depending on the characteristics of the landscape you are spinning up (i.e. type of climate, vegetation and soil).  A spin-up period of several hundred years is often necessary, due to the slow development of soil organic matter pools, which may have turnover times on the order of decades.  Vegetation is an important spatial and temporal dynamic component in the vegetation-soil-water relationship.  Therefore, when spinning-up a worldfile, the model should be run as a dynamic system to allow vegetation to adapt, evolve and respond to the seasonal and interannual cycles of climate, water redistribution in the system, and soil organic matter pools.
 RHESSys has the ability to output the current value of all state variables at any particular point of a simulation, thus, generating a new worldfile.  For example, at the end of a 100-year spin-up simulation, a new worldfile can be output, reflecting estimates of C and N values (as well as all other state variables) attained after 100 years of processing.  The resultant worldfile produced by the spin-up process can then be used as the input worldfile (starting point) for subsequent RHESSys simulations.  By using the 'output_current_state' option in the TEC file a new worldfile can be output at the end of the spin-up period.
@@ Line 31: / Line 29: @@
 In your tecfiles directory, create a TEC file that includes the 'output_current_state' event that will tell RHESSys to output the conditions for all state variables at a given time.
-•	      At the Unix prompt type the following command to start vi (or use another text editor):
+At the Unix prompt type the following command to start vi (or use another text editor):
-	U> vi tec.spinup (insert the following text (i) and write it to the new file :wq)
-1 1 1 print_daily_on
-1 1 2 print_daily_growth_on
-12 31 24 output_current_state
-Using the RHESSys User Interface
-The RHESSys User Interface depends on the required directory structure established at the very beginning of this tutorial.  All of the necessary interface files (for both the User Interface and the Calibration Interface) must be contained in the scripts directory.  The header in the worldfile (which RHESSys will read in the simulation) has also been set up to look for the default and climate files based on this directory structure.  This exercise will take you through entering the necessary information in the User Interface to create a script and run a RHESSys simulation.
-Move to your scripts directory. You should have copied the following files into the scripts directory in the Getting Started portion of this tutorial:
-		rhessys.ini, UserInterface.jar
-•	From within the scripts directory, start the RHESSys User Interface at the Unix prompt with the following command:
-	U> java -jar UserInterface.jar
-	This will bring up the interface window (if the window does not appear, you may need to
-set the display environment first with the following Unix command:
-	U> setenv DISPLAY name_of_computer:0.0 )
-User Interface exercise instructions by page (for more information on each of the interface fields, please see the RHESSys website).  Many of the fields require you to click on an input box and navigate to the directory and file requested.  You will navigate to the directory and file relative to the directory you are currently in, the scripts directory.  For example, when the worldfile is requested, you would navigate back one directory (out of the scripts directory), into the worldfiles directory, and choose the appropriate worldfile file name.
-For fields that ask for the path/directory/file name, an entry box is provided that requires you to click on it and navigate to the appropriate directory and file name.  In the ENTRY column in the table below, the directory and file name you need to navigate to is listed.
-For now, leave all fields blank on a screen that you do not see listed in the table below.  For information on the use of these fields, see the RHESSys website.  This exercise addresses the required and most commonly used fields.
-USER INTERFACE
-SCREEN
-and TITLE	FIELD	ENTRY
-User Information	complete path and file name to the RHESSys default ini file	scripts/rhessys.ini (you are already in the scripts directory – simply choose rhessys.ini)
-Command line page 1	Path and directory to RHESSys files	Your home directory, where all your RHESSys files are stored (g710_xx)
-	Complete path and file name of RHESSys executable	/data/tague/bin/rhessys5.10.7
-(click on the box, then you can just type this is the file field )
-rhessys5.10.7 is the version of the model, and it is stored in the  /data/tague/bin directory
-	TEC file	tecfiles/tec.spinup (you created in the prior step)
-	Worldfile name	worldfiles/world.w8  or worldfile/world.tutorial
-	Flowtable name	flowtables/flow.w8 or flowtables/flow.tutorial
-	prefix ID (string prefix to output names)	type in out/tutorial.su1
-(see text below)
-For prefix ID enter out/tutorial.su1. This will direct all your result output files to be written to the out directory, with the prefix name tutorial.su1. You can give the results any output prefix you want.  For spin-ups, it may be useful to include the abbreviation suX (spun-up/number of years) as part of the output prefix name to identify the number of years the output file has been spun-up. This is useful if you are continually repeating a 50 year simulation to reach a total of 500 years for example.
-Command line page 2	Command line options
-Leave other entries blank	Choose basin output and RHESSys dynamic (grow) mode
-SCREEN
-and TITLE	FIELD	ENTRY
-Command line page 3
-	Sensitivity multipliers (m&K and optional soil depth)	Box 1 = 1.272    Box 2 = 23.377
-	Groundwater
-Leave other entries blank	Box 1 = 0.175    Box 2 = 0.394
-RHESSys date and ini save	Simulation start date	 year=1978, month=1, day=1, hour=1
-(type the year in the box and choose the month, day, and hour from the drop down boxes
-	Simulation end date	1980, 1, 1, 1    (reminder: in the tec.spinup file you request output for Dec. 31, 1978; see text below)
-	Save values to the RHESSys ini file?	Yes
-Important note: the simulation end date must occur at least one day after the last day you want results to print out.  RHESSys executes a set of routines at the beginning of the day, a set which execute hourly, and another set at the end of the day.  However, most processes are performed at a daily time step. So if you give RHESSys an end date of 1959 1 1 1, simulation will cease on the first hour of Jan. 1, 1959 (i.e. 1:00 am).  Since 1 1 1 stops after the first hour, processing does not cover a full day of routines, therefore, results could not be calculated and printed out for that day.  The last full day for which processes would have been performed would be the previous day, ending at the last hour of Dec. 31, 1958.  Therefore, the final day for which processes could be calculated and results could be printed out would be 1958 12 31 24 (midnight on Dec. 31, 1958).
-RHESSys command line	Shows the command line constructed by your entries	Look over the script to make sure everything is correct, either go back to make changes or continue
-Mode selection	Option 1 – remain logged in and allow the operation to proceed	Choose option 2.  Option 2 allows you to run the simulation in the background, so you can log out and review the results later.  If you choose option 1 and run the script now, the interface will minimize, then maximize once the simulation has finished, notifying you of completion.  You will not be able to log out during this process.
-	Option 2 – create a script to be run by hand at the user’s convenience
+'''unix> vi tec.spinup''' (insert the following text (i) and write it to the new file :wq)
-Start simulation confirmation	Start simulation	Press start simulation – if you chose option 2, it will not start the simulation, it will write a script for you to run later
-The next screen you see will tell you a script has been written and the name of the script file (i.e. rhessys_sh-1719156843).   Use the U> ls  command to see the new file in your scripts directory.   Use the U> more file_name command to look at how the script is written.
-If it is a short simulation and you plan to be at the computer until it finishes, you may want to use option 1 – remain logged in and allow the operation to proceed.  However, simulations often take some time, possibly hours or days, especially long spin up’s or multiple calibration scripts.  In this case, you should have the interface create the script for you to run manually.  Then you can run the script in the ‘background’, enabling you to log out of the computer and view your results later, or continue to use the computer for other work.  It is also preferable, and considerate, when using ‘shared’ computers such as in a computer lab, so others can also utilize the machines while your process continues to run in the background.
+1 1 1 print_daily_on<br>
+1 1 2 print_daily_growth_on<br>
+12 31 24 output_current_state<br>
 •	      To run the script manually:
@@ Line 121: / Line 41: @@
 U> nice nohup ./script_file_name &
-o	nice = Invokes a command with an altered scheduling priority.  This option should be used when running simulations on multi-processor machines that are shared by multiple users.
+nice = Invokes a command with an altered scheduling priority.  This option should be used when running simulations on multi-processor machines that are shared by multiple users.
-o	nohup = Runs a command even if the session is disconnected or the user logs out.
+nohup = Runs a command even if the session is disconnected or the user logs out.
-o	./  = Reads the file in the directory you are in and runs commands from the script_file_name
+./  = Reads the file in the directory you are in and runs commands from the script_file_name
-o	The & runs the job in the background and returns the prompt straight away, allowing you do run other programs while waiting for that one to finish.
+The & runs the job in the background and returns the prompt straight away, allowing you do run other programs while waiting for that one to finish.
 You will not receive a message when a process is finished, however, you can check on the status of any processes you have running with the U> ps –u | grep userID command.  The ps command will also tell you the process ID (PID).  In the event you want to cancel a job, use the U> kill PID command.
-Spin-up simulation results - maintenance
+==Spin-up simulation results - maintenance==
 Move to the worldfiles directory.  Because you gave RHESSys the TEC event  'output_current_state', a new worldfile will have been generated in the worldfiles directory after the simulation completes the day specified in the TEC file.  It will have the name of the original worldfile you used in the simulation followed by the date the new worldifle was generated and the extension  .state  - i.e. world.tutorial.Y1958M12D31H24.state.  Because of the long, cumbersome name given to the new worldfile, you will probably want to rename it (U> mv command).  Either overwrite the original non-spun-up worldfile (world.tutorial) by giving the new worldfile that same name, or add an extension to this new worldfile to reflect the number of years the worldfile has been spun-up (i.e. world.tutorial.su1) and retain the original.  Keep in mind that a worldfile can be a very large file requiring considerable disc space; therefore, multiple worldfiles may cause disk space issues.
 Move to the out directory.  In the TEC file, you requested daily and daily_growth output, and in the interface, you requested the command line options for basin and growth output.  In combination, these two options produced basin.daily and grow_basin.daily ascii output files in the out directory.  Every simulation will construct an output file for each temporal level of output that RHESSys can aggregate result for: hourly, daily, monthly, and yearly.  However, you only requested that daily and daily_growth output be written for this simulation, so only the daily and daily_growth output files would contain results.  After running many simulations, the out directory can really fill up.  It may be useful to delete useless files, and once you have results from many simulations, you may want to organize these in sub-directories within the out directory.
-•	     You may want to clean up the out directory by deleting the empty hourly, monthly, and
+ You may want to clean up the out directory by deleting the empty hourly, monthly, and
 yearly files with the Unix command:
-U> rm *hourly    (etc. for monthly and yearly)
+'''unix> rm *hourly'''    (etc. for monthly and yearly)
 The * is a placeholder for any number of characters, allowing you to perform a command
 on any file ending in hourly.
@@ Line 147: / Line 67: @@
-Determining model stabilization
+==Determining model stabilization==
 To determine if the model has stabilized and the spin-up process is now complete, RHESSys output for plant and soil C and N from the spin-up simulation should be evaluated.  Output for these variables is contained in the basin_daily_growth file.  It is also useful to look at patterns for LAI, streamflow, and saturation deficit, which are contained in the basin_daily file.  You will use the statistical and graphics program R to look at these results.
 Move to the out directory (if you are not in the out directory, you will need to give R complete path names).  In addition to the output files generated from the spin-up exercise, you should have three output files in this directory (w8.su75_grow_basin.daily, w8.su75_basin.daily, and w8.su225_grow_basin.daily) that you copied from the tutorial data in the Getting Started portion of the tutorial.  As the output files generated in the spin-up exercise only reflect 1 year, output files from a 450-year spin-up have been provided for you for illustrative purposes.
-•	Start R simply by typing R at the Unix prompt.
+Start R simply by typing R at the Unix prompt.
-	U> R
+'''unix> R'''
-	Some information on R will appear along with a standard > prompt.
+Some information on R will appear along with a standard > prompt.
 To determine if the model has stabilized, you are interested in looking at each of the following output variables from the grow_basin.daily files: plantc, plantn, soilc, soiln.  First look at output after 75 years of processing, then after 225 years of processing in order to see how these variables have changed and moved toward stabilization.
@@ Line 162: / Line 82: @@
 •	First you must read an ascii output file into R as a table, then plot each of the columns of interest using the following commands:
-R> SuGrow75 = read.table("w8su75_grow_basin.daily",
+'''R> SuGrow75 = read.table("w8su75_grow_basin.daily", header=TRUE)'''
-   header=TRUE)
+'''R> SuGrow225 = read.table("w8su225_grow_basin.daily", header=TRUE)'''
-R> SuGrow225 = read.table("w8su225_grow_basin.daily",
-   header=TRUE)
 This will create two R tables called SuGrow75 and SuGrow225 that contain all of the output varialbles from the RHESSys files.  You could look at a table simply by typing the table name at the R prompt, however, these files contain data for each day over multiple years for 19 different fields, so they are very long.  Instead of looking at a long list of daily values, it may be more useful to simply see what type of output is contained in a file by listing the variable headings.
-•	      To see the variable heading names for an output file, use the command
+To see the variable heading names for an output file, use the command
-R> names(SuGrow75)
+'''R> names(SuGrow75)'''
-	R only displays data for a certain width across the monitor, so the variable headings will
+R only displays data for a certain width across the monitor, so the variable headings will continue to wrap to the end of the list.  The numbers on the left side simply provide an index of the variable for the first entry displayed on a line.
-continue to wrap to the end of the list.  The numbers on the left side simply provide an
-index of the variable for the first entry displayed on a line.
-•	      Now plot a timeseries graph for one of the variables (plantc, plantn, soilc,
+Now plot a timeseries graph for one of the variables (plantc, plantn, soilc, soiln) from each of the tables to see how its trend has changed:
-soiln) from each of the tables to see how its trend has changed:
-R> plot(SuGrow75$soiln)
+'''R> plot(SuGrow75$soiln)'''<br>
-	R> plot(SuGrow225$soiln)
+'''R> plot(SuGrow225$soiln)'''
 A new window should have opened and displayed a timeseries graph of soil nitrogen after 75 years, then after 225 years.  You would use the same method to plot the other variables and examine the trends.
@@ Line 186: / Line 101: @@
 The model may be considered stable when the plant and soil C and N variables are not trending up or down, not including small seasonal and annual fluctuations.  The range of fluctuation should be relatively small, perhaps a magnitude on the order of 5% (take note of the y scale on the graph).  If you have measurements or estimates for each of these variables, you should check model results against those values.
-•	      You can plot the variable soiln from both the 75 year and 225 year spin-up’s on the same
+You can plot the variable soiln from both the 75 year and 225 year spin-up’s on the same graph and see how soiln has changed over time with the additional years of processing.   There are no spaces in the first command, type it as one continuous line (ylim sets the y axis, so soiln from both files is visible):
-graph and see how soiln has changed over time with the additional years of processing.   There are no spaces in the first command, type it as one continuous line (ylim sets the y axis, so soiln from both files is visible):
+'''R> plot(SuGrow75$soiln,ylim=c(min(SuGrow75$soiln), max(SuGrow225$soiln)),col="red")'''
-R> plot(SuGrow75$soiln,ylim=c(min(SuGrow75$soiln),
+'''R> lines(SuGrow225$soiln,col="blue")'''
-max(SuGrow225$soiln)),col="red")
-R> lines(SuGrow225$soiln,col="blue")
-•	      To quit an R session:
+To quit an R session:<br>
-R> q()
+'''R> q()'''
-Model Calibration
+==Model Calibration==
 Model calibration consists of modifying values of model input parameters in an attempt to match field conditions within some acceptable criteria.  Observed streamflow data is available and easily obtained for many watersheds.   Determining reasonable values for the calibrated parameters (m, K, gw1 and gw2 in this calibration) in RHESSys is done by measuring the correspondence of modeled streamflow to observed streamflow for goodness of fit.
 There may be many hydrologic parameter sets that acceptably reproduce observed ecosystem behavior.  Equifinality refers to an observation that different initial conditions (combinations of parameter values) may generate similar, or equivalent, output from a model.  The interactions between the components of such a complex system cannot be considered independently, and so different parameter combinations may arrive at the same end result.  Testing a large number of parameter sets across a wide range of possible parameter space helps to reduce uncertainty.  Methods (such as GLUE) exist to assess the behaviour of acceptable parameter sets, however, these will not be discussed here.
@@ Line 205: / Line 118: @@
 The RHESSys calibration interface helps to automate the procedure involved in running the model over randomly selected parameter values within a delimited range of parameters (using the Monte Carlo approach) and computing objective function values for each parameter set (using the Nash-Sutcliffe efficiency metric).  It should be emphasized, however, that this is not a fully automated calibration procedure that results in optimized parameter sets.  It is up to the user to view the results of the calibration, and choose desirable parameter set(s) based on values of objective function(s).
-Calibration strategies
+==Calibration strategies==
 Calibration Parameters: Calibration in RHESSys is usually focused on streamflow.  However, if you have other observed data (i.e. daily nitrate, monthly psn, daily snowpack, etc...) it is possible to calibrate using these variables.  It should be noted that m, K, gw1 and gw2 are hydrologic parameters; however, since water often exerts a strong control on biogeochemical cycling, there may be situations where calibration of these parameters using secondary variables (i.e. PSN) is appropriate.
-Standard RHESSys calibration involves varying m and K to achieve the ‘best’ correspondence between observed and modeled daily streamflow patterns - where best is usually defined as a high value of the Nash-Sutcliffe Efficiency measure and a close match between observed and modeled total streamflow over the calibration period.
+Standard RHESSys calibration involves varying m and K to achieve the ‘best’ correspondence between observed and modeled daily streamflow patterns - where best is usually defined as a high value of the [http://en.wikipedia.org/wiki/Nash–Sutcliffe_model_efficiency_coefficient Nash-Sutcliffe efficiency] measure and a close match between observed and modeled total streamflow over the calibration period.
 Some users follow a more hierarchical calibration approach where first total annual streamflow volumes are matched and then further adjustments to m and K  (within ranges that produce reasonable total streamflow volumes) are made.
@@ Line 236: / Line 149: @@
 Calibrating is an iterative process that requires refinement as you go along.  Start by testing sets across the full range of possible values.  The range tested should be progressively restricted as a pattern emerges indicating the reasonable range for each parameter for the watershed being modeled.
-Calibration process
+==Calibration process==
 You will set up the calibration for a water year of average precipitation and temperature.  The watershed used in these exercises has never been harvested and is typical of an old growth, homogeneous Douglas fir forest; therefore, the model reflects the same conditions.
@@ Line 248: / Line 162: @@
 Save this data as a text file (if necessary, make sure to remove the .txt extension and covert from DOS to UNIX format with dos2unix).
-Using the RHESSys Calibration Interface
-The following exercise is intended to acquaint you with the process of calibration using the RHESSys Calibration Interface.  The Calibration Interface is also dependant on the established directory structure and must be run from the scripts directory.  You will go through the Calibration Interface (similar to the User Interface) to set up a calibration script.  However, due to the time involved in running multiple simulation to calibrate the model, calibration results have been provided for you (eff.w8dwy and eff.w8sum) to analyze the performance of some parameter sets tested (which you should have copied into your out/cal directory), so it is unnecessary to actually run the calibration script for this exercise.
-Move to your scripts directory. You should have copied the following files into the scripts directory in the Getting Started portion of this tutorial:
-Calibrate.jar   TimeStep.class  computeEff.class Scale.class   	 Log.class	  xm.class
-•	From within the scripts directory, start the RHESSys Calibration Interface at the Unix prompt with the following command:
-	U> java -jar Calibrate.jar
-	This will bring up the interface window.
-Calibration Interface exercise instructions by page.  For fields that ask for the path/directory/file name, an entry box is provided that requires you to click on it and navigate to the appropriate directory and file name.  In the ENTRY column in the table below, the directory and file name you need to navigate to are listed.
-For now, leave all fields blank on a screen that you do not see listed in the table below.  For information on the use of these fields, see the RHESSys website.  This exercise addresses the required and most commonly used fields.
-CALIBRATION INTERFACE
-SCREEN
-and TITLE	FIELD	ENTRY
-User Information	complete path and file name to the RHESSys default ini file	scripts/rhessys.ini
-Calibration Parameters	Calibrate with respect to:	check the box for
-m, K, gw1 and gw2
-Min and Max Values	Parameter	Minimum	Maximum
-	m	1	20
-	K	10	150
-	gw1	0.1	0.31
-	gw2	0.1	0.9
-Design Scenario	Name of results data file	out/cal/eff.tut1
-	Name of command log file	out/cal/cmd.tut1
-	Name of obs data file	obs/obs.dw8_wy63_01
-	Number of Monte Carlo runs	25
-	Output timestep and spatial scale	daily and basin
-	Calibration start date	1979 10 1 1
-	Calibration end date	1980 10 1 1
-	Simulation start date	1978 10 1 1
-	Simulation end date	1980 10 1 1
-	Calibration metrics	choose Nash-Sutcliffe Efficiency
-Name of results data file: This will be a file of the parameters tested and the measurements of performance for each set.  Enter the full path and directory where the calibration results file should be written and what you want the name of the calibration result file to be called.  By convention, when calibrating the efficiency between modeled and observed streamflow, the result file should have the prefix eff.  You will run many calibrations and each eff file should have a unique name.
-Name of command log file: creates a file documenting the input for the calibration runs.  Use the prefix cmd and the extension name should match the eff file.
-Number of Monte-Carlo runs: enter 25.  This will generate 25 parameter sets within the ranges you set and create a file of 25 separate runs.  When choosing the number of sets to test, keep in mind processing time.  For example, if each 	run spans 2 years (1 year to stabilize soil moisture and 1 year of calibration) and it takes 1/2 hour of processing time to complete 1 year of simulation, it will take 25 hours to complete 25 Monte-Carlo simulations.  If you wanted to initially test 100 parameter sets and you generate all 100 in the same file, it would take over 4 days to finish running and for you to view all your eff results. You will need to run a large number of simulations, so it may be beneficial to run several smaller sets at a time (i.e. 4 runs containing 25 simulations each).  Remember that this is an iterative process, start with a wide range and narrow it as the optimal range becomes apparent.
-Calibration start date: 1979 10 1 1 (you are calibrating daily on water year 79/80)
-Calibration end date: 1980 10 1 1 (1 day after the end of water year 79/80, which runs from Oct. 1, 1979, through Sept. 30, 1980).
-Command line page 1	Path and directory to RHESSys files	Your home directory, where all your RHESSys files are stored (g710_xx)
-	Complete path and file name of RHESSys executable	/data/tague/bin/rhessys5.10.7
-	TEC file (will be written by cal interface based on cal start/ed dates entered and command line options chosen)	/tecfiles
-	Worldfile name	/worldfiles/world.w8
-	Flowtable name	/flowtables/flow.w8
-	prefix ID (string prefix to output names)	Type in /out/cal.1
-Prefix ID: if you are running more than 1 calibration script at a time, you must give each separate script a unique output name (i.e. cal.1, cal.2, cal.3, etc…).  When two calibration scripts are running at the same time, you do not want them to write to the same file, creating erroneous results that would producing inaccurate statistics.
-Command line page 2	Command line options
-Leave other entries blank	Choose basin output and RHESSys dynamic (grow) mode
-Command line page 3	You do not need to use any of the options on this page for these simulations	Leave all fields blank
-RHESSys
- ini save	Save values to the RHESSys ini file?	Yes
-RHESSys command line
-	Shows the command line constructed by your entries	Look over the script to make sure everything is correct, either go back to make changes or continue
-Variable selection	Choose variable from basin output	Choose totalstreamflow
-Mode selection	Option 1 – remain logged in and allow the operation to proceed
-	Option 2 – create a script to be run by hand at the user’s convenience	Choose option 2
-Start calibration	Start simulation	Press start simulation
-Analyzing calibration results
-The Calibration interface will create a result file listing the parameter sets tested, the Nash-Sutcliffe efficiency, and statistical information about the behaviour of modeled streamflow produced by each parameter set with regard to observed streamflow.  The calibration results data file entries are as follows:
-The parameters tested: m, K, gw1, gw2
-Efficiency (eff)
-Total modeled flow (mod_flow)
-Total observed flow (obs_flow)
-Minimum modeled flow (mod_min)
-Maximum modeled flow (mod_max)
-Minimum observed flow (obs_min)
-Maximum observed flow (obs_max)
-Mean squared error (mse)
-Modeled variance (mod_var)
-Observed variance (obs_var)
-Modeled mean (mod_mean)
-Observed mean (obs_mean)
-A command log file will also be created containing the calibration input information.  For documentation purposes, it is useful to give the result data file (eff) and command log file cmd) files the same extension (i.e. eff.tut1 and command.tut1).  This will give you a record of the RHESSys version and input files that were used to create a particular result file.  Both the eff and cmd files should have been written to the out/cal directory.
-Move to the out/cal directory. You should have 2 calibration result data files, eff.w8wy and eff.w8sum, that were provided with the tutorial data.  Eff.w8wy reflects parameters tested over a considerable range of possibilities, and over a full water year.  Eff.w8sum reflects parameters tested over a restricted range based on the best results of eff.w8wy, and is tested only over the summer period in an effort to focus on low streamflow.
-Parameters producing efficiencies closer to 1 and total modeled streamflow close to total observed streamflow are more desirable.  Use R to analyze the results of eff.w8dwy.
-•	      Start R
-U> R
-•	      Read in the result data ascii file as a table (the file should be in the out/cal directory, so
-            you must either point the command to the file by giving the path and file name, or be in
-            that directory) :
-	R> effwy = read.table("eff.w8wy", header=TRUE)		(creates table called effwy)
-R> names(effwy)   (lists the variable headings)
-******************************************************************************
+ This step will only be necessary after you run multiple calibrations for a project.  However, this step has already been done
-This step will only be necessary after you run multiple calibrations for a project.  However, this step has already been done for the tutorial data, so you do not need to execute this command for these exercises.  This is for future reference.  When you run calibrations, you will generate multiple calibration result data files (multiple eff files, each with n, i.e. 25, number of results).  However, you will want to have all of the results in one file when you analyze them.  You can bind multiple files together in R with the rbind command, for example:
+ for the  tutorial data, so you do not need to execute this command for these exercises.  This is for future reference.
+  When you run calibrations, you will generate multiple calibration result data files (multiple eff files, each with n, i.e. 25,
+ number of results).  However, you will want to have all of the results in one file when you analyze them.  You can bind
+ multiple files together in R with the rbind command, for example:
+ R> NewFile = rbind(read.table(“eff.tut1”, header=TRUE), read.table(“eff.tut2”, header=TRUE))
-     	R> NewFile = rbind(read.table(“eff.tut1”,
-   header=TRUE), read.table(“eff.tut2”,
-   header=TRUE))
-******************************************************************************
-•	      To view a summary for each of the variables and see the min and max for each:
+To view a summary for each of the variables and see the min and max for each:
-	R> summary(effwy)
+'''R> summary(effwy)'''
-•	You can now sort the results to determine which parameter sets produced the highest efficiencies, which produced streamflow closest to observed, etc...
+You can now sort the results to determine which parameter sets produced the highest efficiencies, which produced streamflow closest to observed, etc...
-R> effwy.ord = order(effwy[,"eff"])
+'''R> effwy.ord = order(effwy[,"eff"])'''
-R> effwy.sort = effwy[effwy.ord,]
+'''R> effwy.sort = effwy[effwy.ord,]'''
-R> effwy.sort
+'''R> effwy.sort'''
 This will return a permutation that rearranges the table by efficiency, in ascending order.  The screen will scroll through all the results, so if this is a long file you may not want to display the entire record.  You may only want to look at results with the highest efficiencies.
@@ Line 396: / Line 188: @@
-•	      Print just the parameter sets that produced an efficiency over 0.70:
+Print just the parameter sets that produced an efficiency over 0.70:
-R> effwy.best = subset(effwy.sort[effwy.sort$eff >.7,])
+'''R> effwy.best = subset(effwy.sort[effwy.sort$eff >.7,])'''
-R> effwy.best
+'''R> effwy.best'''
-•	Use a scatter plot to display the results of the RHESSys result file eff.w8wy (called effwy after read into R) in order to determine the range of values producing the best efficiencies for each parameter.  To plot the efficiency results for each parameter:
+Use a scatter plot to display the results of the RHESSys result file eff.w8wy (called effwy after read into R) in order to determine the range of values producing the best efficiencies for each parameter.  To plot the efficiency results for each parameter:
-	R> plot(effwy$m, eff$eff)
+'''R> plot(effwy$m, eff$eff)'''
-	The scatter plot pattern illustrates the range of values that produce the best efficiencies
-for each parameter.  Generally, values producing above .70 efficiency can be used as a
-guide to restricting the range for refining the calibration.
-	Also plot K, gw1 and gw2.
-•	Repeat this process with eff.w8sum to see how the parameter range was restricted and used to determine the 'best' parameter set range produced by these runs based on both a high efficiency and correspondence between modeled and observed streamflow totals.
+The scatter plot pattern illustrates the range of values that produce the best efficiencies for each parameter.  Generally, values producing above .70 efficiency can be used as a guide to restricting the range for refining the calibration. Also plot K, gw1 and gw2.
-•	      Quit out of R:
+Repeat this process with eff.w8sum to see how the parameter range was restricted and used to determine the 'best' parameter set range produced by these runs based on both a high efficiency and correspondence between modeled and observed streamflow totals.
-R> q()
+Quit out of R:<br>
+'''R> q()'''
-Basic Run
+==Basic Run==
 In order to get a feel for how changing the hydrologic parameter values affects modeled streamflow prediction, run RHESSys to produce output (for the same time period used in calibration) and plot modeled and observed streamflow (follow the RHESSy User Interface instructions to create basin_daily output).  Choose a parameter set (or two) from the calibration result data file (the eff file) that appears to have low correspondence to observed streamflow (output from some ‘high’ performing parameters has already been provided for you).
-•	To run RHESSys and generate output, start the RHESSys User Interface from within the scripts directory:
+To run RHESSys and generate output, start the RHESSys User Interface from within the scripts directory:
-U> java –jar UserInterface.jar
+Enter the necessary information, using the  m, K, gw1 and gw2 parameter set you chose from the calibration result data file (eff file).  You can only run one parameter set at a time.
-o	Enter the necessary information, using the  m, K, gw1 and gw2 parameter set you chose from the calibration result data file (eff file).  You can only run one parameter set at a time.
-o	Use the tecfile you created in module II named tec.tutorial, which tells RHESSys to start printing results on Oct. 1, 1979.
+Use the tecfile you created in module II named tec.tutorial, which tells RHESSys to start printing results on Oct. 1, 1979.
-o	Be sure to start the simulation 1 year before the date you want results to begin being printed (to allow antecedent soil moisture to stabilize), and end the simulation 1 day after the last day you want output for (you want output from 10/1/79 through 9/30/80).
+Be sure to start the simulation 1 year before the date you want results to begin being printed (to allow antecedent soil moisture to stabilize), and end the simulation 1 day after the last day you want output for (you want output from 10/1/79 through 9/30/80).
-o	Choose daily basin output, and make sure RHESSys is in dynamic mode.
+Choose daily basin output, and make sure RHESSys is in dynamic mode.
-o	It is useful to give the output file an identifying name, for example, including the parameter values in the prefix (i.e. m1k10) so you know what parameters were used to produce that result file.
+It is useful to give the output file an identifying name, for example, including the parameter values in the prefix (i.e. m1k10) so you know what parameters were used to produce that result file.
 The optimization metric tends to focus on peak streamflow correspondence; therefore, it is useful to actually look at hydrograph correspondence to determine which parameter set produces streamflow that most closely matches observed streamflow (a file of observed streamflow for the calibration water year has been provided – obs.wy79_80dw8).  When choosing an ‘optimal’ parameter set from a number of sets that produce similar statistical results, you should look at how well modeled streamflow response matches observed peak flows, low flows, and recessions.
@@ Line 439: / Line 225: @@
 On the same graph, plot modeled streamflow from the output you generated with modeled streamflow from one of the output files provided for you to see streamflow response to different hydrologic parameter sets.  Then add observed streamflow to the hydrograph and assess how well the modeled streamflow corresponds with observed streamflow.
-•	      First, read in your new result file, one of the provided output files, and the observed
+First, read in your new result file, one of the provided output files, and the observed
 streamflow file into R with the read.table command:
-R> par_set = read.table("result_basin.daily",
+'''R> par_set = read.table("result_basin.daily", header=TRUE)'''
-   header=TRUE)
-R> m1k23 = read.table("m1k23_basin.daily",
+'''R> m1k23 = read.table("m1k23_basin.daily", header=TRUE)'''
-   header=TRUE)
-R> obs = read.table(“obs.wy79_80dw8”, header=TRUE)
+'''R> obs = read.table(“obs.wy79_80dw8”, header=TRUE)'''
-•	      You must generate a file of dates in order to plot this as a timeseries:
+You must generate a file of dates in order to plot this as a time series:
-	R> dates.wy7980 = as.Date(paste(m1k23$year,
+'''R> dates.wy7980 = as.Date(paste(m1k23$year, m1k23$month, m1k23$day, sep="-"))'''
-   m1k23$month, m1k23$day, sep="-"))
 (you can look at the new file by typing dates.wy7980 at the prompt)
@@ Line 459: / Line 242: @@
-•	Now, graph modeled streamflow from both output files with observed streamflow:
+Now, graph modeled streamflow from both output files with observed streamflow:
+'''R> plot(dates.wy7980, par_set$totalstreamflow, type="l", col="green")'''
-R> plot(dates.wy7980, par_set$totalstreamflow,
+'''R> lines(dates.wy7980, m1k23$totalstreamflow, type="l", col="red")'''
-   type="l", col="green")
-R> lines(dates.wy7980, m1k23$totalstreamflow,
+'''R> lines(dates.wy7980, obs.wy79_80dw8$obs, type="l", col="blue")'''
-   type="l", col="red")
-R> lines(dates.wy7980, obs.wy79_80dw8$obs,
-   type="l", col="blue")
+You may want to plot streamflow from the three output files provided, and analyze the hydrographs to determine which parameter set produces streamflow that most closely matches observed streamflow.  This is how you would choose the parameter set you would use for your final simulation and to generate the temporal and spatial output you are interested in for your research project.
-You may want to plot streamflow from the three output files provided, and analyze the hydrographs to determine which parameter set produces streamflow that most closely matches observed streamflow.  This is how you would choose the parameter set you would use for your final simulation and to generate the temporal and spatial output you are interested in for your research project.
+==Generating patch output==
-Generating patch output
 You already have daily basin output for your chosen parameter set from a previous simulation.  Another run is needed to generate output that can be visualized spatially.  For this run you will generate patch output.
 There are typically many patches in a landscape representation.  Printing out data for every variable for every patch on a daily basis over the course of a year would create very large files.  Therefore, it is best to determine a particular time period you want to look at results for, either 1 day or 1 month.  For example, if you want to look at the spatial distribution of spring and summer soil moisture, you may want to print patch output just for the months of April and August.  Average soil moisture deficit is one of the output variables in the patch_monthly output file.
-•	Create a new tecfile in the tecfiles directory that will print monthly output just for April and August, that you will use in combination with the command line option for patch output:
+Create a new tecfile in the tecfiles directory that will print monthly output just for April and August, that you will use in combination with the command line option for patch output:
-4 1 1 print_monthly_on
+4 1 1 print_monthly_on<br>
-5 1 1 print_monthly_off
+5 1 1 print_monthly_off<br>
 8 1 1 print_monthly_on
-•	Start the RHESSys User Interface.
+Start the RHESSys User Interface.
 o	Use the final chosen parameter set (one of the ‘best’ parameter sets provided for you):
@@ Line 498: / Line 279: @@
 Important Note: when interested in output for different spatial scales (i.e. basin and patch output) from the same 	run, keep in mind the time period you want to print results over and the resulting data that will be written to the output file.  For example, if you chose both basin and patch output as command line options in the same run, using the following tecfile for that run:
-10 1 1 print_daily_on
+10 1 1 print_daily_on<br>
 10 1 2 print_monthly_on
 you would end up with the basin.daily and basin.monthly output you probably wanted, but you would also end up with very large patch output files for every patch in the basin over every day and month of the run.  Instead, it is necessary to make two runs, one for the time period you want basin output, one for the time period you want patch output.
-Exception, if you were interested in output from just one patch over every day of the run, you could choose both basin and patch output in the same run, using the same tecfile.  	Output from these patch files would be manageable.
+Exception, if you were interested in output from just one patch over every day of the run, you could choose both basin and patch output in the same run, using the same tecfile. Output from these patch files would be manageable.