.. vim: syntax=rst .. include:: meta.rest 4. Streamflow Nudging Data Assimilation ======================================= This chapter describes streamflow nudging and data assimilation in Version 5.0 and beyond of WRF-Hydro. Streamflow nudging was introduced in v1.1 of the National Water Model (NWM). The community WRF-Hydro model source code and the NWM source code have merged as of Version 5.0 of WRF-Hydro. See Appendix :ref:`A14 ` for more information on the NWM. 4.1 Streamflow Nudging Data Assimilation Overview ------------------------------------------------- For the National Water Model (NWM), a simple nudging data assimilation (DA) scheme has been developed to correct modeled stream flows to (or towards) observed values. The capability is only currently supported under the NWM configuration, but could be extended to NCAR reach-based routing, and potentially other kinds of routing, in the future. Specifically, the nudging capability introduces an interface for stream discharge observations to be applied to the Muskingum-Cunge streamflow routing solution. .. _section-4.2: 4.2 Nudging Formulation ----------------------- There are several motivations for performing DA. For the NWM analysis and assimilation cycle, the motivation is to improve model simulation and forecast initial conditions. Nudging is a simple and computationally inexpensive method of data assimilation where an observed state is inserted into the model with some uncertainty. When the observed value is inserted into the model without uncertainty, the method is referred to as “direct insertion”. Nudging works well locally to observations, both in space and time. Away from observations, in space and time, the method has limited success. For example, our application applies nudging data assimilation on a channel network with the advantage that the corrections are propagated downstream with the network flow. However, if no spatial or temporal smoothing of the corrections are included with the nudging method, upstream errors soon propagate past observed points when in the forecast (away from the observations, into the future). Various assumptions can be made to smooth the nudge (or correction) in space and/or time but these are highly parameterized and require tuning. In the NWM we have avoided spatial smoothing and have opted to pursue a very limited temporal-interpolation approach. The basic nudging equation solves `e_j`, the nudge, `e`, on a spatial element `j`, .. _equation-4.1: .. rst-class:: center .. math:: e_{j} = \frac{\sum_{n=1}^{N_j} q_{n}*w_{n}^{2}(j,t)*(Q_{n} - {\widehat{Q}}_{n})}{\sum_{n=1}^{N_j} w_{n}^{2}(j,t)} \qquad (4.1) The numerator is the sum, over the `N_j` observations affecting element `j`, of the product of each observations' quality coefficient, `q_n`, the model error, :math:`Q_{n} - {\widehat{Q}}_{n}`, and the squared weights. The weights is where most of the action happens. The weights determine how the nudge is interpolated in both space and time (`j,t`). The weights term `w_n(j,t)` in the above equation is solved for observation `n` as a function of both space, `j`, and time, `t`. It is expressed as the product of separate spatial and temporal weight terms: .. _equation-4.2: .. rst-class:: center .. math:: w_{n}(j,t) = w_{n_{s}}(j)\ *\ w_{n_{t}}(t,j) \qquad (4.2) The temporal weight term takes the following piecewise form in our application: .. _equation-4.3: .. rst-class:: center .. math:: w_{n_t}(t,j) = \begin{cases} 10^{10} * (1/10)^{\frac{\left| t-\widehat{t} \right|}{tau_j / 10}} &:\text{if} \ \left| t-\widehat{t} \right| \leq tau_j \\ e^{-a_j*(t-\widehat{t})} &:\text{if} \ \left| t-\widehat{t} \right| \gt tau_j \end{cases} \qquad (4.3) The spatial weight term is of the following form: .. _equation-4.4: .. rst-class:: center .. math:: w_{n_s} = \begin{cases} \frac{R_{n}^2 - d_{jn}^{2}}{R_{n}^2 + d_{jn}^2} &: \text{if} R_n > d_{jn} \\ 0 &: \text{otherwise} \end{cases} \qquad (4.4) The parameters specified in version 1.2 of the NWM (equivalent to this WRF-Hydro version 5) are: .. rst-class:: center | *tau = 15 minutes* | *a = 120 minutes* | *R = 0.25 meters* for all gages (all `j`) in CONUS (the parameter files are discussed below). The very short `R` means that nudging is applied locally, only to the reach where the observation is associated. There is currently no spatial smoothing. This is partly because spatial smoothing is assumed to be computationally intensive and has not been completely implemented. The `tau = 15` means that within 15 minutes of an observation we are heavily weighting the observation and `a = 120` means that nudging to the observation relaxes with e-folding time of two hours moving away from the observation. The Muskingum-Cunge equation in :ref:`Section 3.6.2 ` has the form: .. _equation-4.5: .. rst-class:: center .. math:: Q_{d}^{c} = C1{\,Q}_{u}^{p} + C2\,Q_{u}^{c} + {C3\,Q}_{d}^{p} + \left( \frac{q_{l}\,dt}{D} \right) \qquad (4.5) In v1.0 of the NWM, nudging was applied into this equation in the following manner .. _equation-4.6: .. rst-class:: center .. math:: Q_{d}^{c} = C1{\,Q}_{u}^{p} + C2\,Q_{u}^{c} + {C3\,(Q}_{d}^{p} + N_{d}^{p}) + \left( \frac{q_{l}\,dt}{D} \right) \qquad (4.6) where the discharge solution (`Q`) at the current time (`c`) at the downstream (`d`) reach was solved by applying the nudge from the previous timestep (`N_{d}^{p}`) to adjust the discharge of downstream reach at the previous (`p`) time. Experiments indicated undesirable side effects of introducing a discontinuity (the previous nudge) in discharge between the upstream (`u`) link and the downstream (`d`) link in this solution. With v1.2 of the NWM (equivalent to v5.0 WRF-Hydro), the equation was modified to include the nudge in the upstream terms of the solution as well, at both the previous and current times: .. _equation-4.7: .. rst-class:: center .. math:: Q_{d}^{c} = C1{(Q}_{u}^{p} + N_{d}^{p}) + C2{(Q}_{u}^{c} + N_{d}^{p}) + {C3(Q}_{d}^{p} + N_{d}^{p}) + \left( \frac{q_{l}\,dt}{D} \right) \qquad (4.7) This is the form of the equation currently used for nudging which aims to reduce the discontinuity in space between the upstream and downstream reaches. Experiments revealed that this formulation, significantly reduced the difference between modeled and observed discharge and hence the nudging magnitudes (over long time series of assimilated observations). Note that the nudge is only applied to the upstream reach during the solution of the downstream reach and is not applied in the output values of the upstream reach. This change in the nudging formulation also promotes the previous downstream nudge to a prognostic variable. Whereas `Q_{d}^{p} + N_{d}^{p}` was simply the previous downstream streamflow value after the nudge (already a prognostic model variable), adding the previous downstream nudge to the upstream solutions requires having the previous nudge available. Therefore, the previous downstream nudge values gets written to the nudgingLastObs files (described in :ref:`Section 4.3 `), which are the restart files for the nudging routine. There are a variety of experimental nudging options and features in the nudging section of the :file:`hydro.namelist` which are incomplete or unused at this time. There are also nudging features used in a limited capacity by the NWM which are not described here. As development of these options evolves, they will be documented in future versions of WRF-Hydro. .. _section-4.3: 4.3 Nudging Workflow -------------------- Figure 4.1 provides an overview of the nuding workflow at the file level. Descriptions are provided for each of the files shown in the diagram. .. figure:: media/nudging-workflow.svg :align: center :scale: 125% **Figure 4.1:** The nudging workflow at the file level. 4.3.1 Input Files ~~~~~~~~~~~~~~~~~ .. rubric:: Discharge observations and :file:`nudgingTimeSliceObs/` : Discharge observations from the real world enter the WRF-Hydro system through the :file:`nugingTimeSliceObs/` directory. The individual observation files used for streamflow nudging are stored in this directory, each with the the following naming convention :file:`YYYY-mm-dd_HH:MM:SS.RRmin.usgsTimeSlice.ncdf`. The first part of the filename, ``YYYY-mm-dd_HH:MM:SS``, identifies the center of the “slice of time” in which observations are found (from year down to second). ``RR`` indicates the resolution, or total width of the time slice. Currently this resolution is a **hard-coded** parameter in the model. It is set to 15 minutes as this is the most common reporting frequency for USGS gages in the USA. The ``usgsTimeSlice`` part of the filename is fixed and is legacy of the fact that these files were originally designed for USGS observations. However, any discharge observations can be placed into this format. The format of a an example timeslice observation file is given by an :program:`ncdump -h` in :ref:`Appendix A14 `. Of the three-dimensional variables, two are character lengths and only the ``stationIdInd`` dimension is variable (unlimited). The ``stationIdInd`` variable has dimensions of the number of individual stream gages contained in the file by the fixed width of the strings (``stationIdStrLen=15``). The variable metadata describes the contents. The ``stationId`` variable is the “USGS station identifier of length 15”. While the character type of the variable and the length of 15 are both fixed, identifiers are not limited to those used by the USGS. Custom identifiers can be used and are described later in this section when the gages variable in the :file:`Route_Link.nc` file is described. The variable ``discharge_quality`` is simply a multiplier. This value is stored as a short integer for space concerns and only takes values from zero to one hundred. Internally in the model, this variable is scaled by 100 and used as a floating point variable between zero and one. The ``queryTime`` variable is not used by the model and is optional. It may be useful in situations when the files are updated in real-time. Similarly, the metadata field ``fileUpdateTimeUTC`` can be useful but is not required by the model. The remaining two metadata fields are both required by the model: ``sliceCenterTimeUTC`` and ``sliceTimeResolutionMinutes`` ensure that the file and the model are operating under the same time location and resolution assumptions. An example of generating timeslice files from USGS observations using the R language is given in the help for the rwrfhydro function :command:`WriteNcTimeSlice`. .. rubric:: :file:`hydro.namelist`: When WRF-Hydro is compiled with nudging on, the :file:`hydro.namelist` file is required to contain ``&NUDGING_nlist``. The nudging namelist is found at the bottom of the :file:`hydro.namelist` file either in the :file:`Run/` directory after compilation or in the :file:`template/HYDRO/` directory. The namelist governs the run-time options to the nudging code. These run-time options are detailed in :ref:`Section 4.5 ` below and in :ref:`Appendix A5 `. .. rubric:: :file:`Route_Link.nc`: Collocation of streamflow gages and reach elements is achieved by the gages field in the :file:`Route_Link.nc` file (see Sections :ref:`3.6 ` and :ref:`5.4 `). Each reach element may have a single gage identified with it as specific by a fixed-width 15 character string in the gages field. A blank entry indicates no gage is present on the reach. The gages field in :file:`Route_Link.nc` tells the nudging module where to apply the observed values to the stream network. Gages which appear in the observation files but not in the :file:`Route_Link.nc` file do not cause a problem, they are simply skipped and their identifiers collected and printed to the file :file:`NWIS_gages_not_in_RLAndParams.txt` file, described later. The number of non-blank routelink gages must match the number of gages supplied in the nudging parameters file, described next. .. rubric:: :file:`nudgingParams.nc`: Appendix :ref:`A14 ` shows the structure of the :file:`nudgingParams.nc` file for a small domain. Some of the parameters in the file are explained in detail in Section :ref:`4.2 ` and some are placeholders for capabilities which have not been developed. 4.3.2 Output Files ~~~~~~~~~~~~~~~~~~ When the code is compiled with the nudging compile-time option on (see next section), four types of output files contain nudging information. Some files are different than when compiled without nudging while other files are unique outputs for the nuding option. .. rubric:: :file:`YYYYmmddHHMM.CHRTOUT_DOMAIN1`: The nudging code affects the values written to the “CHRTOUT” files. If valid observations are available, the (modeled) streamflow variable is changed by the assimilated observations. When the model is compiled to enable nudging, the variable ``nudge`` also appears in the file. The nudge value is calculated as in Section :ref:`4.2 `. .. rubric:: :file:`nudgingLastObs.YYYY-mm-dd_HH:MM:SS.nc`: These files are unique to the nudging compile and are the restart files for the nudging component of the model. A restart file is not required, nudging can be run from a cold start. This file can contain multiple variables, only the ``nudge`` variable is described in this documentation. .. rubric:: :file:`NWIS_gages_not_in_RLAndParams.txt`: These files are unique to nudging and report the unique gages found in the observation time slice files which were not present in the :file:`Route_Link.nc` file. These are gages which may have the opportunity to be assimilated (provided they could be located in the domain). There is only one such file per run, written at the end of the run. .. rubric:: Standard output and :file:`hydro_diag.*` files: The nudging routines write various messages. The ultimate destination of these can be compiler dependent. The nudging code aims to preface all its messages with ``Ndg:`` and all its warnings with ``Ndg: WARNING:``. 4.4 Nudging compile-time option ------------------------------- The nuding capability is only available when the code is compiled with the environment variable set: ``WRF_HYDRO_NUDGING=1`` .. _section-4.5: 4.5 Nudging run-time options ---------------------------- :ref:`Appendix A5 ` presents an annotated :file:`hydro.namelist` file. There are two Fortran namelists contained within that file. The nudging run-time options are contained the ``NUDGING_nlist`` which is the second namelist in the document. Only some run time options listed in the namelist are documented at this time. +-----------------------------------+---------------------------------------+ | **Documented/Supported Options** | **Undocumented/Unsupported Options** | +===================================+=======================================+ | ``timeSlicePath`` | ``nLastObs`` | | | | | ``nudgingParamFile`` | ``persistBias`` | | | | | ``nudgingLastObsFile`` | ``biasWindowBeforeT0`` | | | | | ``readTimesliceParallel`` | ``maxAgePairsBiasPersist`` | | | | | ``temporalPersistence`` | ``minNumPairsBiasPersist`` | | | | | | ``invDistTimeWeightBias`` | | | | | | ``noConstInterfBias`` | +-----------------------------------+---------------------------------------+ Details on the meaning and usage of the options are given in :ref:`Appendix A5 `, in both the comments which are part of the namelist itself and by the blue annotations added to the namelists. The supported options are fairly straight forward in their usage. It is worth noting that the specfication of the :file:`nudgingLastObsFile` does not behave the same way as the specification of the LSM or hydro restart files. The unsupported nudging options have to do with mostly experimental methods for forecast bias correction which have been investigated.