Hierarchical continuous time Markov chain models for threshold exceedance
Thresholds have been defined for many water quality indicators (WQIs) which separate the measurement space of the indicator into two states, one of which, the exceedance or violation state, has undesirable consequences. Observations are often made at unevenly spaced intervals, are usually uncoordinated with the timing of state changes, and are usually made asynchronously at multiple locations. These typical observation protocols have hindered estimation of violation-state properties. To address this problem, six hierarchical two-state continuous-time Markov chain (CTMC) models were developed and tested. These allow estimation of duration, frequency, and limiting probabilities from asynchronous, uncoordinated, and unevenly spaced observations. Three of these models were developed for single Markov processes but can be modified to handle multiple processes. Three of the models were developed for multiple processes. Two of the models were homogeneous; the other four were non-homogeneous with sinusoidally varying components. Model parameters were estimated with Bayesian MCMC methods.
In each of three experiments, processes were simulated at high-frequency time steps. Asynchronous, infrequent, uncoordinated, and unevenly spaced observations of these processes were then extracted using protocols specified with varying observation period length, quasi-regular observation interval, and violation-state observation probability. Models were estimated from the simulated observations, and compared on nominal parameter value recovery, predictive performance, and frequency and duration distribution error. Effects of process and observation protocol characteristics on recovery, prediction performance and distribution estimation error were measured.
In the first experiment, simulated observations of single-chain two-state CTMCs were made and modeled. First, choice of prior distribution model was evaluated. Uniform and Gamma priors were found to be roughly equivalent in terms of performance, and both were found to perform substantially better than a Jeffrey's prior. Next, recovery, prediction, and distribution estimation error were evaluated. Duration, frequency, and violation-state probability were overestimated. Lower distribution estimation error was associated with longer observation period and more observations. Lower prediction and distribution estimation error was associated with more non-homogeneous processes.
In the second experiment, observation and modeling of multiple correlated WQI processes was simulated by mimicking WQIs with dual correlated two-state continuous time Markov chains. Estimates were made both jointly and individually, using the homogeneous model from the first experiment modified for multiple chains. Duration, frequency, and long-term violation-state probability were overestimated. Joint and individual estimates produced nearly equal results. Positively correlated and relatively low transition-rate processes were more-accurately predicted. Several observation characteristics were related to better prediction: greater event observation intensity, greater quasi-regular observation intensity, and longer observation period.
In the third experiment, two methods were compared for estimating threshold exceedance frequency and duration properties. One method was adapted from the Partial Duration Series (PDS) method popular in flood frequency analysis. The second method was based on three multiple-chain CTMC models. Simulations of WQI time series were generated using a sinusoidal model with autocorrelated errors adapted from the literature. Duration and violation-state probability were overestimated. Frequency was underestimated. A multiple-chain homogeneous CTMC model produced lower error estimates of frequency and duration than did the PDS method. Results were mixed for the two non-homogeneous multiple-chain CTMC models. For all CTMC models considered, more-positively correlated processes were easier to predict. Higher observation rates were found to improve predictive performance. The multiple-chain CTMC models were shown to be extend-able to allow for prediction of frequency and duration properties from watershed characteristics or to allow these properties to vary with time.
The bias in recovery of nominal parameter values seen in all three experiments appeared to be related to the observation protocol characteristics and not to the models or estimation method. The sources of this bias were not fully investigated.
0388: Hydrologic sciences
0790: Systems science