SEARCH

SEARCH BY CITATION

Keywords:

  • daily weather generator;
  • stochastic model;
  • evolutionary algorithms;
  • Markov chain model;
  • optimization

Abstract

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Theoretical considerations
  5. 3. Model application
  6. 4. Results and discussion
  7. 5. Conclusion
  8. Abbreviations
  9. References

Stochastic Multi-states First-order Markov Chain (SMFOMC) models have been used to describe occurrence of daily rainfall. This paper describes optimization of SMFOMC parameters through the generation of synthetic daily rainfall sequences. Three SMFOMC parameters were the number of states (NS), the preserved proportion in the last state (PPL) and the state divider (SD). The multi-objective differential evolution (MODE) was used to find the Pareto-optimal line (POL) of two conflicting objectives; (1) minimization of total monthly absolute total relative error (TMATRE), and, (2) minimization of NS. Three probability distributions functions (PDFs) for generating daily rainfall amounts in the last Markov Chain state were compared. They were: (1) the shifted exponential distribution (SE), (2) the exponential distribution (E), and, (3) the two-parameter gamma distribution (G-2). The optimal SMFOMC parameters were applied to generate the daily rainfall sequences of 44 rainfall stations located in five regions of Thailand. Reliability of the optimal SMFOMC parameters for each PDF was measured by TMATRE and coefficient of determination (R2). Performance of PDFs was analysed by a ranking method. Results showed that the three PDFs were mostly found to be fitted well with the synthetic daily rainfall sequences. However, highest error was found in case of monthly average minimum daily rainfall values. Out of the three PDFs, the SE demonstrated the lowest performance, while G-2 performed the best. Copyright © 2011 Royal Meteorological Society


1. Introduction

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Theoretical considerations
  5. 3. Model application
  6. 4. Results and discussion
  7. 5. Conclusion
  8. Abbreviations
  9. References

The synthetic daily rainfall sequences are often used as an important input for mathematical simulation in hydrology, agriculture and water resources models. In the case of univariate series daily rainfall models, the missing data could be generated using statistical parameters that describe the hydrological behaviour within the sequence itself. For this purpose, the multiplicative autoregressive integrated moving average (ARIMA) and Thomas–Fiering models have been extensively applied (Delleur and Kavvas, 1978; Sharma, 1985; Vogel, 1988; Mujumdar and Kumar, 1990; Schreider et al., 1997; Toth et al., 1999; Ahmad et al., 2001; Taewichit and Chittaladakorn, 2007; Amini et al., 2009). However, the difficulty, complexity and requirement of large statistical parameters are considered to be their limitations. One simplified approach is the use of a model called the Stochastic First-Order Markov Chain (SFOMC), which describes the probability of rainfall occurrences on a given day using transition probability matrices (TPMs). The SFOMC model is applied to study the occurrences of daily rainfall (Gabriel and Neumann, 1962; Moon et al., 2006) and to construct rainfall-runoff synthesizing models (Kottegoda et al., 2000). The main concept behind the SFOMC is the use of conditional probability to describe the occurrences and non-occurrences of rainfall (Gabriel and Neumann, 1962). Initial development and application of SFOMC is as a two-state (wet-dry) model (Gabriel and Neumann, 1962; Todorovic and Woolhiser, 1975; Haddada et al., 2000), which is applied to generate wet-dry events. Further development of SFOMC is the use of the two-state model coupled with some probability distribution functions (PDFs) (Tsakeris, 1988) to estimate daily rainfall. In addition, modifying the SFOMC model by increasing the number of states (NS) in the wet state (Khanal and Hamrick, 1974; Srikanthan and McMahon, 1984; Hutchinson, 1990; Aksoy, 2003) coupled with the uses of PDFs in the last state has been attempted successfully. This model is popularly known as the Stochastic Multi-States First-Order Markov Chain (SMFOMC). However, some of the difficulties noted in using the SMFOMC (Haan, 1977) were: (1) determining NS, (2) determining the intervals of the variable under study to associate with each state, and, (3) assigning a number to the magnitude of an event once the state is determined. Most research studies of SMFOMC still use trial and error to overcome these limitations and to determine the optimal SMFOMC parameters for which the generated daily rainfall sequences are close to those of historical sequences.

To synthesize daily rainfall for a single site (univariate model), optimal SMFOMC parameters of the number of states (NS), the preserved proportion in the last state (PPL), and the state divider (SD) are determined in this study. Forty-four rainfall stations from five regions (Central, North, North-East, East, and South) in Thailand (Figure 1) were selected to apply and to evaluate the optimal parameters of the SMFOMC model. The multi-objective algorithm, multi-objective differential evolution (MODE), was employed in this study. Two conflicting objectives of minimizing NS and minimizing total monthly absolute total relative error (TMATRE) were set with statistical parameters of generated daily rainfall sequences. Performance of three PDFs, the shifted exponential distribution (SE), the exponential distribution (E) and the two-parameter gamma distribution (G-2), were evaluated and compared.

Figure 1. Locations of the selected 44 rainfall stations in 5 regions of Thailand. This figure is available in colour online at wileyonlinelibrary.com/journal/met

Download figure to PowerPoint

thumbnail image

2. Theoretical considerations

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Theoretical considerations
  5. 3. Model application
  6. 4. Results and discussion
  7. 5. Conclusion
  8. Abbreviations
  9. References

2.1. Stochastic multi-states first order Markov chain

SMFOMC has been applied in hydrology and water management for modelling processes (Kottegoda et al., 2000; Aksoy, 2003; Ochola and Kerkides, 2003). In its first order models it employs conditional probability to describe the process x(t) at the present time t using only the outcome at previous time t − 1. A higher-order Markov Chain model, corresponding to the number of preceding days (Chapman, 1998), could also be formulated (Kulkarni et al., 2002). SMFOMC may then be considered as a simple two-state for a dry day (no rain) and a wet day. However, there is no discernible difference reported between the model performance of first and second order models in synthesizing daily rainfall (Jimoh and Webster, 1996).

SMFOMC is defined by its transition probability matrices (TPMs) and frequency distributions of rainfall amounts (Haan et al., 1976) that can preserve most of the daily, monthly and annual characteristics (Srikanthan and McMahon, 2001). The TPMs play a significant role in estimating the present data j at time t using the probability pij(t) of moving from state i at time t − 1 to state j at time t, which is derived from the frequency of state changes from state i to state j. To obtain the frequency of daily rainfall for each of the states, a rainfall class limits table (RCLT) is constructed for classifying rainfall data into successive states of j = 1 to j = r. Each state consists of upper bound and lower bound rainfall amounts. The state interval for each state is mostly specified through manual trial and error by researcher's experience. In the daily data generation, the amount of rainfall in the intermediate state j at time t (state of j = 2 to state j = r − 1) is estimated by adding the lower bound of rainfall amounts of state j with the term of the stochastic uniform random number U ∈ (0, 1) multiplied by the difference of rainfall amounts between upper and lower bound of state j (linear interpolation). The U is generated from the most popular generators called linear congruential generators (Salas, 1992; Reddy, 1997). In addition, the inverse cumulative probability distribution function is used to estimate rainfall amount in the last state r.

2.2. Probability distributions

Various PDFs have been applied by researchers (Allen and Haan, 1975; Todorovic and Woolhiser, 1975; Suhaila and Jemain, 2007). However, the cumulative probability distribution function (CDF), which is an area under the PDF curve, is popularly used. SE was first proposed by Allen and Haan (1975). The distribution in terms of CDF is given as:

  • equation image(1)

where F(x) is a CDF, λ is difference of average of all recorded historical daily rainfall length being greater than or equal to Rfc−1, Rfc−1 is the rainfall amount of lower bound in the last state, x is the rainfall amount in the last state and c is the state of daily rainfall amount. The Maximum-Likelihood method (MLM) is used for estimating parameters of E and G-2. Numerical methods, suggested by Rao and Hamed (2001), are applied to solve the parameters of both the distributions. In the case of E, which is a special case of the Gamma family, the distribution can be obtained by setting β = 1 in Equation (2) and expressed as Equation (3):

  • equation image(2)
  • equation image(3)

where α, β and ε are distribution parameters and Γ(β) is gamma function.

From Equation (4), G-2 has extensively been used in the Markov Chain model (Coe and Stern, 1982; Richardson and Wright, 1984; Duan et al., 1995; Kottegoda et al., 2000; Aksoy, 2003): its PDF is formulated by eliminating ε in Equation (2):

  • equation image(4)

2.3. Optimization

2.3.1. Multi-objective optimization problems (MOPs)

The MOPs deal with optimizing various conflict objectives simultaneously. Various objectives are incorporated for making decisions to select the desirable solution. The solutions of MOPs comprise non-dominated solutions (NDSs). NDSs are often expressed as the Pareto-optimal line (POL). The heuristic stochastic search techniques, evolutionary algorithms (EAs), have been used intensively for solving MOPs (Coello et al., 2002) owing to their population-based nature that allow the generation of several elements of POL in a single run. EAs also provide a diversification mechanism to obtain a better solution. The context of EAs in MOPs is to find a POL as close as possible to the true POL and diversify solution on the POL as much as possible. The POL comprises NDSs that have been sorted in many front levels using non-dominated sorting algorithm (NDSA). The NDSA, also known as simple modified naïve slow was proposed by Deb (2001) and is applied in the present study. After the set of POL is met, the preferred solution can be chosen using the compromise programming (CP) technique with weighted importance values of each objective function (Zeleny, 1982; Romero and Rehman, 1989).

2.3.2. Multi-objective differential evolution (MODE)

MODE is an advanced version of the differential evolution algorithm (DEA) (Storn and Price, 1997) for multi-objective optimization (Sun et al., 2005). The process of DEA begins with randomly generating the population of solution vectors size NP of D dimension or ‘target vectors’. To improve the solution vectors (trial vectors) recombination process is used, which consists of mutation and crossover. Solution values are swapped and changed by chances of probabilities. The probability is launched by the crossover constant (CR ∈ (0, 1)) when the random number during trial vector generation is less than or equal to CR. The trial vector size, NP, is generated dimension-by-dimension by randomly picking three distinct solution vectors and adding the first vector to the product of the weighted factor (F ∈ (0, 1)) and the difference of the remaining two vectors. The trial vector will replace the temporary target vectors if the objective value of the former is better than the latter.

The DEA is modified to be MODE (Xue, 2003; Reddy and Kumar, 2007; Li and Zhang, 2008), where the initialization vectors of population is started as with a normal DEA to generate target vectors with size NP, which is followed by generation of new trial vectors ū with size NP in the recombination process. Those vectors are combined and sorted to be vectors using the proposed NDSA to rank NDSs. The selection process is similar to an elitist non-dominated sorting GA (NSGAII) where the sorted solutions of NP are directly picked up from the solution fronts, j, which replace the old set of solutions. The lending mechanisms from NSGAII, called the crowded distance assignment (Deb, 2001), are assigned to each solution in the non-dominated fronts in order to use these as criteria to select better compared solutions when the last required set of solutions is located in the same front. The larger crowded distance is preferred to be chosen. Those processes are completed in one generation. Those steps are repeated generation by generation until the set of POL does not change further (Table 1).

Table 1. Proposed MODE algorithm
 Algorithm : MODE
1Initialize population vectors to generate target vectors x̄ size NP
2For G = 1 to Max G
3For i = 1 to NP: Randomly select three distinct vectors and randomly select position j ∈ (1, D)
4For k = 1 to D: generate random number rand (k) ∈ (0,1)
5If rand(k) < CR or k = D then generate trial vectors ū at position j end if
6J = next position: If j > D then j = 1 end if
7Next k: Next i
8Combine x̄ and ū to create new vectors r̄ size NP × 2
9Repeat
10Perform non-dominated sorting to vectors r̄ using simple modified naïve slow sorting
11Until all the population size NP × 2 are sorted, store number of front to NF
12Assign crowded distance to vectors x̄ and trial vectors ū in r̄
13Set remaining required solutions (RRSs) = NP
14For j = 1 to NF: if RRSs = 0 then exit for end if
15If RRSs ≥ the number of solutions in front j then
16pull out all solutions from front j to replace in the next generation G
17RRSs = NP- the number of solutions in front j
18Else if RRSs < the number of solutions in front j then
19For k = 1 to RRSs
20compare crowded distance (CD) of solutions of r̄ in front j (the larger CD will be replaced in the
21next generation G)
22Next k
23end if: Next j: Next G

3. Model application

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Theoretical considerations
  5. 3. Model application
  6. 4. Results and discussion
  7. 5. Conclusion
  8. Abbreviations
  9. References

3.1. Study area and locations of selected rainfall stations

For this study, a 38 year (1971–2008) continuous record of daily rainfall occurrences at 44 stations distributed in 5 regions of Thailand (Figure 1) was used. The rainfall stations were selected based on the data continuity and with length of record for more than 30 years. The daily rainfall data were obtained from the Royal Irrigation Department and the Meteorological Department of Thailand.

3.2. Model formulation for optimization of SMFMOC parameters

Twelve monthly statistical parameters were used to measure the adequacy and acceptability of the model: (1) monthly maximum spell length of wet days (MMaxSLWD); (2) monthly maximum spell length of dry days (MMaxSLDD); (3) monthly maximum daily rainfall (MMaxDR); (4) monthly minimum daily rainfall (MMinDR); (5) monthly sum of daily rainfall (MSDR); (6) monthly average daily rainfall (MADR); (7) monthly daily rainfall standard deviation (MDRStd); (8) monthly daily rainfall skewness (MDRSk); (9) monthly number of wet days (MNWD); (10) monthly number of dry days (MNDD); (11) monthly average number of wet days (MANWD), and, (12) monthly average number of dry days (MANDD). In order to provide the optimal SMFOMC parameters that minimize the TMATRE and reduce large computation of TPMs by minimizing NS for each rainfall station with respect to the parameter constrains assigned, the multi-objective was formulated as:

  • equation image(5)
  • equation image(6)

where u is the number of U sequences, OPssp, m is the monthly statistical parameters index ssp of historical daily rainfall for month m, EPssp, m is the monthly statistical parameters index ssp of generated daily rainfall for month m, ssp is an index of 12 statistical parameters ∈ [1,12], m is an index of 12 months ∈ [1, 12] that starts from January, i is a state index ∈ [1,10] that refers to SD. The three decisions variables are SD, PPL, and NS. PPL is preserved for application of PDFs ∈ [0.01, 0.5] to generate rainfall amount in the last state. The maximum of PPL was fixed to be 50%, so that the probability of rainfall occurrence is described by probability of distribution with a maximum of 50%, while the remaining probability is described by TPMs. SD is a discrete decimal number of state divider varying as 1.1, 1.2, …, 2.0. NSi is the number of states ∈ [Min(NSi), Max(NSi)]. Min(NSi) = 3 (two for wet-dry model) and Max(NSi) is the maximum state for which amount of rainfall in state two after construction of RCLT is not lower than 0.1 mm (dry state) (Srikanthan and McMahon, 1982). This value was calculated corresponding to SD, PPL and the maximum daily rainfall of each rainfall station (Appendix A).

3.3. Model reliability

The 12 monthly statistical parameters of generated and historical sequences were calculated on a monthly basis through the 38 years. The model reliability for optimal SMFOMC parameters was evaluated based on the reproduction of the TMATRE of 12 monthly statistical parameters (Equation (5)). The TMATRE describes the absolute relative error between historical and generated data, the lower value of which indicates a satisfactorily generated sequence.

3.4. Data input

Input data were: (1) the historical daily rainfall data sequence for each rainfall station, (2) U sequences ∈ [0,1], and, (3) maximum NS for each SD (Appendix A). Thirty U sequences were generated and compiled as a single file input to the proposed model. These U sequences were assumed to be several stochastic events and used for generating daily rainfall sequences during the optimization step. This assured that the obtained optimal SMFOMC parameters for each rainfall station would be reliable when the rainfall event is changed.

3.5. Model application

As illustrated in Figure 2, for each rainfall station, after compilation of input data, the modelling process was started with the calculation of required PDF parameters using MLM. Likewise, the 12 monthly statistical parameters of historical daily rainfall data were calculated. Other steps are described below.

Optimization process:

  • (a)
    Thirty U sequences were read from the input file. Sets of NS, SD and PPL were randomly generated as the vectors of decision variables. The optimal SMFOMC parameters were calculated only in cases of shifted exponential distribution, so as to compare the performance of the other two PDFs with this distribution later in the generation process.
  • (b)
    The TPMs of 12 months were created using historical data. The RCLT was also constructed with population vectors of decision variables. The SMFOMC can then be performed using stochastically generated 30 U sequences to obtain the vectors of functions of two objectives as Equations and . The parameters used in the MODE were pre-tested with the calculation time and POL observations. The parameters were updated to observe the optimal POL. The population vector size of 30, maximum generation of 100, weighted factor F of 0.5, and crossover constant (CR) of 0.95, were found suitable for the MODE algorithm, since they provide low calculation time and stable POL.
  • (c)
    After obtaining the POLs of all rainfall stations, the SMFOMC parameters (NS, SD and PPL) were acquired from compromise programming.

Generation proces:

  • (d)
    For each rainfall station optimal parameters from optimization under w1s (weighted importance of first objective) were used to generate three replicates of synthesized daily rainfall sequences of each PDF. When the rainfall amount generation of the last state was enabled, three PDFs were called to generate rainfall amount with their inverse CDF. Reliability of the optimal SMFOMC parameters for each PDF was measured by TMATRE and coefficient of determination (R2). The relative performance of PDFs was then measured by a ranking method.

Performance of PDFs

  • (e)
    The eigenvalue of each distribution was calculated by firstly averaging the values of monthly absolute relative error (MATRE) of 44 rainfall stations for each statistical parameter using Equation . The average MATRE of each statistical parameter for three PDFs was then normalized by the sum of PDF average MATRE of the ith statistical parameter. The Eigenvalue of each PDF was eventually calculated using Equation by assigning an equal weighted importance. This was performed for one w1. In the case of other w1s, the procedure was repeated until the Eigenvalues of all PDFs under w1s were obtained. Moreover, the higher Eigenvalue indicated lower performance:
    • equation image(7)
    where Avg.MATREd, i is an average of MATRE of the ith statistical parameter of PDF d; st is station ID ∈ [1,44]; m is an index of 12 months ∈ [1,12]; OPi, st, m and EPi, st, m are monthly ith statistical parameters of rainfall station st of month m of historical and generated sequences respectively:
    • equation image(8)
    where Eigenvalued is an Eigenvalue of PDF d; Norm(Avg.MATREd, i) is a normalized average MATRE of PDF d of ith statistical parameter, wi is a weighted importance of ith statistical parameter.

Figure 2. SMFOMC's parameters optimization using MODE

Download figure to PowerPoint

thumbnail image

4. Results and discussion

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Theoretical considerations
  5. 3. Model application
  6. 4. Results and discussion
  7. 5. Conclusion
  8. Abbreviations
  9. References

4.1. Optimal SMFOMC parameters

During simulation runs, various POLs were generated. In most of the events, the POLs did not change further after 60 simulation runs. Thus, the simulation was continuously performed until 100 runs and the optimal POL values were recorded. The results of discrete optimal POLs (Figure 3) clearly showed that the TMATRE decreased with increasing NS. The TMATRE was high with low NS, and it sharply decreased until the NS reached nearly 10. The NS values between 18 and 24 were not found in the POLs. After NS of 25, the TMATRE decreased slightly.

Figure 3. Pareto-optimal line of 44 selected rainfall stations: (a) Central, (b) East, (c) North, (d) Northeast and (e) South. NS = number of states

Download figure to PowerPoint

thumbnail image
4.1.1. Variations in optimal SMFOMC parameters with w1s

The SMFOMC parameters were controlled by assigning the amount of rainfall in the upper limit of second rainfall state (wet) to be larger than 0.1 mm (Figure 4). NS must be minimized according to the second objective function in order to reduce the size of TPMs and time of model simulation. Rapidly rising trends of the NS were noticed when w1s were higher than 0.5 (Figure 4(a)). The PPL (Figure 4(b)) remained stable when w1s was lower than 0.5, after which sharp decreases in PPLs were noted. Variations in SD are illustrated in Figure 4(c). This parameter specifies the interval of rainfall amounts of the successive intermediate states. A maximum SD of 1.6 was noted upon varying w1s in the range of 0.1–0.6 and a minimum SD of 1.1 was found at values of w1s higher than 0.7. The minimum values of SD visibly corresponded to lower TMATRE. As shown in Figure 4(d), TMATRE showed large disparity at w1s lower than 0.50, after which TMATRE started gaining stability. It could be summarized that TMATRE was decreasing as the given w1s were higher. As hypothesized, the conflicting characteristics of TMATRE and NS are evidently noticeable in Figure 4(a) and (d). However, for further analysis in the generation process the compromise set of solutions that obtained by varying w1s of 0.50 onward was considered, as those w1s provided low and stable TMATRE.

Figure 4. Effects of weighted importance values of objective one (w1s) on: (a) the number of total states, (b) the preserved proportion in the largest state, (c) the state dividers and (d) TMATRE

Download figure to PowerPoint

thumbnail image
4.1.2. Daily rainfall generation

In the generation process three replicates of daily rainfall sequences were generated as historical daily rainfall sequences with an equal length for each rainfall station. Three PDFs were applied to generate the daily rainfall sequences in the last state corresponding to the same time and the same U. Probability distribution parameters were obtained from 44 rainfall stations once all RCLT for w1s were constructed (Table 2). The generated sequences were summarized monthly, based on the 12 statistical parameters, and were then averaged for the three replicates. High TMATRE values (above 300) were found in w1s of 0.1–0.4. However, distinctly visible variations were observed in TMATRE for w1s in the range of 0.5–0.7 (Figure 5). In general, the compromise solution is presented with given w1 of 0.5 to both objectives. Moreover, the compromise solutions could also be taken as desirable solutions under the variation of w1s. The comparative analysis of w1s on the variation of *TMATRE (normalized sum of all TMATRE of 44 rainfall stations of each PDF under each w1 for 12 statistical parameters) is presented in Figure 6.

Figure 5. Variations in TMATRE of generated daily rainfall sequences with w1s. equation image, 0.5 (wl); equation image, 0.6 (wl); equation image, 0.7 (wl); equation image, 0.8 (wl); equation image, 0.9 (wl); equation image, 1.0 (wl)

Download figure to PowerPoint

thumbnail image

Figure 6. *TAMTRE variation of 12 statistical parameters of all stations from three PDFs (SE, E, and G-2) with different w1s: (a) MMaxSLWD (days), (b) MMaxSLDD (days), (c) MMaxDR (mm), (d) MADR (mm), (e) MDRStd, (f) MDRSk, (g) MNWD (days), (h) MNDD (days), (i) MANWD (days), (j) MANDD (days), (k) MSDR (mm) and (l) MMinDR (mm). Means within each PDF with the same letter are not significantly different (p < 0.05) by Duncan Multiple Range Test. *TMATRE = [Normalized sum of all TMATRE of 44 rainfall stations of each statistical parameter under each w1/(44 rainfall stations × 12 months)]. equation image, 0.5 (wl); equation image, 0.6 (wl); equation image, 0.7 (wl); equation image, 0.8 (wl); equation image, 0.9 (wl); equation image, 1.0 (wl)

Download figure to PowerPoint

thumbnail image
Table 2. Probability distribution parameters
Statistic valueShifted exponentialExponentialTwo-parameter gamma
 λαεαβCsa
  1. a

    Cs is the skewness coefficient of the observed historical daily rainfall in the last state.

Max3.95124.5279.414.4954.30.39
Min0.362.660.80.00426.10.01
Average ± STD1.3 ± 0.735 ± 23.9112.5 ± 38.62.4 ± 2.3112.5 ± 132.20.23 ± 0.08
4.1.3. Effect of criteria weights on statistical parameters of synthetic daily rainfall sequences

Based on the statistical test results, a model was considered to perform satisfactorily if the average of the parameters estimated from the replicates was close to the historical values. No significant differences in *TMATRE were found in the statistical parameters for three PDFs (Figure 6) except in MMaxDR (Figure 6(c)), in MMinDR (Figure 6(l)), in MSDR (Figure 6(k)), in MADR (Figure 6(d)), in MRStd (Figure 6(e)) in MSk (Figure 6(f)). Those differences in *TMATRE (p < 0.05) that varied by w1s, indicated that the w1s statistically affected the acceptability of those statistical parameters. The *TMATRE of all statistical parameters of generated data varied between 2 and 25% and deviated from historical statistical parameters except MMinDR, which showed distinctly large error in its estimation (Figure 6(l)). This may be due to the difficulty in estimating the near zero value of daily rainfall generation with the application of this model. The daily rainfall generation in the last state using three PDFs did not show any significant difference. On the contrary, they provided almost the same *TMATRE for all statistical parameters. Although it could not be firmly concluded which PDF performs the best, to rank the performance of the three PDFs the Eigenvalues of three PDFs under w1s were later calculated and ranked. Over half of the 12 statistical parameters provided low *TMATRE in w1s of 0.8–1.0 (Figure 6(d–f, k and l)) compared to the others. Hence for this range of w1s (0.8–1.0) the cumulative ratios of monthly statistical parameters of generated data to yearly historical data versus the cumulative ratios of monthly statistical parameters of monthly historical data to yearly historical data for all PDFs were then plotted to see correlations (Figure 7). Good correlations (R2 > 0.9) were observed for most parameters except in the extreme overestimates of MMinDR. In addition, modest underestimates in MMaxSLWD, MMaxSLDD and MMaxDR also appeared.

Figure 7. Regression plots for the cumulative ratios of the monthly statistical parameters of generated data to yearly historical data versus the cumulative ratios of monthly statistical parameters of monthly historical data to yearly historical data for w1s in the range of 0.8–1.0 month by month (All plots obtained from three distributions): (a) MMaxSLWD (days; R2 = 0.943), (b) MMaxSLDD (days; R2 = 0.946), (c) MMaxDR (mm; R2 = 0.986), (d) MADR (mm; R2 = 0.996), (e) MDRStd; R2 = 0.990, (f) MDRSk; R2 = 0.980, (g) MNWD (days; R2 = 0.998), (h) MNDD (days; R2 = 0.999), (i) MANWD (days; R2 = 0.998), (j) MANDD (days; R2 = 0.999), (k) MSDR (mm; R2 = 0.994) and (l) MMinDR (mm; R2 = 0.272). chemical structure image, Regression line

Download figure to PowerPoint

thumbnail image
4.1.4. Which set of SMFOMC parameters should be selected?

Most of the parameters considered in this study provided low *TMATRE with a range of less than 0.1–0.3 times the historical statistical parameters (Figure 6(a–k)) and ranges of R2 > 0.9 (Figure 7(a–l)). This indicates that the model and optimal parameters could realistically describe the variation of historical daily rainfall data. An unacceptably high *TMATRE was received in the case of MMinDR (Figure 6(l)) where *TMATRE was about 3–6 times for w1s of 0.8–1.0, 8–10 times for w1s of 0.6–0.7, and about 16 times for w1 of 0.5. This implies that if MMinDR is neglected the compromise solution with w1 of 0.5 would generally be the desired solution. Otherwise, the compromise solutions with w1s of 0.8–1.0 would be appropriate.

4.2. Performance of PDFs

The performance of the three PDFs, as indicated with Eigenvalues, is presented at different values of w1s (Table 3) as w2s was reduced an importance according to higher w1s. The results clearly show consistency for all w1s: the poorest performance was provided by the SE, and the G-2 showed the best performance.

Table 3. Performance ranking of three distributions under variation of w1s
 PDFsCriteria weights of the first objective (w1s)
  1.00.90.80.70.60.5
EigenvaluesSE0.34160.33570.33450.33610.33600.3369
 E0.33910.33310.33340.33290.33310.3325
 G-20.31920.33130.33210.33100.33080.3306

Relative differences of yearly sum of the 12 monthly statistical parameters through the 38 years between generated and historical sequences of G-2 with w1 of 0.8 were interpolated using inverse distance weighing (IDW) and are depicted in GIS maps (Figure 8). Tolerably small differences were noted (percentage difference ranged between 1 and 30% with an average of about 2%) for all statistical parameters except MminDR, where the yearly sum of MminDR overestimated the generated data (17.8 ± 56 mm) compared with that of the historical average (3.3 ± 6.4 mm). Statistical parameters of generated data tended to be underestimated for MMaxSLWD, MMaxSLDD, MMaxDR, MDRSk, MNWD and MANWD (Figure 8(a–c, g–i)) whilst others were overestimated. The lower middle part of the Northeast region showed a larger deviation of MminDR when compared to the other regions (Figure 8(l)). This is because of the historical amount of rainfall in the dry state was lower than 0.1 (e.g. the amount of MminDR 0.067 mm in the case of rainfall station ID 14033). Nevertheless, the Northeast region provided larger error differences of MMaxSLWD and MMaxSLDD, between 5 and 30%, probably due to the fact that this region receives lower and more inconsistent rainfall compared to the other regions. This could be the probable reasons for larger deviations as the model loses its accuracy particularly at low rainfall depths. In the North, overestimation of the generated MSDR and MNWD was noted (14% max) (Figure 8(d)). In the Central region, the highest error estimation was found in MmaxDR (20% max) (Figure 8(c)) with about 80 mm year−1 average difference of yearly sum of MmaxDR. In the Southern region the largest differences were observed in some rainfall stations for MDRstd and MDRsk (17% max) (Figure 8(f) and (g)). The model appears efficiently applied to the East region with very small deviation from historical data for all statistical parameters.

Figure 8. Maps of relative differences between yearly statistical parameters of generated and historical daily rainfall (All plots resulted by G-2 distribution with w1 of 0.8): (a) MMaxSLWD (days), (b) MMaxSLDD (days), (c) MMaxDR (mm), (d) MSDR (mm), (e) MADR (mm), (f) MDRStd, (g) MDRSk, (h) MNWD (days), (i) MANWD (days), (j) MNDD (days), (k) MANDD (days) and (l) MminDR (mm). Relative difference derived from difference of generated data and historical data divided by historical data

Download figure to PowerPoint

thumbnail image

5. Conclusion

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Theoretical considerations
  5. 3. Model application
  6. 4. Results and discussion
  7. 5. Conclusion
  8. Abbreviations
  9. References

With the dataset containing 38 years of daily rainfall from 44 rainfall stations located in 5 regions of Thailand, the daily rainfall was modelled with SMFOMC in terms of daily rainfall occurrences and generation. The optimal SMFOMC parameters were appropriately obtained by the effective use of MODE integration with SMFOMC at desirable intervals in Markov Chain model parameters specification. Minimization of two conflicting objectives on TMATRE and NS was considered with three selected decision variables of SD, PPL and NS. The proposed model reproduced characteristics of original daily rainfall occurrences with acceptable seasonality and accuracy. The model was described in detail with the sensitivity of weighted importance values of objective one (w1s). Lower w1s adversely affected MMaxDR, MMinDR, MADR, MSDR, MDRStd and MDRSk, whereas higher w1s offered significantly higher accuracy and acceptability. The model failed to describe MMinDR. This was the particular case found in the stochastic rainfall simulation in which the period of available historical data was short and the extreme rainfall events were rare to model (Régnière and St-Amant, 2007). However, the optimal SMFOMC parameters under w1s of 0.8–1.0 (Table 4) are recommended to be the appropriate solutions that could potentially be applied for Thailand. The study also verified the performance of PDFs that have been applied in the research. The shifted exponential, exponential and two-parameter gamma distributions were concluded to be generally adequate for describing the rainfall occurrences in the last state. Although no significant difference was found among three PDFs, the performance ranking showed the higher potential of the two-parameter gamma distribution over the others. Moreover, the usefulness of TPMs is still considerable for SMFOMC. With daily rainfall amounts less than 60–70% of the maximum daily rainfall, TPMs helped the model to be fitted well with historical data, and the remaining data was mobilized by PDFs (see optimal PPL of w1s in the range of 0.8–1.0 in Table 4, where an average PPL is in the range of 0.3–0.4).

Table 4. Average optimal number of state, preserved proportion in the last state, and state divider for w1s
Criteria weights of objective 1(w1)SMFOMC's parametersMinimumMaximumAverage ± STD
0.10 333 ± 0.000
0.20 344 ± 0.387
0.30 454 ± 0.506
0.40 677 ± 0.477
0.50Number of states91310 ± 0.991
0.60 111814 ± 1.719
0.70 133620 ± 7.632
0.80 154229 ± 8.080
0.90 284937 ± 4.937
1.00 336849 ± 8.342
0.10 0.49540.50000.499 ± 0.001
0.20 0.47700.50000.496 ± 0.005
0.30 0.47760.50000.497 ± 0.005
0.40 0.48240.50000.498 ± 0.003
0.50Preserved proportion in the last state0.44360.50000.493 ± 0.010
0.60 0.41740.49990.487 ± 0.020
0.70 0.32320.49990.459 ± 0.043
0.80 0.24980.49940.410 ± 0.066
0.90 0.16450.49560.361 ± 0.079
1.00 0.16760.48900.302 ± 0.081
0.10 1.41.61.5 ± 0.062
0.20 1.51.61.5 ± 0.039
0.30 1.51.61.5 ± 0.049
0.40 1.51.61.5 ± 0.042
0.50State divider1.51.61.5 ± 0.049
0.60 1.51.61.5 ± 0.039
0.70 1.11.61.4 ± 0.170
0.80 1.11.51.2 ± 0.155
0.90 1.11.21.1 ± 0.046
1.00 1.11.21.1 ± 0.029

Abbreviations

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Theoretical considerations
  5. 3. Model application
  6. 4. Results and discussion
  7. 5. Conclusion
  8. Abbreviations
  9. References
  • ARIMA = Multiplicative autoregressive integrated moving average

  • CDF = Cumulative probability distribution function

  • CP = Compromise programming

  • CR = Crossover constant

  • DEA = Differential evolution algorithm

  • EAs = Evolutionary algorithms

  • E = Exponential distribution

  • G-2 = Two-parameter gamma distribution

  • IDW = Inverse distance weighting

  • MATRE = Monthly absolute relative error

  • MADR = Monthly average daily rainfall

  • MANDD = Monthly average number of dry days

  • MANWD = Monthly average number of wet days

  • MDRSk = Monthly daily rainfall skewness

  • MDRStd = Monthly daily rainfall standard deviation

  • MMaxSLDD = Monthly maximum spell length of dry days

  • MMaxSLWD = Monthly maximum spell length of wet days

  • MMaxDR = Monthly maximum daily rainfall

  • MMinDR = Monthly minimum daily rainfall

  • MNDD = Monthly number of dry days

  • MNWD = Monthly number of wet days

  • MSDR = Monthly sum of daily rainfall

  • MODE = Multi-objective differential evolution

  • MOPs = Multi-objective optimization problems

  • MLM = Maximum-Likelihood method

  • NDSs = Non-dominated solutions

  • NDSA = Non-dominated sorting algorithms

  • NS = The number of states

  • PDFs = Probability distributions functions

  • POL = Pareto-optimal line

  • PPL = Preserved proportion in the last state

  • RCLT = Rainfall class limits table

  • SD = State divider

  • SE = Shifted exponential distribution

  • SFOMC = Stochastic First-Order Markov Chain

  • SMFOMC = Stochastic Multi-States First-Order Markov Chain

  • TMATRE = Monthly absolute total relative error

  • *TMATRE = Normalized monthly absolute total relative error for each statistical parameter

  • TPMs = Transition probability matrices

Table 5. Geographic details of rainfall station and the maximum number of state under varying state dividers
RegionsStation codeIDProvinceLatitudeLongitudeMaximum number of state under varying state dividers
      1.11.21.31.41.51.61.71.81.92.0
Central04 3611Chainat15°09′57″100°11′32″67362520171513121110
 19 3512Lop Buri15°20′21″101°22′30″64342419161413121110
 31 0223Nonthaburi13°54′38″100°30′09″70372621181514131211
 32 0124Pathumthani14°01′05″100°32′12″69372621171514131211
 54 0125Saraburi14°31′35″100°54′51″64342419161413121110
 56 0126Singburi14°53′12″100°24′29″66352520171513121110
 60 0137Suphanburi14°28′10″100°07′14″68362620171513121111
East03 2318Chachoengsoa13°28′29″101°37′44″64342419161413121110
 06 1219Chanthaburi12°47′23″102°15′33″70372621181514131211
 916010Chonburi13°12′04″100°57′59″68362620171514121111
 44 19111Prachinburi14°10′37″101°47′30″70372621181514131211
 48 14112Rayong12°55′41″101°19′30″64342419161413121110
 66 07113Trat12°28′28″102°28′52″79423023201715141312
North07 39114Chaingmai18°47′21″99°01′01″63342419161413111110
 16 15115Lampang18°08′09″99°34′53″71382721181614131211
 16 18116Lampang18°48′12″99°38′45″64342419161413121110
 17 08117Lampang17°53′16″99°05′20″67362520171513121111
 20 11118Maehongson19°16′10″97°56′55″66352520171513121110
 28 11119Nan18°34′05″100°52′28″72382721181614131211
 40 15120Phrae18°08′44″100°08′42″68362620171513121111
 63 18121Tak16°45′44″98°45′14″64342419161413121110
North East05 20022Chaiyaphum15°46′07″101°49′03″65352520161413121110
 21 01223Mahasarakham16°21′58″103°18′17″68362620171514121111
 25 51124Nakhonratchasima14°35′20″101°50′30″63342419161413111110
 25 55025Nakhonratchasima14°50′47″101°42′15″63342419161413111110
 50 15026Sakonnakhon17°13′43″103°33′08″67362520171513121111
 50 16027Sakonnakhon17°14′51″103°34′16″68362620171513121111
 50 17028Sakonnakhon17°13′02″104°02′14″70372621181514131211
 50 18029Sakonnakhon17°12′57″103°57′24″70372621181514131211
 57 16130Sisaket14°29′48″104°03′29″66352520171513121110
 62 12031Surin14°48′48″103°29′50″64342419161413121110
 67 22032Ubonratchathani15°14′17″104°51′01″67362520171513121111
 38 120133Khonkaen16°26′00″102°50′00″68362620171514121111
 38 130134Khonkaen16°20′00″102°49′00″68362620171514121111
 14 03335Khonkaen15°48′52″102°36′12″69372621171514131211
 14 47236Khonkaen15°57′00″102°33′00″67362520171513121111
South10 11137Chumphon10°37′18″99°03′39″72382721181614131211
 15 01238Krabi8°03′15″98°55′17″67362520171513121111
 34 01239Phangnga8°27′35″98°31′54″69372621171514121211
 43 01340Phuket7°53′18″98°23′14″67362520171513121111
 45 18141Prachuapkirikhan12°06′55″99°44′20″72382721181614131211
 46 01342Ranong9°57′55″98°08′12″71382721181614131211
 58 22143Songkhla6°37′59″100°23′46″67362520171513121111
 61 34144Suratthani9°25′31″99°09′44″75402822191615131212

References

  1. Top of page
  2. Abstract
  3. 1. Introduction
  4. 2. Theoretical considerations
  5. 3. Model application
  6. 4. Results and discussion
  7. 5. Conclusion
  8. Abbreviations
  9. References
  • Ahmad S, Khan IH, Parida BP. 2001. Performance of stochastic approaches for forecasting river water quality. Water Resources 35(18): 42614266.
  • Aksoy H. 2003. Markov chain-based modeling techniques for stochastic generation of daily intermittent stream flows. Advances in Water Resources 26(6): 663671.
  • Allen DM, Haan CT. 1975. Stochastic simulation of daily rainfall. Research Report No. 82, Water Resources Institute, University of Kentucky: Lexington, KY.
  • Amini A, Ali TM, Ghazali AHB, Huat BK. 2009. Adjustment of peak streamflows of a tropical river for urbanization. American Journal of Environmental Sciences 5(3): 285294.
  • Chapman T. 1998. Stochastic modelling of daily rainfall: the impact of adjoining wet days on the distribution of rainfall amounts. Environmental Modelling and Software 13: 317324.
  • Coe R, Stern RD. 1982. Fitting models to daily rainfall data. Journal of Applied Meteorology 21: 10241031.
  • Coello CAC, Veldhuizen DAV, Lamont GB. 2002. Evolutionary Algorithms for Solving Multi-Objective Problems. Kluwer Academic Publishers: New York, NY.
  • Deb K. 2001. Multi-Objective Optimization Using Evolutionary Algorithms. John-Wiley & Sons: Chichester.
  • Delleur JW, Kavvas ML. 1978. Stochastic models for monthly rainfall forecasting and synthetic generation. Journal of Applied Meteorology 17: 36.
  • Duan J, Sikka AK, Grant GE. 1995. A comparison of stochastic models or generating daily precipitation at the H.J. Andrews experimental forest. Northwest Science 69(4): 318329.
  • Gabriel KR, Neumann J. 1962. A Markov chain model for daily rainfall occurrence at Tel Aviv. Quarterly Journal of the Royal Meteorological Society 88(375): 9095.
  • Haan CT. 1977. Statistical Methods in Hydrology, 1st edn. The Iowa State Press: Iowa, IA.
  • Haan CT, Allen DM, Street JO. 1976. A Markov chain model of daily rainfall. Water Resources Research 12(3): 443449.
  • Haddada B, Adanea A, Mesnard F, Sauvageot H. 2000. Modeling anomalous radar propagation using first-order two-state Markov chains. Atmospheric Research 52(4): 283292.
  • Hutchinson MF. 1990. A point rainfall model based on a three-state continuous Markov occurrence process. Journal of Hydrology 114: 125148.
  • Jimoh OD, Webster P. 1996. Optimum order of Markov chain for daily rainfall in Nigeria. Journal of Hydrology 185: 4569.
  • Khanal NN, Hamrick RL. 1974. A stochastic model for daily rainfall data synthesis. Proceedings, Symposium on Statistical Hydrology, Vol. 1275. United States Department of Agriculture: Phoenix, AZ; 197210.
  • Kottegoda NT, Natale L, Raiteri E. 2000. Statistical modelling of daily streamflows using rainfall input and curve number technique. Journal of Hydrology 234(3–4): 170186.
  • Kulkarni MK, Kandalgaonkar SS, Tinmaker MIR, Nath A. 2002. Markov chain models for pre-monsoon season thunderstorms over Pune. International Journal of Climatology 22(11): 14151420.
  • Li H, Zhang Q. 2008. Multiobjective optimization problems with complicated pareto sets, MOEA/D and NSGA-II. IEEE Transactions on Evolutionary Computation 12(2): 284302.
  • Moon SE, Ryoo SB, Kwon JG. 2006. A Markov chain model for daily precipitation occurrence in South Korea. International Journal of Climatology 14(9): 10091016.
  • Mujumdar PP, Kumar DN. 1990. Stochastic models of stream flow: some case studies. Journal of Hydrological Sciences 35(4): 395410.
  • Ochola WO, Kerkides P. 2003. A Markov chain simulation model for predicting critical wet and dry spells in Kenya: analysing rainfall events in the Kano plains. Irrigation and Drainage 52(4): 327342.
  • Rao AR, Hamed KH. 2001. Flood Frequency Analysis. CRC Press: New York, NY.
  • Reddy PJ. 1997. Stochastic Hydrology, 2nd edn. Laxmi Publications: New Delhi.
  • Reddy MJ, Kumar DN. 2007. Multi-objective differential evolution with application to reservoir system optimization. Journal of Computing in Civil Engineering, ASCE 21(2): 136146.
  • Régnière J, St-Amant R. 2007. Stochastic simulation of daily air temperature and precipitation from monthly normals in North American north of Mexico. International Journal of Biometeorology 51: 415430.
  • Richardson CW, Wright DA. 1984. WGEN: A Model for Generating Daily Weather Variables, ARS-8. United States Department of Agriculture, Agricultural Research Service: Springfield, VA. http://soilphysics.okstate.edu/software/cmls/WGEN.pdf (accessed 1 May 2009).
  • Romero C, Rehman T. 1989. Multiple Criteria Analysis for Agricultural Decisions. Elsevier Science Inc.: New York, NY.
  • Salas JD. 1992. Analysis and modeling of hydrologic time series. In Handbook of Hydrology, Maidment DR (ed.). McGraw-Hill: New York, NY; 19.119.72.
  • Schreider SY, Jakeman AJ, Dyer BG, Francis RI. 1997. Combined deterministic and self-adaptive stochastic algorithm for streamflow forecasting with application to catchments of the Upper Murray Basin, Australia. Journal of Environmental Modelling and Software 12(1): 93104.
  • Sharma TC. 1985. Stochastic characteristics of rainfall-runoff processes in Zambia. Journal of Hydrological Sciences 30(4): 497512.
  • Srikanthan R, McMahon TA. 1982. Stochastic generation of daily rainfall at twelve Australian stations. Agricultural Engineering Research Report No. 57/82, University of Melbourne: Melbourne.
  • Srikanthan R, McMahon TA. 1984. Synthesizing daily rainfall and evaporation data as input to water balance-crop growth models. Journal of Australian Institute of Agricultural Sciences 50: 5154.
  • Srikanthan R, McMahon TA. 2001. Stochastic generation of annual, monthly and daily climate data: a review. Hydrology and Earth System Sciences 5(4): 653670.
  • Storn R, Price K. 1997. Differential evolution-a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization 11: 341359.
  • Suhaila J, Jemain AA. 2007. Fitting daily rainfall amount in Peninsular Malaysia using several types of exponential distributions. Journal of Applied Sciences Research 3(10): 10271036.
  • Sun J, Zhang Q, Tsang E. 2005. DE/EDA: new evolutionary algorithm for global optimisation. Information Sciences 169: 249262.
  • Taewichit C, Chittaladakorn S. 2007. Hydrologic time series data modeling using multiplicative ARIMA. Hydrological Science 11: 8194.
  • Todorovic P, Woolhiser DA. 1975. A stochastic model of n-day precipitation. Journal of Applied Meteorology 14: 1724.
  • Toth E, Montanari A, Brath A. 1999. Real-time flood forecasting via combined use of conceptual and stochastic models. Journal of Physics and Chemistry of the Earth, Part B 24(7): 793798.
  • Tsakeris G. 1988. Stochastic modelling of rainfall occurrences in continuous time. Journal of Hydrological Sciences 33(5): 437447.
  • Vogel RM. 1988. The value of stochastic stream flow models in overyear reservoir design applications. Journal of Water Resources Research 24(9): 14831490.
  • Xue F. 2003. Multi-objective differential evolution and its application to enterprise planning. In Proceedings, IEEE International Conference on Robotics and Automation (ICRA'03), Vol. 3. IEEE Press: Taipei; 35353541.
  • Zeleny M. 1982. Multiple Criteria Decision Making. McGraw-Hill: New York, NY.