next up previous
Next: Trends in Flood Up: Recent advances in flood Previous: Partial Duration Series

Regional Flood Frequency Analysis

It has long been recognized that many annual flood series are too short to allow for a reliable estimation of extreme events. The difficulties are related both to the identification of the appropriate statistical distribution for describing the data and to the estimation of the parameters of a selected distribution. Distributions with three or more parameters provide high flexibility for fitting the data. The bias of quantile estimators from 3-parameter distributions is generally small and is usually ignored even in the case of small samples, but the standard deviation of such estimates may be unacceptably large. On the other hand, 2-parameter distributions (EV1, LN) lead to reasonably small standard errors of estimates, but may be highly biased. Regionalization provides a means to cope with this problem by assisting in the identification of the shape of potential parent distributions, leaving only a measure of scale to be estimated from the at-site data. Although generally recognized as a powerful means to improve flood estimates, research in regional flood frequency analysis is hampered by the unwillingness of researchers to deal with problems that cannot be treated mathematically rigourously. In fact, regional flood frequency analysis calls for assumptions, tests, and methods of a somewhat ad hoc nature. It is generally difficult to assess or compare the performance of regional estimation methods, because the degree to which implied assumptions are valid is hard to measure or quantify in practice. This, however, should challenge rather than discourage hydrologists.

At present, the direct regression method and the index flood method are the most used regional flood frequency procedures. While the former is extensively used in the U.S., the latter seems to gain increasing interest among researchers. A fundamental assumption of the index flood method is that data at different sites in a region follow the same distribution except for scale. Cunnane [1988] assessed that the index flood method with a regional Wakeby distribution is the best available regional procedure, while Potter and Lettenmaier [1990] found that better results could be obtained with a regional GEV distribution. It is generally found that index flood methods perform better than the method recommended in Bulletin 17B by the U.S. Water Resources Council [ Potter and Lettenmaier, 1990].

Regional flood frequency analysis involves two major steps: (1) Grouping of sites into homogeneous regions, and (2) Regional estimation of flood quantiles at the site of interest. The performance of any regional estimation method strongly depends on the grouping of sites into homogeneous regions. Geographically contiguous regions have been used for a long time in hydrology, but have been critized for being of arbitrary character. In fact, the geographical proximity does not guarantee hydrological similarity. During the last five to ten years researchers have attempted to develop methods in which similarity between sites is defined in a multidimensional space of catchment-related characteristics or statistical characteristics.

A significant contribution to solving the delineation issue is the region-of-influence approach, developed by Burn [1990a,b] and Zrinji and Burn [1993, 1994]. This method dispenses completely with the classical notion of regions in that each site is allowed to have its own region. The site of interest is located at the centre of gravity in a space of relevant flood and/or catchment characteristics, each weighted properly according to its relevance. The method also involves the choice of a distance threshold [ Burn, 1990a,b]; only sites whose distance to the target site (in the weighted attribute space) does not exceed this threshold are included in the region-of-influence. Zrinji and Burn [1993, 1994] replaced the somewhat subjective choice of threshold with a statistical test in which sites are successively added to the region until the hypothesis of complete homogeneity is rejected by the test. An advantage of the region-of-influence method is that in the estimation of a regional growth curve, each site can be weighted according to its proximity to the site of interest.

Cavadias [1990] proposed another method for delineating homogeneous regions based on canonical correlations. The approach is mathematically different from the region of influence method, but it is based on the same concept of neighborhood. Groups of sites with similar flood response were visually identified in a space of two hydrologic canonical variables and then subsequently interpreted in a space of two physiographical canonical variables. The flood and catchment variables must be carefully selected and weighted. An ungauged (or gauged) site can then be assigned to a region based on catchment characteristics alone. The problem with this method is that the pattern recognition is based on a subjective visual judgment, and that there is no guarantee that a pattern can be found. Canonical catchment variables were also used by Burn and Boorman [1993] for discriminating between clusters.

Nathan and McMahon [1990] used cluster analysis to group sites into homogeneous regions. Their work is notable for the emphasis and thorough discussion on the selection and weighting of attribute variables, an issue which should be investigated much more. In fact, the selection and weighting of variables is one of the problems where no strict mathematical solution is available, but use of common sense can lead to quite acceptable results.

Gabriele and Arnell [1991] used Monte Carlo simulation to examine the performance of the hierarchical approach to grouping sites, an idea advanced by Fiorentino et al. [1987]. The method emerges from the recognition that the higher the order of a regional moment that is to be estimated, the greater the number of sites needed to produce an estimate with a given degree of reliability. Hence, the coefficient of skewness (or parameters related to it) should be estimated in a large region, while the coefficient of variation (or parameters related to it) should be estimated in sub-regions. The same method was also used in the study by Ribeiro-Correa and Rousselle [1993]. Although intuitively appealing, it is questionable if the hierarchical approach to the grouping of sites is feasible in practice, since not one, but two sets of regions must be determined. Moreover, the use of regional probability weighted moments to some extent overcomes the difficulties pertaining to the estimation of higher order moments.

The delineation of homogeneous regions is closely related to the identification of the common regional distributions that apply within each region. A region can only be considered homogeneous if sufficient evidence can be established that data at different sites in the region are drawn from the same parent distribution (except for the scale parameter). L-moment ratio diagrams have become popular tools for regional distribution identification, and for testing for outlier stations. Hosking and Wallis [1993] developed several tests for use in regional studies. They gave guidelines for judging the degree of homogeneity of a group of sites, and for choosing and estimating a regional distribution. L-moment diagrams as a tool for identifying a regional distribution have been used in numerous other studies, including Chowdhury et al. [1989], Pilon and Adamowski [1992], Vogel and Fennessey [1993], and Vogel et al. [1993a,b]. An alternative test for homogeneity based on estimated dimensionless 10-year floods was developed by Lu and Stedinger [1992a]. Chowdhury et al. [1989] compared several goodness-of-fit tests for the regional GEV distribution and found that a new chi-square test based on the L-coefficient of variation and the L-coefficient of skewness outperformed other classical tests.

Regional growth curves estimated from averaged probability weighted moments have been much used in recent studies. When computing the average probability weighted moments, it is tempting to weight each site according to its record length, but Jin and Stedinger [1989], and Stedinger et al. [1993] warn against this approach which tends to bias the regional growth curve by deteriorating the spatial smoothing of population values.

The comparison of regional frequency procedures is a problem which should be given more attention. Monte Carlo simulation has been used in several studies to compare different estimation methods. It is important to design simulation experiments so as to take adequately into account the regional heterogeneity always encountered in practice. Some regional estimation methods are sensitive to heterogeneity, while others are more robust to the violation of the basic assumptions. Hence, the inclusion of a heterogeneity measure is essential for a fair comparison. Correlations between the data at different sites should also be incorporated in the simulations if such are likely to prevail in reality [ Hosking and Wallis, 1988].



next up previous
Next: Trends in Flood Up: Recent advances in flood Previous: Partial Duration Series



U.S. National Report to IUGG, 1991-1994
Rev. Geophys. Vol. 33 Suppl., © 1995 American Geophysical Union