Applications of statistics in flood frequency analysis
Estimation of the probability of occurrence of future flood events at one or more locations across a river system is frequently required for the design of bridges, culverts, spillways, dams and other engineering works. This study investigates some of the statistical aspects for estimating the flood frequency distribution at a single site and on regional basis. It is demonstrated that generalized logistic (GL) distribution has many properties well suited for the modelling of flood frequency data. The GL distribution performs better than the other commonly recommended flood frequency distributions in terms of several key properties. Specifically, it is capable of reproducing almost the same degree of skewness typically present in observed flood data. It appears to be more robust to the presence of extreme outliers in the upper tail of the distribution. It has a relatively simpler mathematical form. Thus all the well known methods of parameter estimation can be easily implemented. It is shown that the method of probability weighted moments (PWM) using the conventionally recommended plotting position substantially effects the estimation of the shape parameter of the generalized extreme value (GEV) distribution by relocating the annual maximum flood series. A location invariant plotting position is introduced to use in estimating, by the method of PWM, the parameters of the GEV and the GL distributions. Tests based on empirical distribution function (EDF) statistics are proposed to assess the goodness of fit of the flood frequency distributions. A modified EDF test is derived that gives greater emphasis to the upper tail of a distribution which is more important for flood frequency prediction. Significance points are derived for the GEV and GL distributions when the parameters are to be estimated from the sample data by the method of PWMs. The critical points are considerably smaller than for the case where the parameters of a distribution are assumed to be specified. Approximate formulae over the whole range of the distribution for these tests are also developed which can be used for regional assessment of GEV and GL models based on all the annual maximum series simultaneously in a hydrological region. In order to pool at-site flood data across a region into a single series for regional analysis, the effect of standardization by at-site mean on the estimation of the regional shape parameter of the GEV distribution is examined. Our simulation study based on various synthetic regions reveals that the standardization by the at-site mean underestimates the shape parameter of the GEV by about 30% of its true value and also contributes to the separation of skewness of observed and simulated floods. A two parameter standardization by the at-site estimates of location and scale parameters is proposed. It does not distort the shape of the flood frequency data in the pooling process. Therefore, it offers significantly improved estimate of the shape parameter, allows pooling data with heterogeneous coefficients of variation and helps to explain the separation of skewness effect. Regions on the basis of flood statistics L-CV and USKEW are derived for Scotland and North England. Only about 50% of the basins could be correctly identified as belonging to these regions by a set of seven catchment characteristics. The alternative approach of grouping basins solely on the basis of physical properties is preferable. Six physically homogeneous groups of basins are identified by WARD's multivariate clustering algorithm using the same seven characteristics. These regions have hydrological homogeneity in addition to their physical homogeneity. Dimensionless regional flood frequency curves are produced by fitting GEV and GL distributions for each region. The GEV regional growth curves imply a larger return period for a given magnitude flood. When floods are described by GL model the respective return periods are considerably smaller.