Title:

New genetic programming methods for rainfall prediction and rainfall derivatives pricing

Rainfall derivatives is a part of an umbrella concept of weather derivatives, whereby the underlying weather variable determines the value of derivative, in our case the rainfall. These financial contracts are currently in their infancy as they have started trading on the Chicago Mercantile Exchange (CME) since 2011. Such contracts are very useful for investors or trading firms who wish to hedge against the direct or indirect adverse effects of the rainfall. The first crucial problem to focus on in this thesis is the prediction of the level of rainfall. In order to predict this, two techniques are routinely used. The first most commonly used approach is Markov chain extended with rainfall prediction. The second approach is Poissoncluster model. Both techniques have some weakness in their predictive powers for rainfall data. More specifically, a large number of rainfall pathways obtained from these techniques are not representative of future rainfall levels. Additionally, the predictions are heavily influenced by the prior information, leading to future rainfall levels being the average of previously observed values. This motivates us to develop a new algorithm to the problem domain, based on Genetic Programming (GP), to improve the prediction of the underlying variable rainfall. GP is capable of producing white box (interpretable, as opposed to black box) models, which allows us to probe the models produced. Moreover, we can capture nonlinear and unexpected patterns in the data without making any strict assumptions regarding the data. The daily rainfall data represents some difficulties for GP. The difficulties include the data value being nonnegative and discontinuous on the real time line. Moreover, the rainfall data consists of high volatilities and low seasonal time series. This makes the rainfall derivatives much more challenging to deal with than other weather contracts such as temperature or wind. However, GP does not perform well when it is applied directly on the daily rainfall data. We thus propose a data transformation method that improves GP's predictive power. The transformation works by accumulating the daily rainfall amounts into accumulated amounts with a sliding window. To evaluate the performance, we compare the prediction accuracy obtained by GP against the most currently used approach in rainfall derivatives, and six other machine learning algorithms. They are compared on 42 different data sets collected from different cities across the USA and Europe. We discover that GP is able to predict rainfall more accurately than the most currently used approaches in the literature and comparably to other machine learning methods. However, we find that the equations generated by GP are not able to take into account the volatilities and extreme periods of wet and dry rainfall. Thus, we propose decomposing the problem of rainfall into 'sub problems' for GP to solve. We decompose the time series of rainfall by creating a partition to represent a selected range of the total rainfall amounts, where each partition is modelled by a separate equation from GP. We use a Genetic Algorithm to assist with the partitioning of data. We find that through the decomposition of the data, we are able to predict the underlying data better than all machine learning benchmark methods. Moreover, GP is able to provide a better representation of the extreme periods in the rainfall time series. The natural progression is to price rainfall futures contracts from rainfall prediction. Unlike other pricing domains in the trading market, there is no generally recognised pricing framework used within the literature. Much of this is due to weather derivatives (including rainfall derivatives) existing in an incomplete market, where the existing and wellstudied pricing methods cannot be directly applied. There are two wellknown techniques for pricing, the first is through indifference pricing and the second is through arbitrage free pricing. One of the requirements for pricing is knowing the level of risk or uncertainty that exists within the market. This allows for a contract price free of arbitrage. GP can be used to price derivatives, but the risk cannot be directly estimated. To estimate the risk, we must calculate a density of proposed rainfall values from a single GP equation, in order to calculate the most probable outcome. We propose three methods to achieve the required results. The first is through the procedure of sampling many different equations and extrapolating a density from the best of each generation over multiple runs. The second proposal builds on the first considering contractspecific equations, rather than a single equation explaining all contracts before extrapolating a density. The third method is the proposition of GP evolving and creating a collection of stochastic equations for pricing rainfall derivatives. We find that GP is a suitable method for pricing and both proposed methods are able to produce good pricing results. Our first and second methods are capable of pricing closer to the rainfall futures prices given by the CME. Moreover, we find that our third method reproduces the actual rainfall for the specified period of interest more accurately.
