Title:

Numerical studies in soil distribution

This thesis makes an empirical test of the concept of Soil Series. In the last 50 years many soil surveys have been made in many parts of the world so that land use can be planned and land management improved. In any one survey the classes of soils recognized have roughly equal variation of some suitable small range. Such classes are most often termed "Soil Series". In general, Soil Series are considered to be the basic unit of the Natural or General Purpose classification of soils. Many soil scientists, notably Cline, suggest that soil surveyors are able to define Soil Series because soil profiles form modes in a frequency diagram. They suggest that the job of the surveyor is simply to identify these modes, and thus define the classes. An empirical test of the concept of Soil Series, in its dual role, both as a profile unit, and as a mapping unit, is made here. This empirical test was made possible by recent developments in computer techniques, in mathematical methods, and by the revival of Adanson's views on taxonomy. An Euclidian geometric framework in multidimensional space is postulated, and is called PropertyGeographic space. In it, there are p properties by which a soil profile is characterized, and q geographic coordinates by which the soil profile is located on the ground. Provided soil properties can be coded by ordered scales, a soil profile can be defined by a point in p+q dimensional space. The subspace of PropertyGeographic space in p dimensions is called Property space. If the traditional idea that Soil Series is a cluster is correct, then groups of points should appear, in Property space, that are dense relative to the whole. For there to be dense clusters in Property space, a necessary condition is the presence in PropertyGeographic space of horizontal or gently sloping hyperplanes separated by relatively steep, preferably vertical, hyperplanes. When projecting the points from Property Geographic space on to Property space, the horizontal or gently sloping hyperplanes appear as dense clusters separated by the relatively empty space formed by the projection of the relatively steep or vertical hyperplanes. The property space of this model was first studied using data for 85 soil profiles, each characterised by 37 properties. These profiles had been chosen by stratified random sampling on 27 physiographically defined areas around Oxford. The profiles were first grouped intuitively in order to have some base for reference. Principal Component analysis was used to project the 85 points in 37 dimensions on to the two dimensions of the two first principal components, with least possible loss of information. The comparison between the ordination by Principal Component analysis and the intuitive classification showed, as expected, that the soil profiles grouped intuitively into classes did tend to appear in a given partition of Property space, as seen from the graph of the two first components. However, it also showed little evidence of clustering of profiles in Property space. Clusters were sought in all 37 dimensions, using the stability of the classifications produced by Similarity analysis when two standardizations of the soil properties were used. In the first case, soil properties were standardized between the range zero to one, and in the second, to zero mean and unit variance. Theoretically these standardisations should yield similar classifications if profiles are well clustered in Property space. The dendrograms showed that two main groups of the Brown Earths remained stable in spite of changes in standardization. Changing the standardization, however, showed the subdivision of the Gleys to be unstable. It was then concluded that since most soil properties could sensibly be given the attributes of order and interval, an Euclidian model was appropriate for the study of PropertyGeographic space. Furthermore, the combination of ordination by Principal Component analysis, the stability of the classifications produced by Similarity analysis, and the context of an Euclidian model, would give us the means to evaluate the cluster arrangement of soil profiles in Property space. The p dimensional model was extended to a study of the p+q dimension of PropertyGeographic space. A transect a little over 3 Km long, covering a variety of soils in north Oxfordshire, was sampled at 10 m intervals. At each sampling point a profile was characterized by 63 properties, and a Soil Series allocation made by the local soil surveyor. Ordination by Principal Component analysis was again used to search visually for clusters in the projection from 63 dimensions of Property space on to two dimensions, from the graph of the first against the second principal component values. The points appeared as a single continuous cloud with three bulbous protudings arms, but with no clusters. On the ordination, evidence of clustering was also sought numerically by Orderneighbour statistics. The results showed no evidence of clustering. Clusters were then searched for in the original 63 dimensions of Property space, by Mode analysis. Mode analysis showed that the points in Property space were distributed as a continuous cloud of points with three very weak peaks. Since the three poorly defined clusters did not correspond in number, and consequently in kind, to the Soil Series classification units defined by the soil surveyor, they were not interpreted as being clusters formed at the level of Soil Series classification units. In the classifications produced by Similarity analysis, the two standardizations of property values yielded rather different groupings. These differences were explained by the lack of welldefined clusters. The shape of the series of points in PropertyGeographic space was studied from its projection from 64 dimensions (p = 63, q = 1) on to the graphs of the first and second principal component values plotted against distance. Two kinds of soil variation were apparent; a short range soil variation occuring over a distance of a few tens of metres, and a relatively long range of variation, occuring over a distance of a few hundreds of metres. The short range soil variation was eliminated by smoothing the two first principal component values, with a five point linear moving average. After this, the trend appears as being made up of a sequence of mainly sloping and horizontal lines, with lengths of 100 to 300 m, seldom separated by relatively steep lines. Correlograms were constructed for the two first principal component values, to elucidate the kind and size of the spatial relationship of soil properties. The correlograms studied showed, am average, a linearly decreasing configuration of size 240 m for the first, and 300 m for the second principal component values. On this basis, 92 per cent and 86 per cent respectively of the total variation is explained. This suggests that the long range soil variation can be explained by a bilateral moving average scheme, 240 m long for the first component and 300 m for the second component. This is interpreted as a sequence of almost straight lines, curved at their joints. These lines are, on the average, 240 m long for the first component and 300 m long for the second component. In this way, 92 per cent of the first component and 86 per cent of the second principal component is explained, leaving only 8 per cent of the first component and 14 per cent of the second component to be explained by the erratic variation. Abrupt changes between lines are included within the erratic variation. The general picture that emerges from the results is that the shape of the points in PropertyGeographic space is a sequence of almost straight, sloping and horizontal, hyperplanes. These are sometimes separated by steep hyperplanes, but often there is just a change in the slope of two adjoining hyperplanes. The lack of clusters in Property space is explained not only by the fact that there are not always steep hyperplanes making the separation between sloping or horizontal hyperplanes, but also by the occurence of relatively long hyperplanes at the same altitude as those steep hyperplanes that make the separation between sloping or horizontal hyperplanes. Thus, at least for the area studied, soil profiles are distributed in Property space as a single cloud of points, more or less evenly spread, but without any strong clustering. Any soil surveyor who assumes clusters to be present, and who expects his work to reveal them, labours under a delusion. The soil surveyor in fact subdivides Property space into zones of approximately uniform density, though his choice of dividing lines may be guided by the positions of relatively steep hyperplane a, or a change of slope of two adjoining hyperplanes in PropertyGeographic space. There is, however, no guarantee that values in Property space at which either steep hyperplanes, or changes on the slope if two adjoining hyperplanes, occur, are repeated in other areas. The surveyors choice of division is still rather arbitrary, and hence differences of opinion and difficulties of grouping over large areas will be frequent. The use of a classification system on a continuously varying population of soil profiles imposes special characteristics on the classes thus produced. An important one is that a large proportion of the profiles belonging to a Soil Series classification unit are more similar to the profiles of other soil, than to the profiles of their own Series.
