When I went from sampling shipments of coal, concentrate, potash and sulphur in the Port of Vancouver to sampling coal, concentrate and ore at Cominco’s operations in Canada and abroad, the concept of spatial dependence in sampling units and sample spaces started to grow on me. In March 1978, SGS had send me a draft of Gy’s Unbiased Sampling from a Falling Stream of Particulate Matter, and asked my opinion on its content and language. I took the task seriously not only because SGS wanted to distribute Gy’s paper among selected clients but even more so because I was a member of the Canadian Advisory Committee to ISO Technical Committee 102 on iron ore. The objective of Gy’s experiment was to derive the optimum width and speed of the primary sampler as a function of the top size of the material in bulk. His experiment was technically brilliant but its symbols and terms were characteristically his own. I defined accuracy and precision my way, and Pierre sent me an autographed copy of his 1979 Sampling of Particulate Materials: Theory and Practice.
At Cominco I met many a geologist and metallurgist who struggled with spatial dependence between ordered sets of measured values, and who bought all kind of textbooks for guidance. Autocorrelation is a somewhat dated term that implies a significant degree of associative dependence between measured values in ordered sets. In the Index of his 1979 textbook Gy does not refer to autocorrelation, associative dependence, or to degrees of freedom for that matter. In the Index of their 1976 Time Series Analysis, Box and Jenkins refer to autocorrelation function but not to associative dependence or degrees of freedom. They worked mostly with sets of measured values ordered in the sample space of time, and with covariances between measured values in ordered sets. Degrees of freedom were mentioned only when they discussed F-, t- and χ²-distributions in the text. What these authors did not do was apply Fisher’s F-test to two variances to verify whether they are statistically identical or differ significantly.
Any set of ordered data may be used to show how to apply Fisher’s F-test. For example, the variance of a data set that consists of the numbers 1, 2, 3, 4 and 5 equals var(x)=2.50. The first variance term of the ordered set equals var1(x) =∑(ni-ni+1)2/2(n-1)=4/8=0.50. The observed F-value of F=2.50/0.50=5.00 exceeds the tabulated F-value of F0.05;df;df(o)=3.84 at 5% probability. Hence, the ordered set displays a significant degree of spatial dependence in its sample space. Set up a simple Excel spreadsheet template and derive this F-statistic. Use Excel’s FINV-function with p=0.01, df=4 and df(o)=8 to find out if the observed F-value is significant at 1% probability. This example shows why degrees of freedom play a key role in Fisher’s F-test.
Some scientists are taught to assume spatial dependence rather than verify it by applying Fisher’s F-test to the variance of the set and the first variance term of the ordered set. It sounds convenient but makes bad science. Others may not know how to derive a sampling variogram, a simple graph that shows where spatial dependence in a sample space or sampling unit dissipates into randomness. This is why I want to show how to derive and interpret sampling variograms. It may not make our world a cooler place but we can measure how hot is too hot in a scientifically sound manner.