Teaching junk statistics at UBC

The stage was set in 1964 to teach junk statistics at the University of British Columbia. It was the year Professor Dr Alastair J Sinclair took on his duty to teach earth sciences to UBC’s students. It was but a few years after Matheron dabbled at his own kind of unreal statistics and fumbled real variances. Just the same, Matheron’s junk statistics was hailed as new science on campus at the University of Kansas in June 1970. His tour de force at that time was to call on Brownian motion to infer the continuity of his famed stationary random function. UK’s campus was a fitting venue because that’s where Agterberg failed for the first time to derive the real variance of his distance-weighted average point grade. Here’s why it gives too rich an abundance of data in mineral exploration. As few as a pair of measured values, when determined in samples taken at positions with different coordinates in a finite sample space, gives an infinite set of Agterberg’s point grades, a zero voodoo variance, and not a single degree of freedom. How about that? Some kind of perpetual motion in mineral exploration!

Sinclair details in Applied Mineral Inventory Estimation how his “exciting and invigorating career” took off when he was exposed to Matheron’s ideas, and how he had had “the good fortune to work with Journel, Huijbregts and Deraisme.” Those were Matheron’s earliest students who took his musings for dogma, and who didn’t have a clue which variances were lost on Matheron’s watch. Sinclair’s list of folks he was “fortunate to have worked with at various times” reads like a Who’s Who in the geostatistical fraternity. He credits all of them to have contributed to his education. I’m all in favor of giving credit where credit is due. But to give credit to everybody who taught him junk statistics is over the top. Some geostatistocrats on Sinclair’s list now know each weighted average has its own variance. And the odd one might even know why! One cannot help wonder how the cream of Matheron’s crop saw fit to make junk statistics look so good to Sinclair starting in 1972. So much so that Sinclair felt compelled to write his own textbook. Of course, all of that spelled bad news for UBC’s students.

When I met Sinclair at his UBC office in August 1992, I talked about real statistics. I showed how to count degrees of freedom for the set of nine holes in Figure 203 of David’s 1977 Geostatistical Ore Reserve Estimation. In Sinclair’s world, the concept of degrees of freedom breaks down in matters of spatial dependence. But it’s alive and well in my world. Sinclair did not see much of a difference between Matheron’s surreal geostatistics and Fisher’s real statistics. In fact, he knew as much about real statistics in August 1992 as he did in September 1989. That’s when CIM Bulletin entrusted Sinclair and David with the review of Precision Estimates for Ore Reserves. David blew a fuse because our paper was “without a single reference to 20 years of work in geostatistical ore reserve estimation.” And we didn’t even know we had written a geostatistical paper! So, we were baffled when Dr L R Fyffe, Editor, CIM’s Geology Division wrote on November 23, 1989, “Both reviewers recommend publication with major revisions.”

But big troubles were looming in the esoteric universe of those who infer, krige, smooth, and rig the rules of real statistics with reckless abandon. When I was working on Sampling and Weighing of Bulk Solids in the early 1980s, I studied David’s 1977 Geostatistical Ore Reserve Estimation. I found way too many symbols and far too few measured values. Brownian motion, too, played some kind of cameo role in this work of geostatistical fiction. The author confessed his work is “not for professional statisticians.” In fact, he even predicted, “…statisticians will find many unqualified statements…” What David didn’t predict was he would deny anything was wrong in surreal geostatistics.

So what were we to do? Spice our paper with symbols? Scrap measured values? Delete Fisher’s F-test for spatial dependence? Call David to the task? Ask him to put in plain words his “good test to find out whether one really understands geostatistics” on page 286 of his 1977 textbook? Or try to pacify CIM Bulletin’s keepers of Matheron’s tablets with a few tidbits of token stuff? So we huffed and puffed a lot and added but a few references to works of geostatistical scholars such as Dagbert, David, Journel and Huijbregts. Our marginally revised paper was rejected on February 7, 1990. My son completed his PhD in computing science. I resolved to raise a stink. I did it then. And I still do now! Sinclair is but one reason. Bre-X’s phantom gold resource is another!

On November 23, 1989, CIM Bulletin’s editor wrote “Both reviewers recommend publication with major revisions.” Sinclair started some charade of sorts on November 22, 1989, at 08:30AM. He welcomed those who attended my short course on Sampling Precious Metal Deposits: Metrology-A New Look. The venue was Room 330A at UBC’s Department of Geological Sciences. The course was sponsored by its Mineral Deposits Research Unit. Sinclair didn’t have time to listen and moved about a lot. In fact, he popped in and out of Room 330A like a Jack-in-the-Box. Sinclair didn’t ask any questions. Was it because the paper he rejected was part of my notes? Did he worry others might ask questions? Did he worry I would talk too much about real statistics and too little about Matheronian geostatistics?

Dr J A McDonald, Interim Director, Mineral Deposits Research Unit, on February 21, 1990, wrote, “We certainly were pleased with the response to your course and have elected to maintain the theme with a 5-day course to be held April 23-27, 1990, entitled Geostatistics for the Mining Industry, New Concepts, New Tools.” How about that for cruel and unusual punishment? Sinclair was in damage control mode. So much more has happened in our stand-off on real statistics ever since I met Sinclair in his UBC Office in August 1992. Much of it will stay untold for some time to come.

Dr Alastair J Sinclair, PEng, PGeo, has striking credentials. He is a former Member of the Discipline Committee of the Association of Professional Engineers and Geoscientists of British Columbia with its Code of Ethics to protect the public at large. He was CIM’s Distinguished Lecturer for 2000-2001. He taught a short course at the UBC Robson Square Campus, Vancouver, BC, on May 15-16, 2008. What he didn’t teach was that each distance-weighted average has its own variance. He didn’t teach how to verify spatial dependence by applying analysis of variance and how to count degrees of freedom. Neither did he teach how to derive unbiased confidence intervals and ranges for metal grades and contents of mineral inventories. Sadly, Sinclair is still teaching junk statistics!

Teaching junk science by consensus

The Centre de Géosciences/Géostatistique deserves praise for posting to its Online Library a treasure trove of writings. A great deal came from the seminal work of Professor Dr Georges Matheron (1930-2000). Most of it merits long overdue scrutiny and review. The problem is not so much that Matheron put a few spurious findings on paper but that his students took it for doctrine. The Online Library has made it easy to pinpoint what Matheron did wrong and when he did so.

Matheron derived the length-weighted average grade of a set of metal grades determined in core samples with variable lengths. He did so in his Rectificatif of January 13, 1955, to Formule des Minerais Connexes of November 25, 1954 (see Note Statistique No 1). What he didn’t derive was the variance of this length-weighted average grade. Neither did he show how to test for spatial dependence between grades of ordered core samples by applying analysis of variance. He didn’t report primary data sets because of his penchant for working with symbols rather than with real measured values.

Matheron concocted the honorific eponym krigeage in his 1960 Krigeage d’un Panneau Rectangulaire par sa Périphérie. In this Note géostatistique No 28, Matheron derived k*, his “estimateur”, and a precursor to kriged estimate or kriged estimator. In real statistics, Matheron’s k* is in fact the length-weighted average grade of a single block. In this case, too, he didn’t derive var(k*), the variance of his “estimateur”. Sadly, kriging became a curse of sorts for Professor D G Krige.

Matheron’s Stationary Random Function seemed not to have troubled those who were at the first geostatistics colloquium in the USA in 1970. Matheron even called on Brownian motion to infer by hook or by crook the continuity of his Riemann integral. He didn’t explain what Brownian motion and mineral deposits have in common. Matheron, unlike John von Neumann in 1941 and Anders Hald in 1952, didn’t work ever in his life with Riemann sums. On the contrary, he would rather infer spatial dependence than apply Fisher’s F-test to the variance of a set and the first variance term of the ordered set.

It is to Matheron’s credit that it was not him who lost variances of all weighted averages. It was Dr Frederik P Agterberg who failed to derive the variance of his distance-weighted average. He did derive the distance-weighted average grade of a set of five (5) points at positions with different coordinates but failed to derive the variance of this central value. What he didn’t point out was that as few as two such points define an infinite set of distance-weighted averages. He fumbled the variance of his central value for the first time in his 1970 colloquium paper and once again in his 1974 Geomathematics.

Matheron’s length-weighted average grade was reborn as an honorific kriged estimate or estimator. But then Agterberg’s distance-weighted average grade was honored in the same way! And here’s the clincher! An infinite set of Agterberg’s zero-dimensional point grades fits along any borehole, and within any ore block, sampling unit or sample space. That’s why distance-weighted average point grades without variances became the heart and soul of geostatistics. Matheron’s seminal work merely set the stage for Agterberg’s giant step into the abyss of mineral reserve and resource estimation with confidence but without confidence intervals and ranges.

The above figure is a facsimile of Fig. 203 on page 286 of David’s 1977 Geostatistical Ore Reserve Estimation. It shows the infinite set of “estimated” values within B derived from the same set of nine (9) holes.

The more geostatistocrats tinkered with real statistics, the more flawed geostatistics grew. It’s a scientific fraud to derive confidence limits from pseudo kriging variances. It’s as silly to talk about confidence without limits as it is to infer spatial dependence within or between boreholes. To discount degrees of freedom would make no sense at all in real statistics. To count degrees of freedom makes no sense in geostatistics. That’s the very reason why geostatistics does not give unbiased confidence limits for metal contents and grades of mineral reserves or mineral resources.

Professor Dr Roussos Dimitrakopoulos is a catch of sorts for the Department of Mining, Metals and Materials Engineering at McGill University. I don’t know why! I told him in 1993 that weighted averages have variances because one-to-one correspondence between functions and variances is sine qua non in statistics. This basic rule is still beyond his grasp in 2008. All the same, he is Editor-in-Chief, Journal of Mathematical Geosciences. Agterberg, President, International Association for Mathematical Geosciences, left his fingerprints when he failed to derive the variance of his distance-weighted average point grade. Dimitrakopoulos talks about “gazillion types” of probabilistic models. What he doesn’t talk about is that the odds to select the least biased subset of some infinite set of kriged estimates are immeasurable. The problem is not so much he himself believes it but the world’s mining industry believes it. The more so because he does all of that with voodoo variances.

Mining engineers, mine geologists, resource analysts, and project managers were invited to a course on Applied Risk Assessment for Ore Reserves and Mine Planning at McGill University. The same course deals with Strategic Risk Quantification and Management for Ore Reserves and Mine Planning and with Conditional Simulation for the Mining Industry. That’s a lot of buzz for a bundle of bucks! Too bad that voodoo variances underpin all that risk assessment and quantification stuff! That’s why one should come with a buddy. For it’s more difficult to baffle a few birds of a feather than a single sitting duck. Dimitrakopoulos should explain why Agterberg’s distance-weighted average point grade aborted its variance during its rebirth as an honorific kriged estimate on Matheron’s watch.

Going gaga about confidence without limits

If truth be told I didn’t really miss the 2000 Millennium celebrations of the Canadian Institute of Mining, Metallurgy, and Petroleum (CIM) and the Prospectors & Developers Association of Canada (PDAC). For the masters of ceremonies didn’t pine for my paper on Applied Statistics and the Bre-X fraud. Most CIM and PDAC members play the kriging game and talk about confidence without limits. Most scientists on this planet work with real statistics and real confidence limits. I work mostly with 95% confidence intervals (95% CI) and 95% confidence ranges (95% CR) for metal contents and grades of mined ores, mineral concentrates, mineral reserves and mineral resources. The world’s mining industry blathers about confidence without limits for mineral reserves and mineral resources. Yet it did accept confidence intervals and ranges limits for mined ores and mineral concentrates. So what’s all that talk about confidence? It should be about risk! The risks between trading partners seem to matter a lot more than the risks mining investors run. That’s the real story behind Bre-X!

I thought all along the Ontario Securities Commission (OSC) would lose confidence in all the mumbo jumbo that replaced the 1998 Interim Report of the Mining Standards Task Force. For I couldn’t find a single scrap of sound statistics in National Instrument 43-101 Standards of Disclosure for Mineral Projects. Here’s what happened on September 10, 2004. “We, the Canadian Securities Administrators (CSA), are publishing for a 90-day review comment period the following documents…” How about that! Did the CSA really plan to repeal and replace that National Instrument rubbish? Did the CSA want real statistics in its standards? Some of its objectives were to “correct errors”, and to “generally make the Current Mining Rule more user-friendly and practical.” Correct errors? Did CSA’s mining experts finally figure out how many variances went missing? Were variances of weighted averages about to make a comeback? Did OSC’s Chief Mining Consultant figure out who lost what and when? I did find the answers but nobody gave a hoot. So I stayed in the trenches and watched CSA’s rulers rule.

Patricia Dillon, CIM Guidelines Coordinator, met with the CSA in Edmonton on May 11, 2004. The objective of this formal annual meeting was to clarify the source of various documents and guidance that underpins Reporting Standards and Guidelines. I like that kind of stuff! The CIM Standing Committee on Reserve Definitions consists of a team of eleven ore reserve practitioners. Normand Champigny, the coauthor of A Study on Kriging Small Blocks and a leading activist of sorts against oversmoothing, brought all of his insights to CIM’s reserve definitions team. Champigny didn’t grasp the additive property of the variances of metal contents for blocks of in situ ore when he spoke on behalf of “five anonymous ore reserve practitioners in Canada and abroad.” He did so in Geostatistics: A Tool that Works (The Northern Miner, May 18, 1992) in response to my Geostatistics or Voodoo Science (The Northern Miner, April 20, 1992). It did work all too well at Bre-X’s Busang property! I wonder whether or not any of Champigny’s anonymous buddies in 1992 served on Dillon’s definitions body in 2000.

CIM Council on December 11, 2005 adopted CIM Definition Standards for Mineral Resources and Mineral Reserves. The term confidence played a prominent role in statements such as the level of confidence, a lower level of confidence, a high level of confidence, a higher level of confidence, the highest degree of confidence, insufficient confidence, the level of geoscientific confidence, different levels of geological confidence, and confident interpretation. Such is the verbose burden of confidence without limits. What happened with confidence intervals and ranges in ore reserve estimation? Who repealed 95% CIs and 95% CR’s? Ten pages of mind numbing text with rambling nuggets such as reasonable assumptions, acting reasonable, conceptional estimates, order of magnitude estimates, reasonable prospects, reasonably assume the continuity of mineralization, and reasonably assumed but not verified. Was Dillon’s waffling squad really thinking?

It was easier to meet my Member of Parliament and talk about geostatistical data analysis of shellfish counts along a coastline than it was to meet Deborah McCombe during her trip to Vancouver and talk about real statistics. She granted me one hour of her time on January 22, 2005. We met at the office of the BC Securities Commission in the presence of Dr Gregory J Gosson, BCSC’s Chief Mining Advisor. I talked about the lost variance of the distance-weighted average, and why it should not have gone missing when the distance-weighted average was reborn as an honorific kriged estimate. I used Clark’s hypothetical uranium data to show how to test for spatial dependence in her sample space, and when the distance-weighted average converges on the arithmetic mean and its variance on the Central Limit Theorem. I also showed what happens when ore was inferred between Bre-X’s salted holes, and what happens when interpolation positions kriged holes between salted holes. There were no questions either during our meeting or thereafter.

BCSC’s former Chief Mining Advisor and OSC’s former Chief Mining Consultant present a $350 workshop at the University of Alberta Campus on Saturday, May 3, 2008. Nowadays, Deborah McCombe is the executive vice-president, Scott Wilson Mining Group, and Greg Gossan is the technical director of geology and geostatistics, AMEX Mining and Metals Consulting Group. They will talk about matters ranging from Setting the regulatory scene to Case studies of what went wrong. What Gossan and McCombe will not talk about is when Agterberg fumbled the variance of the distance-weighted average, and why it is too late to reunite them. For the name of the exploration game is to look forward with confidence without limits rather than take a step back to figure out what is really wrong with geostatistics. Geostatistical data analysis of shellfish counts in samples taken along a shoreline at 1-km intervals may kill the kriging game. The Harper Government may well agree that mineral reserves and mineral resources in annual reports and populations of shellfish along Canada’s shorelines should all be reported with unbiased confidence intervals and ranges.

Confidence limits for mineral reserves and mineral resources

What keeps the world’s mining industry going is mineral exploration. To find and define mineral reserves and mineral resources is not just the name of the game but is itself a bit of a game. The trouble is statistically challenged qualified persons infer ore between holes before verifying spatial dependence either within holes or between holes. To infer ore between holes worked miracles when Bre-X drilled holes at a spacing of 50 m up to 200 m. When this geostatistical practice was applied at Bre-X’s Busang property, it didn’t spook the Ontario Securities Commission (OSC) until a few barren holes were twinned. But it really fooled Bre-X’s stakeholders, didn’t it?

Bre-X’s inferred phantom gold resource passed David’s famous pudding test with a parade of red flags flying. Yet, it was an cinch to prove nothing but barren rock between salted holes. Bre-X’s boss salter didn’t even know how to create spatial dependence between ordered sets of bogus gold grades. Not that it would have mattered. In 2008 qualified persons still do not know how to verify spatial dependence by applying analysis of variance to the variance of a set and the first variance term of the ordered set. But did they ever know how to infer phantom gold between Busang’s barren holes. The trouble with qualified persons is they don’t want to test for spatial dependence between measured values in ordered sets.

There’s more to the Bre-X salting scam than swindled shareholders were told. The 1998 Interim Report on Setting New Standards doesn’t make an easy read. I was pleased because the Mining Standards Task Force (MSTF) talked about real statistics in ore reserve estimation where surreal geostatistics had ruled supreme since the 1990s. But MSTF’s best-laid plan for real statistics in ore reserve estimation came to naught because the odd geostat guru turned bold again and bounced back after the Bre-X fraud. What replaced the Mining Standards Task Force and its 1998 Interim Report was CIM Standing Committee of Reserve Definitions and its National Instrument 43-101 Standards of Disclosure for Mineral Projects. CIM’s definitions were all about weasel words and window dressing with loads of twists and turns. Some sort of show but don’t tell. It sported as many sound sampling practices and proven statistical methods as the Philosopher’s Stone. And much of it was crafted by the most jaded geostatistical mind West of the Rocky Mountains.

To assume the continuity of mineralization between ordered sets of measured values was all the rage among geostatistically gifted ore reserve practitioners in the early 1990s! I myself like to infer because I grew up with sampling and statistics. To infer has the ring of true statistics in my little world. I know statistical inferences and degrees of freedom belong together like donuts and holes. And I even know why! To assume, krige, smooth and rig the rules of statistics was never an option in my work! What I did do is derive confidence limits for the weighted average grade of each hole. Matheron didn’t derive the variance of the length-weighted average for a set of core samples with variable lengths in 1954. So I derived it in 1994. I always verify spatial dependence between measured values in ordered sets. It doesn’t matter whether core samples vary in length, in density, or in both length and density. Statistics gives confidence limits for metal grades and contents as an intuitive measure for risk. And it gives confidence limits that take into account a significant degree of spatial dependence between measured values in ordered sets. This is why it’s so much fun to work with real statistics but kind of silly to put up with surreal geostatistics.

Mines do not like confidence limits for mineral reserves in annual reports because it’s a promise of sorts to mining investors. When mines ship mineral concentrates or mined ores to other mines or to smelters, they like confidence limits for metal contents and grades as a measure for risk. So what gives? I told John Drury, Chair, CIM Ad Hoc Reserve Definitions Committee and OSC’s mining expert, that ISO Technical Committee 183 derived unbiased confidence limits for metal contents and grades of mineral concentrates and mined ores. Drury didn’t grasp why the very same method does give 95% confidence intervals and ranges for metal contents and grades of mineral reserves. I liked John Drury because he made time to listen to my story. The year was 1994 and Bre-X’s was busy at Busang!

Geostatistically engineered mineral reserves may bode well in annual reports but are bound to shrink when mined. Geostatisticians do not know how to derive unbiased confidence limits for metal contents and grades. Pollsters do report confidence limits for opinion surveys because they work with real variances. Geostatisticians work with pseudo kriging variances and do not get unbiased confidence limits. So what’s the matter with the average geostatistical mind? The quintessence is that Agterberg fumbled the variance of the distance-weighted average first in 1970 and again in 1974. It’s a tale of two fumbles so to speak. What baffles me is that not a single geostatistician has asked Agterberg to explain why his distance-weighted average point grade does not have a variance. When will the world’s mining industry ask that question? When will it be ready to replace Matheron’s statistical madness with sound sampling practices and proven statistical methods?

How to Get This Woman Interested in DSI Snake Sandwich Belt Conveyors

I am happy to submit a guest blog for Joe Dos Santos. This is to lighten things up from the technical stuff engineers like so well and frankly…I just don’t understand. Until now!

Spending most of my college life in public relations and marketing classes, engineers and engineering seemed a world away…even though I could walk to the engineering building on my campus in under five minutes. That’s actually strolling. Still, the dynamics of what they did, the math, the figures, the equations! It was way over my head. After all, it was all I could do to get through my remedial math courses much less pursue a higher level of math. Now don’t think I didn’t feel the pressure. My father, a Cornell graduate said he learned to love the thing that really challenged him. Yep….math! And of course, we can’t forget my brother the MIT graduate. Of course I felt the pressure. After all…they are both…you guessed it, Engineers!!!

Well after being with our family company, Dos Santos International, in a full time, official capacity for a year now, I’m going to let my dad and brother in on a little secret. I’m intrigued! I’ve finally found my niche in engineering! I can make it simple…one word! No need for equations or protractors! DIAMONDS!!! That’s right. Diamonds…you know the phrase…a girl’s best friend. Well this girl has finally found a lot of interest in the math behind this sparkling, much sought after wonder! See, it turns out, it takes engineering (math and science…UGH!) to get these dazzling beauties to a jeweler near you. I’m proud to say, that Dos Santos International is responsible for helping to bring these gems to the surface and through the separation process. See, we just completed two projects in Canada. These mines, along with others in the planning process, may soon propel Canada to the number one position in diamond production.

Now not only did that catch my eye because of what is being unearthed, but also because of the high-tech DSI Snake Sandwich Belt Conveyor…wait make that two DSI Snakes. Snap Lake incorporates two, each elevating at the building’s opposite ends… but there’s more. This project, from the beginning, was intended to be environmentally friendly as well. The area where these diamonds are being mined is frozen solid throughout much of the year. I mean the conditions are pretty harsh. Because of this, the project would have to be under cover. No, not secretive but enclosed and heated so people aren’t freezing to death. Sure I’d stand out in freezing temperatures for just a chip of that gorgeous rock, but there are others to consider. Still, they had to make sure they could contain this whole project in a small space (by brandon). That’s where our Snake Conveyors came in. Because of our high angle capability and our system’s gentle yet firm hugging of the kimberlite (diamond ore) in the sandwich between the belts, we are able to bring these beauties up safely without using too much of Canada’s precious land. Because of our Snakes, the process building’s foot print is small, minimizing the environmental impact and reducing both capital and operating costs, especially the cost of heating. The project maintains the land around it without disturbing much of Mother Nature, while at the same time, unearthing one of nature’s most beautiful treasures. Conveyors…well they’re not the most beautiful sight to me but let’s face it. Diamonds look good on everyone…or anything!

For those of you, who enjoy the math and understand the equations, please visit dosssantosintl.com for the logistics of these systems. We have a detailed list of our installations. For the rest of you, just remember DSI Snake Sandwich Belt Conveyors. They are now this girl’s best friend.

Conditional simulation for the mining industry

CIM eNews of March 2008 announced a seminar on Applied risk assessment for ore reserves and mining planning: Conditional simulation for the mining industry. This 2008 CIM, SME, AusIMM, and McGill Professional Development Seminar Series is based on a spurious variant of applied statistics. Conditional simulation with pseudo kriging variances makes no sense in applied statistics. What would make sense is to get rid of surreal geostatistics at each and every institution of higher learning on the face of this planet. I’m working hard to make that happen! The more so because I’ve heard some “geometallurgy” babble that may well set the stage for McGill’s geosciences to gobble up mineral process engineering.

I let my CIM Membership expire because of Applied Statistics and the Bre-X Fraud. This paper for the 2000 Millennium Celebrations was neither accepted nor rejected but simply ignored. I’ve been studying all sort of CIM stuff since July 4, 2006, when I was made a CIM Life Member because of “many years of active participation and service.” Indeed, I’ve been active in various international committees since 1974. In my 1985 textbook on Sampling of Weighing of Bulk Solids I derive unbiased confidence limits for contents and grades of mineral concentrates. ISO Technical Committee 183 Copper, lead, zinc and nickel ores and concentrates, approved the method for contents and grades of ores and concentrates. This textbook was translated into Mandarin without authorization. Our software module on “Precision and Bias for Mass Measurement Techniques” became ISO/FDIS 12475 with the same title. Spreadsheet templates were set up with Lotus 1-2-3 but Big Blue acquired Lotus and outmoded our module with a swift 1-2-3 slight of hand. I should have known! I did pick Beta over VHS. But never would I have picked surreal geostatistics over real statistics.

CIM eNews of March 2008 also reminded me of “The Properties of Variances”. For it was the title of a paper I wanted to present at a forum at McGill University on June 3-5, 1993. This forum was to honour Professor Dr Michel David (1945-2000) for his role in creating geostatistics. Professsor Dr Roussos Dimitrakopoulos was the main brain behind that somewhat early forum called “Geostatistics for the next Century”. What I had derived in my paper was the variance of the distance-weighted average. I did not know in 1993 that Agterberg had fumbled this variance in 1970 and in 1974. Dimitrakopoulos did not miss it! And neither did David!

My presence might have marred David’s bash as much as my variance for Agterberg’s distance-weighted average had rattled its organizers. Here’s why! Our 1989 paper on “Precision Estimates for Ore Reserves” had irked David, one of several dedicated enforcers of geostatistics for CIM Bulletin. He wrote, “The authors present their own method for calculating ore reserves without a single reference 20 years worth of work in geostatistical ore reserve estimation.” Whose method had David expected anyway? We had studied his 1977 textbook and determined that he did not derive unbiased confidence limits for grades and contents of ore reserves. What he did do was praise “the famous central limit theorem.” What David didn’t do in his 1977 textbook was derive the variance of the distance-weighted average. And what we didn’t know in 1989 was that Agterberg had fumbled this variance.

David was the first geostatistical scholar who tried to capture Matheron’s new science. He did so in his 1977 Geostatistical Ore Reserve Estimation. Following is a facsimile with the same caption as Figure 203 on page 286 of Chapter 10 The Practice of Kriging.

Fig. 203. Pattern showing all the points within B,
which are estimated from the same nine holes

On the same page, David proclaimed, “Writing all the necessary covariances for that system of equations might be a good test to find out whether one really understands geostatistics.” What he did not tell his readers was that counting degrees of freedom is a good test to find out whether one grasps real statistics. His set of sixteen (16) “estimated points within B” gives precisely zero degrees of freedom. By contrast, his set of nine (9) holes gives df=n–1=8 degrees of freedom, and the ordered set gives dfo=2(n–1)=16 degrees of freedom. Here are the clinchers! Each of David’s “estimated points within B” has its own variance because it is a function of the same set of nine (9) holes. David did not have to fumble the variance of each of his “estimated points within B” because Agterberg had done so in 1970 and again in 1974!

In Chapter 12 Ore Modelling, David claimed, “What we expect from the model is that the simulated values will have the same distribution as the real one, and also that the spatial correlation between values will be the same as the one estimated on the real values. There is an infinite set of simulated values which will have these properties.” David’s plan to make infinite sets smaller defies astronomical odds! This is why conditional simulation with pseudo kriging variances cannot possibly take into account “values observed at the sampled points.”

In a rare instant of perfect vision, David did confess, “The criticism of this model is obvious. The simulation is not reality. There is only one answer: The proof of the pudding is…!” Mining investors may wonder why Bre-X’s phantom gold resource and Hecla’s shrinking Grouse Creek gold reserve did pass David’s proverbial pudding test. Just the same, Dimitrakopoulos is ready for conditional simulation with pseudo kriging variances of subsets taken from infinite sets of honorific kriged estimates without variances and a touch of artificial intelligence.

It’s fitting to celebrate April Fool’s Day once per year. It saddens me that every day is still a fool’s day in ore reserve and resource estimation so many years after Matheron and Agterberg fumbled variances of central values. The world’s mining industry ought to set up an ISO Technical Committee for mineral reserve and resource estimation. It’s never too late to right a wrong!

Agterberg’s problems

Dr F P Agterberg, President, International Association for Mathematical Geology, has a few problems. The least of his problems is to change IAMG’s current name to International Association for Mathematical Geosciences. To bring the distance-weighted average and its central limit theorem back together again is just as pressing a problem as it is to count the degrees of freedom for a set and for the ordered set. So, I’ll try to put in a chronological context the cases of the missing variances and of the unwelcome degrees of freedom.

Agterberg talked about Autocorrelation Functions in Geology at the 1970 geostatistics colloquium in the USA. He had found some kind of “geologic prediction problem”, and drew a picture of it in Figure 1 of his paper. The same figure was reborn as “a typical kriging problem” in Figure 64 of his 1974 Geomathematics. As such, the same figure is published in the 1970 Colloquium Proceedings and in his 1974 Geomathematics. Why was a “geologic prediction problem” reborn as a “typical kriging problem”? I’ve studied the tortuous nomenclature of geostatistics and tried to figure out who lost what and when.

Agterberg's Problem

What both figures do have in common are symbols instead of Agterberg’s “known values” for the set of five irregularly spaced points. As luck would have it, the same function does apply to “a geologic prediction problem” and “a typical kriging problem.” Nomenclature has never been a strong suit in Matheron’s new science of geostatistics by symbols. David bragged in 1977, “It has been known for a long time that geostatisticians seem to have that capacity to change notations twice or more on the same page and still understand each other.” Not similarly blessed I’m guided by context and ISO symbols and terms. What I have known for twenty years is that geostatisticians have never derived the central limit theorem for the central value of a set of measured values with variable weights, and have never counted degrees of freedom for sets of measured values or ordered sets of measured values.

Agterberg did not mention that his 1970 and 1974 functions are one and the same. The correct contextual description of Agterberg’s function is “the distance-weighted average” of a set of five (5) measured values determined in samples selected at positions with different coordinates. He did not mention that both functions converge on the arithmetic mean as “irregularly spaced known values” become equidistant to P0. In his textbook, he does refer to the central limit theorem in Chapter 6 Probability and Statistics and in Chapter 7 Frequency Distributions and Functions of Independent Random Variables”. And he does refer to degrees of freedom in Chapter 6 Probability and Statistics and in Chapter 8 Statistical Dependence; Multiple Regression. He claimed in the second paragraph of Chapter 10 Stationary Random Variables and Kriging of his textbook, “The results can be used for interpolation and extrapolation.” What was Agterberg thinking?

Clark’s 1979 Practical Geostatistics was the first textbook to work with hypothetical uranium concentrations. That gave some touch and feel of real data but the set didn’t display spatial dependence. The author did study real Fisherian statistics where it was born but got into hanging out with the wrong crowd, and into worrying whether or not, ‘the Central Limit Theorem holds.” The good news is this theorem is bound to hold until the end of time! The bad news is IAMG’s President and his cronies on IAMG’s Council think it’s too late to bring central values and central limit theorems back together again.

Agterberg forgot to mention that his set of irregularly spaced points defines an infinite set of “predicted values” within this sample space and beyond it. He didn’t show how to test for spatial dependence by applying Fisher’s F-test to the variance of the set and the first variance term of the ordered set. He didn’t talk about a systematic walk to derive the variance of his ordered set. Agterberg knows his predicted value is a zero-dimensional point grade. If he were to agree that each zero-dimensional point grade does have its own variance, then he ought to revise his 1974 Geomathematics not only by deleting Chapter 10 Stationary Random Variables and Kriging but also by adding a chapter on precision and bias for mineral reserves and resources. That might be useful if the world’s mining industry were ever ready to set up and support an ISO Technical Committee for mineral reserve and resource estimation.

Agterberg’s 1974 Geomathematics, Mathematical Background and Geo-Science Applications, is a comprehensive textbook on the application of the queen of sciences in earth sciences. Agterberg covered much of the range of tools and techniques that mathematics provides in such rich abundance. This is why most of it will stand the test of time. In spite of that, Chapter 10 is bound to crumble under scrutiny because Agterberg’s geostatistical thinking was just as wrong as that of Matheron and his minions. Don’t take my word for it! Show Agterberg’s figure to a professor of mathematics. Ask her or him to explain whether or not each distance-weighted average has its own variance. And walk away if such a simple question about the Central Limit Theorem draws a blank!

Playing kriging games

When my son and I were working on Precision Estimates for Ore Reserves in the late 1980s, we had copies of David’s 1977 Geostatistical Ore Reserve Estimation and Clark’s 1979 Practical Geostatistics. We wanted to know how geostatisticians derive confidence limits for metal contents of ore reserves. The problem is they don’t! By contrast, ISO/TC183 did approve in 1993 a homologue of the same method to derive confidence limits for copper, lead and zinc contents of concentrate shipments.

One of the few geologists who still talked to me in those days gave me his copy of Journel and Huijbregts’ 1978 Mining Geostatistics. My reviews of the first three textbooks are posted on my website. I have offered to review more recent textbooks and study the latest innovations in geostatistical theory and practice. I’ve yet to receive a single copy! So, don’t let a textbook of a more recent vintage gather dust on your bookshelf. Mail it to me and I’ll post my review where it’s easy to find. By the way, I don’t sell anything on my website. I give away advice on sound sampling practices and proven statistical methods. For example, Precision Estimates for Ore Reserves was thrashed by enforcers of geostatistical dogma but praised by and published in Erzmetall 44, Oct, 1993. This paper and several others are posted on my website under Reviewed papers.

Visit and click “a wonderful kriging game of chance” on my Home page. Play this game with Clark’s hypothetical uranium data. Enter different coordinates and see what you get. Don’t enter the moon’s coordinates because it creates too much hypothetical uranium in space. And Clark’s distance-weighted average hypothetical uranium grades get too close to the arithmetic mean grade. Clark wondered whether or not the Central Limit Theorem holds. Fortuitously, Agterberg and Matheron had already eliminated that ubiquitous theorem behind sampling practice

And don’t test for spatial dependence in Clark’s sample space by applying Fisher’s F-test to var(x), the variance of the set, and var1(x), the first variance term of the ordered set. I never walk to the beat of kriging drums. What I do is walk a systematic walk that covers the shortest possible distance between all coordinates and derive the first variance term of the ordered set. Given that the observed value of F=var(x)/var1(x)=4,480/2,161=2.07 does not exceed F0.05;4;8=3.84, it follows that Clark’s set of hypothetical uranium data does not display a significant degree of spatial dependence. Hence, the distance-weighted average hypothetical uranium concentration of 371 ppm is not necessarily an unbiased estimate.

Surely, assuming spatial dependence beats the odds of finding no spatial dependence at all! And it makes counting pesky degrees of freedom unnecessary! Just assume, krige, smooth, rig the rules of real statistics, and be happy.

Creating geostatistics

Dr Frederik P Agterberg, President, International Association for Mathematical Geology, called Professor Dr Georges Matheron (1930-2000) the Founder of Spatial Statistics. He ranked him on a par with Sir Ronald A Fisher (1890-1962) and Professor Dr J W Tukey (1915-2000). Agterberg was wrong! Matheron was a self-made wizard of odd statistics. Here’s why!

Matheron derived in his 1954 Formule des Minerais Connexes (Note Statistique No 3) the length-weighted average grade of a set of measured grades of core samples with variable lengths but he did not derive the variance of the length-weighted average grade. He derived in his 1960 Krigeage d’un Panneau Rectangulaire par sa Périphérie (Note Géostatistique No 28) the length-weighted average grade of an in-situ block of ore but he did not derive the variance of the length-weighted average grade. He showcased his Stationary Random Function at a 1970 geostatistics colloquium in the USA. He evoked Brownian motion to conjecture the continuity of this Riemann integral. He failed to put in plain words what Brownian motion and mineral resources do have in common. Matheron, unlike Hald in 1952 and von Neumann in 1941, never worked with Riemann sums. His protégé was Dr A G Journel, a professor at Stanford University. Journel pronounced in 1992 that spatial dependence might be assumed, and saw fit to denounce “…classical Fischerian [sic] statistics.” He couldn’t spell Fisher’s name or count degrees of freedom, and never applied Fisher’s F-test in his 1978 Mining Geostatistics. What he did do in this textbook was allude to the zero kriging variance.

Agterberg’s Autocorrelation Function in Geology emerged at the same colloquium. Agterberg derived his distance-weighted average of a set of “five known values” at positions with different coordinates but did not derive the variance of this distance-weighted average. He fumbled the variance of the distance-weighted average once more in his 1974 Geomathematics. Incredibly, one-to-one correspondence between functions and variances remained beyond his grasp. He failed to point out that any such set of “five known values” defines an infinite set of distance-weighted averages, none of which is blessed with a variance in Matheronian geostatistics. What added to Agterberg’s problem is the fact that he didn’t count degrees of freedom for “five known values.” On a positive note, he did refer to degrees of freedom elsewhere in his textbook on Geomathematics.

Despite such bizarre discrepancies in the seminal works of Agterberg and Matheron, the world’s mining industry went to work with infinite sets of honorific kriged estimates and smoothable kriging variances and covariances. And work it did! So much so that too many mineral resources failed to make the predicted grades during mining. The odd statistician did fall for geostatistics but most know all about one-to-one correspondence between functions and variances. Not much set theory is required to grasp that an infinite set of Agterberg’s zero-dimensional distance-weighted average grades fits within Matheron’s three-dimensional block or along any one-dimensional borehole.

Agterberg’s problem of functions without variances was solved by selecting the least biased subset out of any infinite set of kriged estimates, deriving its BLUE (Best Linear Unbiased Estimator), and smoothing its pseudo kriging variance to perfection. So it came about that way too many geoscientists assume spatial dependence between measured values in ordered sets. Surely, assuming spatial dependence doesn’t justify interpolation let alone extrapolation. And it’s quite tough to beat the immeasurable odds of selecting the least biased subset out of an infinite set of kriged estimates. So tough, in fact, that it ranks among impossible events in probability theory. Of course, geostatistocrats breach elementary rules of mathematical statistics with reckless abandon. Matheronian geostatistics is so lavishly blessed with honorific kriged stuff that distance- and length-weighted averages got lost in all kind of kriging babble. Just the same, weighted averages do converge on arithmetic means as variable weights converge on constant weights. The question is then whether or not the Central Limit Theorem will ever make a comeback when so many degrees of freedom fighters stand on guard against real statistics.

Edmond Burke (1729-1797) should have said, “All that is necessary for the triumph of bad geoscience is that good geoscientists do nothing.” H G Wells (1866-1946) did predict, “Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write.” Wells was wrong! But he was not as wrong as were Dr Frederik F Agterberg and Professor Dr Georges Matheron! I’ll have to talk a bit more spatial dependence and sampling variograms.

Properties of variances

Most of my life I have worked with William Volk’s Applied Statistics for Engineers. At present I work with his 1980 Reprint Edition. I lost the 1969 Second Edition while I was preaching sound sampling practices and applied statistics around the world. I’m hanging on to my tattered 1958 Original Edition. Volk’s name translates into the Dutch word for “nation” or “people”. That led me to believe William Volk and Jan Visman may share the same roots. Volk holds a 1959 masters degree in mathematical statistics from Rutgers University and an undergraduate degree in chemical engineering from New York University. So he must have written much of his Original Edition before graduating from Rutgers. I took a real liking to Chapter 7 Analysis of Variance. What I like most of all is Section 7.1.4 Variance of a general function. For it was in this section that Volk proved that each function ought to have its own variance.

Volk’s grasp of the properties of variances shows how inspired he was by Fisher’s work. Probability theory had spawned applied statistics by the time Sir R A Fisher was knighted in 1953. And it was the concept of degrees of freedom that empowered applied statistics and set it apart from probability theory. Fisher in 1922 introduced the concept of degrees of freedom to correct Pearson’s χ²-distribution for finite sets of measured values. It did bridge the breach between probability theory and applied statistics. This is why applied statistics deals with finite samples selected from sampling units or sample spaces. What degrees of freedom also did at that time was fuel the legendary feud between those giants of statistics. Fisher was right because the F- and t-distributions both derive from the χ²-distribution once degrees of freedom are taken into account. Volk’s 1958 textbook is of lasting value because it links χ²-, F-, and t-distributions in such a logical manner.

Volk’s symbols and terms are mostly clear and concise. I found Volk’s “central tendency measures” less intuitive than “central values” (of sets of measured values with either constant or variable weights). I avoid terms such as “successive observations” when discussing an ordered set of measured values of a stochastic variable in a sampling unit or a sample space. All it takes in my work is text and context to correctly explain applied statistics and its symbols.

Volk applied Fisher’s F-test to verify whether or not a pair of variances is statistically identical. He applied Bartlett’s χ²-test to verify whether or not a set of variances is homogeneous. He did not show how to apply Fisher’s F-test to verify spatial dependence between measured values in ordered sets. All it would have taken is to apply Fisher’s F-test to var(x), the variance of a set of measured values, and var1(x), the first variance term of the ordered set. Volk, a chemical engineer, may well have worked with some ordered set of measured values in a sampling unit or a sample space of time but he never showed how to derive a sampling variogram.

John von Neumann was a brilliant mathematician at Princeton’s Institute for Advanced Studies when he coauthored Distribution of the Ratio of the Mean Square Successive Difference to the Variance. He seemed unaware in 1941 that a set of n samples gives df=n–1 degrees of freedom, and that an ordered set of n observations gives dfo=2(n–1) degrees of freedom. Had he added all of the terms x1–x2,…,xi–xi+1,…,xn-1–xn, he would have gotten x1–xn, the nth variance term of the ordered set. Had he counted degrees of freedom, he would have gotten the correct number for the ordered set. He may not have noticed that all but x1 and xn are used twice.

Von Neumann deemed working with random numbers a sin of sorts. It explains why he frowned upon heuristic proof. In those days, random numbers were listed in handbooks of statistical tables. That made the mean squared successive difference of a set about as tedious to derive as its variance. He was a pure mathematician, which may well be why his 1941 study did so little to advance mathematical statistics.

Anders Hald, a Professor of Statistics at the University of Copenhagen, pointed out that the correct number of degrees of freedom for the first variance term of an ordered set of n measured values is dfo=2(n–1). He did so in Section 13.5. The Mean Square of Successive Differences of his 1952 textbook on Statistical Theory with Engineering Applications. Hald, too, studied the distribution of r=var1(x)/var(x) rather than Fisher’s F-distribution. Otherwise, he would have noticed that a significant degree of spatial dependence between measured values in some ordered set gives an observed value of F=var(x)/var1(x)>1.

Textbooks on applied statistics such as Volk’s give a table with F-values at 0.05 and 0.01 probability for a matrix of degrees of freedom. Nowadays, Excel’s FINV makes it easy to get the correct F-value at any probability level and with any number of degrees of freedom for either variance. What’s more, Excel’s RAND makes it simple to prove that Standard Uniform Random Numbers (SURNs) and Normally Distributed Random Numbers (NDRNs) do not display a significant degree of spatial dependence. Visit geostatscam.com and find out about SURNs and NDRNs under Sampling and statistics explained.

Not all geoscientists know how to test for spatial dependence by applying Fisher’s F-test. In fact, geostatisticians would rather assume than test for spatial dependence. They have also been taught that some functions do not have variances. The problem is that too many geoscientists know too little about sampling and statistics and too much about surreal geostatistics.

A weblog for the worldwide powder and bulk solids handling and processing community.

Single Sign On provided by vBSSO