June 01, 2012
Turning Scientific Perplexity into Ordinary Statistical Uncertainty
Cosma Shalizi in American Scientist:
D. R. Cox published his first major book, Planning of Experiments, in 1958; he has been making major contributions to the theory and practice of statistics for as long as most current statisticians have been alive. He is now in a reflective phase of his career, and this book, coauthored with the distinguished biostatistician Christl A. Donnelly, is a valuable distillation of his experience of applied work. It stands as a summary of an entire tradition of using statistics to address scientific problems.
Statistics is a branch of applied mathematics that studies how to draw reliable inferences from partial or noisy data. The field as we know it arose from several strands of scholarship. The word “statistics,” coined in the 1770s, originally referred to the study of the human populations of states and the resources those populations offered: how many men, in what physical condition, with what life expectancies, what wealth and so on. Practitioners soon learned that there was always variation within populations, that there were stable patterns to this variation and that there were relations between these variables. (For instance, richer men tended to be taller and live longer.) Another component strand was formed when scientists began to systematically analyze or “reduce” scientific data from multiple observers or observations (especially astronomical data). It became obvious from this research that there was always variation from one observation to the next, even in controlled experiments, but again, there were patterns to the variation. In both cases, probability theory provided very useful models of the variation. Statistics was born from the weaving together of these three strands: population variability, experimental noise and probability models. The field’s mathematical problems are about how, within a probability model, one might soundly infer something about a given process from the data the model generates, and at the same time quantify how uncertain that inference is.
Applied statistics, in the sense that Cox and Donnelly profess, is about turning vexed scientific (or engineering) questions into statistical problems, and then turning those problems’ solutions into answers to the original questions. The sometimes conflicting aims are to make sure that the statistical problem is well posed enough that it can be solved, and that its solution still helps resolve the original, substantive dilemma—which is, after all, the point.
Rather than spoiling any of Cox and Donnelly’s examples, I will sketch one that recently came up in my department.
More here.
Posted by S. Abbas Raza at 10:30 AM | Permalink






















Comments
Statistics on a Friday? Really?! This is so boring. I thought you hired interns to spice things up.
Posted by: Al | Jun 1, 2012 11:07:51 AM
When statistics are used by some legislatures, the result can be at once entertaining but soon horrifying.
The legislature of the State of N. Carolina is a case in point.
They are working on a law that makes it illegal to say the ocean will rise more than 8 inches in official deliberations.
One wonders what effect this will have on neighbor Virginia that has already had a 1.5 foot sea level rise, in The Sea of Virginia I presume.
Posted by: Dredd | Jun 1, 2012 11:31:16 AM
Al, I found it pretty interesting but then I am weird that way. And maybe I felt guilty posting the silly piece from the Onion just underneath, so compensated with some serious seriousness! :-)
The interns don't start for another couple of weeks so you'll have to deal with my boring stuff until then.
All best, Abbas
Posted by: Abbas Raza | Jun 1, 2012 11:32:51 AM
The most important facts about probability and statistics are:
"There are lies, damn lies and statistics", Mark Twain.
Likely events may fail to happen while unlikely events may happen.
You may have a 99% chance of winning but still lose. Or, you may have a 99% chance of losing but still win. By the way, people win the lottery every day although the chance of winning is very small.
Finally, the best way to tell a lie, especially to a gullible public, is with statistics. This is true because most of the time the recipients of the lie have no data to refute the statistics or necessary details to check their veracity and assumptions. They also rarely have the intelligence to do so either.
"The great masses of the people...will more easily fall victims to a big lie than to a small one." Adolph Hitler
To avoid the pitfalls of statistical data and conclusions, the dictum "check your premises" is appropriate.
Posted by: WJAbbe | Jun 1, 2012 11:45:08 AM
I find it reassuring that science is able to document uncertainty with precision. Or is that a contradiction in terms?
The sea level reference reminds me of one of my favorite Twitter messages. "Don't worry about the rising tide. When it gets within five feet of the house we can sell to a Republican."
I had a friend who admitted that when he was defending his thesis one of the examiners said "Roy, your main problem is that you are rarely specific, and whenever you are, you're nearly always wrong."
Posted by: John Ballard | Jun 1, 2012 5:40:19 PM
Shalizi wished for more graphical models. For those that agree, your nirvana awaits you here:
http://www.itl.nist.gov/div898/handbook/eda/section3/eda3.htm
Posted by: DAS | Jun 1, 2012 6:30:28 PM
That's why scientists came up with multiverse so there is no uncertainty left and anything that can happen will happen with probability 1.0
Posted by: Raza | Jun 1, 2012 6:42:29 PM
I am glad to see that I am not the only one that finds randomness and uncertainty a fascinating topic of discussion. Even more fascinating is the fact that uncertainty can often be precisely quantified (e.g. in the case of signal detection) and be (nearly) eliminated using clever exploitation of redundancy and prior knowledge.
Perhaps people often view the theory of estimation, detection and statistical inference as boring because it is so mathematically challenging...and that the results of the improper use of statistical methods are presented to us all too often by lawyers, journalists and politicians...
Posted by: Bill | Jun 2, 2012 8:05:46 AM
Post a comment