As many of you know John Brockman is literary agent for a parliament of well-known scientists, science journalists, and others – Richard Dawkins, Steven Pinker, Dan Dennett, George Dyson and a cast of, if not thousands, perhaps hundreds. Each year he poses a question and they answer it. Then the answers are posted to the web at The Edge, Brockman’s website. This years’ question, which elicited 206 responses:
What scientific term or concept ought to be more widely known?
I’ve been through them, though only quickly, and selected three for comment: prediction error minimization, Bayes’s Theorem, and attractors.
Andy Clark: Philosopher and Cognitive Scientist; Professor of Logic and Metaphysics, University of Edinburgh, UK; Author: Surfing Uncertainty: Prediction, Action, and the Embodied Mind.
Let’s ease into this one.
Once upon a time, back in my undergraduate days during the 1960s, I was invited to a party at an artist’s loft. This was a real honest-to-god un-renovated loft, large, bare walls, a wood stove for heat (it was mid-winter), raw. Someone remarked they were showing a film “over there.” Sure enough, there was a 16mm projector facing a wall, clicking and buzzing rapidly away, and there were blurry gray smudges dancing on the wall opposite (no sound).
I watched the moving mottled grays for some seconds, five, ten, twenty, who knows, I wasn’t counting, and then SHAZAM! It became clear. On the right, a naked woman standing, bent forward, outstretched arms touching a wall. On the left, a naked man behind her, thrusting away. SEX! First porn film I’d ever seen.
But why did it take me awhile to see what was very plainly there in the flickering lights on the wall? Because I didn’t know what I was seeing, that’s why, and that’s what Clark’s prediction error minimization is getting at. If someone had said “hey, dirty movies” or I’d seen a title (say, “Danny Does Debbie”) I’d have known what to look for in the lights. But I didn’t know and it took me awhile to figure it out.
Well, not so much ME, considered as a reasoning being, because there was no reasoning involved. I just looked and looked until things became clear. My brain, the cognitive unconscious, did it on my behalf, if you will.
The fact is, the world is much too rich for our perceptual and cognitive systems to keep up with it – hence the blurry splotches of light I saw. Most of the time, however, we are in a world that is, to a non-trivial degree, familiar, not only familiar kinds of objects and events, but even specific things, our ordinary everyday surroundings. In such a world we are reasonably good at predicting what comes next, still:
Consider something as commonplace as it is potentially extremely puzzling—the capacity of humans and many other animals to find specific absences salient. A repeated series of notes, followed by an omitted note, results in a distinctive experience—it is an experience that presents a world in which that very note is strikingly absent. How can a very specific absence make such a strong impression on the mind?
The best explanation is that the incoming sensory stream is processed relative to a set of predictions about what should be happening at our sensory peripheries right now. These, mostly unconscious, expectations prepare us to deal rapidly and efficiently with the stream of signals coming from the world. If the sensory signal is as expected, we can launch responses that we have already started to prepare. If it is not as expected, then a distinctive signal results: a so-called “prediction-error” signal. These signals, calculated in every area and at every level of neuronal processing, highlight what we got wrong, and invite the brain to try again.
Here’s an example that is more complex because it involves, not only perception, but timing motor output to anticipate perception. Look at this photo:
As you can see, it’s snowing and visibility is poor. What you don’t see is that there are small lights on the stop sign at the right. They appeared to blink at regular intervals. I wonder if I can catch a shot of the lights? The problem, of course, is if I wait until I see the light to snap the shutter, I’ll miss them because they’ll have shut off by the time the shutter opens. So I’ve got to anticipate them, just like a baseball batter anticipates a pitch. You can’t start your swing when you see the ball in the strike zone; you start swinging early. That’s tricky.
Anticipating the lights is not so tricky. They’re stationary and they appeared to be blinking at regular intervals. Conscious thought isn’t fast enough to keep up. [Insert standard mystical Zen martial arts clichés about not thinking and about going with the flow.] Don’t think, do.
The photo above was the first in the series. Here’s the sixth:
Got it! After that, six more misses, then I got it again; another miss, got it next time; two misses, got it. And then I quit. It seemed clear that I was converging on the proper timing.* Time to go home.
And time to switch scale, from microseconds, seconds, and minutes to weeks, months, and years or more, at least in the examples that interest me. At whatever scale, we’re always making assumptions about and anticipating the future and those assumptions are always grounded in our past.
Sean Carroll: Theoretical Physicist, Caltech; Author, The Big Picture.
Here’s what Carroll says about Bayes’s Theorem:
We turn to Bayes’s Theorem whenever we’re uncertain about the truth of some proposition, and new information comes to light that affects the probability of that proposition being true. […]
The theorem itself isn’t so hard: the probability that a proposition is true, given some new data, is proportional to the probability it was true before that data came in, times the likelihood of the new data if the proposition were true.
So there are two ingredients. First, the prior probability (or simply “the prior”) is the probability we assign to an idea before we gather any new information. Then, the likelihood of some particular piece of data being collected if the idea is correct (or simply “the likelihood”). Bayes’s theorem says that the relative probabilities for different propositions after we collect some new data is just the prior probabilities times the likelihoods.
Here’s a powerful case, and perhaps a bit odd, that entered my personal repertoire of “incidents revealing how the world works”: the fall of the Berlin wall in 1989, which marked the collapse of the Cold War. While a prescient few realized that the Soviet empire was rotting within — notably Senator Daniel Moynihan — I was not one of them. I was raised through primary school in the Cold War, had had fantasies of a super-cool bomb shelter in the back yard, and assumed that the Cold War would dominate the international arena through the day I died and beyond.
My prior proposition, or more simply, my prior, was something like: the Cold War is inherent in the nature of the world. The Cold War was not something I merely read about in history books. It’s something I became aware of, really, before I was old enough to have any real sense that there is such a thing as history and that conditioned my life. On that basis an event like the fall of the Berlin wall was extremely unlikely. When that unlikely event happened, I had to revise that basic sense of the world: Oh, the world CAN change, and in sudden and dramatic ways.
In the recent past, the election of Donald Trump to the presidency has forced a lot of Democrats to rethink, but rethink just what? That depends. I note, however, that by attributing Clinton’s loss to dirty tricks – by the Russians, the FBI, and by GOP restrictions on voter registration – and by emphasizing that Clinton actually won the popular vote, you can preserve a belief that Clinton’s programs, that is, the programs of the neo-liberal third-way triangulating Democratic Party, are essentially right for the country and are what voters really want. Democrats who emphasize the unexpected success of Bernie Sanders, an old socialist fer cryin’ out loud! and of course the actual victory of Donald Trump, whose own party had doubts about him and who appealed to white working class males, these people are thinking a different way. They may not in fact have been particularly happy with Hillary (even if they voted for her), so they’re defending a different set of priors.
What if they’re all wrong? Is it deeper than that? Do we really know?
That brings me to the last idea:
Kate Jeffery: Professor of Behavioural Neuroscience, Dept. of Experimental Psychology, University College London
Let’s wade right in without bothering with a definition; that can come later. Having presented the annealing of glass as a physical example, then other natural systems, Jeffrey looks at human society:
The problem of pairing everybody off so that the species can reproduce successfully is a problem of annealing. Each individual is trying to optimize constraints — they want the most attractive, productive partner but so do all their competitors, and so compromises need to be made — bonds are made and broken, made again and broken again, until each person (approximately speaking) has found a mate. Matching people to jobs is another annealing problem, and one that we haven’t solved yet—how to find a low-strain social organization in which each individual is matched to their ideal job? If this is done badly, and society settles into a strained local minimum in which some people are happy but large numbers of people are trapped in jobs they dislike with little chance of escape, then the only solution may be an annealing one — to inject energy into the system and shake it up so that it can find a better local minimum. This need to de-stabilize a system in order to obtain a more stable one might be why populations sometimes vote for seemingly destructive social change. The alternative is to maintain a strained status quo in which tensions fail to dissipate and society eventually ruptures, like shattered glass.
I don’t know just when Jeffrey submitted her answer, but I can’t help but think that she had the 2017 US presidential election in mind when she wrote those last two sentences – or perhaps Brexit back in July, as she's British. That’s certainly what a lot of Americans had in mind with the candidacies of Trump and Bernie Sanders – “Let’s shake things up!”
The line is that a certain centrist “establishment” has been running things in Washington at least since Bill Clinton. To be sure, George W. Bush was President for eight years, but he was a centrist, no? Perhaps a different brand of center from Clinton or Obama, but still, the corporate Wall Street center. But Sanders and Trump…Whoa, baby!
Let’s get back to the basic physics. Here’s what Jeffrey says about attractors:
Systems in which elements interact with their neighbors and settle into stable states are called attractors, and the stable states they settle into are called attractor states, or local minima. The term “attractor” arises from the property that if the system finds itself near one of these states it will tend to be attracted towards it, like a marble rolling downhill into a hollow. If there are multiple hollows — multiple local minima — then the marble may settle into a nearby one that is not necessarily the lowest point it can reach. To find the “global minimum” the whole thing may need to be shaken up so that the marble can jiggle itself out of its suboptimal local minimum and try and find a better one, including eventually (hopefully) the global one. This jiggling, or injection of energy, is what annealing accomplishes, and the process of moving into progressively lower energy states is called gradient descent.
Let’s consider the simple system I photographed in A Primer on Self Organization. It consists of a tumbler filled with ordinary tap water into which we introduce two or three drops of black ink. Of course we know what’s going to happen. In time the ink will all but disappear into the water.
Here’s the tumbler before I dropped some ink into it:
Consider this photo, the first one I was able to take after dropping the ink:
As I recall, two or three drops entered the water where the individual molecules immediately began to interact with their neighbors and so began to diffuse through the water.
Physicists like to think about this by constructing a very abstract space in which the instantaneous state of the system is but a single point. For that we need a space of very high dimensionality which has a dimension for each molecule (actually, six for each molecule: 3 for its position in space and 3 for its momentum). Just to get a rough feel for this let’s do a crude calculation. Our little system has three spatial dimensions which the photograph flattens (projects) into two. The original of this photo measures 2670 by 2003 pixels making a total of 5,348,010 pixels. Think of that as a two-dimensional measurement of the evolving system taken at an “instant” of time.
That measurement space alone would then require 5,348,010 dimensions. It contains no momentum information at all and each individual pixel is, in effect, a “smear” of billions upon billions of molecules. It’s a pretty poor proxy for real measurement, but it’s enough to give some intuitive sense of complexity and strangeness of these conceptual objects, these abstract space in which a system’s state is represented by a single point.
Here’s the fourth photo I took:
That state too, however rich and complex it appears in the photographic measurement, is only one point in the phase space. Likewise, the final state (four and a half hours after I dropped the ink) is but a point in this high-dimensional space (& ignore the streaks on the tumbler, which are reflections):
The sequence of states a system occupies during its evolution is called its trajectory. That trajectory can be visualized by projecting it into a 3D or 2D space. In the next image I show what such a projection might look like for this system:
I do not in fact know what that trajectory would look like in 2D projection. I made this image purely for illustrative purposes. The nine points represent the nine photographs I took. Each represents a single micro state of the system, all zillions of dimensions. The red point is the first photograph while the blue point is the last. The light gray line represents a WAG (wild-ass guess) about the whole trajectory. Whether or not it’s correct is irrelevant. What’s important is that there IS a trajectory.
If you were to perform this little demonstration many times, it would become evident that, no matter where the ink was dropped into the tumbler, the final state would always be the same. The different trajectories might be somewhat different in character, but the final point is always the same. That final point is called the system’s attractor.
The name is unfortunate as it suggests that there is something THERE that is ATTRACTING the system to that state, like bees to honey or iron files to a magnet. There isn’t. Nothing is attracting in that way; there’s nothing attractive about that point. It’s just that, given the internal dynamics of the system, that’s how things work out.
At this point you might be thinking: Really? Do you really think that ink diffusing into a tumbler of water has anything whatever to tell us about, you know, social systems, and elections and stuff? That’s a good question.
And my answer is: maybe. We’ll never know until we try. Remember, though, that that’s only one of the three examples I’ve discussed in this post. And those three are taken from a set of 206 ideas posted at The Edge, a number of which are conceptually close to these three.
What would it take to gather, 10, 33, 45, a hundred or more such ideas into a single conceptual system? Would that kind of system give us insight into human behavior that’s deeper than any existing conceptual system? And who could understand such a system?
I don’t know the answers to those questions. But it does seem to me that that’s what the human sciences will be working on for the rest of the century, and beyond. In the words of James Tiberius Kirk, “to boldly go where no man has gone before” – that’s our mission.
* I don’t know how long the light blinks on, but it’s likely that it persists in vision longer that it is actually on. If the light is on for only, say, 1/100 of a second, then the nervous system has to be working at that scale in order to successfully anticipate the blink. For what it’s worth, the shutter speed when I caught the light was 1/250 of a second.