| ABOUT US | ARCHIVES | LINKS | RSS FEED | MONDAYS | |

3quarksdaily

An Eclectic Digest of Science, Art and Literature

« Secularism and Disenchantment | Main | On Knowledge Without Wisdom »

August 17, 2009

Psychological Science: The [Non-]Theory of Psychological Testing – Part 3

"Psychological Science: The [Non-]Theory of Psychological Testing – Part 1" can be found HERE.

"Psychological Science: The [Non-]Theory of Psychological Testing – Part 2" can be found HERE.

Note: My views in these three articles on Psychological Test Theory (PTT) are limited to psychological science, particularly what we know as the statistical theory of psychological testing: Classical Test Theory (CTT) and Item Response Theory (IRT). While I do not cover, explicitly, classical infe
113007_el-thorndikerential statistics in psychological research, some of my ideas would extend to that domain, particularly on Plato's Ideal Forms, and the tautological nature of some psychological statistics. I have nothing to say about how my views apply, or not, to engineering, quantum physics, and neural activity in the brain. At times, I use 'overstatement' as a rhetorical device to make a point.


"If a thing exists, it exists in some amount; and if it exists in some amount, it can be measured." *

* –E. L. Thorndike (1874-1949), Introduction to the Theory of Mental and Social Measurements (1904)

  

"Thus, if we perceive the presence of some attribute, we can infer that there must also be present an existing thing or substance to which it may be attributed." **Rene_descartes_002

"For I freely acknowledge that I recognize no matter in corporeal things apart from that which the geometers call quantity, and take as the object of their demonstrations, …." **

** –Rene Descartes (1596-1650), Principles of Philosophy. I:52 and II:64 (1644).

 

More philosophical embarrassments for psychology

    These oft quoted, or paraphrased, ideas have been unfortunate for psychological science. E. L. Thorndike's pioneering contributions to educational, social, general, and industrial psychology, and animal behavior are substantial, without dispute. However, this forceful attempt to establish, with 'common sense', a justification for psychological testing, was no more than a restatement of Plato's Ideal Forms. At the beginning of his illustrious career, psychology and philosophy were commonly administered in the same college and university departments. During his lifetime we saw the ascendancy of psychological science as a discipline separate from philosophy, but with a vestige of relationship issues from the prior marriage of long standing.

    Descartes gave us another problem, frustrating when we look back on it, that limited progress in Animalpsy1 Rene_descartes_001 science and philosophy for nearly 400 years. When it came to mental life (thinking, reasoning, cognition, memory), there was a clear line of demarcation between humans and the rest of entire animal world. Humans could think, plan, imagine, reason, and solve complex problems; animals functioned at the level of instinct and base neural connections. Thorndike reinforced this notion by a refusal to see the possibility of human-like thought processes in research on animals. The problem of mind and body, since Descartes, advanced only by putting a hyphen between the two words, 'Mind-Body'. Fortunately, philosophy has stopped asking itself questions that can't be answered.

The_neuropsychology_center     Alright, not all scientists, and philosophers are perfect. The point I wish to make, though, is that the same philosophical and mathematical assumptions that help to spur advances in psychological science, can also limit its future development. If psychology, as we know it, does not get it's scientific-philosophical-mathematical act together, it will be eclipsed by neuro-cognitive science, fMRI, genetics, biology, endocrinology, pharmacology, and the commercial testing industry. The prefix 'psycho' may be sprinkled, amply, through course catalogs, but we might be hard pressed to justify administering it as a separate discipline, and distributing researchBrain002 dollars on a par with other departments. It is my personal view that we have less than one generation to shake ourselves loose of an entrenched failure to reform our scientific shortcomings.


 The Revolutions of 1848996-7

    I was completely oblivious to the opening salvos, and the barricades, when a few anarchist psychologists and disillusioned social science researchers went into the streets and called for an end to psychological and inferential statistics as we knew them. The voices of discontent and progress were battling the entrenched department heads who thought their own research, and that of their students, would not reach publication. Sequences of courses were breached, content was revised, and a few of the old guard went, voluntarily, tRevolution_of_1848_book_covero reeducation camps (pre-convention seminars). In April 1996, I started a social science research company in Brewster, NY. I was preoccupied with hiring staff, renting office space, buying equipment, marketing, writing proposals, and funding my own start-up business. I don't think I read a professional journal for a couple of years. Certainly, there was no time or money to spend the better part of a week at the annual conventions of the American Psychological Association (APA), and the Society for Industrial/Organizational Psychology (SIOP).

    When I finally discovered what happened, the smoke and the barricades were gone, and there was no obvious trace of the birth of an important movement. I had to go looking for it. It's always chancy when you try to pin down the one event that started it all, especially when many factors, over time, may have made that one moment an auspicious one. So, here's my candidate for the shot heard round the world of psychological statistics. It was an article by Frank L. Schmidt, Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers, in Psychological Methods, I, 115-129. The following year, 1997, saw a great deal of focus at the APA convention on some of the issues discussed by Schmidt. At the risk of over simplifying the central ideas that were debated, I make the following observations: The concept of the effective well conducted primary study is an illusion; statistical testing in the single study is virtually worthless; the value of the primary or single study is only assessed, years later, in a meta-analysis. The revolutionaries wanted to ban all reporting of significance results. The brave comrades who manned and womaned the barricades did change some of the rules of peer reviewed journals in psychology for the better.Revolution_1848

    What the hell was going on that led bookish, nerdy researchers to take to the streets, and put their jobs and reputations on the line? My view is that the fundamental suppositions of psychology as a science were fatally flawed. They were not rebelling at psychology, nor were they rebelling at the science of psychology. They were simply saying that what they've learned in graduate school, and continued to teach their students, was frustrating their progress as social science researchers. They were wondering, in my opinion, if psychology as a science could hold it's own with the other more successful sciences. I think many of the courageous warriors DID NOT understand that the philosophy and statistical theories undergirding their science was dooming them to failure. They didn't appreciate this problem nor articulate it this way; they just knew it wasn't working. The result of brave men and women standing up to the tyranny of the past and the established order of things, was to deemphasize statistical tests in refereed journals, and provide results that were more descriptive rather than purely inferential.


Pre-revolution history

    The pre-revolution history shows that this moment in time was inevitable. Psychology, as a science, was given a huge boost by the psychologists who cut their teeth on psychological testing and psychological research in the U. S. Army Air Corps during WWII. It is not an exaggeration to say that the science of psychology in the second half of the twentieth century is indebted, immeasurably, to the Army Air Corps. Among many truly outstanding psychologists was Robert L. Thorndike (1910-1990), who followed his father, E. L. Thorndike, into Teachers College, Columbia University, in New York City. He Aviation_psychology was one of the best psychologists and psychometricians of the twentieth century. Few, however, took note that he was very clear that PTT was a tautology. Many of my colleagues will probably bristle at this and find it implausible. After all, he was one of the giants in the field of PTT, and contributed mightily to the literature, and the texts that are still used today. Is there a paradox or contradiction here? No. He was intelligent enough, and so well versed in the statistical theory of mental tests, that he understood it for what it was: a highly useful tool for society (as it still is) that was based on a tautology.

    R. L. Thorndike demonstrated the tautological nature of PTT with a simple example. First, we need a little background. The two most important cornerstones of PTT are the concepts of validity and reliability. Validity is a property that is imputed to a psychological or educational test, if it can be demonstrated that it is measuring what it is intended to measure. For example, a school district wants to use a standardized test to assess mathematics achievement among its eighth graders. How do the school Depression_validity superintendent, principals, and parents, know that a particular test really measures mathematics achievement as it relates to their educational requirements? Test publishers claim a test, in their catalog, measures eighth grade math achievement; upon inspection it may even look like a test of eighth grade math. Is it valid as a test of eighth grade math achievement? It is valid (it has validity) only if the test is subjected to specific kinds of research and examination that are spelled out in a document called, "The Standards for Educational and Psychological Testing."

Standards_for_testing     Reliability is a property that is imputed to a psychological or educational test, if it can be demonstrated that it yields consistent results with repeated use, all other things being equal. The statistical determination of the property of reliability is founded upon the concept of parallel tests. Achievement Test A, and Achievement Test B, are parallel if the content is essentially the same with, possibly, some variation. For example, Test A asks a student to solve for the unknown in the equation, 24 = 4 + x. Test B uses the equation, 13 = 3 + x. If there is only one test form available, the items of the single test could be divided into two tests of equal numbers of items. Thus, we have a Test A, and a Test B, administered atTest-retest_reliability the same time on one form, and in one sitting. These parallel tests are referred to as split-half, parallel tests. It is also possible to use a single test as its own parallel test, with two different administrations of the same test. Like validity, the determination of a test's reliability is the result of prescribed research and statistics found in "The Standards for Educational and Psychological Testing." Reliability is an indispensable, but not sufficient, condition for the validity of a test.

    Now, how does Thorndike demonstrate the tautological nature of PTT? He does it very simply. Reliability is defined in terms of parallel tests, and parallel tests are defined in terms of reliability. If you want to determine a test's reliability, then create a parallel test and follow the recipe in the "Standards." Tests are parallel, if they can be used to measure reliability. What do we have? We have circular reasoning, also known as a tautology: Reliability cannot be imputed without parallel tests, and parallel tests, as a concept, do not exist apart from their use in determining reliability. All statements in a tautology are necessarily true.

Loevinger     Some of the leading, early psychometricians recognized this tautological problem as early as the 1930s – definitely in the 1940s and 1950s. The brilliant psychologist, Jane Loevinger (1918-2008), was probably the first (and for aLoevinger_book long time the only) psychologist to make a stink about the fact that there was no non-circular definition of test reliability. She was ignored by the big name psychometricians of her day, but her assertion stands, and has never been challenged, successfully. Her personal history is fascinating. In spite of blatant gender discrimination for decades, hers was an exceptional career as scholar, teacher, and researcher. TRIVIA ALERT: Jane Loevinger singlehandedly created the academic area of women's studies in the university. Please say a prayer of thanks, or give a moment of reflection, for her gift to all of us, women and men.

 Herbert_feigl_001    The biggest antecedent to the revolution occurred 37 years earlier. The philosopher H. Feigl published the article, "Philosophical Embarrassments of Psychology," in the APA's flagship publication, American Psychologist, 1959, 14, 115-128. Feigl was one of the most influential philosophers in America, following his immigration just prior to the outbreak of WWII, and his appointment at the University of Minnesota. He is associated with ideas like philosophical analysis, logical empiricism, and scientific empiricism. With my penchant for over simplification, I would like to say that much of his thinking, andHerbert_feigl influence, were summed up in two humble questions: "What do you mean?" and "How do you know?" Continuing with great brevity, I would like to say that his paper had two intended effects, and two that were unintended. Feigl correctly pointed out the serious flaws in the psychoanalytic traditions that still believed they were doing the Lord's work as good scientists. Their pretenses to empirical science were – shall we say – embarrassing. The second, and probably intended, result, was to give succor, of a philosophical kind, to researchers who had enough with the hitherto, arrogant psychoanalytic pretenders to science. One of the unintended consequences was to give justification to the positivist behaviorists to seize the offices of the recently deposed, arrogant pretenders. Subsequently, it was harder to get your paper published if it spoke of cognitive function, mental process, meaningful verbal learning, and object relations theory in ego psychology. Scientific speculation resulted in the 666 branding of the foreheads of the incorrigible researchers. The second unintended result, was the cumulative frustration, thirty-five years later, of the psychological research community, after three decades of a free hand at a positivist, reductionist research model did not get them any closer to answering important questions. Richard Feynman, in an interview, summed up the lack of satisfying scientific progress in psychology, very nicely. He said psychology had adopted the proper scientific form, but we were not producing any laws of nature. Until we do, he said, psychology was a pseudo-science. Personally, as a psychologist and a scientist, that hurts. There are a few fundamental aspects about mental life and behavior that we have described, but, in the main I have to agree with him. The solution is clear: get our philosophy and science right before we go ahead. Otherwise, we will lead each other down a path to more frustration.


What got us here, and how are we going to get out?

    The Revolution confirmed the inadequacies of the established regime. What was so familiar to them in the past, was, and still is, hard to see as wrong. For example, classical inferential statistics for psychological research, appears to be joined at the hip of experimental design that we teach in psychological research methods classes. The Bayes_statistics classical model presupposes an ongoing accumulation of data that stand apart from the researcher, and who is objective and dispassionate. A Bayesian model of statistical inference compensates for many of the limitations of classical statistical inference. I won't go into all the goodies associated with a Bayesian approach. I want to focus, instead, on the major roadblock to incorporating Bayesian inference by social science researchers. What is untenable, if not downright unnatural for the classically trained, is that the Bayesian approach functions by modifying the beliefs and projected assumptions of the researcher. The researcher is prompted, constantly in the research process, to take a position on what is likely to happen, based on prior data. What happened to the objective, detached psychological scientist? The Bayesian focus of shaping the belief system of the researcher just doesn't compute for most investigators in the social sciences.

    Those not familiar with a Bayesian approach to statistical inference assume there must be a corollary in the classical model. There isn't. I gave a talk on Bayesian sampling, some years ago, to our graduateEvobayes and post-doc interns in industrial psychology at IBM. One of our very bright interns suggested that we could change the value of alpha, the probability of making a Type I error (incorrectly rejecting established knowledge when, in fact, it was true all along), using a classical model of statistical inference. I asked him if he would like to report to the executive of compensation and benefits that the percent of employees who were satisfied with their pay was 38 percent, + or – 41 percent. He would get thrown out of the executive's office, and asked to pack his bags and head back to the University of South Florida. That is the consequence of trying to use classical inference when Bayesian is more appropriate. It is a very different process that sounds like make believe to the uninitiated. It will be at least a generation, if at all, for those trained in classical inferential statistics to consider using a Bayesian approach.

    Another thing we must do is to understand the stifling effect on progress in psychological science by: 1. Our philosophy that fails to relinquish the World of Ideal Forms; 2. the tautology of the statistical theory of mental tests; and 3. the assumptions that our models of the distribution of traits are depictions of reality. No matter how closely an observed distribution of a measured psychological trait APPEARS TO LOOK LIKE a Normal Bell Curve, or another idealized distribution, we must understand that the curve is a mathematical model, a human construction, that is used because it has utility. The model of the Normal Bell Curve is no more a depiction of reality than Ptolemy's model of the universe. Ptolemy_system Ptolemy's model was accepted as the truth of reality because reality, as it was perceived, fit it perfectly. Ptolemy's description of reality allowed western civilization to make very accurate calendars, and predict events that were so important to sustaining civilization, like when to plant. Observation fit his idealized curve, exactly. It was so successful, and accepted as obviously real, that it obviated the need to explore a different model of reality for many centuries. When it was virtually synonymous with truth, as determined by the church, the arbiter of all truth, investigation into new ideas was aborted, discouraged, or persecuted.

    The problem for PTT is not that is isn't very useful, in the way that Ptolemy's model of the universe was very useful – in fact essential to the survival of whole peoples. The problem is that the current state of PTT limits progress in psychological science. Here's how. Let's look at the philosophical straight-jacket that is Plato's Ideal Forms. One of the fundamental assumptions of Forms is that, since they can't be observed directly, we can only observe their manifestation in the World of Experience as successive approximations of the REAL THING, which REALLY EXISTS. Personal experience over a lifetime gets us ever closer to the truth because, by definition, life is an accumulation of closer, successive approximations. So what the hell is wrong with that, you ask. Plenty! Assuming an uMonty-pythonnchanging reality that we continue to approximate, completely shuts off the option to chuck the whole thing, say it was all  bullshit, and start over with something (Monty Python) completely different. This was the quandary that Kepler and Bruno were in. Since we all know the earth is at the center of God's creation, then there must be something wrong with our data – worse still, we have to discount them as an illusion.

Principia482     Let's take a look at Isaac Newton's work on physics and astronomy, resulting in his Principia. The mathematical principals of the day, influenced greatly by Archimedes, Pythagoras, and Hindu and Islamic scholars, were insufficient for the work he was doing. So he invented a new system of mathematics that would work for him – Calculus. (Calculus was also invented independently and contemporaneously by Leipzig.) Try to imagine Newton trying to do his research with only Pythagorean mathematics, back in the time when Pythagoras was keeping secret his discovery of irrational numbers, and solid geometric constructions like the dodecahedron. Newton would have nowhere to go. Imagine the Church saying, this is the extent of truth, and there is nowhere else to go, anyway. This is the highly circumscribed situation we find ourselves in regarding psychological science, in general, and PTT in particular. We accept representational moSpace-timedels as actual reality. We can't see the tautologies for the forest, because we've become too accustomed to using them as if they were legitimate scientific theories based on observation. Einstein's general relativity was not an extension, elaboration, or refined approximation of Newton's work on gravity and motion. Einstein threw it all out. Newton was almost completely wrong. Sometimes scientific progress is incremental, but let's not confine ourselves in a scientific prison with highly circumscribed assumptions before we even begin.


Thank you for taking the time to read and, hopefully, comment on my ideas. At another time I will return to the intricacies of PTT and discuss them in more technical detail.However, that will not be for a while. Next month I will return to a more familiar genre of non-fiction. All I will reveal at this time is the title, "My Life as an Observer: Target Practice." See you on September 14, 2009.

Posted by Norman Costa at 12:00 AM | Permalink

Comments

A fascinating series, Norman. Today's installment reminds me of painting and writing class, where one is told to beware of the gigantic and alluring thing that you won't let go of, for, if it is wrong after all, you shall have ordered everything else in its light. When that happens, all you have is a bad painting, but with this...

Posted by: Elatia Harris | Aug 17, 2009 10:15:16 AM

The Calculus was invented by Leibniz, Leipzig is a town in Germany.

Posted by: Chris | Aug 17, 2009 3:52:42 PM


Chris,

How the hell did I do that! Thanks.

Posted by: Norman Costa | Aug 17, 2009 4:01:58 PM

Norman, I have returned to this and am wondering something. To follow your own analogy a bit, if PTT is wrong but useful (the Ptolemaic model that dictates timely planting of crops), then is it wrong in light of another model that is superseding it as we speak (the Copernican model), slowly winning adherents, slowly becoming official? Or, just wrong? Is it one of those things that cannot be shown wrong in the absence of an idea demonstrably correct?

I understand you don't plan to write Part 4, but if you were going to write Part 4, about what could replace PTT, what might that be?

Posted by: Elatia Harris | Aug 19, 2009 12:50:45 AM

Norman,
re your reliability example, it seems like the best reason to regard solving those two equations as parallel is mathematical: they're both single variable linear equations. They even both have unit slope. You understand how to solve the one iff you understand how to solve the other. Doesn't that do anything to rescue reliability tests from circular reasoning?

Posted by: D | Aug 19, 2009 7:13:23 AM


Oh Lord,

After spending a not-very-delightful evening doing an installation repair on my main Linux workstation, the first thing I did this morning - no, it was the second thing I did - was to see how my pride-and-joy was doing. First stop, through blurry eyes, was 3QD. Praise the Saints in Heaven, I got two more comments! August is redeemed!

Well, it's not redeemed yet.

So what did I get? I got what amounts to two requests to write Part 4, now, as a comment. AAARRRGGGHHH! Fie! on both you morning spoilers. I need some breakfast first.

Posted by: Norman Costa | Aug 19, 2009 9:27:01 AM


Elatia,

A fundamental concept in the philosophy of modern science is that scientific knowledge is always proximate, and always subject to being superseded by new knowledge. New knowledge may enhance, extend, limit, or even obliterate it's predecessor. This is true, today, for Psychological Test Theory (PTT) as a model for studying mental life and behavior in animals and humans, as it was for the Ptolemaic System of the Universe. PTT is a model for mathematical argument in psychology, as the Ptolemaic System was a model for mathematical argument in astronomy.

The Ptolemaic model was superseded by the Copernican (1473-1543) model, which was superseded by the Kepler (1571-1630) and Galileo (1564-1642) models, which were superseded by the Newton (1643-1727) model, which was superseded by the Einstein (1879-1955) model, and so on. This is the sense in which PTT is wrong. Unlike Ptolemy's heavenly orbs, however, we do not know what will supersede PTT. In the 1960s and 1970s, it was proposed that Criterion Based Testing (CBT) replace PTT. However, it was argued, successfully, that CBT was a special case of PTT.

Another way of looking at this is to say that a model is wrong when it can't accommodate new data, and can't solve new problems. It becomes wrong, incrementally, even before there is a new model to replace it. This is the state of PTT, today. Even when a new model is proposed, and largely accepted, the most brilliant among us can find it impossible to embrace. The best examples are Fred Hoyle (Big Bang), Albert Einstein (Quantum Mechanics), and Sigmund Freud (Origins of Hysteria). In order for psychological science to find a better model, we have to discard the vestiges of Ideal Forms, tautological theory, and the idea that models are reality.

Posted by: Norman Costa | Aug 19, 2009 11:07:48 AM


D.

Based on your prior comments and questions, my guess is that you are among the engineer-scientist-programmer lot. You are partially uncovered by the double-eff iff.

You have uncovered THE core problem I have with the concepts of reliability and validity in PTT.

But, I must be off to a meeting, and then some lunch before tackling this very difficult issue.

Posted by: Norman Costa | Aug 19, 2009 11:23:14 AM

I'm not putting my question very well. Doesn't there have to be new data that can't be accommodated, then, for one world view to subside, another to rise? What I'm wondering is how acute is the perception of this new data? Are you saying a majority view *will* definitely give way to a more accurate minority view -- one that is available right now -- or that it *should* do, when the more advanced view is articulated?

Posted by: Elatia Harris | Aug 19, 2009 11:40:39 AM

Norman,
Yep. Physics student.

Unlike Ptolemy's heavenly orbs, however, we do not know what will supersede PTT.

I thought you were suggesting Ptolemy::Copernicus ~ PTT+frequentist statistics :: (something new, sought for) + bayesian methods, given the latter's more explicit focus on researcher priors and beliefs. Weren't you?

Posted by: D | Aug 19, 2009 12:31:15 PM


Elatia and D,

Both of you have asked questions that are variations on the same theme. Can we pause for a moment to appreciate the coming together of the minds of an artist, writer, and expert on saffron, on the one hand, and a physics student, on the other?

I just got back from my meeting, and now it's time to lunch. I need to supply protein to the gray matter, before I can get back to this subject.


Posted by: Norman Costa | Aug 19, 2009 1:47:10 PM


D,

My response will be a little formal and long, because I am writing for my colleagues, as well as answering your question. My purpose in writing on this topic is to develop my ideas in anticipation of a more scholarly presentation on these issues. I'm sure you can identify with the value of successive drafts in developing your views, and responding to questions and critiques as a way of testing one's own thought process and argument.

The short of it: I do not want to rescue test reliability in the context the current formulation of Psychological Test Theory (PTT). There are logical inconsistencies, confusions between the concepts of reliability and validity, and questions that cannot be answered.

Thus begins my descriptions of ideas that do not work: Test reliability as consistency of repeated test use; and test reliability (consistency) as distinct from test validity (the appropriateness of using a test for a particular purpose.)

Let's start with some assumptions. There are two distinct tests of eighth-grade math achievement. We will call them Test A and Test B. Each test, commonly, might have many items covering the agreed upon curriculum. Let us assume each test form has 100 items. Each item is scored 1 or 0, if correct or incorrect. A total score is computed by summing the scores of all items.

Now, how do we determine that Test A and Test B are 'parallel' tests – alternate forms of each other? It's very simple. Create two tests that appear to be parallel, and then examine the Pearson product-moment correlation coefficient between the two. We compute the correlation coefficient after administering each test form to the same sample of eighth-graders. A high positive correlation means that high and low scores on Test A are associated with high and low scores on Test B. This is interpreted to mean that both test forms are measuring the same thing, AND they are doing so consistently.

But, how do we construct Tests A and B so that they are parallel, and can be considered as alternate forms of the same test? There are several ways to do this. 1. The Mirror Effect - Create Test A to reflect the curriculum. Then, create a mirror item for each test item, varying only the non-essential content. The assembly of mirrored items becomes Test B. Test A and Test B LOOK LIKE each other and are assumed to be alternate forms of the same test. This was the example used in my article, above. Test A, item i reads,

Solve for the unknown variable in the following equation: 24 = 4 + x.

Test B, item i reads,

Solve for the unknown variable in the following equation: 13 = 3 + x.

The final step is the administer Tests A and B to the same sample of 8th graders and compute the correlation coefficient. The value of the correlation coefficient IS the reliability coefficient. We would expect the reliability coefficient to be high.

2. Double Your Pleasure – Create two tests of 8th grade math achievement from the same curriculum, but independently of each other. The simplest case is Teacher A creating Test A, and Teacher B creating Test B. Let's assume both teachers are expert at what they do and know the curriculum inside and out. Let's assume all other things are equal, except that Teacher A likes to use short-answer and fill-in-the-blank items, and Teacher B likes to use fill-in-the-blank items and multiple-choice items. An inspection of the items confirms the two tests cover exactly the same content and APPEAR to be alternate forms of each other. The two tests are administered to the same sample of 8th graders, and the reliability coefficient is computed. We would expect the reliability coefficient to be high.

3. Cheat – Create a single test of 100 items. Administer the test to a sample of 8th graders. Divide the 100 item test into two tests, each one 50 items in length. The items can be assigned randomly, or conveniently by odd and even numbered items. Score the 'split-half' tests separately as if they were Test A and Test B, and compute the reliability coefficient. We would expect the reliability coefficient to be high.

4. The Ultimate Parallel Test – Administer Test A, today. Administer Test A, next week, to the same sample of 8th graders. Test A is its own parallel test. Compute the reliability coefficient. All things being equal, we would expect the reliability coefficient to be high..

Reliability, defined as consistency of scores with repeated test use, is determined by the correlation between parallel tests. Tests are parallel if they yield sufficiently high reliability coefficients.

End of part 1 of ?? of my answer to D.

Posted by: Norman Costa | Aug 19, 2009 4:44:07 PM


Elatia,

“Doesn't there have to be new data that can't be accommodated, then, for one world view to subside, another to rise? What I'm wondering is how acute is the perception of this new data? Are you saying a majority view *will* definitely give way to a more accurate minority view -- one that is available right now -- or that it *should* do, when the more advanced view is articulated?”

This is going to be a very, very hard sell. What I am offering is not so much 'new' data, as making the point that we are stuck in the same old thing which is not taking us anywhere. There is nothing in the present context of psychological science that can be called an acute awareness that something is not working. A few, but a might few, are trying to move psychological research, in general, out of it's straight jacket. However, none of these, to my knowledge, is focusing on PTT. There is no reason to think that the majority will, should, or want to change whether there are new data or no. What I am hoping to do is to lay a foundation and focus for examining the limitations and then to progress from there. It's a near impossible task, but that only means it will take a little longer.

In order for my views, which I will begin to articulate in my extended answer to D, to have any traction, there will need to be a supportive political climate (political in a broad sense) that will provide the energy and will to bring about change. In my view, the only possibility of finding the political will for change in PTT is for those who are adversely affected by test use to push for change. I have in mind the recent court case in Hartford CT concerning a test for advancement in the Fire Department. There is a better chance of that happening if they can be given a solid foundation to criticize PTT, and a viable alternative to compete with PTT.

D.

“I thought you were suggesting Ptolemy::Copernicus ~ PTT+frequentist statistics :: (something new, sought for) + Bayesian methods, given the latter's more explicit focus on researcher priors and beliefs. Weren't you? “

I'm in a quandry over this. I'm not feeling secure enough, yet, to say that I have a prescription for the next big thing in PTT, or to say that I have some tools and ideas that can get us there. I have more thinking and writing to do, as well as challenges to respond to, which I will continue in my long, formal answer to your other question.

Posted by: Norman Costa | Aug 19, 2009 6:55:11 PM

At the risk of being accused of blasphemy, I am going to say that Richard Feynman is not infallible, and that when he said psychology had come up with no natural laws, he was wrong. One natural law I can think of, right off the top of my head, is that behavior, particularly human behavior, is over determined.

You are right, Norman, in predicting that psychology will be eclipsed by neuroscience. I see it happening right now as I read textbooks for the introductory course. However, an eclipse does not mean the death of that which is eclipsed.

It is because the above natural law is true, that it is also true that some psychology is going to survive. For example, neuroscience can teach us that mirror neurons cause primates to imitate behavior. But I am pretty sure that no matter how you study those individual neurons, the structure or formation of neural nets, that you are not going to be able tell WHICH behavior will be imitated. You need psychology for that.

Finally, neuroscience is not likely to eliminate psychology AS AN ART. I am aware that you are writing about psychological SCIENCE, and that a discussion of the art is perhaps not germane. Nonetheless, I think it extraordinarily unlikely that neuroscience will ever eliminate the practice of all forms of psychotherapy, which is, an art, not a science. (People treasure the PROCESS of therapy too much for it to ever die out entirely.)

Which brings me back round to your points and Richard Feynman. It has been said that psychology suffers from physics envy. Perhaps it does, and that is why we have clung so long to what we hope are mathematical models used in physics and the easy sciences. (Surely psychology is the hardest science, since we have so few natural laws?) If we accepted the fact that art is as valuable as science, perhaps we could let go of our needless envy and move on to the revolution you predict.

Posted by: Rhea | Aug 20, 2009 2:27:34 PM


Rhea,

Thanks for your observations. There are more things in your comments with which I am inclined to agree, than not. What do you mean by "over determined," as opposed to "determined?"

"One natural law I can think of, right off the top of my head, is that behavior, particularly human behavior, is over determined."

Posted by: Norman Costa | Aug 20, 2009 4:32:21 PM

When psychologists say behavior is overdetermined, we mean that it is caused by more than one thing. For example, my dining with a friend was caused by, amongst other things, my hunger (a biological event), my cultural training, my family upbringing (they taught me that dining with friends is fun), an unconscious desire for ...., etc., etc.

Posted by: Rhea | Aug 28, 2009 10:32:12 PM


Rhea,

I'm going to disagree with you on two counts. First, overdetermination, in the way you intend, does not mean that there are many causes to an event. Rather, it means that there are many causes, any one of which is sufficient to produce the event.

Second, I don't see that this is really a law of nature, other than a descriptive category of a subset of phenomena we call cause and effect. I believe we have a few concepts that touch upon basic processes of mental life and behavior in animals and humans. Among these are the law of effect, classical conditioning, operant learning, and schedules of reinforcement. In the area of cultural learning and development there is social modeling theory. These are only a few areas that have some history and still have promise for further research. Yet, none of them rise to the level of fundamental laws of nature - at least not today.

Finding joy and satisfaction in dining with a friend over Jordanian salad, babaganoug, pita, Turkish coffee, and home made baklava may not be a law of nature, but it should be.

Posted by: Norman Costa | Aug 29, 2009 12:22:26 PM

Norman and Rhea,

Yes, please, about dinner!

About overdetermination, I am not sure it was ever intended to be understood as a law of nature. Rather, it's a concept that at least partially explains the ineluctable drift towards certain outcomes in the lives of people at risk for those outcomes. Long ago, when a person was said to "fall ill of a neurosis," psychoanalytic theory held that it didn't just happen, but was overdetermined. I would be interested to see what they make of the usage now.

Posted by: Elatia Harris | Aug 29, 2009 2:00:55 PM

Post a comment






Subscribe to this blog's feed  

Nominations Now Open

3QD ADVERTISING

Find the best prices on Las Vegas Show Tickets at Best of Vegas and Orlando Theme Parks at Best of Orlando!

3QD on Facebook

3QD on Kindle

3QD by Daily Email

Receive all blogposts at the same time every day.

Enter your Email:


Preview 3QD Email

3QD on Twitter

Miscellany

Lijit Search

AddThis Social Bookmark Button

Add to Google

Recent Comments

Gordon on Jonathan Haidt Decodes the Tribal Psychology of Politics

fa on Gish Jen to Judge 3rd Annual 3QD Arts & Literature Prize

Elatia Harris on Smells (and the people who write about them)

rjm on Gish Jen to Judge 3rd Annual 3QD Arts & Literature Prize

Rohan Maitzen on Gish Jen to Judge 3rd Annual 3QD Arts & Literature Prize

ray Butlers on Tax Justice: The Next Great American Movement

Pepito on Becoming Condoleezza Rice

Jaya Aninda Chatterjee on Gish Jen to Judge 3rd Annual 3QD Arts & Literature Prize

Steve on Becoming Condoleezza Rice

ed rackley on That's not music – that's just noise!

Philosopher's Beard on Gish Jen to Judge 3rd Annual 3QD Arts & Literature Prize

DS on Gish Jen to Judge 3rd Annual 3QD Arts & Literature Prize

Michael Cunningham on Suicide as Scene and Spectacle: Notes on The Bridge and Aokigahara - Suicide Forest

Bilal Tanweer on Gish Jen to Judge 3rd Annual 3QD Arts & Literature Prize

Nithin on Gish Jen to Judge 3rd Annual 3QD Arts & Literature Prize

omar on Learning Urdu

Ankur on Learning Urdu

hairlessOrphan on That's not music – that's just noise!

Namit on Gish Jen to Judge 3rd Annual 3QD Arts & Literature Prize

hairlessOrphan on That's not music – that's just noise!

Ankur on Learning Urdu

Frank on Smells (and the people who write about them)

Nick Smyth on That's not music – that's just noise!

Jeff Strabone on Tax Justice: The Next Great American Movement

panopticonopolis on Tax Justice: The Next Great American Movement

Acclaim For 3QD


"I couldn't tear myself away from 3 Quarks Daily, to the point of neglecting my work. Congratulations on this superb site."—Steven Pinker, Johnstone Professor of Psychology, Harvard University.

"I have placed 3 Quarks Daily at the head of my list of web bookmarks."—Richard Dawkins, Charles Simonyi Professor of the Public Understanding of Science at Oxford University.

"Just wanted you to know I’m one of many who reads and enjoys 3 Quarks....almost daily."—David Byrne, musician, former lead-singer of the Talking Heads, artist, intellectual.

Read more here.

The 3QD Prizes

Subscribe to this blog's feed