The Inherent Problem of Psychology Research


Psychology parades as a science when it is in fact a philosophy.


Recently, Quillette published two articles that explained why the replication crisis happened—and continues to happen. One blames sensationalism, and the other blames the more fundamental, epistemological issue of mistaking correlation for causation. While both the media and shoddy data analysis play their role, these points don’t explain why many of the studies that failed to replicate were indeed conducted properly. It may be, in fact, psychology’s reliance on research itself that is at the root of why the field is in a tailspin.

The Variable Problem

The replication crisis has been blamed, in part, on one of the hallmarks of poor science—the incorrect inference of causation from correlation, as Dr. Bleske-Rechek elucidates. Although she’s correct, there’s something inherent in psychology research that makes it effectively impossible to tease apart correlation and causation. That something is variables. Specifically, there are way too many to conduct valid research, even when it’s conducted with the utmost prudence and analysis.

What determines the power of the study? Or the extent to which an independent variable influences a dependent variable? It’s the simplicity of the system that’s studied. The more simple the system, the fewer moving parts it contains, the higher the effect size. The more complex the system, the more moving parts it contains, the lower the effect size. And there’s no system more complex, of course, than the human brain.

To demonstrate what I mean, let’s look at Galileo’s falling bodies experiment, in which he showed the acceleration of a falling object was the same regardless of its mass. This is a simple study with a few, concrete variables. He gathered various objects of various mass, took them to the same height on the Tower of Pisa, dropped each one, and timed how long it took each object to hit the ground. Of course there would be some confounding variables—wind resistance, timing errors, drop inconsistency—but this is a relatively small amount of variability that could easily be corrected for with many different tests with many different objects—ie a large N. Overall, it’s a simple system with a concrete dependent and independent variable. If you fail to replicate this experiment, you should be barred from replicating in the gene pool.

Now let’s look at a previously-lauded experiment, the one on power poses, which will serve as a paragon for research in psychology, especially the social sciences. The results showed that if an individual held a power pose—think Wonder Woman’s iconic stance—before entering a job interview, then they would be more likely to get the job. The explanation here was power poses raise our serotonin. When we think we’re important to our social group, this is what happens to our physiology, so we can trick ourselves into believing we’re important when we mimic its innate body language. In the interview, therefore, the subjects would be more likely to act like they deserve the job, so then they would be more likely to get it.

But what are the variables here? Well, there’s a bunch. The power pose position, the pre power-pose feeling, the post power-pose feeling, the experiences of both the interviewer and the subject, the explicit and implicit values of the subject, and the qualifications of the subject. I could go on, but let’s suffice it to say there’s way more going on here than the inclination of the subject’s thoracic vertebrae 15 minutes prior to the interview.

Not only are there effectively infinite confounds, but they’re indefinite. Though Amy Cuddy, the lead of the study, has reaffirmed that the results are significant through further analysis, the question remains: significant of what? The only variable that comes close to objective is the power pose. Except even this isn’t objective because the pose itself doesn’t matter as much as the pose’s effect on the subject. If Galileo’s beliefs about himself, influenced by his relationship with his father, vaguely affected the Earth’s gravitational pull, then the falling bodies experiment would be difficult to replicate too.

My Research Fail

I’ve become more intimate with the inherent flaws of psychological research since I recently submitted my own project. Though I’ve worked on several before, this was the first that I spearheaded. It took a lot of my time so it’s difficult to admit that it was a waste, but it was.

The research measured the relationship between stress and substance use, which is an established connection, but this did so using something called ecological momentary assessment. Instead of being asked about substance use and stress every week or so during a visit to the clinic, subjects were prompted with these questions on their smartphone. The purpose of the EMA is to increase ecological validity and minimize reporting bias, which it does. However, increasing the validity of self-reports is a mortarboard on a pig. Self-reported stress leaves itself open to countless confounding variables, no matter how ecological or momentary the collection. Relying on people to accurately determine their stress hinges the research on the emotional literacy of the subject, and most psychologists cannot even give you a straight answer when asked what an emotion is, let alone emotional literacy. Besides, stress affects different people in different ways. The only valid or accurate aspect about self-reports are how invalid and inaccurate they are. A momentary report doesn’t touch this, and, in fact, it could hinder the accuracy. The app prompted the subjects six times per day, which in itself could aggravate stress and lead to reporting error.

The next generation of ecological momentary assessment will measure body chemistry to determine stress. Cortisol in blood indicates the subject is more stressed. This is a step in the right direction, but it’s barely a step. Two people who have same levels of cortisol can feel two very different levels of stress. One person may be more acclimated to the cortisol, for instance, so it won’t affect him the same way it would affect someone else. But “measuring cortisol” sounds scientific, so you’ll be able to present this research at a conference and receive a lot of validating nods from PhDs. Those feel good. And you’ll get free monies to do it. Those feel even better. But unless the research measures objective criteria, and we know exactly the effects of that objective criteria, then the field of psychology will remain in its bubble—and its results will remain in crisis.

A Call for Philosophical Principles in Psychology

In the 1960s, psychology as a field made the conscious, APA-driven decision to become a hard science like physics or biology. This wasn’t a principled decision, rather one of immediate survival. When the alleged success of Sputnik shamed America, we increased funding more than ten times to science and technology. Psychology noticed this, looked down at itself and noticed it was as soft as a marshmallow, so it fabricated a hard coat for a piece of the government pie.

Previously, psychology was built upon philosophy. Freud used his observations from hypnosis to create a theory of an unconscious. And every philosophical tradition—from Ancient Egypt to Christianity—had a psychological tradition. This makes sense—a view of existence and man’s relationship with it would be intimately tied to a view of the mind (not brain), heart, or soul.

There are of course epistemological flaws in Freud’s philosophy, but this is an ancillary objection to him and the traditions that came before. The point is the process was correct. He looked at the available information, his observations from Charcot and from his own patients, and he induced abstractions based on those observations. In this respect, Freud is the Thales of psychology. More than 2500 years ago, Thales postulated that everything was made of water. This turned out to be untrue but what was true was his process. He sought the “one in the many.” He tried to find commonality, he attempted to seek unity. This has been the progress of all scientific endeavors, including physics, chemistry, and biology—to explain the same phenomenon with fewer and fewer laws. It will be the progress of psychology as well. The current trend in psychology, however, is to do the opposite, to seek the disconnected many in the one.

A few questions regarding the philosophy of psychology that need to be answered are:

  • How to treat abnormal behaviors? And first, how to distinguish normal behavior from abnormal behavior?

  • How to distinguish between well-regulated and dysregulated emotions? And first, what is an emotion?

  • What integrations can be made from current findings on neuroscience, behavioral genetics, and comorbidity? (This isn’t data analysis, rather an analysis of observations based on an acknowledged, philosophical precepts. The dispute will be the accuracy of such precepts.)

  • What are the unclear concepts in psychology and how do they suffer from conflation, false dichotomies, and unfounded cultural assumptions?

No amount of research alone will be able to answer these questions.* What’s more, no amount of research is going to matter until we answer these questions. Psychology will need fewer self-reports, regressions, and distributions—and more Aristotle, Locke, and Erasmus. For the field to build a future, it will need to reaffirm its foundations.


* not until a nanobot-level of technology is available—even then I’m not so sure.