Reliability and Validity of Scientific Research

For many years I taught courses about research and measurement. I always thought the most critical concepts for any research project were the reliability and validity of the tools used to measure the phenomena being studied. As a quick review. Reliability refers to the degree to which the measuring instrument gives the same answer every time given that the thing being measured has not changed. Validity refers to the degree to which the measuring instrument actually measures the phenomena being studied. A quick example may be seen with IQ tests. IQ is an abstract concept of a person’s potential to learn (among other areas too). It has been shown that a person’s IQ score varies somewhat over time but generally provides the same information every time. This shows the test is reliable if administered in the same fashion every time. The validity of IQ tests has always been questioned because it is a very abstract concept and agreement varies among psychologists as to what it is, ways to measure it and how to apply the results.

So far I’ve talked about reliability and validity as applying to a single measuring instrument, like an IQ test, but the concepts also apply to whole research studies and series of related studies. Often a study is published and the authors reach one conclusion and later another study finds different conclusions from seemingly the same methods. Remember room temperature nuclear fusion?

I recently became interested in two areas of research. One biological and the other historical. First, let us talk about fungus and trees.

Back in February, I wrote about how scientists had concluded that trees were interconnected via networks of fungi (Oh, 2022 and Grant, 2018.) I considered the discovery of this symbiotic relationship to be really exciting and confirmation of one of my, long held, beliefs that everything is connected. See the Gaia hypothesis (Lovelock, 2000) A recent article in the New York Times (Popkin, 2022) raises some concerns about those earlier conclusions. These new researchers did not find fungi connecting the roots of the trees in the forests they studied. These conflicting findings raise questions of reliability. The true test of a hypothesis is if others find the same results when duplicating a study. One of the annoying caveats of science (especially for long cherished fantasies) is seeking consistent findings across studies. This is a good thing.

Now, an historical example. This example deals with a question of validity. After recently watching Ken Burns’ documentary on the Holocaust, I was reminded of a related book by Edwin Black about the role played by IBM. For background, what follows is a quote from Wikipedia.

“IBM and the Holocaust: The Strategic Alliance between Nazi Germany and America’s Most Powerful Corporation is a book by investigative journalist and historian Edwin Black which documents the strategic technology services rendered by US-based multinational corporation International Business Machines (IBM) and its German and other European subsidiaries for the Nazi government of Adolf Hitler from the beginning of the Third Reich in January 1933 through the last day of the regime in May 1945 at the end of World War II. Published in 2001, with numerous subsequent expanded editions, Black outlined the key role of IBM’s technology in the Nazi genocide, by facilitating the regime’s generation and tabulation of punch cards for national census data, military logistics, ghetto statistics, train traffic management, and concentration camp capacity.”

This historical research appears to be reliable as the activities of IBM in Germany during WWII are well documented. What I wondered was if the author’s final conclusions, that IBM was complicit in enabling the Holocaust, go beyond the data. I looked for other research on this question and really found nothing comparable. I cannot find any mention that IBM assumed or denied responsibility beyond providing and maintaining the tabulation machines to the German government. Finding other research that also supported Black’s conclusions would give me greater confidence in the validity of the work. Black’s work left me feeling I’d learned a lot about the subject but unsure exactly what it meant.

A final generalization: One might guess the correct conclusion from unreliable data but one cannot draw a valid conclusion without reliable measurements. Even with reliable measurements, it is possible to reach the wrong conclusion.

References

Psychometric Methods, J. P. Guilford, McGraw-Hill, 1954. (This book was central to my undergraduate degree in psychology and still resides on my bookshelf.)
Test Reliability Indicates More than Just Consistency, Timothy Vansickle, Questar, April 2015.
Measurement Moments – Validity and Reliability, Questar, May 2015.
Cold Fusion, Wikipedia.
Do Trees Talk to Each Other?, Richard Grant, Smithsonian Magazine, March 2018.
Are Trees Talking Underground? For Scientists, It’s in Dispute, Gabriel Popkin, New York Times, November 7, 2022.
The Interconnectedness of Trees, Deepthinker Oh, Hypothesis, Science Circle, February 2022.
Lovelock, James (2000) [1979]. Gaia: A New Look at Life on Earth (3rd ed.). Oxford University Press.
IBM and the Holocaust: The Strategic Alliance Between Nazi Germany and America’s Most Powerful Corporation, Edwin Black, Amazon, March 16, 2012.
IBM and the Holocaust, Wikipedia.
Stranger than Science Fiction: Edwin Black, IBM, and the Holocaust, Michael Thad Allen, Technology and Culture, January 2002.

Visits: 39

About Author

Deepy https://sciencecircle.org

Deepy (Deepthinker Oh) is an educational psychologist with a long standing love of journalism and previous experience as the editor of MANIERA magazine. Deepthinker Oh's use of the SLBN logo does not constitute approval by or a representation or endorsement from Linden Lab.