Sunday, December 29, 2024

Why Fallacies are False -- 08, Sampling Bias

Living in an echo chamber creates a fallacy called sampling bias.

It automatically excludes some true data from consideration.

You may ask “what if I have a huge circle of experts in my echo chamber?” You have to make sure your people really are experts.

There’s a legal definition for that. If the person has an extensive background of research and publishing in peer-reviewed periodicals, often with some teaching thrown in, that person can testify as an expert witness in court. But only in the field where they have done research. A psychiatrist can testify about a patient’s mental condition, but not about what caused a car crash involving the patient.

Or, if they have trained in a given field and worked there for years, they can testify as an expert witness. But only in the field where they have worked. An FAA controller can testify to how air traffic control works, but not about airplane engineering unless (see above).

People called as expert witnesses have been disqualified if they published only in periodicals that promote a specific line of thought. Their research is not scientific; the name for it is advocacy research. These people usually commit sampling bias; all their research is done among people in their echo chambers. A court will only accept their testimony if it is supported by work done outside the echo chamber.

Publishing books doesn’t count instead of or in addition to periodicals, unless the books are commercial versions of peer-reviewed professional publications, like the book form of an approved dissertation. Publishers are famous for getting authors to tart up their work to make it more exciting (I’ll say more about the excitement factor in later posts). Too many books get debunked; diet books all get debunked sooner or later.

Dr. A was an expert on biochemistry, but his book would have attracted an invitation from an echo chamber about the Bible, about which he knew nothing but what DH discussed.

Sampling bias and a number of other fallacies fail the data portion of the Test of Occam’s Razor, which says you have to cover all data that meets the description of your dataset, and do it honestly, without corruption or manipulation. I’ll talk about the manipulation part in later posts.

A number of fallacies have similar features to sampling bias; here are three of them.

The first is the one I talked about for the drug test. You pick a dataset and then you claim the drug is effective (a categorical) when your dataset only covers 10% of the world population. If you want a true claim that your drug is effective, your dataset has to be the start of a series of tests, each of which will address a different demographic. It has to be the “break up the problem and test one piece at a time” portion of Cartesian method. It cannot be the whole show.

Another is often called “cherry picking”. You do all the testing, then you report only on the successful trials. I saw this depicted on ER, once. A doctor was running trials of a drug, and reported out only the successful trials. The rest he grouped as “underlying unfavorable conditions”. A younger doctor assisting in the trials questioned this – and was fired. He was right to question it; he got fired over a bruised ego. Yeah, I know that was a TV show, if you know a lot about drug testing, speak up with a more realistic version and we’ll all learn something.

The third is one that I told Gary Curtis about, called The Texas Sharpshooter Fallacy. You create your output, then you create the description of the dataset to cover the outcomes you like. Sort of like, a guy shoots at his barn, then decides which ones he wants to brag about. Draws a line around them and claims that was the target all along.

A partially related fallacy is “quoting out of context”, which I discuss on my blog in three posts. 

Quoting out of context has been used for millennia to influence people to think or behave a given way.

Sampling bias also relates to weak analogies. An analogy might ignore inconvenient truths in the interests of making a point. An archaeologist once claimed that an inscription referencing Balaam was a reference to the Balaam in the Bible. It was a Moabitic site, which coordinated with Balaam in the Bible working for the king of Moab. But Balaam was part of the Exodus story, and the Israelites existed in the Holy Land in time for Merneptah to write his stele (1230s BCE), while the Moabite inscription dated to the 800s BCE. Add to that the 100% difference between what the inscription and the Bible record, and that archaeologist proved nothing except their own incompetence.

There’s a tradeoff between dataset creation and the use you can make of it. But remember, you’re the one who decides on the dataset, and unless you cast a broad enough net, your claims of “I haven’t seen…” become irrelevant.

No comments:

Post a Comment