Peanut Butter, Anyone?
You have been given the responsibility to catch potential terrorists at an airport. You have complete control over developing procedures to screen passengers and weed out the terrorists as they try to sneak through your security. The burden that has been placed on your shoulders is immense. If a terrorist gets through lives could be lost. If the terrorist incident is then linked to a particular country a war could ensue resulting in the deaths of thousands. Travelers are counting on you to protect their safety as are your fellow citizens and your country. As you consider what kind of screening process to develop, you can be sure of one thing; whatever process you come up with, it will be flawed and will result in errors – guaranteed.
You are a sixteen year old girl who is scared to death that you might be pregnant. You attended a party at a friend’s house and while you had no intention of disappearing with that boy, the hormones were surging and things got out of hand. A few weeks later you confide your fears in your best friend, she goes down to the local drug store, summons up the courage, and buys you a pregnancy test. You take the pregnancy test into the bathroom and follow the instructions. You feel that your whole future hangs in the balance, depending on how the test results come out. There is one thing that you may not know, the test on which you feel your future depends upon is not 100% accurate and if you are nervous as you take the test and do not follow the procedures exactly, a percentage of the time the test will tell you that you are not pregnant when in fact you might be, or it may tell you that you are pregnant when in reality you are not. One such home pregnancy test in 2004 was found to be in error 30% of the time.
You are a 50-year old male and you religiously monitor your health, having an annual medical checkup. During your most recent annual medical checkup the doctor notices some symptoms, weight loss, pain in the upper abdomen, jaundice, dark urine, some blood clots. These are all symptoms of a potentially lethal cancer which kills the vast majority of those afflicted, or they could be symptoms totally unrelated to cancer. The doctor orders more extensive tests, trying to reduce the potential for a misdiagnosis and you begin to think of your wife and children and what the future might hold for them. The additional tests come back positive, you are found to have cancer, a cancer that you have likely had for a few years. Why wasn’t it detected during an earlier physical? There is no reliable test, one with an acceptable error rate, for that type of cancer in its early stages and it is usually only detected when it is quite advanced.
You are a quality control inspector at a peanut processing plant. Once an hour you draw a sample of peanut paste off the line to test it for any contamination. You take your sampled paste to your inspection room and one test you conduct is for salmonella. The results of this test are not available until the next day as the salmonella bacteria has to be cultured in the lab in order to be seen and the test successfully conducted. Today’s production batch sits in storage awaiting your confirmation that it is safe and can be shipped out. You are feeling some pressure from shipping who are awaiting your results. You may or may not know of the error rates that your screening tests typically have, but have been instructed that if your test comes back positive to re-run it to see if the positive findings happen again. While I don’t know the error rates of these tests myself, lets hypothetically say that if all procedures are correctly followed, with the culture allowed to grow at the right temperature for the right period of time, a test could hypothetically, say 10% of the time, report no contamination when there is contamination. If the test was run 10 times, by chance alone, one of those times a theoretically contaminated run of peanut paste will be given a clean bill of health. Said another way, if 10 samples from a contaminated production run were taken, with a 10% error rate, it would be normal for one of those tests to report no salmonella, even when it is actually present. If unscrupulous people wanted to use only that false report of no contamination and ignore the positive results, contaminated peanut paste could routinely enter the market. It is somewhat disturbing that a supplier of tests for salmonella uses words like “confidential” and “little training required” as selling points for their tests on their website. Customers of all types deserve and need to demand transparency and complete disclosure.
The two most common errors in any decision-making process are called false positives (Type I) and false negatives (Type II). A false positive is when a test or the decision making system says that something is there, or you should take a course of action when in fact you should not. Are those blips on the radar screen really incoming missiles? If I assume they are and take retaliatory action against what are actually birds flying in formation, I am guilty of false positive decision making, and perhaps of starting WWIII. If I assume, based on the information I have, that the economy is in a downward spiral and that my business is going to go downward with it, and I take action based on that assumption, I may create a self-fulfilling prophecy; cutting back on the resources I need to fulfill orders. And if I got it wrong and the economy does not go down, I have fallen victim to a false positive and have missed business opportunity. False negatives are when your results or logic lead you to believe something is not there when it fact it is or you forego a course of action when you should have taken it. If you assume those blips on the radar screen are birds and do nothing when they really were missiles, that is a false negative error in your decision making.
Many times in accepting a higher incidence of false positives when the consequences of you taking action are not horrific, you are taking a more conservative approach. If for instance, it is your responsibility to maintain a positive working environment and to prevent unionization in your company, you might take action when your employee survey scores are more favorable, accepting a more sensitive trigger point, the point at which you take action, but you will in all likelihood be taking some actions that you did not need to. That might be more acceptable then setting a tougher more rigorous trigger point leading to fewer false positives, but in that case you may miss some cases where you should have taken action and did nothing.
No matter what anyone says to you about the accuracy of their testing or decision-making procedure and regardless of whether we are talking about correctly identifying terrorists as they pass through an airport, medical tests, food inspection tests, personnel selection tests, unionization probability surveys, interpreting blips on a radar screen or most other decision making processes, they are all subject to error rates. Do not believe otherwise and anyone who says otherwise is selling snake oil or perhaps contaminated peanut oil. Increasing the sample size and replication are two keys to reducing error rates, however not always. (Other ways of improving your decision making include, increasing the sensitivity of your measurement, or measuring device, using completely different approaches to come to the same conclusions independently, for instance a second opinion or two or more different types of tests that measure the same issue, or establishing a baseline measures which you can then compare future measures against.)
When the phlebotomist draws a sample of blood from your arm to test for the presence of a certain bacteria, what they are drawing is a sample, which may or may not contain the pathogen. If all of the blood was drained from your body and carefully examined the doctor would know for sure whether you were infected, but moving from a sampling technique to examining the universe (all the blood in your body) would likely kill you. Sometimes enlarging the sample size just won’t work. In this case replication might be the answer. If over the course of a few hours or days the test was repeated a few times and each time the answer the test gave was the same, you could be much more certain that the correct conclusion had been reached. (Doctors typically use multiple different approaches to diagnose illness, examining you for a series of symptoms, x-ray, MRI’s, blood work, physiological appearance, pain etc. all pointing to the same conclusion and yielding an increased likelihood of a correct diagnosis.)
Sometimes increasing the sample size is the best course of action. If we wanted to know if male vs. female voters had differing opinions on whom to elect for president of the United States, we could pull a representative, random sample of people likely to vote containing about 700 females and 700 males, which would tell us with an accuracy of +/- 5 points how females vs. males would vote, with 95% confidence, assuming a 100 percent response rate, for it is not the number in the sample that is absolutely critical but the number of responses received. These 700 females and 700 males are acting as representatives of all the females and males who are eligible to vote in the presidential election. But in a tight election being within 5 points of what will actually happen may not be good enough, for we could end up with a false positive or false negative in our conclusions and so we need to increase our sample size. If we were able to get 1200 female and 1200 male responses to our political poll we would now be within 1 point of how men vs. women would cast their vote with 95% confidence, yielding a more accurate conclusion.
There are less commonly mentioned types of errors in decision making processes and can throw off your ability to make consistently appropriate decisions. One is called a Type III error and that is when you come to the right decision, but for the wrong reason. Say you get back the results of your web purchased salmonella lab test on the peanut butter and it shows that the peanut butter is uncontaminated, and in reality the peanut butter is uncontaminated, but the test you used was not really capable of distinguishing between contaminated and uncontaminated lots of peanut butter. In that case you came to the right conclusion simply due to blind luck. Your results in this case may not be consistently reproducible or replicated. If we make the assumption for a moment that it is rare for a batch of peanut butter to be contaminated, if our worthless test came back each time saying the peanut butter was clean of bacteria, it would be right most of the time, but still not contributing any worthwhile information. Remember the old saying; even a broken clock is correct twice a day.
A tarot card or horoscope reader is guilty of making Type III errors. In this case the reader reviews the cards and makes an interpretation that may or may not come to pass. If it does come to pass the customer is amazed at how the reader could have known, predicting the future. The reader knew simply because when you make vague or general predictions the human brain, which wants to believe, can fill in the blanks, a Type III error big time.
A Type IV error is when you take incorrect action based on correct findings from your testing or decision-making assessment. In this case say you use a selection battery to determine if a candidate is a good fit as a potential employee for an organization. The test administrator gives the test and scores it. The test is a good solid test, the right conclusions are being drawn, but then the wrong course of action is set upon for whatever reason. Or if a doctor sees a patient, orders tests to determine if an illness is present, correctly determines that is it from the test results, but then orders an ineffective treatment for the illness that is a Type IV error.
If we return to our airport, where you are charged with stopping terrorists, if you develop an assessment screen for passengers as they pass through security, and your screen is in fact measuring attributes that correctly identifies terrorists, but for whatever reason they are still getting on the planes, that would be a Type IV error.
Clearly there is more to developing tests and decision-making paradigms than meets the casual observer’s eye. Doing it right takes time and effort. Snake charmers and so-called gurus exist all around us, ready to take whatever advantage they can and in many cases today it is a buyer beware type market. But I am cheered in that I do know a lot of people who want to do it right, in fact most people, and they are working at the highest levels of their various professions.
© 2010 by OrgVitality, Jeffrey M. Saltzman. All rights reserved.
Visit OV: www.orgvitality.com