A couple of weeks ago a research article being published in Science Advances made the rounds of healthcare policy Twitter. The article made the claim that there was a correlation between police shootings of unarmed black men and black infants being born underweight and premature. The research design took two datasets, one of fatal police shootings and another one of births in California, mapped them, and established a relationship for local police shootings that occured during an infants gestation based on a distance between the shooting location and the infant’s residence ranging from 1 to 3 kilometers. The working theory was that widespread perception of police discrimination against blacks, combined with exposure to local events of police brutality against blacks was causing undue stress on black mothers and having important consequences on the gestating infants.
The research appeared to have found a significant result, showing police shootings of unarmed blacks being associated with negative effects on black infant gestation period and birth weight. Such a finding would have obvious serious implications and the signal was quickly magnified by the usual suspectsas well as Twitter in general. Once I looked at the article myself, however, things didn’t sit right. Whenever I’m mentoring people I tell them to always do a sanity check and make sure that things make sense. If you look at the results:
The first thing you notice is that unarmed black american victim / black infant is effectively the only combination that results in a significant result. Does this make sense? If the working effect is dependent on a perception of systemic injustice, why are infants not affected by shootings of armed blacks? Similarly, the police is not notorious for equitable treatment of hispanics, so why is there no effect for hispanic infants? In fact, shouldn’t we expect mothers being stressed by shootings of unarmed citizens regardless of their race? This didn’t pass my first sanity check. The second thing you’ll notice about the results is the size of the confidence intervals for black infants relative to the other groups. This made me wonder if there was something going on with the sample size and not being able to find any information in the main article about the number of each babies in each group, I looked into the supplemental materials.
I will leave it to the statisticians to debate whether these samples are sufficiently powered. The thing that I noticed is the ratio of infants exposed to unarmed vs armed black victim shootings – 3,296 : 3,888. I’ve looked into statistics police shootings and other general gun violence in the past when the “get in your lane” debate started (I encourage you to look at these publicly collected datasets yourself, you will be sure to come to surprising conclusions on the subject) and from the police shooting dataset maintained by the Washington Post I knew that the true ratio of armed:unarmed police shootings was much closer to 15:85. Fortunately the author supplied the data that they used for this analysis. This is obviously not common and I’m sure many of you also have had cases where you strongly doubted the findings of a research article but couldn’t contest it because the data was not available. After teaching myself R to open the file and matching the coding of the case ids in the research dataset against the case id in the Fatal Encounters database from which the data was drawn I was able to research the news stories about these cases and found numerous cases that were incorrectly coded as unarmed when there was concrete evidence that the victim was armed at the time of the shooting. I have my own theories of how these errors happened, but it’s all speculative (hint: there’s 2025 cases).
After bringing up these cases to the author’s attention and some back and forth the author stated that he redid the analysis and that there was no longer a significant effect and the paper was being retratected. I’m not sure what to make of the statement that he reviewed the data. It took me 4 hours to review the news stories and recode just 36 cases. Did the author really review all 2025 cases in the dataset in just a week? Did he have help? Whatever the case, it is true that the author must be commended for choosing to make the data available, for responding to the concerns, and for promptly issuing a retraction.
The reason for this post is that in the discussion that ensued after the retraction, a number of people have opined that the paper should not be retracted, but instead published with the null result. The basic argument is that a null finding is still a finding and in addition to fighting the academic bias of publishing only significant findings that it would further our knowledge and help build a foundation for future research. I disagree with the sentiment that publishing the null result furthers our knowledge and propose that this belief makes the same mistake as the original: erroneously making conclusions without a proper understanding of the underlying reality. As a system engineer in training my bias is to understand how data is collected and what it represents before even considering how to operationalize it into system improvement, but in my experience this is something that rarely happens in healthcare policy research. I’ve written previously about how data fails to accurately represent the underlying reality in comparative system research. By reducing a complex system to a set of binary or even continuous variables we are failing to accurately capture the complexity of interactions and behaviors inherent to that system. Furthermore, scientific analysis causes researchers to self-impose a required myopia focusing on only one variable while holding all else constant, an obvious conceit when dealing with organic human systems. I’ll use the cases between 2014-2017 that I recoded for this police shooting dataset to demonstrate the limitations of the scientific method at drawing conclusions about the real world. As a reminder, each case was originally coded as a simple binary ARMED/UNARMED and the hypothesis was that the stress of proximatory to discriminatory police shootings would impose stress on mothers and adversely affect infants. I will not go into the cases that I found to be coded clearly incorrectly (victim is known to have had a gun or knife) because that’s not relevant to the point I’m try ing to make here.
Before we start, let’s pay respect and remember the names of men and women that were unarmed and should have never been shot by the police. Many of these stories involve homelessness and mental illnesses and are a direct reflection of the mass alienation tolerated by our economic and political structures. These innocents are: Tommy Yancy Jr., Paul Ray Kemp Jr., Jacorey Calhoun, Charly Keunang, Brendon Glenn, Kris Jackson, Nathaniel Harris Pickett, Rodney Watts, Donnell Thompson, Alfred Olango, Nana Àdomako, Mark Roshawn Adkins, Keita Oneil, Elena Mondragon.
The first set of cases is those in which the definition of armed/unarmed explicitly fails to reflect the situation. In 2014 Andre Milton was in the process of publicly smashing his girlfriend’s head into a nearby vehicle when the female officer, unable to stop him, chose to use her firearm. Also in 2014, Michael Laray Dozer took the handle of a gas pump and after soaking a woman with gasoline tried to use a lighter when he was shot by the officer. In a strictly literal sense of the word, these cases could be coded as unarmed. But would any person say that it was unreasonable for the officers to use their firearm in these cases? In the context of the hypothesis, does it make any sense to say that women were stressed by systemic discrimination when a man brutally beating his girlfriend was stopped?
The second set of cases is those in which the concept of armed/unarmed comes under scrutiny to a much greater degree because the urgency so clear in the previous set are much more vague. The first category in this set are cases in which there is a physical struggle between the officer and the victim and at some point the victim grabs for the officer’s firearm or taser. This is the case for James McKinney, Ezell Ford, Anthony Ashford, and Leroy Browning. The police officers involved in these shootings were determined to have acted in self-defense or defense of a partner and did not have charges pressed against them. The author raised the legitimate issue that many of these cases only have other officers as witnesses and that there’s a known concern about police departments covering up to protect their own. But if we are correcting for systemic bias by assuming that the victim in all these cases never grabbed the firearm or that they were unarmed because the firearm in question was not theirs, are we sure that we’re not introducing an even greater bias into the data? If that’s the case, how should one handle cases in which the victim actually did get hold of a firearm or taser and even used it on an officer, such as Carl Blossomgame? And what about cases where a vehicle is being used to endanger officer and civilian lives such as the cases of Dion Ramirez? If you investigate the statistics on police shootings you’ll know that this is often how women become a police shooting victim. In many jurisdictions police officers are allowed to treat drivers who use their vehicle in a way that could endanger others as dangerous and are allowed to use their firearms to stop them. The issue of police self-reporting bias is present here, but how would you treat cases where there is civilian or recorded footage corroborating the victim using their vehicle to ram into officers themselves or into vehicles such as the case with Nephi Arriguin, Jessice Williams, Michelle Lee Shirley. Were these victims armed or unarmed? Is a car used as a weapon actually a weapon? What about corded electric clippers swung like a ball and chain? And what about the case of Marquintan Sandlin, who is not known to have had a weapon, but Kisha Michael, who died in the same shooting, did have a gun with her in the vehicle?
The next set of cases our understanding of what constitutes a police shooting of an unarmed victim is challenged. Dominic Andrew Hutchinson shouted that he was armed and upon the police approaching the door rushed the officers while gripping a set of binoculars like a firearm. There are many such cases of “suicide-by-police”. The victim didn’t have a weapon and was in principle harmless, but the research hypothesis was that unjustified shootings are causing stress on mothers. How are we to expect the police to react in this scenario? Should the case be coded as armed/unarmed based on reality or the information on which the police acted on? Similarly, Augustus Crawford was being chased for being the suspect in another case and was suspected to have a gun. The gun associated with the previous case was found in the area after the shooting. The author suggested that this case was up to interpretation. The interpretation seems to be whether Crawford discarded the weapon during the chase without the police knowing this, whether the police did know he discarded the weapon and still shot him, or even whether the police planted the weapon. How are we to interpret this? Washington Post interpreted it as a shooting of an armed victim. Similarly, how are we to interpret the case of Marquez Warren who broke into the home of off-duty police officer Vedder Li and was shot with a personal (not service) firearm. Is this a police shooting? Washington Post did not include the case in their database.
Finally, there are the cases where there’s simply not enough information. Damian Murray held a teacher hostage for 6 hours before being shot by SWAT, but the news does not confirm or deny whether he had a weapon. The news for Jeffrey Smith’s case says that it’s still under investigation whether the victim was armed. The cases were coded as unarmed, but is this appropriate? Should they be coded at all? Should more information be collected by reaching out to the local authorities? Or should the data be discarded?
The point I hope I’ve conveyed is that reducing complex events to a simple binary is completely insufficient when trying to provide evidence for a complex hypothesis like public perception of police brutality against a racial group resulting in long term health consequences for infants. The best that such evidence can conclude is an opinion, precisely because so much of the data is compromised and confounded in one way or another and the way that the data is coded often comes down precisely to opinion. In the above events, a reasonable case can be made for both the armed and unarmed option. That’s a major problem. Is a car driving at you a weapon? If you’re a strict legalist then it may not be, but if you’re in situ and the vehicle is coming directly at you, I bet you’re more than likely to interpret it as a weapon that’s a threat to you. The connection to healthcare system research is that this flattening of complex information and interpretation is pervasive in policy research. If you take Commonwealth Fund data and look at data saying that some percent of people were seen within 3 hours of arriving at a hospital and that this is a metric of healthcare accessibility, what does that actually mean? What is the underlying reality? Was the patient “seen” for 2 minutes which involved being told to come back next week for a scheduled clinic appointment or was the patient seen for a 2 hour full workdown? I don’t know, and I guarantee that the researchers don’t either. The bias of research to try to find significant differences causes the data to be flattened to a degree that it no longer represents any system from which it was drawn. The consequence of then making conclusions and policy based on this data is, in my opinion, flying close to fraud. I encourage healthcare system researchers to not only seek data but to deeply understand what this data means qualitatively, what it doesn’t mean, what its limitations are, and to be honest and explicit about these details. It’s the only way to avoid systemic disasters that are guaranteed when we try to reduce reality to a binary.