AI and Eyewitness Testimony

Featured Article

Journal of Applied Research in Memory and Cognition | 2024, p. 1-11

Article Title

Does Artificial Intelligence (AI) Assistance Mitigate Biased Evaluations of Eyewitness Identifications?

Authors

Lauren E. Kelso; Department of Psychology, University of Virginia, United States

Jesse H. Grabman; Department of Psychology, New Mexico State University, United States

David G. Dobolyi; Leeds School of Business, University of Colorado Boulder, United States

Chad S. Dodson; Department of Psychology, University of Virginia, United States

Abstract

Artificial intelligence (AI) is playing an increasing role in human decision making. We use eyewitness lineup identification to show when AI assistance can and cannot help people avoid a cognitive bias known as the featural justification effect. People are biased to judge highly confident eyewitnesses as less likely to be correct when their lineup identification is based on an observable feature than on an expression of recognition. Our participants (N = 1,010) saw an eyewitness’s lineup identification, accompanied by the eyewitness’s verbal confidence statement (e.g., “I’m certain”) and either a featural (“I remember his eyes”) or a recognition justification (“I remember him”). They then rated the likely accuracy of the eyewitness’s identification. AI assistance eliminated the featural justification effect but only in participants who regarded the AI as very useful. This project is the first step in evaluating human–algorithm interactions before the widespread use of AI assistance by law enforcement.

Keywords

Explainable artificial intelligence, cognitive bias, eyewitness identification, cognitive forcing, artificial intelligence usefulness

Summary of Research

“From medical treatment to predicting recidivism, artificial intelligence (AI) is playing an increasingly larger role in human decision-making. We use eyewitness lineup identification as a model paradigm to show when AI assistance can and cannot help people overcome a cognitive bias when judging the accuracy of an eyewitness’s lineup identification. One example of a cognitive bias that distorts how people interpret an eyewitness’s lineup decision and confidence statement is the featural justification effect (FJE). When an eyewitness makes an identification from a lineup and refers to an observable feature in their confidence statement (e.g., “I am confident it’s him. I remember his eyes”), they are perceived as less likely to be correct as compared with when an eyewitness’s confidence statement is either recognition based (e.g., “I am confident it’s him. I recognize him.”) or consists of only an expression of confidence (e.g., “I am confident it’s him”) without an accompanying feature… However, no one has investigated whether AI assistance can improve people’s evaluation of an eyewitness’s identification. Previous studies have shown that AI assistance can mitigate certain cognitive biases, but it can also exacerbate the effect of other biases” (p. 2).

“Will AI assistance improve people’s evaluation of eyewitness identifications by minimizing the FJE? We chose the FJE to test AI assistance for two reasons. First, the FJE has been shown to be a strong cognitive bias with respect to effect size... Second, the FJE illustrates an interdisciplinary problem of what causes people to misunderstand verbal probability statements… All participants saw a series of trials, each involving an eyewitness’s identification from a lineup that was accompanied by the eyewitness’s confidence statement. This confidence statement included either a featural or a recognition justification. Participants also received either no AI assistance (control condition) or AI assistance, which took one of three forms. They saw either the AI’s prediction about the likely accuracy of the identification (Prediction Only condition), the AI’s prediction as well as a graphical explanation (Prediction + Graphical Explanation condition), or they were in a Cognitive Forcing condition. Participants then rated the likely accuracy of the eyewitness’s identification… Our final sample consisted of 1,010 participants” (p. 3).

“Can AI assistance help people overcome a cognitive bias? When judging the accuracy of a highly confident eyewitness’s lineup identification, people are biased against and perceive an eyewitness as less likely to be correct when their lineup identification is based on a visible feature (e.g., “I remember his eyes”) than when it is based on a recognition response (e.g., “I remember him”)—a bias that we call the FJE. Consistent with previous studies, participants in our control (No AI assistance) condition showed the featural justification bias and rated identifications as less likely to be correct when they were accompanied by featural statements than recognition statements” (p. 8).

“Our key novel finding is that we show that AI assistance can eliminate the featural justification bias. But whether or not this occurs depends on participants’ perception of AI usefulness; this bias is eliminated in participants who rate the AI as very useful, but it is robust in participants who distrust the AI. These results highlight the necessity of collecting participant perceptions of AI usefulness when evaluating the influence of AI assistance on people’s behavior. Previous work has shown that how useful participants find a tool to be influences the way that tool is used, and our findings further support this result” (p. 8).

“We also predicted that participants would be more resistant to considering the AI’s advice when evaluating featural than recognition statements. In addition, we predicted that this resistance would be more pronounced in the Cognitive Forcing and the Prediction + Graphical Explanation conditions because we thought that the colorization of particular words in these conditions would exacerbate the featural justification bias. These predictions were only partially supported. Though we did find more resistance to the AI’s advice when judging featural statements than recognition statements, this resistance was present in all conditions involving AI assistance” (p. 8).

Translating Research into Practice

AI Decision Aids Can Correct Cognitive Biases: Law enforcement evaluators are prone to the Featural Justification Effect (FJE), judging eyewitness identifications based on observable features as less credible. AI assistance can help correct this bias—but only if evaluators find the AI useful.

Perceived Usefulness of AI Matters: The AI's ability to reduce bias depends on user trust. Practically, this suggests that training and user experience design are critical for implementation; evaluators must understand and value the AI’s guidance to benefit from it.

Simple AI Predictions May Be More Effective: Among the tested formats, the Prediction Only AI assistance (just showing the probability) was more effective than cognitive forcing or graphical explanations, implying that overcomplicating AI output may reduce impact.

Bias is Strongest Without Support: In the absence of AI, people consistently undervalued accurate identifications accompanied by featural justifications. Therefore, relying on human judgment alone in high-stakes settings like legal proceedings may reinforce cognitive bias.

Caution for Implementation in Real-World Systems: While promising, the study was conducted in a controlled setting with high-confidence correct identifications. Further testing is needed before integrating AI decision aids into real-world law enforcement practices.

Other Interesting Tidbits for Researchers and Clinicians

“Because there is no other (to our knowledge) research on the topic of AI assistance and eyewitness lineup identifications, our study leaves many questions open for future research. One important question is whether AI assistance can improve people’s ability to discriminate between correct and incorrect eyewitness identifications. Additionally, due to material constraints, our AI’s predictions were restricted to a range of 63%−86%. There is the potential that participants respond quite differently to an AI’s prediction when it is higher or lower than the range in our study. Overall, we are not arguing for the immediate adoption of AI assistance by law enforcement. Before that can happen, we need greater confidence that our AI predictions about identification accuracy—which are based on laboratory paradigms—scale up and generalize to real-world eyewitness lineup identifications” (p. 8).