The views expressed by contributors are their own and not the view of The Hill

Bridging tech and humanity: The role of AI foundation models in reducing civilian harm 

In August, the Department of Defense (D0D) established a generative AI task force charged with advising the department on how it might best use AI in everything from warfighting to health and business affairs. The task force should also consider how best to use a particular application of AI, called foundation models, in support of one of Secretary of Defense Lloyd Austin’s growing priorities: understanding, predicting and preventing civilian harm.  

Foundation models can recognize patterns and make predictions like human brains; however, they lack the subjective experiences, contextual reasoning, creativity and embodied grounding that characterize human intelligence and cognition. While foundation models hold massive potential to limit civilian harm, they must be used to supplement — not replace — human analysts. 

Unlike many AI models, which are designed for niche tasks like recognizing faces in photos or predicting stock market trends, foundation models, like ChatGPT, are trained on vast datasets that include a wide variety of information from the internet. This means they’ve been exposed to everything from scientific papers to historical documents to movie reviews. Consequently, they can engage in conversations, generate creative text, translate languages and more — sometimes outpacing analysts. Their big-picture analysis could fill the inevitable gaps in human cognition that may otherwise lead to civilian harm.  

Indeed, such gap filling is necessary.  

A 2021 report by the RAND Corporation found that the DOD’s failure to protect civilians stems in part from a lack of data and technology necessary to detect potential risks to civilians or to verify reports of harm. Threats to civilians escalate rapidly during conflict, making it difficult for human cognition alone to grasp the full landscape of harms. This is especially true when remote warfighting or reliance on partnered forces obscure the Defense Department’s on-the-ground view. Furthermore, data on harm often emerges from several disjointed sources, including official intelligence assessments, humanitarian and journalistic reporting, and social media chatter. These analytic shortfalls hinder the DOD’s processes for conducting civilian harm risk assessments, and for completing post-hoc investigations of harm which would otherwise help to improve upon past errors. 

Foundation models, with their ability to analyze large datasets, can assist by collating data across different sources and highlighting patterns or anomalies that indicate potential risks or past harms to civilians. When used judiciously and in conjunction with human expertise, they can accelerate the DOD’s mission of minimizing civilian harm.  

As shown by a paper put out last year by the Center for Naval Analysis, social media analysis and natural language processing are already being employed or researched to improve situational awareness for forces in combat. For example, a “cyber recon team” used social media analysis to identify a planned ambush and direct a combat team away from harm during training. The DOD should apply this same kind of analysis to assess and mitigate risk to civilians when planning combat operations. 

In an interconnected world, achieving true situational awareness means delving deep into the social media landscape of a specific area. Here, foundation models like Large Language Models (LLMs) shine. They sift through tweets, posts and blogs to spot emerging patterns of harm. The power of these models becomes evident when they decipher messages filled with local nuances or evolving dialects.  

In Ukraine, for example, civilians have provided live streams of information on harms and emerging threats to their safety on social media platforms like Twitter, Diia and eVorog. Human analysis alone may struggle to make sense of the “war feed” as it develops in real time, but LLMs can be used to analyze real-time open-source data streams, and pinpoint potential escalation areas along with urgent humanitarian needs. LLMs have great potential for predictive analysis in conflict zones.   

Satellite imagery can also be transformed into dynamic and informative data that protects civilians when combined with foundation models. Models such as Vision Transformers function not merely as tools that scrutinize isolated parts of an image but as systems that grasp the interconnectedness of every segment. Using these models’ computational capacities, military planners can decode patterns, trace historical trends, and ascertain potential civilian congregation points that could otherwise elude the human gaze. This capability can support the DOD’s endeavor to reduce civilian casualties by offering a more profound understanding of ground realities. 

While human judgment remains invaluable in military decision-making, people naturally face challenges in rapidly processing vast amounts of real-time data and identifying nuanced patterns. The tragic drone strike in Kabul in August 2021, the significant civilian casualties during the battle of Raqqa, and the drone strike on a wedding procession in Yemen exemplify pitfalls in human decision-making for targeting. Captain Bill Urban, a U.S. Central Command spokesperson, told The New York Times in 2021 that human error — including misinterpretation, confirmation bias and incomplete information — has led to erroneous strikes. Foundation models can help address shortfalls in human decision-making.  

By synthesizing a colossal range of intelligence data alongside open-source data, they can be used to provide a rapid and comprehensive risk view. More than just a data processor, these models serve as a “gut check” by cross-referencing current decisions against historical data. They can identify potential oversights or biases and flag anomalies. This AI-driven layer of verification can bolster the targeting accuracy of combat drones, helping to align actions with the Biden administration’s “near certainty” standard in drone warfare and to minimize tragic errors.  

The true potential of foundation models lies in so-called unicorn technologies — that is, groundbreaking innovations that redefine our capabilities. Envision a conflict simulator enhanced by these models: it would not just mimic tangible events but would also capture the intricate socio-cultural dynamics and historical contexts.  

While platforms like Google Translate provide real-time language translation, foundation models offer an elevated experience. Drawing inspiration from initiatives like the Universal Dependencies project, unicorn technologies can aid in understanding and navigating the diverse grammatical structures and cultural nuances of languages. Such AI systems don’t merely translate; they bridge profound cultural understanding gaps that offer soldiers an enriched perspective to facilitate more nuanced interactions in conflict zones. 

It is vital to remember that the sophistication of foundation model-driven strategies doesn’t guarantee perfection. While they provide a data-rich perspective, some human intricacies might elude them. Critical decisions, especially life-impacting ones, must center on human judgment. These models also carry a risk of algorithmic bias. If models learn from skewed data, they might unknowingly reinforce these biases.  

A classic concern is data omission. Vulnerable groups that are less likely to be active online — such as the elderly, poor and rural populations — might be underrepresented. Additionally, disinformation remains a challenge. Without proper screening, foundation models might amplify false narratives or even be vulnerable to adversarial techniques like data poisoning. 

Foundation models don’t merely offer a new instrument to our arsenal; they herald in a paradigm shift in the landscape of conflict itself. By leveraging these advanced technologies, we’re not only venturing into a profound transformation of warfare’s identity, we’re unlocking unprecedented avenues to safeguard civilians from harm’s way. In embracing this future, we commit to a vision where technology is not just an ally in warfare, but a guardian of humanitarian values. 

Alisa Laufer is a policy analyst, Lucy Shearer is a research assistant, and Joshua Steier is a technical analyst at the nonprofit, nonpartisan RAND Corporation.