By CDT Intern Bowman Cooper
In my first few days as a CDT intern, I analyzed the various federal agency AI inventories, describing ways in which they fall short and could be improved. As I read through the inventories, I noticed three examples of ways that the federal government is using AI that warrant further attention:
- Chatbots: Several federal agencies are using AI-powered chatbots meant to assist the public in navigating agency websites, which is particularly important given the popularity of generative AI like ChatGPT.
- National security: The Department of Homeland Security (DHS) is using AI for security and surveillance purposes, while the State Department is using it for the detection of potential mis- and disinformation.
- Veterans’ mental health: Finally — and possibly most concerningly — the Department of Veterans Affairs (VA) is using AI to predict mental health outcomes, including deaths by suicide and overdose.
Seven of the 20 agency inventories I reviewed contained chatbot use cases. Some chatbots are simply tailored toward helping users navigate agency websites so that they can find the information they’re looking for. For example, one of the chatbots featured in the Department of Health and Human Services’ (HHS) inventory is the NIH Grants Virtual Assistant, which assists “users in finding grant-related information via OER resources.”
Many of the chatbots, however, play a customer service role, answering questions asked by users navigating the websites. For instance, HHS has another chatbot (simply labeled as “Chatbot” in its inventory) that “can respond to plain language queries in real-time using natural language processing.” Similarly, the Department of Education’s (ED) Aidan Chat-bot “uses natural language processing to answer common financial aid questions and help customers get information about their federal aid on StudentAid.gov.” According to the inventory, Aidan has interacted with 2.6 million individuals seeking information about student aid in only two years. Aidan can respond both to predetermined questions and original typed questions.
Of course, chatbots are not new, so it is unsurprising to see their use by federal agencies, especially as they seek to provide individuals with a better customer experience (as directed by the current administration in its Customer Service Executive Order). That said, what is new is the use of generative AI by chatbots, which raises new questions about the promises, and perils, of updating chatbots with tools like ChatGPT. For example, known risks like hallucinations (in which generative AI makes up information) could undermine agencies’ efforts to provide individuals with accurate information that makes it easier for them to access public benefits. Therefore, agencies that are using chatbots should exercise caution in whether, and how, they incorporate generative AI into their existing chatbots.
Another prominent use case involves the use of AI by the Department of Homeland Security (DHS) and the State Department in law and immigration enforcement. Both agencies are applying AI in a range of activities, from detecting immigration benefits fraud to training cameras to identify moving objects to tracking disinformation, but, unfortunately, the descriptions of these use cases left me with more questions than answers. Additionally, the DHS inventory lacked references to some examples of AI known to be used by the agency.
Immigration enforcement: Within DHS, AI is being employed in a variety of ways by the U.S. Citizenship and Immigration Services (USCIS) and Immigrations & Customs Enforcement (ICE).
USCIS is largely utilizing AI to verify information in applications and forms related to immigration, asylum, and immigration benefits. For example, the Fraud Detection and National Security (FDNS) Directorate is in the process of developing a case management system called FDNS-DS NexGen, which will purportedly “speed up processing” of immigration benefits forms with the use of AI and machine learning. FDNS’s stated goal is to “strengthen the integrity of the nation’s immigration system,” in part by identifying and deterring immigration benefit fraud—FDNS-DS NexGen may well increase the speed of document processing, but how accurate will it be?
Might it mistakenly flag applications as fraudulent? Another use case that stood out to me is the use of a Sentiment Analysis tool by the USCIS Service Center Operations Directorate (SCOPS) in surveys it administers to individuals seeking immigration benefits, applying sentiment ratings from “strongly positive to strongly negative” to categories of questions in the survey. Like FDNS, SCOPS’s stated goal is to ensure “the integrity and security of our immigration system.” What is the purpose of this survey? What questions are asked? If an applicant provides responses deemed “negative” by the tool, might they be denied benefits?
ICE is developing a platform called the Repository for Analytics in a Virtualized Environment (RAVEn). The stated purpose of this project is to “support ICE’s mission to enforce and investigate violations of U.S. criminal, civil, and administrative laws.” And RAVEn certainly has the potential to make ICE’s work more efficient — for better or for worse. Earlier this year, the Brennan Center for Justice reported its concerns about RAVEn, which included its capacity to provide a “mountain of information” about individuals, which allows agents to construct detailed profiles for purposes of surveillance.
According to the Brennan Center, this “mountain of information” is sourced from a variety of sellers, and agents are not required to verify the accuracy of the data they acquire. So, how accurate is the data compiled by RAVEn? How likely is it that ICE might mistakenly flag individuals on the basis of inaccurate information?
Interestingly, DHS’s inventory does not make mention of certain known uses of AI, such as those involving facial recognition technology. CBP currently uses facial recognition in airports and proposed to expand its usage in 2020 (CDT’s comment opposing this expansion can be found here). ICE has also conducted facial recognition searches of driver’s license photos to aid immigration enforcement efforts.
Surveillance activities: DHS CBP is also using AI as a tool for surveilling the border. One such tool is the Autonomous Surveillance Towers, which scan “constantly and autonomously” for “items of interest,” with the capability of detecting and recognizing movement. These “items” include human beings thought to be crossing the border without authorization and away from the designated entry points.
The tower is able to automatically pan to identify people and items they may be carrying, including weapons. Using AI and machine learning, the tower can alert human users and track people and other “items of interest.” Again, it would be useful to have further information about this system’s accuracy, any assessments done about its potential biases, and how the towers are integrated into overall border security and surveillance.
Mis/disinformation: The State Department seems particularly focused on identifying potential sources of disinformation online, which is unsurprising, given its proliferation in recent years and the threats it poses to national security. Largely, the identified AI use cases revolve around detecting groups or “communities” of accounts that may be responsible for the dissemination of disinformation. The Louvain Community Detection program “[u]ses social network analysis to cluster nodes together into ‘communities’ to detect clusters of accounts possibly spreading disinformation.” Another use case, referred to as “Disinformation Topic Modeling,” works to discern potential topics of disinformation.
The use of deepfakes to perpetuate disinformation is also of concern to the State Department, as they employ a “Deepfake Detector” to classify images of people as either genuine or synthetically generated. The State Department is also using AI to study how disinformation narratives gain steam. The “Image Clustering for Disinformation Detection” use case analyzes similar images as a way of understanding “how images are used to spread and build traction with disinformation narratives.” These types of automated content analysis tools for both text and multimedia content have fundamental limitations around robustness, accuracy, bias, and other characteristics. More details about how they work and are used and what testing the State Department has done are needed to evaluate the use of these tools.
Veterans’ Mental Health
The VA uses AI in a variety of ways, primarily to streamline care and improve health outcomes. The most concerning use cases I identified relate specifically to mental health outcomes, namely, the detection and prevention of suicide and overdose deaths.
In 2017, the VA and the Department of Energy (DoE) began partnering with the mission of improving veteran health outcomes. One of the stated goals of this collaboration is to “improve identification of patients at risk for suicide through new patient-specific algorithms.” True to this goal, the VA lists in its AI inventory five use cases related to the collection and application of patient data in order to improve the agency’s ability to identify patients who are at risk of dying by suicide. Though this objective is certainly an admirable one, secondary uses of individuals’ personally identifiable information, especially sensitive information like risks of self-harm, often results in loss of trust and is more prone to irresponsible uses as there is little to no transparency.
Moreover, even when patients give consent to have their information used in other ways, providing meaningful consent is very challenging, especially if the choice is between receiving urgent medical care and having their data shared and used by someone other than their provider. The inventory is generally unclear as to patients’ knowledge or consent about this use of their, as well as any privacy protections in place.
In addition to secondary uses of medical information, two specific use cases raise red flags as they are both high-stakes and appear to lack evidence that they work. The SoKat Suicidal Ideation Detection Engine, a partnership between the VA and SoKat, an AI consulting company, is piloting a tool that purports to use natural language processing to flag language in text responses that is associated with suicidal ideation, so that “the Veterans Crisis Line will be able to more quickly identify and help Veterans in crisis.”
The hope is to reduce the burden on the Veterans Crisis Line caused by a high false-positive rate for detecting cases of suicidal ideation. Naturally, an overloaded crisis line will be less effective at accomplishing the goal of supporting veterans facing mental health crises. However, a tool that lowers the false-positive rate might in turn increase the false-negative rate. This is worrisome given the stakes and requires fulsome testing and transparency.
Like SoKat, Behavidence represents another partnership between the VA and a private company. Behavidence is an app that can be installed on veterans’ phones. Once installed, it compares users’ “phone usage to that of a digital phenotype that represents people with a confirmed diagnosis of mental health conditions.” Behavidence’s website states that the data collected is anonymized and that the app cannot read any content within apps that are opened on a user’s phone. Even so, the idea that the VA might “remotely monitor… daily mental health scores” of patients as determined by their phone usage is unnerving. The app is also advertised to other employers, and it is also available to individuals who wish to see their daily stress scores. In fact, I downloaded the app. It doesn’t work very well.
SoKat and Behavidence became affiliated with and supported by the VA in 2021 through an innovation competition called the VA AI Tech Sprint, reinforcing the need for further attention to how federal funds are used to engage companies that provide products that utilize AI.
As I stated in a previous blog post, the lack of information provided in some of these use cases is concerning and makes them difficult to evaluate. Many of them are being used in ways that could have major impacts on real people’s lives—such as those seeking immigration benefits, asylum seekers, and patients of VA hospitals. To fully understand how these technologies are being applied, we need clearer, more thorough information about them.
As a step in that direction, CDT has just submitted a Freedom of Information Act (FOIA) request to the VA for more information about the way they are using AI to predict veterans’ mental health outcomes.