AI Policy & Governance, Equity in Civic Technology, Free Expression, Privacy & Data

Like Looking for a Needle in an AI-Stack

July 21, 2023

The challenges of navigating federal agencies’ AI inventories

By CDT Intern Bowman Cooper

In December of 2020, the White House released an Executive Order (EO) titled Promoting the Use of Trustworthy Artificial Intelligence in the Federal Government. The purpose of the EO was to push agencies to use AI in a way that preserves the public’s trust, including a requirement that agencies be “transparent in disclosing relevant information regarding their use of AI to appropriate stakeholders, including the Congress and the public” (emphasis mine).

Similarly, the Biden Administration’s Blueprint for an AI Bill of Rights, states that Americans “should know that an automated system is being used and understand how and why it contributes to outcomes that impact” them.

Fast forward three years, when I joined CDT as a summer intern and set out to review the inventories that federal agencies compiled of their uses of AI as the EO directed. Unfortunately, my experience trying to track down and understand the AI initiatives described by federal agencies was anything but transparent and easy to understand. Although any level of visibility into governments’ use of AI is long overdue, true transparency requires both clarity and ease of use, descriptors that the agency inventories cannot claim.

The information provided by each agency is inconsistent and unclear, making it difficult for the public to understand exactly how the use of AI impacts them.

The 2023 Guidance provided by the Federal Chief Information Officers (CIO) Council to assist agencies in complying with the EO offers detailed requirements concerning what agencies must include when sharing the results of their AI inventories with other federal agencies. This includes the summary of the use case and the current stage of development, as well as seven other categories. Despite this, public inventories are variable and often anemic in terms of the information provided. For example, the level of detail in the summaries of AI uses is inconsistent across and even within agencies, ranging from one-line descriptions to full paragraphs.

Some use case summaries are so vague as to render them meaningless. For example, the Department of Labor uses “Claim Document Processing” technology to “identify if [a] physician’s note contains causal language by training custom natural language processing models.” Clearly, this is neither descriptive nor accessible language to most members of the public. Does the program identify language that indicates that an injury or illness was caused by work-related activity or something else? How is the system’s conclusion used? If the program impacts individuals’ eligibility for workers’ compensation or disability, then that impact should be made known.

Another demonstration of vagueness can be found in the description of the “handwriting recognition” program employed by the Social Security Administration. According to the agency-provided description, “Artificial Intelligence (AI) performs Optical Character Recognition (OCR) against handwritten entries on specific standard forms submitted by clients. This use case is in support of a Robotic Process Automation (RPA) effort as well as a standalone use.” Nothing in the description indicates the purpose of this tech—How is a typical person supposed to guess what purpose the OCR serves here and how it affects them (if they even know what optical character recognition is)?

Is it just meant to convert handwritten forms into text so they can be processed more easily? How would this impact individuals with motor difficulties whose handwriting may be difficult for the OCR to process? Would their claims be rejected or subject to delays? And that’s without getting into the issues that arise if the OCR system is designed to assist with something like fraud detection (seeing if the same handwriting appears on multiple forms, for instance, which could just be the local librarian assisting patrons). Again, for members of the public to understand how AI impacts them, there must be more clarity in agency descriptions.

Where agencies are consistent is in failing to include information that is critical to interested parties who are trying to understand and keep track of the federal government’s use of AI: the dates that the inventories were posted and/or updated. Only 7 out of the 20 agencies that I reviewed provided the date that each use case was added or when the inventory was updated.

Agencies are required to update their inventories on an annual basis, but because these updates do not happen on a specific date and agencies do not proactively communicate when their updates are posted, it is difficult for individuals to monitor whether there have been changes in AI use by the federal government, further undermining transparency and public accountability.

Each agency portrays its information in a different format, making the site clunky and difficult to navigate.

Despite the Guidance provided by the CIO Council in 2023, there is little uniformity in the way this information is delivered to the public. Some agencies, like the United States Department of Agriculture (USDA) and the United States Agency for International Development (USAID), have their AI use cases compiled in a downloadable Excel file. Others, like the Department of Health and Human Services (HHS), link to online PDF tables of their use cases. Still others present the information in a variety of other formats. Some formats are easier to read than others, with the Department of Transportation (DOT) winning my award for the least navigable. Users have to mouse over cells in order to see their entire contents and publicly-available use cases are buried among a pile of redacted use cases.

Screenshot of how DOT’s inventory appears on Bowman’s laptop screen.

The variety of formats used by the various agencies makes navigating the inventories complicated and time-consuming. In December 2021, the White House released another executive order requiring federal agencies to “design and deliver services in a manner that people of all abilities can navigate.” Importantly, the EO noted the “time tax” levied on Americans who waste excessive time navigating government services.

Unfortunately, the way AI.gov currently presents information about AI use cases within the federal government creates just such a “time tax” for any American who wishes to understand how federal agencies use AI technology. If it is truly a priority of the government to be transparent about how it uses AI and how that use may impact individuals, that information should be presented in a way that is easily accessible by any American.

Conclusion and Recommendations

The government needs to focus on the consistency, navigability, and accessibility of its publicly available AI inventories. Fortunately, there are years of research on open data best practices and how the government can make information easily available to members of the public.

For example, the information that is required for the published inventories should be standardized and consistent within and across agencies. The information provided should use language that members of the public could understand without a law degree or PhD in computer science.

The inventories should include dates so that it is clear when use cases are added. Even better, the government could set up a notification system, allowing those interested to be alerted when inventories are updated. Finally, the format of the inventories matter, and this information should be provided in ways that can be easily analyzed by computers (e.g., not PDFs).

Federal agencies have taken an important step in making more information available about their uses of AI but there is a long way to go. More detailed guidance, along with agency compliance, that centers on the goal of the EO, true transparency, is needed. All of these solutions would provide greater clarity to the growing number of people who are interested and wish to monitor government agencies’ use of AI. The time tax is real, and we shouldn’t be forced to pay.

Like Looking for a Needle in an AI-Stack

Related Reading

CDT Files Comments with DOJ in Response to Advance Notice of Proposed Rulemaking on Bulk Sale of Data

CDT’s Matt Scherer Testifies Before Connecticut Senate’s General Law Committee on Senate Bill 2, An Act Concerning Artificial Intelligence

CDT Europe’s AI Bulletin: April 2024