The NSA’s Laziness Masquerading as Reasonableness
The “Summer of Snowden” has put in stark relief just how difficult it is to make sense of laws designed to protect privacy given the current state of surveillance technology. Perhaps nowhere is this difficulty more evident than in the 2011 Foreign Intelligence Surveillance Court (FISC) opinion concerning so-called “multi-communication transactions” (MCTs) that the US government declassified and released to the public as a result of a Freedom of Information Act request filed by Electronic Frontier Foundation. Because the FISC opinion is partially redacted and cites other documents that are not public, it is impossible to draw completely firm conclusions about the NSA surveillance activities it describes. But a close read suggests that the NSA’s arguments about its technological capabilities are disingenuous and that the FISC lacks the technical understanding to challenge the government’s assertions.
Understanding the FISC opinion is no simple task. Rather than using well-understood terminology commonly used in the technical community, the NSA has created its own jargon for describing what it does. To simplify the analysis, it’s easier to focus on the NSA’s collection of email (similar analysis is possible for voice calls and other kinds of communications).
The FISC opinion concerns the NSA’s “upstream” collection of communications, broadly understood as the NSA’s collection of Internet traffic as it transits the physical infrastructure (wires, cable, switches, routers, etc.) of telecommunications companies like Verizon and AT&T. Upstream collection is distinct from PRISM collection, which reportedly involves collection of emails directly from the companies that provide email services, such as Google and Microsoft.
The goal of the upstream collection is for the NSA to acquire “communications” – individual emails – to, from, or about individuals or organizations targeted for foreign intelligence purposes, such as terrorists or terrorist organizations. The NSA does this by collecting “transactions,” which are essentially streams of raw data extracted from telecommunication infrastructure. The opinion cites an NSA definition of “transaction” as “a complement of ‘packets’ traversing the Internet that together may be understood by a device on the Internet and, where applicable, rendered in an intelligible form to the user of that device.” Thus upstream collection involves the NSA obtaining streams of packets that, if reassembled, could form complete subject lines, email messages, or inbox listings.
The NSA and the FISC distinguish between two types of transactions. “Single Communication Transactions” (SCTs) contain only a single discrete communication: one email, or perhaps the subject line and to/from addresses of one email. “Multi-Communication Transactions” (MCTs) contain more than one communication: multiple emails, or a list of subject lines of emails sent or received to or from an individual, for example. The FISC opinion implies that all of the communications within one MCT are either to or from the same email address.
Prior to this summer’s revelations, it was generally understood that the NSA aimed to collect only emails that were to or from individuals or entities targeted for surveillance. But the targeting guidelines that the government released earlier this summer
Holes in the NSA’s Logic
The FISC opinion arose because the government previously disclosed to the Court that in the process of acquiring MCTs, the NSA is collecting communications that are neither to, from, nor about surveillance targets. What the NSA explained to the Court is that its surveillance equipment extracts transactions, both SCTs and MCTs, that contain identifiers for surveillance targets (“Zawahiri” in the example above). But some MCTs that it collects also contain communications that are neither to, from, nor about any target. NSA cites as an example a user who logs into his email inbox and receives a list of emails, with the sender, subject line, and date of each one listed. If one of the subject lines is a news headline about Zawahiri, or if one of the messages is to or from Zawahiri, the NSA has said that it sometimes collects the entire list of emails in one MCT.
The NSA claims that it collects these other communications within an MCT because its equipment cannot distinguish between discrete communications within an MCT at the point of acquisition. In fact, the NSA explained to the court that it couldn’t even determine whether a transaction contains a single communication or multiple communications at the point of acquisition. As a result, the NSA collects and stores many communications that are neither to, from, nor about a surveillance target. Instead of selecting for retention only the communications within an MCT that contain mention of the surveillance target, it stores them and all other communications within the MCT. Nearly all of these communications, which are collected solely because the NSA claims that it is infeasible to distinguish one communication from another at the point of acquisition, are retained for at least two years. If an analyst searches for and finds a communication within an MCT that is “wholly domestic” – provably to and from persons within the US – then that MCT is deleted unless one of several exceptions applies. But this rule only applies to communications that analysts explicitly search for and that they can definitively determine to be wholly domestic. No other MCTs get purged. Prior to the 2011 FISC opinion, there was no such rule for deletion and all MCTs were retained for at least five years.
There are several reasons to question how the NSA handles MCTs First, how is it possible that the NSA is able to find target identifiers at the point of acquisition, but not the beginning and end of a single communication? Consider the user who logged into his inbox and received the list of emails, one of which contained a news headline containing the name “Zawahiri.” When that inbox information is sent over the network, it begins with some word or symbol (known here as a “delimiter”) to indicate that a list of emails is about to be sent. For example, one popular email protocol uses the word FETCH for this purpose: the user’s email program sends the FETCH command to his email server, and the response from the server will likewise begin with the word FETCH. Every email protocol uses some such delimiter, because this is the only way that email programs can know that they are about to receive a particular kind of information that needs to be displayed to the user – in this case, a list of subject lines, senders, and dates. Similarly, email protocols must use delimiters to indicate the end of each line of the response so that email programs will know where one subject line ends and the next one starts. Otherwise, they would not be able to display the inbox information properly. This is how all email protocols work – and how the vast majority of all Internet protocols work. There must be delimiters sent to distinguish one piece of data from the next so that the software on either end of the communication knows how to interpret the streams of data that it receives.
The NSA’s definition of “transaction” indicates that transactions contain packet streams that, if reassembled, could be intelligible to end user software. Thus the transactions that the NSA collects surely contain the same delimiters that email programs use to interpret the data they receive. The NSA’s argument about the infeasibility of distinguishing one communication from another therefore boils down to this: its equipment is capable of locating the word “Zawahiri” within a transaction, but not the word “FETCH.” As a purely technical matter, this seems implausible. If the equipment can first look for “Z,” then “a,” then “w,” and so on, surely it can do the same for “F,” “E,” “T,” and so on. There is an entire market for deep packet inspection and network analytics products
In theory it is possible that the equipment that the NSA is using to conduct its upstream collection is so poorly designed that it can recognize the names of surveillance targets but not the standard delimiters in use by email protocols or other Internet protocols. If this is the case, the NSA is essentially claiming that its collection procedures are reasonable – in the sense of FISA and the Fourth Amendment – because the agency purchased sub-par equipment. For an agency that spends $2.5 billion annually on data collection (according to The Washington Post), this amounts to no less than laziness masquerading as reasonableness. If the problem lies with poor equipment, the agency clearly has the means to upgrade, but it has declined to do so.
It is possible that MCTs are largely comprised of some sort of niche protocol or data format that causes the NSA’s equipment to be unable to distinguish one communication from another – emails written using unusual characters or languages, or protocols that are not in common use. Nonetheless, the NSA collects millions of MCTs per year – plenty to justify resolving the equipment’s parsing problems if they exist. Alternatively, it may be the case that some services change the way they use delimiters so often that the NSA has decided not to continually upgrade its equipment to be able to correctly parse the communications sent using these services. Again, even if such services are not in wide use, the number of MCTs collected clearly justifies upgrades to keep pace with changes in communications services.
Even if the NSA’s claims that it cannot distinguish one communication from another at the point of acquisition are plausible – because the processing required to do so would take too long to be able to decide in real time whether a particular communication should be acquired or not – why does the NSA retain all of the extra communications within an MCT that are neither to, from, nor about a surveillance target? Surely if it has the equipment capable of finding identifiers like “Zawahiri” (and perhaps millions of other identifiers) in all of the traffic it intercepts in real time, it can afford the much less high-powered equipment that would be necessary to sift through all transactions after they are acquired and stored, and delete the communications that were only incidentally collected in an MCT and contain no foreign surveillance information. But the NSA does not do that. It claims that distinguishing between communications is infeasible, again cloaking its own laziness in claims of reasonableness.
The FISC opinion is laudable for challenging the NSA over upstream collection. However, read from a technical perspective, it leaves the impression of a sophisticated NSA pulling the wool over the much less tech-savvy eyes of the Court. The FISC has publicly conceded
The fact that the NSA can pick out surveillance target identifiers from communications at the point of acquisition but cannot accomplish the much simpler task of finding the beginning or the end of an email suggests that something is amiss in the NSA’s logic. That the agency scoops up millions of extraneous communications and then does nothing to identify and purge them indicates not that the NSA is incapable of conducting its activities reasonably, but that it is not even trying. As one of the only available checks against government surveillance over-reach, it is imperative that the FISC be equipped with the knowledge necessary to identify and challenge these kinds of claims. Should FISC procedures change to provide for a public advocate to argue opposite the government on legal matters before the Court, that advocate should be given the resources to bring to the FISC statements, reports and testimony from independent technical experts to help the court dissect and debunk the government’s technical claims. Moreover, when the government tells the FISC that a sledgehammer it has developed is acquiring so many communications each year that should not be acquired, a technical expert should be available to describe any scalpels the government could use instead.