Delete is a word built into the vocabulary of users from the beginning of personal computing. When commanded to “del,” an operating system appeared to erase a file completely. However, right from the start, a user’s commonsense understanding of the command to “delete” differed from companies’ practices; rather than erasing a file, “delete” meant “put in the recycle bin.” “Deleted” files were not really gone but rather out of sight, available to be recovered if necessary. The rise of cloud computing, where files live remotely from their owners’ devices and are frequently accessible from multiple devices, has further muddied the concept of deletion by saving all of a user’s files in an ambiguous location until called upon. While this change may seem trivial, it represents a larger truth about our digital files: they are almost never really “deleted.”
In this paper, we argue that companies should reconsider their concept of deletion and implement sound technical and policy processes to formalize their practices. We believe there comes a point when the value of data has been extracted and the costs (both operational costs and potential for liability) of retaining data outweigh the potential benefits of keeping it. While the price of physical storage may be plummeting, data management costs continue to grow. Data breaches are ubiquitous and massive, and show no sign of abating. Retaining large amounts of data greatly increases the potential harms that could result from a data breach; the more robust a database is, the more appealing it is to malicious actors. Headline-grabbing breaches of major retailers, financial institutions, healthcare providers, and even government agencies have damaged companies’ reputations, exposed individuals to identify theft and embarrassment, and undermined trust in both institutional and organizational security efforts.