Schools collect a lot of data about students. Of course data can be a valuable tool for improving student outcomes, for instance, by identifying students who are at risk of dropping out, allowing teachers to intervene early on. (If you’re curious about how else schools put data to work, check out the Data Quality Campaign’s Data-Rich Year.) However, that same information can pose a substantial risk to students and their families if it’s not managed well, and managing that information is no simple task. For example, take the seemingly straightforward topic of data deletion. Just drag a file to the trash and empty the trash, right? Unfortunately, deleting data is much more complicated, with a number of important policy, legal, and technical considerations.
You don’t have to look too hard to find instances where student data was exposed because information that should have been deleted was not. For example, in New Orleans a school district sold laptops containing the names, addresses, and Social Security numbers of 210 students, putting them at risk of identity theft or phishing attacks. And the danger doesn’t just apply to electronic records. In Tulsa, a woman discovered student records including both academic and personal information alongside other discarded school supplies.
One way educational institutions can minimize risk for students and families is to cut down on how much data they collect and store (an approach often called “data minimization”), and keep only what they really need. After all, if the data doesn’t exist, it can’t be misused or breached. And there are other benefits to minimizing data: large data sets are expensive to maintain and search. Also, as data sets grow more complex, updating and searching the data becomes more complex, making it more difficult to separate the signal from the noise.
Students who want to pursue higher education need their high-school transcripts, whether they graduated one year ago or 50.
Unfortunately, in the education context, minimizing data is a complicated task. Institutions have very good reasons to retain some types of data for a very long time. For example, students who want to pursue higher education need their high-school transcripts, whether they graduated one year ago or 50. Schools have to find a way to balance retaining and deleting data so they can provide students with the services they need, but limit the risk as much as possible.
Further complicating the difficult balance between data minimization and data retention is a convoluted legal landscape. Institutions have to comply with a thicket of both federal and state laws that don’t always fit together smoothly. At the federal level, the Family Educational Rights and Privacy Act (FERPA), the Children’s Online Privacy Protection Act (COPPA), and the Individuals with Disabilities Education Act (IDEA) all govern handling students’ data. There are state laws to consider as well, which makes the picture even more chaotic. California, for example, has at least nine laws that impact how institutions must manage student data, both education-specific and more general consumer and child privacy laws. As with the federal laws, it’s not always obvious how the laws interact with one another, or what to do if they seem to conflict.
Let’s suppose a school does manage to navigate these waters, and settles on a solid retention/deletion plan. Well, there’s still one more wrench to throw in the works: deleting data is more technically complex than it might seem. Deleting a file isn’t as simple as dragging to the trash or recycle bin and then emptying the bin. That approach actually leaves the data fairly susceptible to recovery; that is, simple forms of deletion can often be simply reversed. More care must be taken. There are other approaches (like overwriting data or destroying the storage media itself) that provide better security, but these are more expensive or time consuming. In addition to determining when to delete what data, educational institutions also have to consider how technically to delete that data.
But have no fear! CDT is developing recommendations for practitioners, and the companies that work with them, about how to balance data retention and deletion that maximize the value of data and technology while protecting students’ privacy. Stay tuned for more materials in early 2019. In the meantime, try to remember to forget.