Age Estimation Requires Verification for Many Users
Online services can now guess a user’s age with as little as a selfie or a phone number, according to leading age assurance providers. Age estimation providers in particular claim their processes are easy and, because they don’t seek to tie anyone’s identity to their estimated age (or use only existing data to make their guess), privacy-protective. As a result, policymakers are increasingly seeing age estimation as a potential panacea that enables providers to determine a user’s age while preserving their privacy, anonymity, and safety.
Even without legal requirements, companies are increasingly implementing methods to determine users’ ages. For example, just last month, Google committed to using machine learning methods to guess age, including analyzing users’ online activity such as sites browsed and YouTube videos watched. Apple is equipping parents and caregivers with tools to add their children’s ages when setting up an iCloud account, and to age-gate access to apps at the App Store level.
Age estimation methods are little more than educated guesses, which means that they have an error rate. But what’s often unmentioned by age assurance providers is what happens when an error arises. Despite promises, age estimation could easily become secondary verification.
First, what is the difference between age verification, estimation, and assurance?
Age verification, estimation, and assurance are terms used interchangeably but mean different things. “Age assurance” is the umbrella term for a range of methods that an online service provider may employ to determine a user’s age. Methods to confirm users’ ages may include collecting government IDs, accessing mobile phone records, checking credit agencies and private databases, performing biometric analysis, and using algorithmic profiling, all of which require substantial data retention or further collection. Age assurance methods fall under the following discrete buckets:
- Age declaration refers to methods users encounter such as being asked to supply a website with their date of birth or age or to check a box that affirms that they are above 18 (an “age gate”).
- Age verification connotes more rigorous methods to determine a user’s age such as supplying a website with a scan of their drivers license or passport and using third-party databases of digital driver’s licenses or other government IDs to verify a user’s age. It can also require reference checks with credit agencies or utility companies.
- Age estimation methods use machine learning technology to infer or estimate a user’s age based on their characteristics or features (for example: face scanning or voice analysis) or behavior (for example: likes or videos watched). Estimation can also include social vouching (where users are asked to vouch for another users’ age).
When does age estimation become verification?
Some platforms opt for seemingly convenient estimation methods such as analyzing existing user data, as Google says it will do, or conducting facial age estimation. For example, Instagram partnered with Yoti to ask users to submit a video selfie. When they work, these systems are a more convenient and frictionless way to generally assess whether someone is old enough to use a service. But, when these systems fall short, as they often do, users are prompted to share additional and often sensitive information so the platform can confirm their age. Instagram for example requires users to submit a government ID. Roblox asks for a driver’s license, passport, residency card, or any other government-issued identification document with a photo to verify users’ age.
Predictive age estimation techniques may fall short particularly when it comes to teenagers. Age estimation systems often provide an age range to online services rather than a specific age. For adults well into their twenties or beyond, these ranges pose little problem as whether someone is 28 or 32 makes little difference. However, an age estimation system may have trouble distinguishing different ages for teenagers and return an age range such as 15 to 19. Offering this age range may be meaningless for online services trying to enforce legal requirements to grant age-appropriate access based on whether the user has reached the age of 16 or 18. To verify a user’s age more granularly, a teenager may be asked to provide government identification they simply don’t have. Department of Transportation data shows fewer American teenagers are obtaining driver’s licenses at 16 than in previous generations. Other forms of government ID—birth certificates, social security cards, or passports—are typically held by parents, not by the teens themselves. This could lead to a situation where even legally adult 18-year-olds living independently at college might still need parental involvement to download an app or access an online service, undermining their autonomy right when they’re supposed to be developing it.
What’s more, certain groups of adults are more likely to bear the brunt when these supposedly “convenient” estimation methods fail. Nonbinary and trans people are likely to be misclassified by facial age estimation technologies and often do not have access to IDs reflecting their gender and name. People with disabilities that affect their physical appearance often face misclassification, as facial estimation technologies struggle with variations outside their training parameters, and they may be limited from attaining IDs like driver’s licenses as well. People of color are routinely misidentified by facial recognition and estimation technology—something that Yoti’s white paper acknowledges in reporting higher error rates for people with darker skin tones—and consequently may distrust facial scanning systems and prompts to upload more invasive documentation. Finally, people from different socioeconomic contexts, particularly low-income people, and some immigrant communities may lack the documentation and IDs even if they want to supply them—in fact, millions of Americans lack government ID. This creates a troubling pattern: those who don’t fit algorithmic “norms” must surrender more personal data to access the same services or eschew the use of online services that help people access information, seek employment opportunities, and speak freely altogether.
Taken together, these affected groups—people with disabilities, people of color, gender-diverse individuals, and more—represent a significant portion of users, both in the United States and globally. If any of these users’ ages is not correctly detected upon facial scan or is misclassified by a machine learning model, they may be asked to provide their social security number or their government ID. Should an adult be misclassified as a child user simply because they watch a lot of roleplay game reviews on YouTube, they will have to choose between appealing the decision by the service and uploading their ID or foregoing access to the service entirely, if they tell you what age you were classified as in the first place. A recent study conducted in the aftermath of Louisiana enacting an age verification law found that users adapted by moving to services that did not require age verification, at minimum demonstrating their interest in avoiding handing over ID to online services. These decisions put many users in a tricky situation, having to choose between their privacy and their ability to access the benefits of online services.
Companies have an obligation to protect the human rights of all of their users, adults and children alike. Before rolling out new age assurance methods to access vital online services or applications, companies should carefully assess the potential impacts these methods will have on privacy and access to information. They should also advance greater research and development, particularly in internet standards spaces, to ensure that advancements in age assurance serve all users, not just a few.