The pitfalls & potential of pseudonymized data strategies within organizations
The most valuable human-made asset of today is likely data. For proof, look to the complex regulations and standards expanding into the market, or the hundreds of millions of dollars in ransomware payments made every year. In response, data gathering strategies have multiplied, with data pseudonymization arising to help entities maintain quality datasets without compromising individual privacy.
Too often, however, pseudonymization deployment achieves only a subpar quality, leaving data subjects exposed to breaches of confidentiality.
So, how can organizations ensure high-quality, effective pseudonymization? What are the implications and benefits of opting for this strategy? What is the role of pseudonymization in today’s data-based economy? Why is it so essential to overall progress, both in the private and public sector?
An overview of data security strategies
Data minimization, data masking, anonymization, pseudonymization…Each brings unique advantages and disadvantages: In data minimization, for example, entities minimize risk and simplify data management by retaining only essential data for a minimal amount of time. Irrelevant data gets tossed immediately.
For cases in which data collection is unavoidable, two main techniques rise to the top: anonymization and pseudonymization. Their difference lies in how they manage personal information—names, ID numbers, physical features, phone numbers, location and more.
Anonymization involves permanently deleting personal information from data, making reidentification forever impossible. When personal information is not deleted but instead replaced by meaningless information (pseudonyms), we generally refer to this data as pseudonymized.
GDPR defines pseudonymization as “the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person…”
The role of data pseudonymization
Anonymization’s appeal lies in its limited regulatory burden. With no personal data to protect, organizations are released from GDPR responsibilities. In return, however, they sacrifice potentially valuable datasets, along with the ability to reidentify data subjects.
Let’s take a real-life Parkinson’s Disease Study. This data, originating in the Netherlands, was made useable by researchers around the world through a polymorphic approach in which each team had its own local pseudonyms. When team’s accessed shared information via the Personalized Parkinson’s Project, it would appear under their own pseudonyms.
This scenario presents common challenges:
- Reliance on data sharing: The best results require global collaboration, complicating security
- Reliance on datasets with identifying characters: Such as DNA and highly personalized wearable device data
Pseudonymization makes it possible to utilize necessary datapoints and ensure reidentification (e.g. link-ability): If improvements are made to diagnosis or treatment during the study, or if the study results in actionable information, all of the data subjects can be re-identified and informed.
As part of GDPR, data subjects have “right of access.” In this case, they cannot view the data online, but can visit and view it via an onsite repository. Should they choose to withdraw their personal data, it will stay stored for the sake of study repeatability but will be excluded from future studies.
The advantages of pseudonymization extend beyond medical sciences and healthcare:
- Smart cities: Analyze urban planning data without risking citizen privacy
- Financial services: Protect customers’ sensitive information during transactions and investigations
- Retail: Analyze customer behavior without compromising identities
- Human resources: Analyze overall workforce performance without exposing employees
- Education: Monitor learning outcomes or demographics without disclosing students
- Telecomms: Study & optimize network performance without revealing individual users
- Government: Review programs, policies & economic indicators while protecting citizens’ rights to privacy
The chosen approach depends entirely on context. In the case of pseudonymization, a best practice for one organization might prove highly irresponsible and risky for another. Keys&More’s pseudonymization service line arose from the realization that these decisions require support from cryptographic experts, along with the technological capability to provide customized solutions.
Cryptographic algorithms: building blocks of pseudonymization
Cryptography plays two main roles in pseudonymization:
1. Furnishing the algorithms used as the basis for pseudonymization, regardless of the approach: random number generator, counter, keyed-hash, encryption, etc.
2. Ensuring confidentiality of the pseudonymization table: Secures information to prevent reidentification, protect data subject privacy and, thereby, meet GDPR requirements
If the building blocks (i.e. algorithms) are low quality, then data subjects can be identified, proving catastrophic to an organization, especially one acting in service to the public.
Subpar random number generators can result in numbers that are not in fact random and, when enough datapoints are gathered, linkable to data subjects.
Keys&More provides algorithms that meet Germany’s robust BSI standards, and has a proven track record as a trusted public and private sector partner specializing in cryptography: Key Management Systems (KMS), PKI and Identity Pseudonymization Management Systems.
Management system for identity pseudonymization
In the same way that encrypted assets are only as secure as their corresponding encryption keys, pseudonymized data relies on the secure management of reidentification information and strong encryption algorithms.
In order to meet compliance standards (1) pseudonymized data cannot be used on its own to identify someone, (2) information for reidentifying data subjects must be stored separately, and (3) measures must be implemented to secure that reidentifying data.
In terms of these measures, leading regulations and standards generally demand some form of demonstratable cybersecurity system, complete with security processes and policies, risk assessment and crisis response plans.
Management systems—whether key management (KMS), identity pseudonymization management or verifiable credential management—play a central role in meeting regulatory requirements by upholding organization-wide policies and consistency.
Partnering for data security driven by gold-standard cryptography
Keys&More partners with entities to provide end-to-end cryptographic services:
- Assess risk and need
- Identify approach, drawing on the full spectrum of data security innovations
- Develop customized solutions, incorporating business objectives and holistic context
- Support deployment
- Provide high-caliber advising services
As experienced encryption partners, we help organizations and institutions, not only achieve compliance, but get the most out of their cybersecurity investments by streamlining their operations: removing redundancies, cutting costs, simplifying audits and boosting efficiency.
For many organizations, data management is an obligation on the way to their end goal, whether that be medical breakthroughs, technical innovation or product development. An innovation partner with unique expertise in cryptography ensures that they use cybersecurity improvements as stepping stones to a stronger core business.