Blockchain and data protection (GDPR)
Protecting personal data when using blockchain technology is tricky, may be full of surprises, and cumbersome. According to the European Data Protection Board opinion, which this assessment and advice is inspired with, the inherent characteristics of blockchain—like immutability or decentralisation—pose serious compliance issues with the General Data Protection Regulation (GDPR). These features make it difficult to check the data protection principles boxes like storage limitation, accuracy, and the right to erasure.
Technically speaking, a blockchain is a type of distributed ledger technology that maintains a synchronised and tamper-evident record of transactions across multiple nodes. It’s a database of linked records where each transaction is grouped into a block, which is cryptographically linked to the previous one. Consensus algorithms ensure that all participants agree on the validity of new blocks.
Blockchain is not a magical wand made of buzzwords. GDPR still applies in full. Storing personal data directly on-chain is really bad. The EDPB is unambiguous on this point: data that is not strictly necessary should never be written to a blockchain, directly identifiable data should be avoided entirely. Once recorded, on-chain data is typically immutable and distributed, making modification or deletion technically complex or impossible. This is also unfortunate, as it may breach the GDPR. Even if changes are theoretically possible, they would require consensus across all nodes—an unrealistic scenario in practice. Not to mention that the historical data could still be available, because why not. But there’s more.
The most tricky place is Privacy by Design.
Personal data must not be accessible to an indefinite number of recipients. This requirement stems directly from data protection by design and by default. Public blockchains, by their nature, expose data to an unrestricted audience.
Let’s also not forget about smart contracts, the structures able to automate activities when certain conditions are met. These must be assessed if they lead to automated decision-making (Art. 22, GDPR) if used to make decisions with legal or similarly significant effects.
Although blockchain can theoretically offer high integrity and availability via cryptographic tools and decentralised storage, these guarantees are not standardised in practice. Moreover, the technology’s irreversibility and lack of rollback mechanisms introduce significant legal and ethical risks, particularly in personal data contexts. Once implemented, blockchain systems leave little room to undo past decisions or adopt them incrementally, which increases the stakes for getting the design and governance right from the start.
Used improperly, blockchain technologies may infringe key GDPR principles such as data minimisation, accuracy, and storage limitation—particularly due to their inherent immutability, persistence, and replication across distributed systems. By default, there's no good reason to store private user data in a public chain forever. The principle of data minimisation applies with full force here: any processing should be limited to what is necessary, and permanent, public exposure of private data should be avoided unless explicitly justified. Any uses of blockchain must document rationale for choosing blockchain over other technologies. The EDPB explicitly calls for controllers to demonstrate the necessity and proportionality of using blockchain, especially when personal data is involved. This means documenting alternative options considered and explaining why blockchain was selected despite its limitations. In this view of things, the use of blockchain is seen quite as being suspicious in itself. That’s likely a side-effect of the years of buzzword-led PR campaigns or attempts to deploy blockchains in some unfortunate uses, not necessarily merely e-voting.
However, despite EDPB concerns around cryptographic strength, and their mention of "the possibility of encryption algorithms being broken", which I consider an extremely unlikely scenario—actually, practically impossible, I’m not worried about algorithms being broken. Modern encryption schemes are strong. Cryptography remains one of the most stable and reliable aspects.
Closing suggestion: perform Data Protection Impact Assessments (DPIAs) before deployment. The use of blockchain may qualify as high-risk processing due to its complexity, scale, and potential for long-term impact on data subjects. DPIAs should be used not only to assess legality and proportionality but also to decide (prove!) if blockchain is the most suitable technical solution in the first place. In other words, in this case the DPIA would be a form of (!) justification for the administrator but also to the Data Protection Authority. That’s right: the use of blockchain should be justified.
Below I extract a checklist that guarantees compliance of blockchain deployment or design with EU data protection law (GDPR).
Checklist for building blockchain systems with data protection by design
Architecture & Documentation
- Have you identified whether the blockchain will process personal data?
- Have you documented why blockchain is a necessary and proportionate solution?
- Have you considered alternatives to blockchain (e.g. a database)?
- Have you justified the choice of blockchain type?
- Have you described technical and organisational safeguards?
Data Location
- Are all additional personal data stored off-chain, beyond what’s unavoidable?
- Have you documented what remains on-chain?
Transparency & Information
- Are data subjects informed before data is added to the blockchain?
- Is the information easily accessible at any time?
- Do data subjects understand their rights and how to exercise them?
Data Protection by Design, Default & Minimisation
- Is the amount, visibility, and duration of data strictly limited to what is necessary?
- Have you minimised on-chain payloads and metadata exposure?
- Are all GDPR principles (minimisation, accuracy, limitation) supported from the start?
- Is public accessibility of personal data prevented by default?
- Are trust mechanisms in place?
- Are software and infrastructure components vetted for security and reliability?
Legal Basis
- How much did you pay for the legal advice and the paperwork? Was this the only thing that you actually did? Is this checklist being filled by a LLM like ChatGPT?
Vulnerability & Incident Management
- Is there a plan for disclosing and responding to software vulnerabilities?
- Are breach notification and response procedures established?
Rights & Consent
- Is there a technical mechanism to remove off-chain data if consent is withdrawn?
- Is consent not relied on where erasure or modification is not possible?
- Can all rights (access, rectification, erasure, objection) be fulfilled?
- Is there a mechanism to render data anonymous if full erasure isn’t possible?
- Are all principles (minimisation, accuracy, limitation) embedded from the start?
- Is public accessibility of personal data prevented by default?
Data Retention
- Is the retention period defined and proportionate to purpose?
- Can the data be deleted or effectively anonymised when no longer needed? If not, is no personal data stored on-chain?
Security & Cryptographic Resilience
- Have security risks been assessed and mitigated for both on-chain and off-chain components?