Privacy of IPv6 IP address configuration - what can the tale of 20-years old privacy bug teach us?
IPv6 is a big enough topic that any short writing obviously cannot account for all the issues. But it’s important to realize that at its core, the IP version 6 (which was/is supposed to replace IPv4 at some point) is modern technology at its core from the ‘90s (1995). Like many internet technologies, for example those designed in the ‘80s, IPv6 was not designed with privacy in mind. It is a paradox that the slow pace of deployment (we’re still mostly on the IPv4 in 2020) was an opportunity to identify the security and privacy issue and to address them in the 2000s and 2010s.
IPv6 address configuration problem
Autoconfiguration of the IP interface is probably the best example of such privacy flaws. To put it simply, IPv6 is to make it possible to auto-configure the interface - to “auto-assign” an IP address. Due to unknown reasons, in '90s a design choice was made to include the full device MAC address in such an IP (v6) address (I’m making this description simple for clarity reasons!). But MAC addresses are statically assigned to devices, they generally do not change. Including this identifier in the IP would make the devices (and users) leaving quite a unique trail in many used networks. This trail would link the actions of the users across networks. Perhaps this was not an issue in 1995, but it certainly is a problem today with today's use patterns (wifi, roaming, etc.). We can easily imagine users leaving trails at home, while commuting, when using the wifi in the cafes, and so on. This is not a question of hypotheticals, it is very realistic. Fortunately, the slow rollout of the IPv6 was an opportunity to identify and fix this issue.
Civilising IPv6 configuration
Fortunately, the problem was identified quite early. There are actually many proposed fixes. Some include early suggestions to “pseudo-randomize” this “static” part of the IP (v6) address ('0s; RFC 3041 RFC 4941). Others advocated for the generation of IPs with the use of quite heavy cryptographic methods (RFC 3972). Others still, favor cryptographically-secure hash functions that would take the device MAC and a bunch of other input (like SLAAC using the per-device 128bit secret key for the hash function computation; which is unfortunately now already obsolete due to the consideration of quantum risks).
Summary
It was never necessary to include a static identifier in the publicly-exposed IP addresses in the first place. Yet this was the chosen approach. What is perhaps more puzzling is that it took around 20 years to identify and propose an effective fix. And even today we do not know what strategy is followed in practical deployments (including on Android or iOS).
Fortunately, IPv6 did not “ship” in 2000 and the specification and technologies could benefit from the extra time to find and address concerns. But developing technologies on such huge timescales is generally not sustainable. So this should rather not be taken as a general recommendation. Moving faster is much better, and doing so carefully still allows to identify and address problem.
Did you like the assessment and analysis? Any questions, comments, complaints or maybe even offers? Feel free to reach out: me@lukaszolejnik.com