Making third-party hosted scripts safer with Subresource Integrity

Websites routinely include third-party hosted resources - images, scripts, stylesheets and so on. It's now a standard practice. One thing to keep in mind is that if your website includes a JavaScript script from another site (example.org):

<script src=“https://example.org/CoolLib.js"></script>

technically speaking you’re relying on the security level of the script provider’s server example.org. In case example.org would start serving malicious content, your site including this content might be affected. In other words, if your site relies on scripts hosted by a third-party, and this third-party gets compromised, your site is in trouble.
This is a legitimate risk. We’re speaking about data loss, defacement, and so on. That’s why including files from reputable CDNs (such as Google - let's say it’s pretty difficult to hack Google) not only increases performance, but also security.

To counter this real threat, W3C came up with a neat thing called Subresource Integrity (SRI). Resources can be annotated with an integrity information, and in the event of a malicious modification of the injected content, modern browsers will simply refuse to include such a resource.

It’s a really simple tag extension:

<script src=“https://example.org/resource.js” 
integrity=“sha256-ZZ1+PnEb/tZuGbksyz7RzvsyZ8YRLgsVIQBinUe8dyk=” crossorigin=“anonymous”>
</script>

The additional “integrity” attribute is the integrity metadata. The first part of the attribute (“alg”) is the cryptographic hash function algorithm, and the second part (“val”) is digest.

Algorithm part

In modern browsers, you’ll find all the popular choices, such as SHA-256, SHA-384 or SHA-512. The W3C specification and the standard itself needs to be future-proof so it’s highlighted that web browsers should offer safe hash functions, according to the latest knowledge and update the list of functions if relevant.

Hash collisions are definitely in scope of SRI’s threat model - if a hash function turns out to be weak, it will have an impact on SRI security guarantees. That said, modern hash function design appears to be strong, with some knowledgable people saying there are chances that modern constructs will (maybe) never be broken.

That said, considering the lifecycle of a typical web project - I don’t think that being overly concerned about algorithm strength in case of SRI is justified. So just pick your favourite algorithm.

Value part

The value part (“ZZ1+PnEb/tZuGbksyz7RzvsyZ8YRLgsVIQBinUe8dyk=“ above) is a Base64 encoded digest of the included file (resource.js here). In my case, the file contents are:

{'value': ‘secret'}

Computing values

Computation is easy:

~$ shasum -a 256 resource.js | xxd -r -p| base64 ZZ1+PnEb/tZuGbksyz7RzvsyZ8YRLgsVIQBinUe8dyk=

That’s it.

How it works

Assume we’re including a resource.js from a remote location (i.e. a CDN). Let’s suppose the contents of this file have changed for whatever reasons (malicious or not). The cryptographic digest verification then will simply fail (unless it was updated) and the script won’t be executed:

Failed to find a valid digest in the 'integrity' attribute for resource ‘http://example.org/resource.js' with computed SHA-256 integrity 'ZZ1+PnEb/tZuGbksyz7RzvsyZ8YRLgsVIQBinUe8dyk='. The resource has been blocked.

And that’s it.

SRI does not generally protect from MITM

Things SRI generally doesn’t protect are man-in-the-middle (MITM) attacks. Suppose that user’s browser connects to a website. Any intermediary, such as proxy (for example - at the user's hotel) can intercept and modify the included page - meaning the script content may be modified and the integrity attribute(s) in any tags may simply be changed accordingly.

SRI does not protect against MITM (oddly, people sometimes appear to claim otherwise), the one that TLS protects from; SRI an integrity mechanism meant for higher layer.

Abusing SRI+CORS?

You should realize that having your website configured to enable broad cross origin resource sharing (CORS) and serving sensitive content creates a potential risk. In this case, sub-resource integrity might also simply be used to brute-force the contents of the served files. Suppose you serve customer.js, which is a JSON file:

{secret: 123}

The attacker might simply brute-force the contents by trying to inject the script with dynamically changed integrity attribute. In the simplest model, the attacker could try to compute the digest of {secret: 1} and continue towards 123, eventually successfully learning the value of secret.

Not that there aren’t other options for obtaining the contents. Just make sure your CORS is fine anyway!

What you need to keep in mind is not simply turning on broad CORS as a response to this message from your browser during app development:

Subresource Integrity: The resource ‘http://example.org/res.js' has an integrity attribute, but the resource requires the request to be CORS enabled to check the integrity, and it is not. The resource has been blocked because the integrity cannot be enforced.

Future

SRI is pretty much working, but there are possible extensions coming.

Signatures

At the moment, integrity is a simple output of a cryptographic hash function - it provides guarantees about the resource content (authenticity). Adding cryptographic signatures would have the neat functionality allowing easy discovery of the script source (provenance) - like, who is its author (“where does this come from”). This exact thing is debated now.
This addition won't be standardized anytime soon, although Chrome is shipping it.

But aside from discussing the merits of this extension, I would just acknowledge that it could offer an interesting new tool for internet privacy/security measurements. In particular, new ways of linking the source of content and answering the "who maintains/does this".

As mentioned in the debate surrounding the development, one of the issues would be the ability of forcing the use of obsolete library versions (for example with known security issues) by malicious attackers. Mitigating that particular point would require a slight complexity increase of an otherwise very simple SRI.

New tags

Currently, SRI only works for <link> and <script>. In future, you might expect an extension to other tags where integrity might be useful such as: a, audio, embed, iframe, img, link, object, script, source, track, video.

I especially like the idea of extending the <a> tag, which would be useful to prevent downloading malicious files presented ad “legitimate” from (e.g.) compromised sites. As for some of the others, they would not only improve security but also make some variations of Rickrolling attacks more difficult.

Summary

SRI is an important web security tool. Not a panacea for everything, it does one job and it does it right. It’s also an interesting example of applied cryptography on the web in its highest layers.