This is a personal blog. My other stuff: book | home page | Twitter | CNC robotics | electronics

February 04, 2011

So you think *your* capability model is bad?

In his recent post, Brad Spengler mocked the Linux capability system - a somewhat ill-conceived effort to add modern access controls on top of the traditional Unix permission model. Brad noted that most of the CAP_* boundaries are not particularly well aligned with the underlying OS, and not internally consistent - and therefore, much of the resulting granularity is almost completely meaningless: for example, there is no substantial benefit of giving an application just CAP_SYS_MODULE, CAP_MKNOD, CAP_SYS_PTRACE, or CAP_SYS_TTY_CONFIG privileges, as all of these are essentially equivalent to giving root access to the ACLed program.

I thought it would be interesting to engage in a similar thought experiment for the browser environment - after all, it is quickly becoming the equivalent of a complex and powerful operating system for modern web applications.

As far as normal web applications are considered, there is no concept of a globally privileged access level; permissions to access content on client and server side are controlled by four separate, implicit authentication schemes, instead:

  • HTTP cookies (reference):

    • Visibility: explicitly visible to client and server code.

    • Scoping: scoped to the originating functional domain (or subdomain thereof). Can be additionally scoped to a specific document path; this meaningless as a security boundary.

    • Notes: a kludge to allow scoping to HTTPS only is present - the secure flag; this mechanism offers far less benefit than it could, because HTTP and HTTPS cookie jars are not isolated otherwise.

  • Legacy HTTP authentication (reference):

    • Visibility: explicitly visible to server code; sometimes exposed to client code.

    • Scoping: scoped to a protocol-host name-port tuple. In some but not all browsers, additionally scoped to a specific request path; or to a server-declared "realm" string.

  • Client SSL certificates:

    • Visbility: visible to server code only.

    • Scoping: scoped globally in the browser.

    • Notes: in most but not all browsers, user must confirm sending a certificate to a particular destination host name once within a browsing session.

  • Script origin:

    • Visibility: principally visible to client code only; unreliably disclosed to server on some requests.

    • Scoping: origin is defined by a protocol-host tuple; port number is also included in most, but not all, browsers.

  • Notably absent: network context. The information about the circumstances in which a particular credential is established is not analyzed or preserved. Because of the persistence of web content, this poses a significant problem with public wireless networks.
The above set of overlapping credential schemes is then used to build a number of client-side mechanisms with a range of conflicting security boundaries:
  • Subresource loads (reference):

    • Relevant to: the ability to load images, scripts, plugins, frames, and other types of embedded content; and to navigate the top-level window.

    • Security boundaries: this capability is not generally restricted in modern browsers. Certain response types can be read directly across websites; others can be requested, and then examined only indirectly.

    • Interactions: server response can and often will be tied to server-recognized credentials, including cookies, SSL certificates, or client-supplied origin (non-universal Origin or unsafe Referer header).

  • DOM access (reference):

    • Relevant to: the ability to directly access loaded documents through the JavaScript Document Object Model - a method considerably more versatile than the previous scenario.

    • Security boundaries: privilege scoped to origin; when origin is not fully qualified, behavior is undefined. Scope can be expanded to functional domain via document.domain; this has unintended consequences and is usually unsafe.

    • Interactions: access is not tied to any other credentials; for example, replacing or removing cookies does not revoke access from old documents to the new ones, and vice versa.

  • Most types of browser API access:

    • Relevant to: access to browser-managed interfaces such as postMessage(), localStorage, geolocation information, pop-up privileges, and so forth.

    • Security boundaries: permissions theoretically scoped to origin, but Firefox and MSIE currently violate this rule for localStorage and sessionStorage, and scope to host; when origin is not fully qualified, behavior is undefined. Additional top-level window scoping is introduced for sessionStorage. Unlike with DOM access, these permissions are not affected by document.domain.

    • Interactions: access is not tied to any other credentials.

  • XMLHttpRequest API reference):

    • Relevant to: the ability to make almost arbitrary, credential-bearing HTTP requests, and read back raw responses, from within JavaScript code.

    • Security boundaries: permission scoped to origin; when origin is not fully qualified, behavior is undefined. Port number is always compared, even in browsers that do not include it in other origin checks. Scope not affected by document.domain.

    • Notes: access to another origin is possible after a simple HTTP handshake in modern browsers.

    • Interactions: server response can and often will be tied to server-recognized credentials.

  • Web sockets API (reference):

    • Relevant to: a new HTML5 feature in WebKit browsers, allowing scripts to establish long-lived stream connections to arbitrary servers.

    • Security boundaries: scripts can access any server and port after a successful completion of a challenge-response handshake.

    • Interactions: server is provided with requestor's origin information and cookies to authenticate the request.

  • Cookie access (reference):

    • Relevant to: the ability to read or write the document.cookie property.

    • Security boundary: content is scoped to a particular domain, path, and secure flag level, as governed by cookie scoping rules. Cookies may also be tagged as httponly, preventing reads (but not writes) from within JavaScript.

    • Notes: document.cookie has highly asymmetrical write and read behavior; it is possible to overwrite cookies for subdomains, paths, or secure / httponly settings well outside setter's nominal visibility.

    • Interactions: not tied to any other credentials or network context. Substantially incompatible with DOM access boundaries, affecting both schemes: DOM rules make cookie path scoping useless, while lax cookie scoping often undermines DOM origin-based isolation in cookie-authenticated web applications.

  • Password managers:

    • Relevant to: password auto-completion capabilities integrated with most browsers.

    • Security boundaries: stored credentials are scoped to origin, path, and form layout; only the first part constitutes a meaningful security boundary.

    • Notes: in some but not all browsers, an explicit user action needed to expose credentials to the origin.

    • Interactions: incompatibility with DOM access rules makes path and form scoping useless from security perspective. Existing credentials are not taken into account when completing form data. Saved passwords are generally converted to cookie-based credentials by the server using an application-specific mapping.

  • Cache control:

    • Relevant to: implicit and explicit retrieval of previously cached documents when requested by the client-side code.

    • Security boundaries: cached content is scoped to original request URL and POST payload; once retrieved from the cache, it follows the same rules as any fresh response would.

    • Interactions: caches may be shared by multiple users. Cached content is not explicitly tied to any credentials - logging out does not invalidate cached documents, and does not prevent same-origin access later on. If shared proxies are accidentally permitted to cache the response, it may be returned to other users, even though their requests do not bear relevant cookies.

  • Internet Explorer zone model:

    • Relevant to: a proprietary mechanism that allows elevated privileges to be granted to certain content; and to prevent navigation between certain groups of websites.

    • Security boundaries: a mix of origin scoping for explicitly defined URLs; protocol-level scoping for file:// content; and IP, host name, and proxy configuration heuristics for Intranet resources.

    • Notes: local network heuristics can fail spectacularly in certain settings. Zone settings are fairly cryptic and difficult to understand. Users frequently add not-particularly-trustworthy websites to more privileged zones to work around usability problems.

    • Interactions: not consistently synchronized with any other security boundaries. Largely neglects to consider the impact of cross-site scripting flaws.

  • Plugin access:

    • Relevant to: various activities of plugin-delivered active content, which generally shares the HTTP stack, cookie jar, and document cache with the browser; and has DOM access to the embedding page.

    • Security boundaries: a variety of custom, inconsistent models: for example, Java considers all content originating from the same IP as same-origin; Flash glances over redirects and considers their result same-origin with the initial URL. Most plugins also offer multiple ways to negotiate cross-domain access.

    • Notes: plugin origin is derived from the URL from which the code is retrieved; Content-Type and Content-Disposition is usually ignored during this operation.

    • Interactions: largely inconsistent with all other browser security mechanisms.
Operating systems are more complex and more diverse than browsers; but I dare you to come up with an example of a design nearly as messy and dangerous as this. It not just that the set of capabilities is odd, spurious, and sometimes redundant; but that each and every one of them has a slightly different understanding of who you are, and what permissions you need to be given.


  1. I think the closest analogy is the multilevel security model promoted by DoD/NSA from 1985 to 1999 and still alive in some corners. The TCSEC, TNI, TDI, entire Rainbow Series, Federal Criteria, Common Criteria, and countless conferences (including the National Computer Security Conference) just scratch the surface of the legacy of this failed idea. In the end, I suppose you are right though, that the browser situation is worse. At least there was a strong fundamental model by Bell and Lapadula that grounded this work. I don't think the same-origin policy is in the same category.

  2. Linux capabilities have little to do with MLS, whose main subject are data files that have classification tags. Main reason why MLS is not very usable in Linux environment is tagging all files in operating system that is reconfigured, updated etc. Some systems managed to make this more usable by either protecting only subset of files or implementing various "learning modes" (SELinux, AppArmor, systrace). Second thing is that in an operating system we're not that much concerned about confidentiality (Bell-LaPadula) and rather by integrity (Biba and others). And I consider quite a success that Microsoft managed to implement integrity protection based on Biba Windows Integrity Mechanism) in a general-use, consumer level operating system (and I find it quite a shame that they were also first to implement DEP in mainstream distribution :)