Content control and capabilities in Mozilla
Mike Shaver
Draft: April 8, 2001

Note to the reader

This is a forward-looking document: it describes a state of affairs in Mozilla that is not, as of the time of this writing, consistent with source-tree reality.  For updates on the state of various work items associated with achieving this content-control Nirvana, please consult the content control tracking bug.

Overview

For a variety of reasons, including accessibility, privacy, security and personal preference, it is desirable that Mozilla expose a set of APIs to users and embedders for controlling how it interacts with various types of content.  The types of interaction can be described in terms of three questions that Mozilla must, with assistance from its components, answer:

Should I load this content?

This question can be meaningfully asked of top-level URLs entered in the location bar or seen as the target of a redirection command from JavaScript, HTTP headers or HTML <META> tags.  It is also useful to ask it about content referenced from HTML and XML content, such as <IMG>, <FRAME>, and <SCRIPT SRC="..."> tags.  If the answer to this question is no, then Mozilla should not make a network request to fetch the content, and layout should proceed without it (honouring provided width and height attributes, if present).

Should I process this content?

Once content has been loaded, or for content such as <SCRIPT> and <META>, which can be found inline in a document, we must decide whether or not to interpret or otherwise act on the content.  If the answer is no, Mozilla should simply ignore the contents of the tag or document, though it should be left intact in the document tree.

What is this content permitted to access?

For some types of content, particularily executable content such as Java and JavaScript, it is useful to restrict the capabilities of the content without preventing execution/interpretation altogether.

nsIContentPolicy and the Capabilities Manager

Mozilla's mechanisms for controlling and restricting content fall into two categories: nsIContentPolicy handles the first two questions ("Should I load this content? Should I process it?") and the Capabilities Manager handles the last ("What can this content do?").  A brief description of the nsIContentPolicy API follows:

[scriptable,uuid(1cb4085d-5407-4169-bcfe-4c5ba013fa5b)]
interface nsIContentPolicy : nsISupports
{
  const short OTHER       = 0;
  /**
   * <SCRIPT>
   */
  const short SCRIPT      = 1;

  /**
   * <IMG>, or background images.
   */
  const short IMAGE       = 2;

  /*
   * <STYLE>.
   * XXX should this control processing of STYLE= attributes?
   */
  const short STYLESHEET  = 3;

  /*
   * <APPLET>, <OBJECT>, <EMBED>
   */
  const short OBJECT      = 4;

  /*
   * <FRAME>, <IFRAME>, <OBJECT TYPE=text/html>
   */
  const short SUBDOCUMENT = 5;

  /*
   * <LINK>, <META>, HTTP refresh, etc.
   */
  const short CONTROL_TAG = 6;

  /*
   * Network request without document context.
   */
  const short RAW_URL     = 7;

  /**
   * Should the content at this location be loaded and processed?
   */
  boolean shouldLoad(in PRInt32 contentType, in nsISupports context,
                     in nsIURI contentLocation);

  /**
   * Should the contents of the element in question be processed?
   */
  boolean shouldProcess(in PRInt32 contentType, in nsISupports context,
                        in nsIURI contentLocation);
};

Implementors of nsIContentPolicy register themselves in the "@mozilla.org/layout/content-policy;1" category, and are then called from the content policy manager in turn.  A false return from a content policy object will veto the loading or processing of the content in question, and no further objects in the list are consulted.  At present, there is no way to control ordering of the content policy list, or to register for only certain content types: all registered content policy objects are consulted for each potential veto, in arbitrary order (likely that of registration).

In the case of a contentType of RAW_URL, the context argument will be a the network request, and should be QueryInterfaced to nsINetworkRequest. In other cases, the context argument will be a DOM element (nsIDOMElement ) or a network header (nsIResponseHeader), as appropriate.

For per-window content-policy, such as preventing of script execution in mail, a content policy object gets the current window for the content and queries it for nsIContentPolicy. (XXX more thought required for that.)

Get Mitch to help me write the caps stuff.