Mozilla.com


Offline resources in Firefox

Introduced in Gecko 1.9.1

(Firefox 3.5 / Thunderbird 3 / SeaMonkey 2)

Firefox 3.5 supports the HTML 5 specification for offline caching of web applications' resources; this is done using the application cache -- a collection of resources obtained from a resource manifest provided by the web application.

Firefox 3 note

Parts of this specification were supported by Firefox 3; however, the specification has changed since Firefox 3 shipped, so targeting Firefox 3 for web application support is not covered here.

Terminology

This section defines a few terms that will be helpful to understand while reading this documentation.

cache manifest
A cache manifest is a file that describes the resources that should be cached for offline use.
Client ID
The client ID is the client ID used by the nsICacheService interface when managing the application cache.  This is an opaque string, created when a new application cache is created.
cache group
A cache group is a set of the different versions of a cache for the same cache manifest.
Group ID
The group ID is the URI of a cache manifest file. All caches that share the same manifest are in the same application cache group.

The application cache

Each web application's resource manifest has its own application cache.  Since applications can share resources (and can even share the same manifest URI), each application's cache has a separate copy of each shared resource.  These caches are versioned; each time the site is visited while online, a new version of the application is synched into the application cache.  That new version will be used on the next visit to the site.

How the application cache works

The application cache modifies the process of loading a document; items in the application cache are now loaded directly from the cache, without accessing the network (other than to update the copy of the web application in the cache; this is done in the background and doesn't affect performance significantly).

Application caches can also become obsolete.  If the manifest is removed from the server, Firefox now removes all application caches that use that manifest, then sends an "obsoleted" event to the application cache object.  Then the application cache's status is set to OBSOLETE.

Example work flow

This section offers an example of the work flow of how loading a page works with the cache.  If you're an end user of the caching mechanism, you don't really need to know this, but it's useful if you're developing Firefox or embedding Gecko.

Let's assume you have a cache manifest located at http://www.foo.com/cache.manifest, which includes two resources: test.html and test.png.  Loading test.html results in the following steps being taken:

The HTTP channel calls the nsIApplicationCacheService method chooseApplicationCache() to select the appropriate group ID for the requested resource:

appCache = nsIApplicationCacheService::ChooseApplicationCache("http://www.foo.com/test.html");

This returns the most recent application cache in the group identified by group ID "http://www.foo.com/cache.manifest".  Then, the HTTP channel does something similar to the following:

cacheSession = nsICacheSerivce::CreateSession(appCache.clientID, STORAGE_OFFLINE, true);
cacheEntry = cacheSession.openCacheEntry("http://www.foo.com/test.html");

This creates a cache session and opens the requested resource from within the cache.  From this point on, it loads the data from cacheEntry, but without validation.

When test.html is loading, the docshell tells sub-loads to use the same application cache that was found by the call to chooseApplicationCache(), so that any resources loaded by test.html will also be loaded from the same cache.

Application cache states

Each application cache has a state, which indicates the current condition of the cache.  Caches that share the same manifest URI share the same cache state, which will be one of the following:

UNCACHED
A special value that indicates that an application cache object is not fully initialized.
IDLE
The application cache is not currently in the process of being updated.
CHECKING
The manifest is being fetched and checked for updates.
DOWNLOADING
Resources are being downloaded to be added to the cache, due to a changed resource manifest.
UPDATEREADY
There is a new version of the application cache available.  There is a corresponding updateready event, which is fired instead of the cached event when a new update has been downloaded but not yet activated using the swapCache() method.
OBSOLETE
The application cache group is now obsolete.

Resources in the application cache

The cache always includes at least one resource, identified by URI.  All resources fit into one of the following categories:

Master entries
These are resources added to the cache because a browsing context visited by the user included a document that indicated that it was in this cache using its manifest attribute.
The manifest
This is the resource manifest itself, loaded from the URI specified in an implicit entry's html element's manifest attribute. The manifest is downloaded and processed during the application cache update process. Implicit entries must have the same scheme, host, and port as the manifest.
Explicit entries
These are resources listed in the cache's manifest.
Fallback entries
These are resources that were listed in the cache's manifest as fallback entries. New in Firefox 3.5
Note: Resources can be tagged with multiple categories, and can therefore be categorized as multiple entries.  For example, an entry can be both an explicit entry and a fallback entry.

Master entries

Master entries are any HTML files that include a manifest attribute in their <html> element.  For example, let's say we have the HTML file http://www.foo.bar/entry.html, which looks like this:

<html manifest="foo.manifest">
<h1>Entry</h1>
</html>

If entry.html isn't included in the manifest, visiting the entry.html page causes entry.html to be added to the application cache as a master entry.

Fallback entries

Fallback entries are used when an attempt to load a resource fails.  For example, imagine that there is a cache manifest located at http://www.foo.bar/test.manifest, with the following contents:

CACHE MANIFEST
FALLBACK:
foo/bar/ foo.html

Any request to http://www.foo.bar/foo/bar/ or its subdirectories and their contents will cause a network request to attempt to load the requested resource.  If the attempt fails, either due to a network failure or a server error of some kind, the contents of the file foo.html are loaded instead.

The online whitelist

The online whitelist may contain zero or more URIs of resources that the web application will need to access off the server rather than the offline cache. This lets the browser's security model protect the user from potential security breaches by limiting access only to approved resources.

Note: The online whitelist is ignored in versions of Firefox prior to 3.5.

This lets you ensure that, for example, scripts and other code is loaded and executed from the server instead of the cache:

CACHE MANIFEST
NETWORK:
/api

This ensures that requests to load resources contained in the http://www.foo.bar/api/ subtree will always go to the network without attempting to access them from the cache.

The cache manifest

Cache manifest files must be served with the text/cache-manifest MIME type, and all resources served using this MIME type must follow the syntax for an application cache manifest, as defined here. Cache manifests are UTF-8 format text files and may, optionally, include a BOM character. Newlines may be represented by line feed (U+000A), carriage return (U+000D), or carriage return and line feed both.

The first line of the cache manifest must consist of the string "CACHE MANIFEST" (with a single U+0020 space between the two words), followed by zero or more space or tab characters. Any other text on the line will be ignored.

The remainder of the cache manifest must be comprised of zero or more of the following lines:

Blank line
You may use blank lines comprised of zero or more space and tab characters.
Comment
Comments consist of zero or more tabs or spaces followed by a single "#" character, followed by zero or more characters of comment text. Comments may only be used on their own lines, and cannot be appended to other lines.
Section header
Section headers specify which section of the cache manifest is being manipulated. There are three possible section headers:
Section header Description
CACHE: Switches to the explicit section. This is the default section.
FALLBACK: Switches to the fallback section.

Firefox 3.5 note

The fallback section is ignored by versions of Firefox prior to 3.5.

NETWORK: Switches to the online whitelist section.

Firefox 3.5 note

The online whitelist section is ignored by versions of Firefox prior to 3.5.

The section header line may include whitespaces, but must include the colon in the section name.
Data for the current section
The format of data lines varies from section to section. In the explicit section, each line is a valid URI or IRI reference to a resource to cache. Whitespace is allowed before and after the URI or IRI on each line.

Cache manifests may switch back and forth from section to section at will (so each section header can be used more than once), and sections are allowed to be empty.

Note: Relative URIs are relative to the cache manifest's URI, not to the URI of the document referencing the manifest.

A sample cache manifest

This is a simple cache manifest for an imaginary web site at foo.com.

CACHE MANIFEST
# v1
# This is a comment.
http://www.foo.com/index.html
http://www.foo.com/header.png
http://www.foo.com/blah/blah

In this example, there is no section header, so all data lines are assumed to be in the explicit section.

The "v1" comment is there for a good reason. Because the cache is only updated when the manifest changes, if you change the resources (for example, updating the header.png image with new content), you need to change the manifest file in order to let the browser know that it needs to refresh the cache. You can do this by any tweak to the manifest, but having a version number is a good way to do it.

Specifying a cache manifest

To tell Firefox to use offline application caching for a given web site, the site needs to use the manifest attribute on the html element, like this:

<html manifest="http://www.foo.com/cache-manifest">
  ...
</html>

This causes Firefox to display the notification bar the first time the user loads your application saying "This website (www.example.com) is asking to store data on your computer for offline use. [Allow] [Never for This Site] [Not Now]". The term "Offline(-enabled) applications" sometimes refers specifically to applications allowed by the user to use the offline capabilities.

The update process

  1. When Firefox visits a document that includes a manifest attribute, it sends a checking event to the window.applicationCache object, then fetches the manifest file, following the appropriate HTTP caching rules. If the currently-cached copy of the manifest is up-to-date, the noupdate event is sent to the applicationCache, and the update process is complete.
  2. If the manifest file hasn't changed since the last update check, again, the noupdate event is sent to the applicationCache, and the update process is complete. Again, this is why if you change the resources, you need to change the manifest file so Firefox knows it needs to re-cache the resources.
  3. If the manifest file has changed, all files in the manifest -- as well as those added to the cache by calling applicationCache.add() -- are fetched into a temporary cache, following the appropriate HTTP caching rules. For each file fetched into the cache, a progress event is sent to the applicationCache object. If any errors occur, an error event is sent, and the update halts.
  4. Once all the files have been successfully retrieved, they are moved into the real offline cache atomically, and a cached event is sent to the applicationCache object.

Storage location and clearing the offline cache

The offline cache data is stored separately from the Firefox profile -- next to the regular disk cache:

  • Windows Vista/7: C:\Users\<username>\AppData\Local\Mozilla\Firefox\Profiles\<salt>.<profile name>\OfflineCache

The current status of the offline cache can be inspected on the about:cache page (under the "Offline cache device" heading).

The offline cache is not cleared via Tools -> Clear Recent History (bug 538595)
The offline cache is not cleared via Tools -> Options -> Advanced -> Network -> Offline data -> Clear Now (bug 538588).
The offline cache can be cleared for each site separately using the "Remove..." button in Tools -> Options -> Advanced -> Network -> Offline data.

See also clearing the DOM Storage data.

Page last modified 11:35, 9 Jan 2010 by Nickolay

Files (0)