Jump to: navigation, search

Contents

[edit] Serving Content to Users

Content preserved in a LOCKSS cache may be served to users by either a proxy server1 or a direct server.

[edit] Proxy

The proxy provides transparent access to preserved content. It requires either a user's browser to be configured to send (selected) proxy requests to the cache, or a departmental or institutional proxy server to be configured to do the same for a community of users. Users may then access content at its original URLs.

The box normally defers to the publisher site, and serves its own preserved content only if there is no newer content available from the publisher. That is, each request received is first forwarded to the publisher's site, and if the publisher responds with content, that response is sent back to the user. If the publisher does not have the requested content or fails to respond within a short time, or if the box has an up-to-date copy, the box serves its preserved copy. The process is transparent to the user.

[edit] Direct

The direct server requires no browser or institutional setup. Content is accessed at URLs using the hostname of the LOCKSS box. The pages that are served have their internal links rewritten to point to the copies of the target pages stored on the LOCKSS box. Integration with SFX and other link resolvers will be available in early 2011.


[edit] Verifying Stored Material

It is likely that you will wish to confirm that your LOCKSS box has correctly retrieved and stored an Archival Unit. There are two methods you can use to monitor the content stored.

[edit] Viewing the Status of an Archival Unit

From the cache administration interface, select Daemon Status -> Archival Units. You will be presented with a list of the Archival Units (AUs) currently stored in your LOCKSS machine. Selecting the link corresponding to an Archival Unit will display the current status of that Archival Unit.

Information of note here is the Disk Usage, the current Status, whether the AU is still Available From Publisher, and the tree of individual files and directories that have been stored for the Archival Unit. You can also view individual files stored in your LOCKSS machine from this page by selecting individual NodeUrls. However, note that you will not be able to navigate between linked pages contained in the same Archival Unit. Currently, the LOCKSS Content Audit, described below, is required for this.

[edit] Auditing a LOCKSS Box's Contents

In normal use, the LOCKSS proxy server transparently serves either locally stored content or content from the publisher (if available), depending on which is more recent. In order to facilitate auditing, an alternate proxy server is available. The audit proxy serves only locally stored content, never from the publisher, so makes it easy to see what content is actually present in the box.

To use the audit proxy, go to Content Access Options / Content Server Options in the admin UI. Check the "Enable audit proxy" box, choose a port that's not in use (or filtered by packet filter rules) then click Update Content Servers. You may also need to add the IP address of the browser you will be using, in the Allow Access list on the Content Access Control page.

Configure a web browser to proxy all HTTP requests to the designated port on the LOCKSS box. (This will make the browser unable to fetch any content *not* preserved in the box.) Now when you fetch a URL that is preserved in the box you will see its contents, but any other URL will return "404 Not Found". Pages that are displayed may have broken links or missing images, as some pages point to resources that are not part of any preserved AU (eg, off-site links). The audit proxy will return a 404 error for these. When you are finished auditing content, you will need to restore your browser's proxy configuration to its original settings, and you may wish to disable the audit proxy on the LOCKSS cache.


[edit] Proxy Integration

See Proxy Integration