The LOCKSS Program engineers and maintains open-source software for distributed digital preservation. We have been serving libraries and memory organizations for over two decades.
The classic LOCKSS system (version 1.x) is a mature, vertically-integrated, open-source, distributed digital preservation system developed and maintained by the LOCKSS Program.
LOCKSS 1.x is a Java application for flavors of Linux in the RHEL family, like Rocky Linux, AlmaLinux or CentOS.
The classic LOCKSS system stores preserved content in a POSIX file system, harvests content from the Web with the LOCKSS Crawler, extracts bibliographic metadata from preserved content, and facilitates access to preserved content through a proxy, an OpenURL resolver, and the LOCKSS ServeContent Web replay engine. The system's Web crawling and metadata extraction activities are driven by LOCKSS plugins, and the stored content is preserved by the LOCKSS Poller's state-of-the-art implementation of the LOCKSS polling and repair protocol. The metadata extraction subsystem requires an external PostgreSQL database.
The latest release of the classic LOCKSS system is LOCKSS 1.75.9.
The re-engineered LOCKSS system (version 2.0), also nicknamed LAAWS for "LOCKSS Architected As Web Services", is an upcoming, modular, open-source, distributed digital preservation system developed by the LOCKSS Program.
LOCKSS 2.0 is a containerized application orchestrated by Kubernetes, which can run from a single Linux host using an installable Kubernetes distribution.
The re-engineered LOCKSS system consists of a configurable set of components driven by the LOCKSS Installer and connected to the LOCKSS Configuration Service. The LOCKSS Repository Service stores preserved content in WARC files. Content can be harvested from the Web using a Web crawler registered with the LOCKSS Crawler Service, typically the classic LOCKSS Crawler, or deposited into the LOCKSS Repository Service using its REST API. The LOCKSS Metadata Extraction Service extracts bibliographic metadata from preserved content, and the LOCKSS Metadata Service provides access to it via an OpenURL resolver. Access to the preserved content is facilitated by up to three Web replay engines: Pywb, OpenWayback, and LOCKSS ServeContent. As in the classic system, Web crawling and metadata extraction activities are driven by LOCKSS plugins, and the stored content is preserved by the LOCKSS Poller Service's state-of-the-art implementation of the LOCKSS polling and repair protocol. The LOCKSS Repository Service requires a Solr index and the metadata-related components a PostgreSQL database, both of which can be run as provided containers in the stack (or they can be externally run if desired).
Turtles is a tool to manage LOCKSS plugin sets and LOCKSS plugin registries, written in Python, and available from GitHub.
Cross-PLN Technical Working Group
The Cross-PLN Technical Working Group maintains a GitHub organization, bringing together software made by Private LOCKSS Networks and made available to all PLNs.