Speaker
Description
Long-term digital preservation of scholarly publications, research datasets, and cultural-heritage collections requires redundancy, integrity verification, and automated recovery distributed across independently administered institutions. Yet existing systems force a choice: centralized archives like LOCKSS and Portico provide strong preservation guarantees but depend on trusted nodes, manual configuration, and passive dissemination, leaving them vulnerable to institutional failure, governance fragility, and single points of control. Decentralized alternatives like IPFS offer content-addressed distribution but no persistence guarantees. We present D-LOCKSS (Distributed LOCKSS), a fully decentralized preservation network built on IPFS and libp2p that reimagines the "lots of copies keep stuff safe" principle for content-addressed, peer-to-peer infrastructure. D-LOCKSS introduces dynamic binary-tree sharding to partition preservation responsibility by content hash and per-shard conflict-free replicated state (CRDT) that synchronizes collections across institutions without central coordination. Content integrity and provenance are ensured through signed, content-addressed manifests, supporting FAIR findability and accessibility while aligning with NDSA Level 4 preservation objectives. We evaluate D-LOCKSS on a testnet of N nodes, demonstrating that the network reaches target replication within X minutes and self-heals after node failure without manual intervention. D-LOCKSS is implemented in Go, open-source under MIT/Apache-2.0, and deployed on Wikimedia Cloud. Our findings offer libraries, archives, and cultural-heritage institutions a practical path toward sovereign and resilient digital preservation that does not depend on any single institution's continued operation.