Switch OpenWayback to CDX Indexing

CDX indexes scale better for large archives.

Steps

  1. Generate CDX files:
    bin/cdx-indexer archive.warc.gz cdx-index/index.cdx
  2. Edit WEB-INF/wayback.xml:
    <!-- Disable BDB collection -->
    <!-- <ref bean="localbdbcollection" /> -->
    <ref bean="localcdxcollection" />
  3. Configure CDXCollection.xml with CompositeSearchResultSource if you have multiple indexes.
  4. Map ARC/WARC paths using FlatFileResourceFileLocationDB.

Diagram

  flowchart LR
    A[CDX files] --> B[CompositeSearchResultSource]
    B --> C[OpenWayback queries]

Restart Tomcat after changes and monitor logs for missing paths.