You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We want to start automatically indexing FC WARCs for full-text search.
See e.g. website/scripts/run-solr-indexer.sh for the basic operation.
The Solr indexer uses a SurtPrefixSet for the Open Access list, so that is expected to be SURTs. This should be provided by the OA Surts file generated by w3act_export.
The Solr indexer uses a StaticMapExclusionFilterFactory for exclusions, like Open Wayback, so this can be a mixture of URLs and SURTs. The PyWB block files are manually-managed files from the internal GitLab repo.
Some of this will be done in ukwa-manage rather than here, but we'll need an Airflow runner.
Create a Solr indexer that:
Runs like the CDX Indexer, tracking progress in TrackDB.
We want to start automatically indexing FC WARCs for full-text search.
website/scripts/run-solr-indexer.sh
for the basic operation.w3act_export
.Some of this will be done in
ukwa-manage
rather than here, but we'll need an Airflow runner.Create a Solr indexer that:
commit
at the end?The text was updated successfully, but these errors were encountered: