Skip to content

Commit

Permalink
Rename msmarco-passage -> msmarco-passage-wp in YAML for wp-based reg…
Browse files Browse the repository at this point in the history
…ressions (#1858)

Otherwise, if we run the "base" and "wp" versions concurrently, they clash with each other.
  • Loading branch information
lintool authored Apr 26, 2022
1 parent d4155e2 commit 38cd408
Show file tree
Hide file tree
Showing 6 changed files with 24 additions and 24 deletions.
14 changes: 7 additions & 7 deletions docs/regressions-dl19-passage-wp.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ Typical indexing command:
```
target/appassembler/bin/IndexCollection \
-collection JsonCollection \
-input /path/to/msmarco-passage \
-input /path/to/msmarco-passage-wp \
-index indexes/lucene-index.msmarco-passage-wp/ \
-generator DefaultLuceneDocumentGenerator \
-threads 9 -storePositions -storeDocvectors -storeRaw -pretokenized \
>& logs/log.msmarco-passage &
>& logs/log.msmarco-passage-wp &
```

The directory `/path/to/msmarco-passage-wp/` should be a directory containing the corpus in Anserini's jsonl format.
Expand All @@ -49,17 +49,17 @@ target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.msmarco-passage-wp/ \
-topics src/main/resources/topics-and-qrels/topics.dl19-passage.wp.tsv.gz \
-topicreader TsvInt \
-output runs/run.msmarco-passage.bm25-default.topics.dl19-passage.wp.txt \
-output runs/run.msmarco-passage-wp.bm25-default.topics.dl19-passage.wp.txt \
-bm25 -pretokenized &
```

Evaluation can be performed using `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -m map -c -l 2 src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage.bm25-default.topics.dl19-passage.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m ndcg_cut.10 -c src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage.bm25-default.topics.dl19-passage.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m recall.100 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage.bm25-default.topics.dl19-passage.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m recall.1000 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage.bm25-default.topics.dl19-passage.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m map -c -l 2 src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage-wp.bm25-default.topics.dl19-passage.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m ndcg_cut.10 -c src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage-wp.bm25-default.topics.dl19-passage.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m recall.100 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage-wp.bm25-default.topics.dl19-passage.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m recall.1000 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl19-passage.txt runs/run.msmarco-passage-wp.bm25-default.topics.dl19-passage.wp.txt
```

## Effectiveness
Expand Down
14 changes: 7 additions & 7 deletions docs/regressions-dl20-passage-wp.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,11 @@ Typical indexing command:
```
target/appassembler/bin/IndexCollection \
-collection JsonCollection \
-input /path/to/msmarco-passage \
-input /path/to/msmarco-passage-wp \
-index indexes/lucene-index.msmarco-passage-wp/ \
-generator DefaultLuceneDocumentGenerator \
-threads 9 -storePositions -storeDocvectors -storeRaw -pretokenized \
>& logs/log.msmarco-passage &
>& logs/log.msmarco-passage-wp &
```

The directory `/path/to/msmarco-passage-wp/` should be a directory containing the corpus in Anserini's jsonl format.
Expand All @@ -49,17 +49,17 @@ target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.msmarco-passage-wp/ \
-topics src/main/resources/topics-and-qrels/topics.dl20.wp.tsv.gz \
-topicreader TsvInt \
-output runs/run.msmarco-passage.bm25-default.topics.dl20.wp.txt \
-output runs/run.msmarco-passage-wp.bm25-default.topics.dl20.wp.txt \
-bm25 -pretokenized &
```

Evaluation can be performed using `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -m map -c -l 2 src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage.bm25-default.topics.dl20.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m ndcg_cut.10 -c src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage.bm25-default.topics.dl20.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m recall.100 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage.bm25-default.topics.dl20.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m recall.1000 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage.bm25-default.topics.dl20.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m map -c -l 2 src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage-wp.bm25-default.topics.dl20.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m ndcg_cut.10 -c src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage-wp.bm25-default.topics.dl20.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m recall.100 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage-wp.bm25-default.topics.dl20.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -m recall.1000 -c -l 2 src/main/resources/topics-and-qrels/qrels.dl20-passage.txt runs/run.msmarco-passage-wp.bm25-default.topics.dl20.wp.txt
```

## Effectiveness
Expand Down
14 changes: 7 additions & 7 deletions docs/regressions-msmarco-passage-wp.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,11 @@ Typical indexing command:
```
target/appassembler/bin/IndexCollection \
-collection JsonCollection \
-input /path/to/msmarco-passage \
-input /path/to/msmarco-passage-wp \
-index indexes/lucene-index.msmarco-passage-wp/ \
-generator DefaultLuceneDocumentGenerator \
-threads 9 -storePositions -storeDocvectors -storeRaw -pretokenized \
>& logs/log.msmarco-passage &
>& logs/log.msmarco-passage-wp &
```

The directory `/path/to/msmarco-passage-wp/` should be a directory containing the corpus in Anserini's jsonl format.
Expand All @@ -45,17 +45,17 @@ target/appassembler/bin/SearchCollection \
-index indexes/lucene-index.msmarco-passage-wp/ \
-topics src/main/resources/topics-and-qrels/topics.msmarco-passage.dev-subset.wp.tsv.gz \
-topicreader TsvInt \
-output runs/run.msmarco-passage.bm25-default.topics.msmarco-passage.dev-subset.wp.txt \
-output runs/run.msmarco-passage-wp.bm25-default.topics.msmarco-passage.dev-subset.wp.txt \
-bm25 -pretokenized &
```

Evaluation can be performed using `trec_eval`:

```
tools/eval/trec_eval.9.0.4/trec_eval -c -m map src/main/resources/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt runs/run.msmarco-passage.bm25-default.topics.msmarco-passage.dev-subset.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 10 -m recip_rank src/main/resources/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt runs/run.msmarco-passage.bm25-default.topics.msmarco-passage.dev-subset.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt runs/run.msmarco-passage.bm25-default.topics.msmarco-passage.dev-subset.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt runs/run.msmarco-passage.bm25-default.topics.msmarco-passage.dev-subset.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m map src/main/resources/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt runs/run.msmarco-passage-wp.bm25-default.topics.msmarco-passage.dev-subset.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -M 10 -m recip_rank src/main/resources/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt runs/run.msmarco-passage-wp.bm25-default.topics.msmarco-passage.dev-subset.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.100 src/main/resources/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt runs/run.msmarco-passage-wp.bm25-default.topics.msmarco-passage.dev-subset.wp.txt
tools/eval/trec_eval.9.0.4/trec_eval -c -m recall.1000 src/main/resources/topics-and-qrels/qrels.msmarco-passage.dev-subset.txt runs/run.msmarco-passage-wp.bm25-default.topics.msmarco-passage.dev-subset.wp.txt
```

## Effectiveness
Expand Down
2 changes: 1 addition & 1 deletion src/main/resources/regression/dl19-passage-wp.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
corpus: msmarco-passage
corpus: msmarco-passage-wp
corpus_path: collections/msmarco/msmarco-passage-wp

index_path: indexes/lucene-index.msmarco-passage-wp/
Expand Down
2 changes: 1 addition & 1 deletion src/main/resources/regression/dl20-passage-wp.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
corpus: msmarco-passage
corpus: msmarco-passage-wp
corpus_path: collections/msmarco/msmarco-passage-wp

index_path: indexes/lucene-index.msmarco-passage-wp/
Expand Down
2 changes: 1 addition & 1 deletion src/main/resources/regression/msmarco-passage-wp.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
corpus: msmarco-passage
corpus: msmarco-passage-wp
corpus_path: collections/msmarco/msmarco-passage-wp

index_path: indexes/lucene-index.msmarco-passage-wp/
Expand Down

0 comments on commit 38cd408

Please sign in to comment.