Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete end-to-end documentation for single-node dstlr #20

Open
lintool opened this issue Oct 2, 2019 · 5 comments
Open

Complete end-to-end documentation for single-node dstlr #20

lintool opened this issue Oct 2, 2019 · 5 comments
Assignees

Comments

@lintool
Copy link
Member

lintool commented Oct 2, 2019

We need complete end-to-end documentation for a single-node dstlr:

  • Ingesting Washington Post into Solr.
  • Running extraction on a subset of the docs. (I understand that extraction over the entire corpus might be unrealistic on a single node.)
  • Running enrichment.
  • Running sample data cleaning queries.

We have parts here and there already, but I'd like documentation down to the level of "copy and paste these commands" into a shell... and it should just work.

@ryan-clancy
Copy link
Member

I've started a branch here for the updated documentation. I've added the instructions to build dstlr, fix an issue with CoreNLP 3.8 and Spark, added the Anserini/Solrini instructions, and updated some neo4j docs.

@x389liu Are you able to flush out more of the Running section? It might be good to point out what needs changing in each of the scripts (e.g., the neo4j password, amount of memory, # executors and # cores, etc.)

@x389liu
Copy link
Member

x389liu commented Oct 2, 2019

@r-clancy yeah, I'll add more details to that branch.

@x389liu
Copy link
Member

x389liu commented Oct 9, 2019

@lintool ryan and I have added detailed instructions on running single-node dstlr #26
I think this issue can be closed?

@x389liu
Copy link
Member

x389liu commented Oct 9, 2019

Talked to @r-clancy. Before we close this issue, we'd like to run dstlr on a single himrod node following these instructions, check if more details are needed.

@lintool
Copy link
Member Author

lintool commented Feb 11, 2020

Bumping this - @x389liu you should work on this.
The Core18 instructions in the README can now just be replaced by the Solrini docs in Anserini.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants