-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse_wikidump: programmatic access #28
base: master
Are you sure you want to change the base?
Conversation
…ality, e.g., initialize db, and initiate downloads via code.
help='Download snapshot if it does not exist as snapshot.xml.bz2. The corpus file name should match that of snapshot.') | ||
parser.add_argument('-N', '--ngram', dest='ngram', default=7, type=int, | ||
help='Maximum order of ngrams, set to None to disable [default: 7].') | ||
args = parser.parse_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @IsaacHaze will be very unhappy when he sees his darling docopt
replaced by ArgumentParser
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't know. I was inspired from how things are being done in xtas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was written before I knew about docopt
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:'(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so docopt
is the current default for parsing args?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright. I'll revert the arg parsing logic to use docopt
, then.
I see how this is a useful feature, but I think it should be implemented differently, viz. by moving functionality from |
Good points. Let me work on it a bit more and will hear from me soon. |
…unction in __init__ that takes care of downloading.
But apart from the sadness, i'm all for this change (i did something similar in my joblib branch.) |
Refactored parse_wikidump to allow programmatic access to its functionality, e.g., initialize db, and initiate downloads via code.