Skip to content

Script used to download random books from Project Gutenberg's catalog and strip out all punctuation.

License

Notifications You must be signed in to change notification settings

nispio/gutenbergwords

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gutenbergwords

Script used to download random books from Project Gutenberg's catalog and strip out all punctuation.

Usage: ./getbooks.sh [-f FILE] [[-d FOLDER] | [-o OUT]]
       ./getbooks.sh [-n NUM] [-m URL] [-d FOLDER]
       ./getbooks.sh [-b BOOK] [-m URL] [[-d FOLDER] | [-o OUT]]
Retrieve book(s) from Project Gutenberg catalog and strip the file
of all characters except A-Z and <space>.
  -f FILE        open and parse the book found at FILE
                   (this will ignore options -b -n -m)
  -b BOOK        download and parse book number BOOK from PG catalog
                   (this will ignore option -n)
  -n NUM         download NUM random books (default 1)
  -d FOLDER      save results to FOLDER (default \"./words/\")
  -o FILE        save results to output file FILE
                   (do not use with option -n)
  -m URL         retrieve books from mirror at URL
                   (default \"ftp://mirrors.xmission.com/gutenberg\")
  -u             do not strip header and footer
  -l             language-agnostic download
  -v             show verbose output (for debugging)
  -h, --help     display this message and exit

Report bugs to [email protected]

About

Script used to download random books from Project Gutenberg's catalog and strip out all punctuation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published