During this report a python script was put together to process the Gutenberg project Ebooks and create a subject classifier for the Ebooks via utilising machine learning techniques. In order to process the size of the data set, spark was used on City University’s network, with a python API (pyspark).
-
Notifications
You must be signed in to change notification settings - Fork 0
ilektram/pySpark-for-ebook-Classification
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published