Skip to content

DennisMcWherter/spark-common-crawl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Misc. Spark Common Crawl

Some miscellaneous examples of using Spark to analyze some common-crawl data.

The original use of these scripts were for some simple evaluations. Use them at your own risk and for an example of how to work with the data.

I copied the common-crawl datasets from s3 to a local hdfs cluster.

About

Spark examples for parsing common crawl data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages