GitHub - udeyrishi/alchemy_dir_analyzer: App for NLP-ing entire directories using Alchemy API

##Alchemy Directory Analyzer A text analysis app that uses Alchemy API to analyze all the plain text documents in a directory.

###Dependencies This project has been built and tested using Python 2.7.11. Get all the required packages as follows:

# Install all pip requirements
$ pip install -r requirements.txt

# Install Python Alchemy API (submodule)
$ git submodule init && git submodule update

###Usage

$ chmod +x ./main.py

$ ./main.py -h
usage: main.py [-h] [-r] [-v] key source destination

positional arguments:
  key              The AlchemyAPI key
  source           The path to the source directory
  destination      The path to the destination directory. Directories will be
                   created if they don't exist

optional arguments:
  -h, --help       show this help message and exit
  -r, --recursive  Recursively process subdirectories
  -v, --verbose    Enable verbose log output

Example usage:

$ ./main.py -rv 1234567890098765432112345678900987654321 path/to/src path/to/dest
2016-04-16 22:49:14,009 alchemy_dir_analyzer INFO     Starting analysis
2016-04-16 22:49:14,009 alchemy_dir_analyzer DEBUG    Processing file: path/to/src/f1.txt >> Output file: path/to/dest/f1.txt.json
...
2016-04-16 22:49:18,528 alchemy_dir_analyzer INFO     Analysis finished. Results in path/to/dest

Note that all the files in the source directory (and subdirectories, if -r is used) starting with a '.' will be skipped (system files on Linux and OS X).

###Output The destination directory has the same structure as the source directory, except that all the file names have '.json' appended, and they contain the corresponding Alchemy API analysis information as a JSON object. All the fields in this JSON have the same meaning as specified in the Alchemy API documentation. Some unimportant fields such as 'url', 'usage', 'status', 'statusInfo', 'totalTransactions', etc., have been removed from the API results.

Sample result JSON:

{
    "category": {
        "category": "arts_entertainment",
        "language": "english",
        "score": "0.85"
    },
    "concepts": {
        "concepts": [
            {
                "dbpedia": "http://dbpedia.org/resource/Film",
                "freebase": "http://rdf.freebase.com/ns/m.02vxn",
                "opencyc": "http://sw.opencyc.org/concept/Mx4rwP19XJwpEbGdrcN5Y29ycA",
                "relevance": "0.962047",
                "text": "Film"
            },
            ...
        ],
        "language": "english"
    },
    "doc_sentiment": {
        "docSentiment": {
            "mixed": "1",
            "score": "-0.314192",
            "type": "negative"
        },
        "language": "english"
    },
    "entities": {
        "entities": [
            {
                "count": "2",
                "disambiguated": {
                    "dbpedia": "http://dbpedia.org/resource/Melissa_Sagemiller",
                    "freebase": "http://rdf.freebase.com/ns/m.0b6z88",
                    "name": "Melissa Sagemiller",
                    "subType": [
                        "Actor",
                        "FilmActor",
                        "TVActor"
                    ],
                    "yago": "http://yago-knowledge.org/resource/Melissa_Sagemiller"
                },
                "relevance": "0.737991",
                "sentiment": {
                    "score": "-0.714528",
                    "type": "negative"
                },
                "text": "melissa sagemiller",
                "type": "Person"
            },
            {
                "count": "1",
                "relevance": "0.584015",
                "sentiment": {
                    "score": "0.357154",
                    "type": "positive"
                },
                "text": "the deal",
                "type": "FieldTerminology"
            },
            ...
        ],
        "language": "english"
    },
    "keywords": {
        "keywords": [
            {
                "relevance": "0.943157",
                "sentiment": {
                    "score": "0.310866",
                    "type": "positive"
                },
                "text": "pretty neat concept"
            },
            ...
        ],
        "language": "english"
    }
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
alchemyapi_python @ 6bddd90		alchemyapi_python @ 6bddd90
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
NOTICE		NOTICE
Readme.md		Readme.md
__init__.py		__init__.py
alchemy_analyzer.py		alchemy_analyzer.py
main.py		main.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases 1

Packages

Languages

License

udeyrishi/alchemy_dir_analyzer

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages