Skip to content

Package for identifying the topics present in a collection of text documents and create summaries of texts

License

Notifications You must be signed in to change notification settings

CarlosSanabriaM/topics_and_summary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Topics and Summary

Example of some of the topics obtained with this tool on the 20_newsgroups dataset

What is it?

topics_and_summary is a library that identifies topics in a collection of text documents and classifies the documents inside that topics. It also generates summaries of text documents. This is done using NLP techniques.

Main Features

  • Identify the topics present in the collection of documents.
  • Identify the relation of each document in the collection with each topic.
  • Classify each document in the collection inside a topic.
  • Identify the relation of a given text document with each topic.
  • Classify a given text document inside a topic.
  • Obtain the most representative documents of each topic.
  • Obtain the documents of the collection more related with a given text document.
  • Create an extractive summary of a given text document.

Dependencies, Installation and Usage

All this information and more is present in the documentation. To generate it, execute:

cd topics_and_summary/docs
./generate-api-doc.sh

The documentation will be generated in HTML format in the folder topics_and_summary/docs/build/html. The index.html file is the main page of the documentation.

About

Package for identifying the topics present in a collection of text documents and create summaries of texts

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published