The goal of this project is to implement a primitive version of the elasticsearch. This project is purely for educational purposes and is not to be used in any production environment.
While simplistic, this project should be modular and open for future extensions, even if none will come.
The main dependencies are:
- Python 3.8 or older.
- MongoDB.
The current version of the project only supports the MongoDB installed locally as a database.
-
Install the project with
poetry install
-
Use the
poetry run cli
to access the CLI. It will show you the schema of the project. The thing that interests you the most is theCOMMANDS
section. -
Pick a command from the
COMMAND
section and run it.For example:
poetry run cli insert_document --name='MyDocument' --content='I love reading tutorials'
If the command you want to run requires some arguments, it will let you know.
The basic commands are the insert_document
and search
. The insert_document
populates the database with documents, and the search
command allows you to
search through your documents with a query.
The project consists out of 3 main parts:
-
Database - functionality to store data and use it for search purposes.
- Add documents.
- Delete documents.
- Clear the entire database.
-
Search engine - functionality to query the database with a specific query and return the result. For simplicity, only one type of search will be available - word match, which will return documents with the closest word match, where longer words have more weight.
- The algorithm to handle the search
- Search across a specific collection.
- Limit query output.
-
CLI interface - the program will be accessed from the cli interface.