diff --git a/docs/_toc.yml b/docs/_toc.yml index e3ac587626..be2c3cdc66 100644 --- a/docs/_toc.yml +++ b/docs/_toc.yml @@ -78,8 +78,14 @@ parts: - file: source/contribute/debugging title: Debugging EvaDB - - file: source/contribute/new-command + - file: source/contribute/extend title: Extending EvaDB + sections: + - file: source/contribute/new-data-source + title: Structured Data Source Integration + - file: source/contribute/new-command + title: Operators + - file: source/contribute/release title: Releasing EvaDB diff --git a/docs/source/contribute/extend.rst b/docs/source/contribute/extend.rst new file mode 100644 index 0000000000..30f42778cc --- /dev/null +++ b/docs/source/contribute/extend.rst @@ -0,0 +1,5 @@ +Extending EvaDB +==== +This document details steps invovled in extending EvaDB. + +.. tableofcontents:: diff --git a/docs/source/contribute/new-command.rst b/docs/source/contribute/new-command.rst index 8ce2ccdb1d..337a78adc9 100644 --- a/docs/source/contribute/new-command.rst +++ b/docs/source/contribute/new-command.rst @@ -1,4 +1,4 @@ -Extending EvaDB +Operators / Commands ============= This document details the steps involved in adding support for a new operator (or command) in EvaDB. We illustrate the process using a DDL command. diff --git a/docs/source/contribute/new-data-source.rst b/docs/source/contribute/new-data-source.rst new file mode 100644 index 0000000000..d8d37483b9 --- /dev/null +++ b/docs/source/contribute/new-data-source.rst @@ -0,0 +1,89 @@ +Structured Data Source Integration +==== +This document details steps invovled in adding a new structured data source integration in EvaDB. + + +Example Data Source Integration In EvaDB +---- + +- `PostgreSQL `_ + + +Create Data Source Handler +---- + +1. Create a new directory at `evadb/third_party/databases/ `_ +~~~~ + +.. note:: + + The directory name is also the engine name used in the `CREATE DATABASE mydb_source WITH ENGINE = "..."`. In this document, we use **mydb** as the example data source we want to integrate in EvaDB. + +The directory should contain three files: + +- __init__.py +- requirements.txt +- mydb_handler.py + +The *__init__.py* can contain copyright information. The *requirements.txt* contains the extra python libraries that need to be installed via pip for the mydb data source. + +.. note:: + + EvaDB will only install a data source's specific dependency libraries when a connection to the data source is created by the user via, e.g., `CREATE DATABASE mydb_source WITH ENGINE = "mydb";`. + +2. Implement the data source handler +~~~~ + +In *mydb_handler.py*, you need to implement the `DBHandler` declared at `evadb/third_party/databases/types.py `_. There are 7 functions that you need to implement: + +.. code:: python + + class MydbHandler(DBHandler): + + def __init__(self, name: str, **kwargs): + ... + def connect(self): + ... + def disconnect(self): + ... + def check_connection(self) -> DBHandlerStatus: + ... + def get_tables(self) -> DBHandlerResponse: + ... + def get_columns(self, table_name: str) -> DBHandlerResponse: + ... + def execute_native_query(self, query_string: str) -> DBHandlerResponse: + ... + +The *get_tables* should retrieve the list of tables from the data source. The *get_columns* should retrieve the columns of a specified table from the database. The *execute_native_query* specifies how to execute the query through the data source's engine. For more details, please check the function signature and documentation at `evadb/third_party/databases/types.py `_. + +You can get the data source's configuration parameters from `__init__(self, name: str, **kwargs)`. Below is an example: + +.. code:: python + + def __init__(self, name: str, **kwargs): + super().__init__(name) + self.host = kwargs.get("host") + self.port = kwargs.get("port") + self.user = kwargs.get("user") + self.password = kwargs.get("password") + +.. note:: + + Those paramters will be specified when the user creates a connection to the data source: `CREATE DATABASE mydb_source WITH ENGINE = "mydb", PARAMETERS = {"host": "localhost", "port": "5432", "user": "eva", "password": "password"};`. + +You can check the PostgreSQL's handler example at `evadb/third_party/databases/postgres/postgres_handler.py `_ for ideas. + + +Register the Data Source Handler +---- + +Add your created data source handler in `get_database_handler` function at `evadb/third_party/databases/interface.py `_. Below is an example of registering the created mydb data source: + +.. code:: python + + ... + elif engine == "mydb": + return mod.MydbHandler(engine, **kwargs) + ... + diff --git a/evadb/third_party/databases/types.py b/evadb/third_party/databases/types.py index bdedb3a4be..5fc547adc2 100644 --- a/evadb/third_party/databases/types.py +++ b/evadb/third_party/databases/types.py @@ -53,7 +53,7 @@ class DBHandler: name (str): The name associated with the database handler instance. """ - def __init__(self, name: str): + def __init__(self, name: str, **kwargs): self.name = name def connect(self):