Skip to content

Commit

Permalink
feat: UDF migrates to Function (#1034)
Browse files Browse the repository at this point in the history
  • Loading branch information
jiashenC committed Sep 5, 2023
1 parent 9d78832 commit 18f98de
Show file tree
Hide file tree
Showing 199 changed files with 2,435 additions and 1,927 deletions.
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -205,8 +205,13 @@ dep.txt
*eva_data/
*evadb_data/

# models
# models, but not apply to codebase
models/
!evadb/models
!evadb/catalog/models
!test/unit_tests/models
!test/unit_tests/catalog/models


# test files
test.py
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ SELECT ChatGPT('Is this video summary related to Ukraine russia war', text)
* Train an ML model using the <a href="https://ludwig.ai/latest/">Ludwig AI</a> engine to predict a column in a table.

```sql
CREATE UDF IF NOT EXISTS PredictHouseRent FROM
CREATE FUNCTION IF NOT EXISTS PredictHouseRent FROM
( SELECT * FROM HomeRentals )
TYPE Ludwig
'predict' 'rental_price'
Expand Down
4 changes: 2 additions & 2 deletions docs/source/dev-guide/packaging.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ Models

Please follow the following steps to package models:

* Create a folder with a descriptive name. This folder name will be used by the UDF that is invoking your model.
* Place all files used by the UDF inside this folder. These are typically:
* Create a folder with a descriptive name. This folder name will be used by the function that is invoking your model.
* Place all files used by the function inside this folder. These are typically:
* Model weights (The .pt files that contain the actual weights)
* Model architectures (The .pt files that contain model architecture information)
* Label files (Extra files that are used in the process of model inference for outputting labels.)
Expand Down
8 changes: 4 additions & 4 deletions docs/source/overview/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ The program runs a SQL query for listing all the built-in functions in EvaDB. It
cursor = evadb.connect().cursor()
# List all the built-in functions in EvaDB
print(cursor.query("SHOW UDFS;").df())
print(cursor.query("SHOW FUNCTIONS;").df())
Now, run the Python program:

Expand All @@ -83,9 +83,9 @@ You should see a list of built-in functions including but not limited to the fol
.. code-block:: bash
name inputs ... impl metadata
0 ArrayCount [Input_Array NDARRAY ANYTYPE (), Search_Key ANY] ... /home/jarulraj3/evadb/evadb/udfs/ndarray/array... []
1 Crop [Frame_Array NDARRAY UINT8 (3, None, None), bb... ... /home/jarulraj3/evadb/evadb/udfs/ndarray/crop.py []
2 ChatGPT [query NDARRAY STR (1,), content NDARRAY STR (... ... /home/jarulraj3/evadb/evadb/udfs/chatgpt.py []
0 ArrayCount [Input_Array NDARRAY ANYTYPE (), Search_Key ANY] ... /home/jarulraj3/evadb/evadb/functions/ndarray/array... []
1 Crop [Frame_Array NDARRAY UINT8 (3, None, None), bb... ... /home/jarulraj3/evadb/evadb/functions/ndarray/crop.py []
2 ChatGPT [query NDARRAY STR (1,), content NDARRAY STR (... ... /home/jarulraj3/evadb/evadb/functions/chatgpt.py []
[3 rows x 6 columns]
Expand Down
66 changes: 33 additions & 33 deletions docs/source/reference/ai/custom.rst
Original file line number Diff line number Diff line change
@@ -1,60 +1,60 @@
.. _udf:

User-Defined Functions
Functions
======================

This section provides an overview of how you can create and use a custom user-defined function (UDF) in your queries. For example, you could write an UDF that wraps around your custom PyTorch model.
This section provides an overview of how you can create and use a custom function in your queries. For example, you could write an function that wraps around your custom PyTorch model.

Part 1: Writing a custom UDF
------------------------------
Part 1: Writing a custom Function
---------------------------------

During each step, use `this UDF implementation <https://github.com/georgia-tech-db/evadb/blob/master/evadb/udfs/yolo_object_detector.py>`_ as a reference.
During each step, use `this function implementation <https://github.com/georgia-tech-db/evadb/blob/master/evadb/functions/yolo_object_detector.py>`_ as a reference.

1. Create a new file under `udfs/` folder and give it a descriptive name. eg: `yolo_object_detection.py`.
1. Create a new file under `functions/` folder and give it a descriptive name. eg: `yolo_object_detection.py`.

.. note::

UDFs packaged along with EvaDB are located inside the `udfs <https://github.com/georgia-tech-db/evadb/tree/master/evadb/udfs>`_ folder.
Functions packaged along with EvaDB are located inside the `functions <https://github.com/georgia-tech-db/evadb/tree/master/evadb/functions>`_ folder.

2. Create a Python class that inherits from `PytorchClassifierAbstractUDF`.
2. Create a Python class that inherits from `PytorchClassifierAbstractFunction`.

* The `PytorchClassifierAbstractUDF` is a parent class that defines and implements standard methods for model inference.
* The `PytorchClassifierAbstractFunction` is a parent class that defines and implements standard methods for model inference.

* The functions setup and forward should be implemented in your child class. These functions can be implemented with the help of Decorators.

Setup
-----

An abstract method that must be implemented in your child class. The setup function can be used to initialize the parameters for executing the UDF. The parameters that need to be set are
An abstract method that must be implemented in your child class. The setup function can be used to initialize the parameters for executing the function. The parameters that need to be set are

- cacheable: bool

- True: Cache should be enabled. Cache will be automatically invalidated when the UDF changes.
- True: Cache should be enabled. Cache will be automatically invalidated when the function changes.
- False: cache should not be enabled.
- udf_type: str
- function_type: str

- object_detection: UDFs for object detection.
- object_detection: functions for object detection.
- batchable: bool

- True: Batching should be enabled
- False: Batching is disabled.

The custom setup operations for the UDF can be written inside the function in the child class. If there is no need for any custom logic, then you can just simply write "pass" in the function definition.
The custom setup operations for the function can be written inside the function in the child class. If there is no need for any custom logic, then you can just simply write "pass" in the function definition.

Example of a Setup function
Example of a Setup Function

.. code-block:: python
@setup(cacheable=True, udf_type="object_detection", batchable=True)
@setup(cacheable=True, function_type="object_detection", batchable=True)
def setup(self, threshold=0.85):
#custom setup function that is specific for the UDF
#custom setup function that is specific for the function
self.threshold = threshold
self.model = torch.hub.load("ultralytics/yolov5", "yolov5s", verbose=False)
Forward
--------

An abstract method that must be implemented in your UDF. The forward function receives the frames and runs the deep learning model on the data. The logic for transforming the frames and running the models must be provided by you.
An abstract method that must be implemented in your function. The forward function receives the frames and runs the deep learning model on the data. The logic for transforming the frames and running the models must be provided by you.
The arguments that need to be passed are

- input_signatures: List[IOColumnArgument]
Expand Down Expand Up @@ -91,7 +91,7 @@ A sample forward function is given below
],
)
def forward(self, frames: Tensor) -> pd.DataFrame:
#the custom logic for the UDF
#the custom logic for the function
outcome = []
frames = torch.permute(frames, (0, 2, 3, 1))
Expand All @@ -113,39 +113,39 @@ A sample forward function is given below
----------

Part 2: Registering and using the UDF in EvaDB Queries
------------------------------------------------------
Part 2: Registering and using the function in EvaDB Queries
-----------------------------------------------------------

Now that you have implemented your UDF, we need to register it as a UDF in EvaDB. You can then use the UDF in any query.
Now that you have implemented your function, we need to register it as a function in EvaDB. You can then use the function in any query.

1. Register the UDF with a query that follows this template:
1. Register the function with a query that follows this template:

`CREATE UDF [ IF NOT EXISTS ] <name>
`CREATE FUNCTION [ IF NOT EXISTS ] <name>
IMPL <path_to_implementation>;`

where,

* <name> - specifies the unique identifier for the UDF.
* <path_to_implementation> - specifies the path to the implementation class for the UDF
* <name> - specifies the unique identifier for the function.
* <path_to_implementation> - specifies the path to the implementation class for the function

Here, is an example query that registers a UDF that wraps around the 'YoloObjectDetection' model that performs Object Detection.
Here, is an example query that registers a function that wraps around the 'YoloObjectDetection' model that performs Object Detection.

.. code-block:: sql
CREATE UDF YoloDecorators
IMPL 'evadb/udfs/decorators/yolo_object_detection_decorators.py';
CREATE FUNCTION YoloDecorators
IMPL 'evadb/functions/decorators/yolo_object_detection_decorators.py';
A status of 0 in the response denotes the successful registration of this UDF.
A status of 0 in the response denotes the successful registration of this function.

2. Now you can execute your UDF on any video:
2. Now you can execute your function on any video:

.. code-block:: sql
SELECT YoloDecorators(data) FROM MyVideo WHERE id < 5;
3. You can drop the UDF when you no longer need it.
3. You can drop the function when you no longer need it.

.. code-block:: sql
DROP UDF IF EXISTS YoloDecorators;
DROP FUNCTION IF EXISTS YoloDecorators;
8 changes: 4 additions & 4 deletions docs/source/reference/ai/hf.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@ HuggingFace Models
This section provides an overview of how you can use out-of-the-box HuggingFace models in EvaDB.


Creating UDF from HuggingFace
------------------------------
Creating Function from HuggingFace
----------------------------------

EvaDB supports UDFS similar to `Pipelines <https://huggingface.co/docs/transformers/main_classes/pipelines>`_ in HuggingFace.
EvaDB supports functions similar to `Pipelines <https://huggingface.co/docs/transformers/main_classes/pipelines>`_ in HuggingFace.

.. code-block:: sql
CREATE UDF IF NOT EXISTS HFObjectDetector
CREATE FUNCTION IF NOT EXISTS HFObjectDetector
TYPE HuggingFace
'task' 'object-detection'
'model' 'facebook / detr-resnet-50'
Expand Down
2 changes: 1 addition & 1 deletion docs/source/reference/ai/index.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Models
------------------------------------------

EvaDB facilitates the utilization of thin wrappers around deep learning, commonly referred to as User Defined Functions (UDFs). These UDFs enable the incorporation of deep learning models into AI queries.
EvaDB facilitates the utilization of thin wrappers around deep learning, commonly referred to as functions. These functions enable the incorporation of deep learning models into AI queries.

This section compiles a comprehensive catalog of the model integrations that EvaDB supports.

Expand Down
10 changes: 5 additions & 5 deletions docs/source/reference/ai/model-train.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,19 @@ Training and Finetuning

.. code-block:: sql
CREATE UDF IF NOT EXISTS PredictHouseRent FROM
CREATE FUNCTION IF NOT EXISTS PredictHouseRent FROM
( SELECT sqft, location, rental_price FROM HomeRentals )
TYPE Ludwig
'predict' 'rental_price'
'time_limit' 120;
In the above query, you are creating a new customized UDF by automatically training a model from the `HomeRentals` table. The `rental_price` column will be the target column for predication, while `sqft` and `location` are the inputs.
In the above query, you are creating a new customized function by automatically training a model from the `HomeRentals` table. The `rental_price` column will be the target column for predication, while `sqft` and `location` are the inputs.

You can also simply give all other columns in `HomeRentals` as inputs and let the underlying automl framework to figure it out. Below is an example query:

.. code-block:: sql
CREATE UDF IF NOT EXISTS PredictHouseRent FROM
CREATE FUNCTION IF NOT EXISTS PredictHouseRent FROM
( SELECT * FROM HomeRentals )
TYPE Ludwig
'predict' 'rental_price'
Expand All @@ -31,13 +31,13 @@ You can also simply give all other columns in `HomeRentals` as inputs and let th

Check :ref:`create-udf-train` for available configurations for training models.

2. After training completes, you can use the `PredictHouseRent` like all other UDFs in EvaDB
2. After training completes, you can use the `PredictHouseRent` like all other functions in EvaDB

.. code-block:: sql
CREATE PredictHouseRent(sqft, location) FROM HomeRentals;
You can also simply give all columns in `HomeRentals` as inputs for inference. The customized UDF with the underlying model can figure out the proper inference columns via the training columns.
You can also simply give all columns in `HomeRentals` as inputs for inference. The customized function with the underlying model can figure out the proper inference columns via the training columns.

.. code-block:: sql
Expand Down
12 changes: 6 additions & 6 deletions docs/source/reference/ai/openai.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ OpenAI Models
This section provides an overview of how you can use OpenAI models in EvaDB.


Chat Completion UDFs
--------------------
Chat Completion Functions
-------------------------

To create a chat completion UDF in EvaDB, use the following SQL command:
To create a chat completion function in EvaDB, use the following SQL command:

.. code-block:: sql
CREATE UDF IF NOT EXISTS OpenAIChatCompletion
IMPL 'evadb/udfs/openai_chat_completion_udf.py'
CREATE FUNCTION IF NOT EXISTS OpenAIChatCompletion
IMPL 'evadb/functions/openai_chat_completion_function.py'
'model' 'gpt-3.5-turbo'
EvaDB supports the following models for chat completion task:
Expand All @@ -26,4 +26,4 @@ EvaDB supports the following models for chat completion task:
- "gpt-3.5-turbo"
- "gpt-3.5-turbo-0301"

The chat completion UDF can be composed in interesting ways with other UDFs. Please check the `Google Colab <https://colab.research.google.com/github/georgia-tech-db/evadb/blob/master/tutorials/08-chatgpt.ipynb>`_ for an example of combining chat completion task with caption extraction and video summarization models from Hugging Face and feeding it to chat completion to ask questions about the results.
The chat completion function can be composed in interesting ways with other functions. Please check the `Google Colab <https://colab.research.google.com/github/georgia-tech-db/evadb/blob/master/tutorials/08-chatgpt.ipynb>`_ for an example of combining chat completion task with caption extraction and video summarization models from Hugging Face and feeding it to chat completion to ask questions about the results.
12 changes: 6 additions & 6 deletions docs/source/reference/ai/yolo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ This section provides an overview of how you can use out-of-the-box Ultralytics
Creating YOLO Model
-------------------

To create a YOLO UDF in EvaDB using Ultralytics models, use the following SQL command:
To create a YOLO function in EvaDB using Ultralytics models, use the following SQL command:

.. code-block:: sql
CREATE UDF IF NOT EXISTS Yolo
CREATE FUNCTION IF NOT EXISTS Yolo
TYPE ultralytics
'model' 'yolov8m.pt'
Expand All @@ -30,11 +30,11 @@ The following models are currently supported by Ultralytics in EvaDB:

Please refer to the `Ultralytics documentation <https://docs.ultralytics.com/tasks/detect/#models>`_ for more information about these models and their capabilities.

Using Ultralytics Models with Other UDFs
----------------------------------------
Using Ultralytics Models with Other Functions
---------------------------------------------
This code block demonstrates how the YOLO model can be combined with other models such as Color and DogBreedClassifier to perform more specific and targeted object detection tasks. In this case, the goal is to find images of black-colored Great Danes.

The first query uses YOLO to detect all images of dogs with black color. The ``UNNEST`` function is used to split the output of the ``Yolo`` UDF into individual rows, one for each object detected in the image. The ``Color`` UDF is then applied to the cropped portion of the image to identify the color of each detected dog object. The ``WHERE`` clause filters the results to only include objects labeled as "dog" and with a color of "black".
The first query uses YOLO to detect all images of dogs with black color. The ``UNNEST`` function is used to split the output of the ``Yolo`` function into individual rows, one for each object detected in the image. The ``Color`` function is then applied to the cropped portion of the image to identify the color of each detected dog object. The ``WHERE`` clause filters the results to only include objects labeled as "dog" and with a color of "black".

.. code-block:: sql
Expand All @@ -44,7 +44,7 @@ The first query uses YOLO to detect all images of dogs with black color. The ``U
AND Color(Crop(data, bbox)) = 'black';
The second query builds upon the first by further filtering the results to only include images of Great Danes. The ``DogBreedClassifier`` UDF is used to classify the cropped portion of the image as a Great Dane. The ``WHERE`` clause adds an additional condition to filter the results to only include objects labeled as "dog", with a color of "black", and classified as a "great dane".
The second query builds upon the first by further filtering the results to only include images of Great Danes. The ``DogBreedClassifier`` function is used to classify the cropped portion of the image as a Great Dane. The ``WHERE`` clause adds an additional condition to filter the results to only include objects labeled as "dog", with a color of "black", and classified as a "great dane".

.. code-block:: sql
Expand Down
8 changes: 4 additions & 4 deletions docs/source/reference/evaql.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,19 @@ EvaDB Query Language Reference
===============================

EvaDB Query Language (EvaDB) is derived from SQL. It is tailored for AI-driven analytics. EvaDB allows users to invoke deep learning models in the form
of user-defined functions (UDFs).
of functions.

Here is an example where we first define a UDF wrapping around the FastRCNN object detection model. We then issue a query with this function to detect objects.
Here is an example where we first define a function wrapping around the FastRCNN object detection model. We then issue a query with this function to detect objects.

.. code:: sql
--- Create an user-defined function wrapping around FastRCNN ObjectDetector
CREATE UDF IF NOT EXISTS FastRCNNObjectDetector
CREATE FUNCTION IF NOT EXISTS FastRCNNObjectDetector
INPUT (frame NDARRAY UINT8(3, ANYDIM, ANYDIM))
OUTPUT (labels NDARRAY STR(ANYDIM), bboxes NDARRAY FLOAT32(ANYDIM, 4),
scores NDARRAY FLOAT32(ANYDIM))
TYPE Classification
IMPL 'evadb/udfs/fastrcnn_object_detector.py';
IMPL 'evadb/functions/fastrcnn_object_detector.py';
--- Use the function to retrieve frames that contain more than 3 cars
SELECT id FROM MyVideo
Expand Down
16 changes: 8 additions & 8 deletions docs/source/reference/evaql/create.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,30 +49,30 @@ To create a table, specify the schema of the table.
object_id INTEGER
);
CREATE UDF
----------
CREATE FUNCTION
---------------

To register an user-defined function, specify the implementation details of the UDF.
To register an user-defined function, specify the implementation details of the function.

.. code-block:: sql
CREATE UDF IF NOT EXISTS FastRCNNObjectDetector
CREATE FUNCTION IF NOT EXISTS FastRCNNObjectDetector
INPUT (frame NDARRAY UINT8(3, ANYDIM, ANYDIM))
OUTPUT (labels NDARRAY STR(ANYDIM), bboxes NDARRAY FLOAT32(ANYDIM, 4),
scores NDARRAY FLOAT32(ANYDIM))
TYPE Classification
IMPL 'evadb/udfs/fastrcnn_object_detector.py';
IMPL 'evadb/functions/fastrcnn_object_detector.py';
.. _create-udf-train:

CREATE UDF via Training
-----------------------
CREATE FUNCTION via Training
----------------------------

To register an user-defined function by training a predication model.

.. code-block:: sql
CREATE UDF IF NOT EXISTS PredictHouseRent FROM
CREATE FUNCTION IF NOT EXISTS PredictHouseRent FROM
(SELECT * FROM HomeRentals)
TYPE Ludwig
'predict' 'rental_price'
Expand Down
Loading

0 comments on commit 18f98de

Please sign in to comment.