Skip to content

Cookbook: User Defined Function

Stefan Dombek edited this page May 7, 2024 · 43 revisions

User Defined Function

This page is still under construction!

A User Defined Function (UDF) is a SQL function that is written by the user. Like built-in functions a UDF can accept input values, can return outputs or just execute SQL statements.

Please note:

  • In this cookbook we use PL/pgSQL (Procedural Language/PostgreSQL Structured Query Language). The procedural language of the object-relational database PostgreSQL.
  • This cookbook is only intended to get you started with FOLIO reporting. Only selected things are explained.

Table of Contents

1. Creating an UDF

To create a UDF you can use the following structur. First check whether the function already exists. In this case, delete them first before create the new version to the database.

Example:

DROP FUNCTION IF EXISTS schema.func;

CREATE FUNCTION schema.func(
/* parameters */
...
)
/* Set return type */
RETURNS TABLE(
...
) AS 
/* Code block */
$$
BEGIN
    /* DECLARE section */
    DECLARE
        ...

    /* PROCEDURE section */
    BEGIN
        ...
		
        /* EXCEPTION section */
        ...
		
    END;
END;
$$
LANGUAGE plpgsql -- The language of the function body
STABLE           -- For the DBMS
PARALLEL SAFE    -- For the DBMS

You can find an example under 5.1. Creating an UDF to get a table output.

[top]

2. Using Parameters

2.1. Difference between arguments and parameters

There is a difference between the terms arguments and parameters.

Arguments are values that you can set when you call a function. In the following example staff is the argument.

Example:

SELECT * FROM reporting.user_groups('staff');

Parameters are values that are used inside the function. The values for these parameters were given via the arguments when the function was called. In the following example user_group is the parameter.

Example:

CREATE FUNCTION reporting.user_groups(
    user_group varchar DEFAULT ''
)
...

[top]

2.2. Creating parameters

In the first block of the definition from the function you can create parameters and their datatype. It is also possible to set a default value.

Example:

CREATE FUNCTION reporting.user_groups(
    user_group varchar DEFAULT ''
)
...

You should use logical names for the parameters. The behavior of the parameters is like variables in other script languages. The parameters can be used inside the function multiple times and storing the values that were given via the arguments when the function was called. For this reason it is helpful to use default values. Default values can help you catch errors when executing the function.

[top]

3. Set return type

In the next step you can define a return type.

In the following example you can see that we define which kind of return type should be shown in the output. In this case it is a table. You can also set the datatypes for the columns.

Example:

CREATE FUNCTION reporting.user_groups(
    user_group varchar DEFAULT ''
)
RETURNS TABLE (
    id uuid,
    group_name text
)
AS
...

Sometimes you only need a return value instead of a table. The following example shows how this could be done.

Example:

CREATE FUNCTION reporting.example(
    parameter INTEGER DEFAULT 0
) 
RETURNS INTEGER AS
...

There is also the option to execute a function without a return value. A good example can be found under 4.2.1.2. Functions without return value.

Example:

CREATE FUNCTION reporting.example(
    parameter INTEGER DEFAULT 0 
)
RETURNS VOID AS 
...

[top]

4. Structure of the code block

There are usually three sections in an UDF code block that can be used in the part between the $$. Some of them are optional.

Section Purpose Required
DECLARE Definition of variables, objects, constants and cursors Optional
PROCEDURE SQL statements and conditions. It also execute the UDF Mandatory
EXCEPTION Definition of exception handlers Optional

This is the basic syntax:

BEGIN             -- optional, shows the beginning of the block
    DECLARE       -- optional, in this section you can declare variables
    BEGIN         -- mandatory, begin of the PROCEDURE section
        PROCEDURE -- mandatory, SQL statements
        EXCEPTION -- optional, begin of the EXCEPTION section
    END           -- mandatory, end of the PROCEDURE section
END               -- optional, shows the end of the block

[top]

4.1. DECLARE section

In this section you can define your variables, constants or cursors with their data types. The data types must be available in the database. You can also initializing these with data values if you want.

Example:

...
$$
BEGIN
    DECLARE
        total INTEGER;
    BEGIN
        /* PROCEDURE */
    END;
END;
$$
...

To initialize the variables and constants with data values you use :=.

Example:

...
$$
BEGIN
    DECLARE
        total INTEGER := 0;
    BEGIN
        /* PROCEDURE */
    END;
END;
$$
...

Cursors will not be explained in this cookbook

[top]

4.2. PROCEDURE section

This section is the main part of the code block from the UDF. Within this section you can call the defined variables, constants, cursors and the set parameters. It is also the place to execute the SQL statement for the UDF.

The section is surrounded by the words BEGIN and END.

Example:

...
$$
BEGIN
    /* PROCEDURE */
END;
$$
...

[top]

4.2.1. RETURN

4.2.1.1. Functions with return value

Functions are used in programming to eliminate redundancies. This means that pieces of code that are needed more often in the program in the same or a similar form are not rewritten each time, but are encapsulated in a function. This function can then be called at the points where it is needed.

In the most cases you want to generate a return value with a function. To get a return value from a function you have to use the SQL statement RETURN.

Examples:

[top]

4.2.1.2. Functions without return value

You can also write functions without a return value where you use the statement RETURN.

Example: We have the following table.

personid lastname firstname address city
1 Meyer Max Musterstraße Musterstadt
2 Schultze Maxi Musterstraße Musterstadt

We would like delete records via a function from this table by using the personid.

Function:

DROP FUNCTION IF EXISTS reporting.delete_person;

CREATE FUNCTION reporting.delete_person(
    person_id INTEGER DEFAULT NULL 
)
RETURNS VOID AS 
$$
BEGIN
    /* PROCEDURE section */
    BEGIN
        IF person_id IS NULL THEN 
            RETURN; /* Return, person_id is invalid */
        END IF;    
		
        DELETE
        FROM
            reporting.persons
        WHERE
            personid = person_id;
    END;
END; /* Return, person_id is valid */
$$
LANGUAGE plpgsql
VOLATILE

Function call:

SELECT * FROM reporting.delete_person(1)

We call the function and set the argument 1. The function will call and the value of the argument will be used as the parameter person_id. Then we have to define the return type void. With the return type void we are saying that we simply want to execute something without getting a return value. In the PROCEDURE section we check the value of the parameter. If the value of the parameter is NULL then the statement RETURN exit the sub program (the function) and we go back to the main program (the query) which called the function. Otherwise the rest of the PROCEDURE section will execute and end with the statement END.

[top]

4.2.2. Conditions

You can use the conditions to control the processing of the SQL statements inside the PROCEDURE section.

[top]

4.2.2.1. IF THEN

To control the processing you can use the statement IF ... THEN. It is like in other programming languages. When a condition is met, then a statement will execute. This statement can define inside the condition.

Structure of a simple IF ... THEN statement:

IF condition THEN
    statement;
END IF;

Structure when you have to check 2 conditions:

IF condition1 THEN
    statement1;
ELSE
    statement2;
END IF;

Structure when you have to check multiple conditions:

IF condition1 THEN
    statement1;
ELSEIF condition2 THEN
    statement2;
ELSE
    statement3;
END IF;

If you have more conditions to check then add the new condition by using the statement ELSEIF condition THEN.

[top]

4.2.2.2. CASE WHEN

Instead of an IF... THEN statements, CASE WHEN can also be used.

Structure:

CASE
    WHEN condition1 THEN
        statement1;
    WHEN condition2 THEN
        statement2;
    ELSE
        statement3;
END CASE;

Example:

CREATE OR REPLACE FUNCTION reporting.case(
    firstname varchar
)
RETURNS VARCHAR AS 
$$
BEGIN
    /* PROCEDURE section */
    BEGIN
        CASE 
            WHEN firstname = 'Max' THEN
                RETURN firstname;
            WHEN firstname = 'Maxi' THEN
                RETURN firstname;
            ELSE
                RETURN 'Blubb';
        END CASE;
    END;
END;
$$
LANGUAGE plpgsql;

Output:

Function call Output
SELECT * FROM reporting.case('Max'); Max
SELECT * FROM reporting.case('Maxi'); Maxi
SELECT * FROM reporting.case('Fred'); Blubb

[top]

4.3. EXCEPTION section

The most exception are handled by the DBMS but sometimes you would like to control specific exceptions and execute a special SQL statement when the exception raised.

Example how you can raise an exception inside the PROCEDURE section:

BEGIN
    /* DECLARE section */
    DECLARE
        exception_name EXCEPTION;
		
    /* PROCEDURE section */
    BEGIN
        IF condition THEN
            RAISE exception_name;
        ENDIF;
		
        /* EXCEPTION section */
        EXCEPTION
            WHEN exception_name THEN
                statement;
				
    END;
END;

In the PROCEDURE section the word RAISE refer to the EXCEPTION section. In the example above the exception will call if a condition is met.

How to handle multiple exceptions:

/* EXCEPTION section */
EXCEPTION
    WHEN exception1 THEN
        statement1;
    WHEN exception2 THEN
        statement2;
    WHEN exception3 THEN
        statement3;
    ...

There is also a way to create an exception if the other exceptions will not raise:

/* EXCEPTION section */
EXCEPTION
    WHEN exception1 THEN
        statement1;
    WHEN exception2 THEN
        statement2;
    WHEN OTHERS THEN
        statement3;

When the exceptions 1 and 2 not raised then the statement3 will execute. The WHEN OTHERS clause allows you to control the errors inside the block.

[top]

5. Use cases

5.1. Creating an UDF to get a table output

In the last section, the code block, you can create a SQL statement to query the data from the database. You can use the parameters within the code block to querying the data.

Example:

DROP FUNCTION IF EXISTS reporting.user; 

CREATE FUNCTION reporting.user(
    user_gr varchar DEFAULT ''
)
RETURNS TABLE(
    id varchar,
    groups varchar
) AS 
$$
/* Code block */
BEGIN
    /* PROCEDURE section */
    BEGIN
        RETURN QUERY
            SELECT 
                id AS id,
                groups__t.group AS groups
            FROM 
                folio_users.groups__t
            WHERE 
                ((groups__t.group = user_gr) OR (user_gr = ''));
    END;
END;
$$
LANGUAGE plpgsql
STABLE
PARALLEL SAFE

In this function we create the parameter user_group with a default value. The default value is an empty string. In the query we use the parameter in the WHERE clause.

Function call:

SELECT * FROM reporting.user_groups('staff')

Output:

id group_name
3684a786-6671-4268-8ed0-9db82ebca60b staff

[top]

5.2. Creating an UDF for calculations and an INTEGER output

You can use UDF to calculate. There are a lot of built-in functions in the DBMS, but sometimes you have individual calculations and then it is helpful to create your own function.

Example:

DROP FUNCTION IF EXISTS reporting.addition_example;

CREATE FUNCTION reporting.addition_example(
    addend1 INTEGER DEFAULT 0,
    addend2 INTEGER DEFAULT 0
) 
RETURNS INTEGER AS 
$$
BEGIN
    /* DECLARE section */

    DECLARE total INTEGER := 0;
        
    /* PROCEDURE section */
    BEGIN 

        total := addend1 + addend2;  
   
        RETURN total;
	
    END;
END;
$$
LANGUAGE plpgsql
STABLE
PARALLEL SAFE

Function call:

SELECT * FROM reporting.addition_example(1,2)

Return value: 3 (INTEGER)

[top]

5.3. Creating an UDF for formatting and a VARCHAR output

Another use case could be to format strings in a specific way instead of formatting them using the built-in functions. With UDF you have the opportunity to centrally updating the formatting in all scripts where you use the function.

Example:

DROP FUNCTION IF EXISTS reporting.format_example;

CREATE FUNCTION reporting.format_example(
    firstname VARCHAR DEFAULT NULL, 
    lastname VARCHAR DEFAULT NULL 
)
RETURNS VARCHAR AS
$$
BEGIN
    /* PROCEDURE section */
    BEGIN
        IF firstname IS NULL AND lastname IS NULL THEN 
            RETURN NULL;
        ELSIF firstname = '' AND lastname IS NOT NULL THEN 
            RETURN lastname;
        ELSIF firstname IS NOT NULL AND lastname IS NULL THEN 
            RETURN firstname;
        ELSIF firstname IS NOT NULL AND lastname = '' THEN 
            RETURN firstname;
        ELSE 
            RETURN firstname || ' ' || lastname;
        END IF;    
    END;
END;
$$
LANGUAGE plpgsql
STABLE
PARALLEL SAFE

Sample outputs:

Function call Return value
SELECT * FROM reporting.format_example('John', 'Doe') John Doe
SELECT * FROM reporting.format_example('John') John
SELECT * FROM reporting.format_example('John', '') John
SELECT * FROM reporting.format_example('', 'Doe') Doe

[top]

6. Reports for the FOLIO app Reporting

6.1. Basics and Tips

You need two files to write a report.

  • SQL file with an UDF
  • JSON file with the definition of the fields for the formular in the FOLIO app

The SQL file contains an UDF that will be used to querying the data in the database. It must be an UDF because the values for the parameters are individual. The app passes the individual values ​​from the form to the UDF. The mapping between the values ​​from the form and the parameters in the UDF is done by using the named notation when calling the function.

Example using named notation

SELECT function(a => 'Hello', b => 'World');

To guarantee the mapping you need the expression in the JSON file for the fields. It is the key name in the array parameters.

The full description of how you write the reports you can find in the repository of the FOLIO app Reporting.

Tips

  • The data types for the parameters must match those of the attributes in the database. Otherwise there is an error message.
  • The data types of the attributes in the query (PROCEDURE section) must match those that are then used in the output table (return section). A good practice is to have an explicit type cast for each attribute in the query (PROCEDURE section) that matches the data types in the output (return section).
  • Everything must be unique, that means the schemas must be added to the attributes.
  • SQL and PL/pgSQL can be used.

[top]

6.2. LDP and Metadb functions

Creating functions for LDP and Metadb is similar to the normal way. Like the derived tables, a directive is required at the beginning.

  • For LDP: --ldp:function <NAME>
  • For Metadb: --metadb:function <NAME>

Example for LDP:

--ldp:function get_users

DROP FUNCTION IF EXISTS get_users;

CREATE FUNCTION get_users(

    /* Parameters */

    start_date DATE DEFAULT '2000-01-01',
    end_date   DATE DEFAULT '2050-01-01'
	
)
/* Set return type */
RETURNS TABLE(

    /* Output table */

    id           UUID,
    barcode      TEXT,
    created_date TIMESTAMPTZ
)
AS 
/* Code block */
$$
BEGIN
    
    /* PROCEDURE section */
    BEGIN

        /* RETURN value is the result of a query */
        RETURN QUERY

            /* Query Start */
            SELECT 
                u.id           :: UUID,
                u.barcode      :: TEXT,
                u.created_date :: TIMESTAMPTZ
            FROM 
                user_users AS u
            WHERE 
                start_date <= u.created_date AND u.created_date < end_date
            ;
            /* Query End */
    END;
END;
$$
LANGUAGE plpgsql
STABLE
PARALLEL SAFE;

Example for Metadb:

--metadb:function get_users

DROP FUNCTION IF EXISTS get_users;

CREATE FUNCTION get_users(

    /* Parameters */

    start_date date DEFAULT '2000-01-01',
    end_date date   DEFAULT '2050-01-01'
	
)
/* Set return type */
RETURNS TABLE(

    /* Output table */

    id           UUID,
    barcode      TEXT,
    created_date TIMESTAMPTZ
)
AS 
/* Code block */
$$
BEGIN

    /* PROCEDURE section */
    BEGIN

        /* RETURN value is the result of a query */
        RETURN QUERY

            /* Query Start */
            SELECT 
                u.id           :: UUID,
                u.barcode      :: TEXT,
                u.created_date :: TIMESTAMPTZ
            FROM 
                folio_users.users__t AS u
            WHERE 
                start_date <= u.created_date AND u.created_date < end_date
            ;
            /* Query End */
    END;
END;
$$
LANGUAGE plpgsql
STABLE
PARALLEL SAFE;

[top]