Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider supporting FIFO + message keys #294

Open
fggrtech opened this issue Aug 9, 2024 · 12 comments
Open

Consider supporting FIFO + message keys #294

fggrtech opened this issue Aug 9, 2024 · 12 comments
Labels
enhancement New feature or request

Comments

@fggrtech
Copy link

fggrtech commented Aug 9, 2024

Firstly, this project is real neat.

Any thoughts on supporting FIFO queues with message key values, similar to SQS FIFO + MessageGroupId ?

Riffing off your existing work, here are the prototype functions which illustrate the idea:

CREATE TYPE message_record AS (
    msg_id BIGINT,
    read_ct INTEGER,
    enqueued_at TIMESTAMP WITH TIME ZONE,
    vt TIMESTAMP WITH TIME ZONE,
    group_id UUID,
    message JSONB
);

CREATE FUNCTION create_queue(queue_name TEXT)
RETURNS void AS $$
BEGIN

  EXECUTE FORMAT(
    $QUERY$
    CREATE TABLE IF NOT EXISTS %s (
        msg_id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
        read_ct INT DEFAULT 0 NOT NULL,
        enqueued_at TIMESTAMP WITH TIME ZONE DEFAULT now() NOT NULL,
        vt TIMESTAMP WITH TIME ZONE NOT NULL,
        group_id UUID NOT NULL,
        message JSONB NOT NULL
    )
    $QUERY$,
    queue_name
  );

  EXECUTE FORMAT(
    $QUERY$
    CREATE INDEX IF NOT EXISTS %s_vt_idx ON %s (vt ASC);
    $QUERY$,
    queue_name, queue_name
  );

  EXECUTE FORMAT(
    $QUERY$
    CREATE INDEX IF NOT EXISTS %s_group_id_order ON %s (group_id, msg_id);
    $QUERY$,
    queue_name, queue_name
  );

END;
$$ LANGUAGE plpgsql;

CREATE FUNCTION enqueue(
    queue_name TEXT,
    group_id UUID,
    msg JSONB,
    delay INTEGER DEFAULT 0
) RETURNS SETOF BIGINT AS $$
DECLARE
    sql TEXT;
BEGIN
    sql := FORMAT(
        $QUERY$
        INSERT INTO %s (vt, group_id, message)
        VALUES ((clock_timestamp() + interval '%s seconds'), '%s'::uuid, $1)
        RETURNING msg_id;
        $QUERY$,
        queue_name, delay, group_id
    );
    RETURN QUERY EXECUTE sql USING msg;
END;
$$ LANGUAGE plpgsql;

CREATE FUNCTION dequeue(
    queue_name TEXT,
    vt INTEGER,
    qty INTEGER
)
RETURNS SETOF message_record AS $$
DECLARE
    sql TEXT;
BEGIN
    sql := FORMAT(
        $QUERY$

        WITH cte0 AS 
        (
          SELECT group_id, MIN(msg_id) AS msg_id
          FROM %s
          GROUP BY group_id
        ),
        cte1 AS
        (
          SELECT t1.msg_id AS msg_id
          FROM %s AS t1
          JOIN cte0 AS t2
            ON t1.group_id = t2.group_id
            AND t1.msg_id = t2.msg_id
          WHERE vt <= clock_timestamp()
          ORDER BY msg_id ASC
          LIMIT $1
          FOR UPDATE SKIP LOCKED
        )

        UPDATE %s m
        SET
            vt = clock_timestamp() + interval '%s seconds',
            read_ct = read_ct + 1
        FROM cte1
        WHERE m.msg_id = cte1.msg_id
        RETURNING m.msg_id, m.read_ct, m.enqueued_at, m.vt, m.group_id, m.message;
        $QUERY$,
        queue_name, queue_name, queue_name, vt
    );
    RETURN QUERY EXECUTE sql USING qty;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM create_queue('test');

-- Two enqueued messages with the same group id. These should respect FIFO rules.
SELECT * FROM enqueue('test', '0470da73-c6a4-4273-aa31-f48b05555539', '{"email-address": "[email protected]"}');
SELECT * FROM enqueue('test', '0470da73-c6a4-4273-aa31-f48b05555539', '{"email-address": "[email protected]", "phone-number": 5551112222}');

-- Third enqueued message with different group id.
SELECT * FROM enqueue('test', gen_random_uuid(), '{"email-address": "[email protected]}");

-- This should return message id 1 and 3. Message id 2 will NOT be available until 1 has been deleted.
SELECT * FROM dequeue('test', 5, 5);

Thoughts?

@ChuckHend
Copy link
Member

Hi @fggrtech , thank you. I'd like to see features like this make it into the project too.

I'll have to think a bit about how this would impact the existing queue APIs and tables. I don't think we'd want to break the simple queue structure that currently exists and ideally we could find a way to add this functionality without adding overhead or requiring the usage of the new keys, but worst case we add a new API specifically for FIFO.

@tazmaniax
Copy link

tazmaniax commented Sep 20, 2024

I'm looking for exactly this functionality. Does the group ID need to be a UUID or can it be just some arbitrary text?

@fggrtech
Copy link
Author

I'm looking for exactly this functionality. Does the group ID need to be a UUID or can it be just some arbitrary text?

I think that is one of difficult questions about this feature request. A message id means different things based on the use context. My personal case is a uuid, but i would assume it would be more generic like a Kafka message key - bytes or a string field.

@nimrodkor
Copy link
Contributor

nimrodkor commented Oct 31, 2024

Hey @ChuckHend !

We've run into this too. I like the approach of not breaking existing queues, so I'd like to suggest a new create_fifo_queue, send_fifo_message and read_fifo_message methods that will support the message group ID and ordering requirements.
WDYT?

Agreeing with the above that the group ID should be TEXT, allowing the implementor to choose whether he should stringify his UUID or put in some other value

Up for a contribution?

CC: @gruebel

@v0idpwn
Copy link
Collaborator

v0idpwn commented Oct 31, 2024

Hi.

That's me (and not PGMQ), but I feel like we should keep the same interface for send. The main reason is that we'd like to support exchanges in the future, and exchanges should be able to push to all queue types. Currently, all queue types have the same interface to push, so there's no problem doing that.

A way to achieve that would be:

  1. Adding headers support (in our list for v1.5.0)
  2. Using a special header for FIFO, e.g.: "pgmq_fifo_key"
  3. Having a custom read function which respects header-based FIFO (we can also make the key header parameterizable here, and let people use any header for that in step 2)

If we really wanted, we could also keep the same interface for read. But this would imply an additional check (for queue type) on every read. This is a bit hard to write in a performant way.

@ChuckHend
Copy link
Member

But this would imply an additional check (for queue type) on every read. This is a bit hard to write in a performant way.

@v0idpwn , @nimrodkor - yes this would be a subsequent read to pgmq.meta on every pgmq.read() call. Are there ANY scenarios where someone might want to do a regular pgmq.read() from a FIFO queue? If not, then I don't mind if we add a function for pgmq.read_fifo().

We already have pgmq.create_unlogged(), so adding a pgmq.create_fifo() doesn't seem so bad, but we may want to refactor both of those to pgmq.create() in v2.0.

@v0idpwn
Copy link
Collaborator

v0idpwn commented Nov 1, 2024

Proposed api:

-- We create the queue in the same way. It's a regular queue
select pgmq.create('myqueue');

-- When sending messages, we can add a special header
select pgmq.send('myqueue', '"hello"'::jsonb, '{"x-pgmq-fifo": "ordering-key"}'::jsonb);
select pgmq.send('myqueue', '"fifo"'::jsonb, '{"x-pgmq-fifo": "ordering-key"}'::jsonb);

-- We can get multiple messages of the same ordering key on a query
select msg_id, msg, headers from pgmq.read_fifo('myqueue', 10, 10);
| msg_id | msg       | headers                           |
+--------+-----------+-----------------------------------|
|      1 | '"hello"' | '{"x-pgmq-fifo": "ordering-key"}' |
|      2 | '"fifo"'  | '{"x-pgmq-fifo": "ordering-key"}' |

 
select pgmq.send('myqueue', '"bye"'::jsonb, '{"x-pgmq-fifo": "ordering-key"}'::jsonb);
select pgmq.send('myqueue', '"world"'::jsonb, '{"x-pgmq-fifo": "another-ordering-key"}'::jsonb);

-- One the second select, we don't get the `"bye"` message because it's on FIFO with the `"ordering-key"` key, which still
-- has undeleted messages, even if they are not visible
select msg_id, msg, headers from pgmq.read_fifo('myqueue', 10, 10);
| msg_id | msg       | headers                                   |
+--------+-----------+-------------------------------------------+
|      4 | '"world"' | '{"x-pgmq-fifo": "another-ordering-key"}' |

SELECT PG_SLEEP(10);

-- After the VT is expired or messages are deleted, we can read the messages behind in the queue
select msg_id, msg, headers from pgmq.read_fifo('myqueue', 10, 10);
| msg_id | msg       | headers                                   |
+--------+-----------+-------------------------------------------+
|      1 | '"hello"' | '{"x-pgmq-fifo": "ordering-key"}'         |
|      2 | '"fifo"'  | '{"x-pgmq-fifo": "ordering-key"}'         |
|      3 | '"bye"'   | '{"x-pgmq-fifo": "ordering-key"}'         |
|      4 | '"world"' | '{"x-pgmq-fifo": "another-ordering-key"}' |

@ChuckHend
Copy link
Member

@fggrtech , @tazmaniax , @nimrodkor - what are your thoughts on the API proposed above? Do you think this will meet your requirements?

@v0idpwn
Copy link
Collaborator

v0idpwn commented Nov 1, 2024

Important notes:

  • if you pgmq.read instead of pgmq.read_fifo, you may get out of order messages
  • messages without a fifo key should be treated as if they had a default order. key So if you have messages 1, 2, 3 and 4 without a header, read 1 and 2, you can't read 3 and 4 until 1 and 2 are "completed"

@fgasperino
Copy link

@ChuckHend Works for me.

@kevbook
Copy link

kevbook commented Nov 10, 2024

This makes sense. Correct if I'm wrong below:

If I have a user-A (fifo-group-id = user-A), and the user has 10 messages, and I'm reading them 1 by 1 because ordering matters:

  • On any read, I'm only going to get 1st message (or no message from this fifo-group-id = user-A) since visibility timeout has not passed, and hence maintaining FIFO
  • If message 1 processing fails (ie. visibility timeout has passed, or I've set it to be 0 via set_vt when code throws an error), and we re-read, we'll get 1st message?

@v0idpwn
Copy link
Collaborator

v0idpwn commented Nov 10, 2024

That's right, @kevbook!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants