Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

31 event trigger #664

Merged
merged 10 commits into from
Nov 21, 2022
Merged

31 event trigger #664

merged 10 commits into from
Nov 21, 2022

Conversation

matthewhilton
Copy link
Contributor

@matthewhilton matthewhilton commented Nov 6, 2022

PR to add event triggers.

What has been added

New event trigger

This event trigger works the following:

  1. The plugin (tool_dataflows) listens to EVERY moodle event.
  2. When it observes a Moodle event, it checks if any event_reader steps are configured to listen to this event.
  3. If they are, it stores the event data in the tool_dataflows_events table
  4. Then depending on the execution policy of the event_reader step, it will either:
    4.1. Execute the dataflow immediately
    4.2. Create an adhoc task to execute the dataflow
    4.3. Wait until the next CRON call to dataflow to run it.
  5. When the dataflow is run, it gets the record from the database and loads it as an array iterator and makes the event variable available to later steps.

What has been changed

  • nextruntime is now optional (since now event triggered dataflows don't have a nextruntime like cron ones do)
  • Handle bad webservices causing the form to be unusable
  • Updated docs around the recent variable handling change
  • Refactored the calling of a dataflow so it can be reused in a couple of new places

Things left to do:

  • Unit tests - will do once the structure is settled
  • A user guide for how to use the event trigger
  • Codechecker, etc...

Some areas I would like comment / feedback / ideas on

  • Should this be a trigger, reader, or a mix of both - currently it's a reader but styled as if its a trigger.
  • Is a race condition possible the current db queue setup? I had a good think and I don't think this is possible even with concurrent running, but i'm not 100% confident.
  • General comments regarding the way everything has been setup. I'll admit OOP isn't my strongest suite but happy for any tips

Closes issues

Closes #31

Closes #665

Closes #666

Example dataflow

See example dataflow below (captures course viewed event, then calls course_get_contents webservice)
Just replaced the user with the admin user on your site

name: 'Test dataflow'
config:
  enabled: '1'
  concurrencyenabled: '0'
steps:
  course_viewed:
    name: 'Course viewed'
    type: tool_dataflows\local\step\trigger_event
    config:
      eventname: \core\event\course_viewed
      executionpolicy: adhocqueued
    vars:
      courseid: '${{ event.courseid }}'
  debugging_writer:
    name: 'Debugging writer'
    depends_on:
      1: course_viewed
    type: tool_dataflows\local\step\writer_debugging
  flow_web_service:
    name: 'Flow web service'
    depends_on: debugging_writer
    type: tool_dataflows\local\step\flow_web_service
    config:
      webservice: core_course_get_contents
      user: admin
      parameters: 'courseid: ${{steps.course_viewed.event.courseid}}'
      failure: abortflow

@matthewhilton
Copy link
Contributor Author

This should be ready for an initial review now
Not 100% complete but would like feedback regarding the setup before continuing

Copy link
Contributor

@keevan keevan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitting feedback in parts.

I'm going to skip most phpdoc related ones, as they are indicated on the CI. But noted the ones which might not have appeared.

VARIABLES.md Outdated Show resolved Hide resolved
VARIABLES.md Outdated Show resolved Hide resolved
VARIABLES.md Outdated Show resolved Hide resolved
classes/local/event_processor.php Outdated Show resolved Hide resolved
classes/local/event_processor.php Show resolved Hide resolved
classes/local/event_processor.php Outdated Show resolved Hide resolved
classes/local/step/reader_event.php Outdated Show resolved Hide resolved
classes/local/step/reader_event.php Outdated Show resolved Hide resolved
classes/local/step/reader_event.php Outdated Show resolved Hide resolved
classes/local/step/reader_event.php Outdated Show resolved Hide resolved
Copy link
Contributor

@keevan keevan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments.

Thanks for making the changes @matthewhilton, I'm still a bit on the fence regarding whether or not it should appear under the reader list or trigger list.

image

In its current state, it can't act like a typical trigger-connector step. I think the sweet spot would be to tweak the handling of the dataflow iterator to allow for trigger steps that are "flows" to act like readers. This is because you wouldn't have a step that comes before this.

ref: #663

classes/local/event_processor.php Outdated Show resolved Hide resolved
Comment on lines 36 to 41
if (!$this->enginestep->engine->isdryrun) {
event_processor::consume_event($input->id);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does a dry run event data get passed to the run at the moment? I scanned the class and didn't see a form field for this sort of data, and below it seems to be referenced.

$variables->set('event', $input->eventdata);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed that dry runs would pull the event data just like any other run would. Although that might be a bad assumption, i'll look into it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tested the dry run and the get_iterator is called to get the event data and it is passed to the execute function just as like any other run.

When I go to dry-run there is no form that pops up - is this a feature that steps can implement? I'm not too familiar with ti

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no such feature currently. It would be nice, for the interface to display / prompt the user for an input when required (or stop until the input has been supplied).

A similar issue had been created here, but this is for the initial trigger (form trigger)
#443

If / once that has been implemented, it could potentially be expanded to other steps and allow for gating.

For testing purposes, perhaps the data could be set up in the step itself? And would only be used for testing (e.g. dry runs) due to the absense of a form feature.

Though, at the same time it might be good to make a generic field to hold dry-run test data that is used if certain conditions were checked. But I think this is somewhat scope creep, and should be implemented in another issue.

I'm leaning towards adding a generically named field to hold the test (event) data for now, and building on that concept later down the track.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards adding a generically named field to hold the test (event) data for now, and building on that concept later down the track.

I think this should probably be implemented later since to me it seems a bit complex and not necessary at the moment:

  • Different events have different structures, so would we store a test event for every event type? What about installed plugin events?
  • If I wanted to dry-run for e.g. a course_viewed event, it would probably have to load the test data and then extra form field for e.g. what user, what course id

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different events have different structures, so would we store a test event for every event type? What about installed plugin events?

By default it will be empty and that either passes through no data or fails. Users can define their own test data which would be based on what they expect it to look like.

If I wanted to dry-run for e.g. a course_viewed event, it would probably have to load the test data and then extra form field for e.g. what user, what course id

Yes, and those things could come from expressions as well, or hard-coded based on what the tester wants.

version.php Outdated Show resolved Hide resolved
db/upgrade.php Outdated Show resolved Hide resolved
db/install.xml Outdated Show resolved Hide resolved
Copy link
Contributor

@keevan keevan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good if we could add some unit tests as well. This should work and be testable for the 'run immediately' type of execution, and can verify the event data comes through as expected.

@matthewhilton
Copy link
Contributor Author

Thanks @keevan for the review 😄 .

Unit tests are definietely coming just wanted to settle down the whole structure of the new code first

I think the sweet spot would be to tweak the handling of the dataflow iterator to allow for trigger steps that are "flows" to act like readers. This is because you wouldn't have a step that comes before this.

Could you give some pointers about how this would be achieved ?

@matthewhilton
Copy link
Contributor Author

matthewhilton commented Nov 10, 2022

Test case ideas:

  • dataflow disabled, shouldn't run / capture events
  • the different execution policies
  • config validation
  • delete step, events recorded are deleted

@matthewhilton matthewhilton force-pushed the 31-event-trigger branch 3 times, most recently from ef3be93 to 04522d9 Compare November 10, 2022 04:54
@matthewhilton
Copy link
Contributor Author

matthewhilton commented Nov 10, 2022

I added a new method to the base_step:

/**
* Does the iterator in this step produce values?
* If it does, this must be true otherwise not all the iterator values will be executed.
* Generally, this is only true for reader steps, but exceptions do exist.
*
* @return bool
*/
public function has_producing_iterator(): bool {
return false;
}

And overrode it in the reader step

/**
* Reader steps have a producing iterator.
* This enables the dataflow engine to pull the first record from the iterator.
*
* @return bool
*/
public function has_producing_iterator(): bool {
return true;
}

And so instead of checking group == 'reader', we can check instead the output of this function:

/**
* Should this iterator pull the next value down?
*
* @return bool
*/
public function should_pull_next(): bool {
return !empty($this->pulled) || !$this->steptype->has_producing_iterator();
}

This is equivalent to the code before; readers will not pull the next. However, this now means the new reader_event class can be in the trigger group but still specify that it has a producing iterator (since it is a reader) meaning we don't encounter the bug #663

Furthermore, I styled the trigger as blue to indicate to the user 'This is a trigger, but its also kinda a reader' since ultimately it is a reader with a bit of trigger'ry stuff sprinkled on top

image

I thought about going the other way and making this new Moodle event trigger based on the trigger_step class with iterator stuff added, but because triggers are based off the connector_step class things get hairy pretty quickly. And since this step is probably 80% reader I though it would be best to keep it that way.

Would like some feedback on ^^ though.

Copy link
Contributor

@keevan keevan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks mostly good. As discussed internally regarding the colour choices, I think keep it the same as the normal trigger step. It will ultimately head towards that direction after the future refactor.

I'll wait until you've added the rest of the things you plan to add such as those tests before taking another look.

Also I'm curious, have you tried testing if you can reference the event data using ${{ record.eventdata.courseid }} based on your example in VARIABLES.md, as that should by default work.

One test case I'd like to cover as well, is one where on an event, you read some SQL data based on the data from the event and write it somewhere. It'd be interesting to inspect the shape of the $input that gets passed through.

@matthewhilton matthewhilton force-pushed the 31-event-trigger branch 3 times, most recently from 2559a4b to 635d810 Compare November 11, 2022 01:34
@matthewhilton
Copy link
Contributor Author

matthewhilton commented Nov 11, 2022

Was running into #639 causing the unit tests on 3.5 and 3.6 to fail. I implemented a check that if the function exists, use it, otherwise just queue it.

Apart from just copying the entire logic of the function in OR splitting the MOODLE_XX_STABLE branches, I think this is the best solution.

// Only available 3.7 onwards.
if (function_exists('\core\task\manager::reschedule_or_queue_adhoc_task')) {
\core\task\manager::reschedule_or_queue_adhoc_task($task);
} else {
\core\task\manager::queue_adhoc_task($task);
}

@matthewhilton matthewhilton force-pushed the 31-event-trigger branch 4 times, most recently from 532eb10 to f33cf70 Compare November 14, 2022 02:05
@matthewhilton matthewhilton marked this pull request as ready for review November 14, 2022 02:06
@matthewhilton
Copy link
Contributor Author

matthewhilton commented Nov 14, 2022

One test case I'd like to cover as well, is one where on an event, you read some SQL data based on the data from the event and write it somewhere. It'd be interesting to inspect the shape of the $input that gets passed through.

This currently isn't allowed since the sql_reader step does not allow upstream flows (such as the event trigger step). #671 would cover this functionality

Copy link
Contributor

@keevan keevan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments. Otherwise it looks like it's in a good shape. Once those are cleaned up I'll do another round of functional testing.

VARIABLES.md Outdated Show resolved Hide resolved
classes/local/event_processor.php Outdated Show resolved Hide resolved
classes/local/event_processor.php Outdated Show resolved Hide resolved
classes/local/event_processor.php Show resolved Hide resolved
classes/local/event_processor.php Outdated Show resolved Hide resolved
classes/local/event_processor.php Outdated Show resolved Hide resolved
classes/local/step/trigger_event.php Show resolved Hide resolved
classes/local/step/trigger_event.php Outdated Show resolved Hide resolved
classes/task/process_dataflow_ad_hoc.php Outdated Show resolved Hide resolved
classes/task/process_dataflows.php Outdated Show resolved Hide resolved
@matthewhilton matthewhilton force-pushed the 31-event-trigger branch 2 times, most recently from 05fefba to 1d30817 Compare November 14, 2022 22:27
db/upgrade.php Outdated
@@ -234,5 +234,28 @@ function xmldb_tool_dataflows_upgrade($oldversion) {
upgrade_plugin_savepoint(true, 2022091600, 'tool', 'dataflows');
}

if ($oldversion < 2022100601) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to match the one defined in version.php, and would be better if it was bumped by date instead of a micro bump. I had issues adding the table and noticed the zero was meant to be a 2. :')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I thought i fixed this... will fix it again 😆

@keevan
Copy link
Contributor

keevan commented Nov 15, 2022

Functional testing seems OK. I encountered some issues.

Using the example workflow in this PR

  • I had to change the type to tool_dataflows\local\step\trigger_event (from reader_event)
  • I encountered this on the course page. I believe it to be outputting directly into the page which is not desired.
    image
    When using this setting
    image

@matthewhilton
Copy link
Contributor Author

I noticed something similar too, but I thought it was the result of the debugging writer. Maybe I might add output buffer catching and pass this to the debugging output instead when using this option.

@matthewhilton
Copy link
Contributor Author

matthewhilton commented Nov 17, 2022

Also some things discussed internally:

  • Capture output buffer when running in the same process to stop the dataflow debugging leaking. Can then redirect this to debugging() so if debug is enabled you can see the run.
  • Give warning when changing the event type that previous events are deleted
  • Switch wording of policies:
Run immediately
Run asap in individual tasks in parallel
Run asap in batch tasks in series
  • When a flow doesn't allow parallelism, method 2 cannot run, so it must use either 1 or 3 (probably 3, the queue method)

@matthewhilton
Copy link
Contributor Author

matthewhilton commented Nov 18, 2022

I'm not super fussed on exactly what happens if the config changes as long as nonthing is dropped andit is clearly defined
like I might guess that if you swap from single events in paralell to batch in series then its easier to let the old ones drain and just pick up the new ones
so you might have 4 in parallel + another one in series running at the same time
and then after they are all done we are back to just 1 in series

When adhoc tasks are created these are only recording the dataflow ID to run - no event data is included in the adhoc task itself, this is loaded from the DB when it is run based on the trigger's step id. And once it is loaded into a dataflow, the record is deleted.

The trigger step itself does not really care at all how it is run, you could queue 500 events and 1000 adhoc tasks to process them and it wouldn't care, after all the events are processed the remaining tasks would just have empty iterators.

I think I may move the event consuming from when the execute() is called to when the iterator is created. Then I will add a lock around this to ensure parallel tasks don't access the queue at the same time otherwise it might be possible for multiple parallel tasks to read from the queue twice.

Reorganised the code that handles the execution of the dataflows to
enable outside classes to access it (such as the event trigger)
Not all steps will define a nextruntime. So now it is only logged
if it is present
When determining if the iterator should pull the next,
instead of checking the group, we check now the function
has_producing_iterator. This allows the group to be independent.
Copy link
Contributor

@keevan keevan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small comments. Going through functional testing now.

classes/local/step/trigger_event.php Outdated Show resolved Hide resolved
classes/local/event_processor.php Outdated Show resolved Hide resolved
lib.php Outdated Show resolved Hide resolved
Adds event reader step along with an extra helper class event_processor
to handle triggering dataflows from Moodle events
Issues fixed:
      1 - Comma required after last value in array declaration
     10 - Each line in an array declaration must end in a comma
      1 - Expected a blank line before function
      1 - Expected a space after FUNCTION keyword
      1 - Expected a space after SWITCH keyword
      1 - Multi-line array contains a single value
@keevan keevan enabled auto-merge November 21, 2022 03:39
Copy link
Contributor

@keevan keevan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Merging in for now and we'll tweak if we encounter any issues

@keevan keevan merged commit 6ff610c into MOODLE_35_STABLE Nov 21, 2022
@keevan keevan deleted the 31-event-trigger branch November 21, 2022 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants