Skip to content

Cron interface for generated reports

Ashwin Date edited this page Sep 6, 2018 · 2 revisions

Background

In several cases, the report needs a complicated query to get the necessary data. In such cases the reports become highly non-performant, since there are complex queries on multiple tables to show aggregated data.

E.g Shika has a “course report” with the below columns -

Course_id | course_name | total enrolled users | users completed the course | total_time_spent by users

Here the total time spent, comes from the lesson_track table, which stores the timespent by each user for each attempt. Total enrolled users count comes from the enrolled_users table, where there are entries for the user_id and course_id. Users completed the course count comes the course_track table, where there are separate entries for course_id and user_id

When there are hundreds of courses with thousands of users, if the report queries are fired in real-time the server will get heavily loaded.

To avoid, this a solution is proposed to run the heavy queries periodically and store the output in a separate table. The report is then shown on this table. This will reduce the load considerably and will allow for complicated reports with no noticeable performance degradation.

Proposed Solution

  • Plugins may implement a install SQL file which creates the necessary tables upon installation

  • Plugins that need this feature will implement a new method “summarize”

  • This method runs the complex queries / processing and stores the normalised output in the table In case the data to normalise is a lot and cannot be processed in a single pass, the summarize method can return a variable containing state information. The reports framework will serialize and save this state information and send it on the next iteration

  • In TJ Reports introduce a new cron task index.php?option=com_tjreports&task=summarizer.run&plugins[]=plug1&plugins[]=plug2&secret=XXXXX

  • The cron URL should accept an optional plugins[] array. If values are specified for this array the cron will summarize the data only for those plugins. This will allow the crons to be set in a staggered way so that all the plugins don’t run at once.

  • The cron triggers the function “summarize” from the specified plugins or for all plugins if the plugins[] variable is not specified. The previously saved state information is passed to this method.

  • The cron will also save the state information returned by the summarize() method

Tables

#__tjreports_crons

Column Name Details
id Primary key
plugin varchar(255) (Name of the plugin triggered)
last_executed Datetime (The time when this plugin last executed
cron_state Text (Serialized string containing the state information from the cron run)