Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixture generation utilities #9652

Closed
wants to merge 11 commits into from

Conversation

abdelrahman725
Copy link
Contributor

@abdelrahman725 abdelrahman725 commented Aug 23, 2022

GSoC Work Summary

created 3 python scripts for generating all relevant data for the following apps models

content
kolibriauth
lessons
exams
logger

Goal

before

Kolibri was lacking authentic testing data which should represent real usage scenarios, existing data generation utilities are run during unit test runtime making them not so efficient

after

new scripts that can generate authentic data ( for models of apps specefied above ) that are representable of real Kolibri data taking in account most use cases and scenarios with the ability of being deterministic i.e. developers can choose and specify what exact data to produce (e.g. range/number of data, what fields to include) depending on the current testing scenario requirements, ability to dump these generated data into fixtures .json files to be used directly in unit testing

Features

to do ..

Usage

run kolibri manage (script_name) with or without arguments

where script_name can be :

generate_content_data for content app

  • --channels number of channels trees default 1
  • --levels number of channel tree levels default 2
  • --children how many children for each parent node ( of kind topic ) default 3
  • --resources_kind kind of resources default random

generate_auth_data for kolibriauth, lessons and exams apps

  • --facilities number of facilities default 1
  • --not_assigned_users number of facility users that are not assigned to any collection default 5
  • --admins number of facility admins default 1
  • --coaches number of facility coaches default 1
  • --classes number of classes default 2
  • --class_coaches number of class coaches default 1
  • --class_learners number of class learners default 20
  • --class_lessons number of class lessons default 3
  • --class_exams number of class default 3
  • --groups number of groups per class default 1
  • --group_members number of group members default 5
  • --adhoc_lessons number of lessons assigned for specific learners default 0
  • --adhoc_lesson_learners number of adhoc_lesson learners default 0
  • --adhoc_exams number of exams assigned for specific learners default 0
  • --adhoc_exams_learners number of adhoc_exam learners default 0

generate_interactions for logger app

  • --users number of authenticated users default 20
  • --visitors number of anonymous users default 5
  • --start_time Minimum start_timstamp for all logs default 2022-01-01
  • --end_time Maximum end_timstamp for all logs default current run time
  • --session kolibri user session duration (in mins >=15 ) default 15
  • --n_sessions number of user sessions in kolibri (not used yet)
  • --n_resources how many resources should each user interact with (not used yet)

shared arguments

  • --mode generated data destination ( json file as fixtures or saved in local db) default default_db
  • --seed random seed value, so all operations can be randomized predictably default 1
  • --fixtures_path fixtures file path

Testing checklist

  • Contributor has fully tested the PR manually
  • If there are any front-end changes, before/after screenshots are included
  • Critical user journeys are covered by Gherkin stories
  • Critical and brittle code paths are covered by unit tests

PR process

  • PR has the correct target branch and milestone
  • PR has 'needs review' or 'work-in-progress' label
  • If PR is ready for review, a reviewer has been added. (Don't use 'Assignees')
  • If this is an important user-facing change, PR or related issue has a 'changelog' label
  • If this includes an internal dependency change, a link to the diff is provided

Reviewer checklist

  • Automated test coverage is satisfactory
  • PR is fully functional
  • PR has been tested for accessibility regressions
  • External dependency files were updated if necessary (yarn and pip)
  • Documentation is updated
  • Contributor is in AUTHORS.md

@github-actions
Copy link
Contributor

github-actions bot commented Aug 23, 2022

Copy link
Member

@jredrejo jredrejo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some comments inline in the code. After testing the command I've seen these issues too:

  • In generate_auth_data lessons and exams are always assigned to one single classroom , even if there are multiple

  • kolibri manage generate_auth_data --mode=default_db --groups=5 --class_exams=3 --class_lessons=5 --classes=4 creates 2 classes instead of 4

  • In generate_content_data level=3 produces 400 topics and 2401 nodes, can you explain this number? because kolibri manage generate_content_data --mode=default_db --levels=5 seems to be an infinite loop

  • In generate_content_data only video and topics are created, no other kinds

  • One test is failing in generate_auth_data.py, line 332, because you're trying to unpack one list inside a list with the * operator, that only works with functions

  • Creating the fixtures does not work because it looks for an unexisting directory:

 start dumping fixtures for content app 

CommandError: Unable to serialize database: [Errno 2] No such file or directory: 'fixtures/all_content_data.json'

Two separate comments:

  • you've created the PR using your develop branch. It's better if you create a different branch in your repository and create the PR from it, doing that way is much easier for you to do rebase if needed, and work on several issues at the same time.
  • when filling a PR is good to follow the provided template , in particular the "reviewing guidance" is helpful when reviewing code that can be complex. Beware that many PR are reviewed by QA people who don't need to be developers. It's good to provide instructions on how to test and what to test.

@abdelrahman725
Copy link
Contributor Author

In generate_auth_data lessons and exams are always assigned to one single classroom , even if there are multiple

i tested 2 classes and 5 lessons for each it it was working!

@jredrejo jredrejo requested a review from rtibbles September 12, 2022 15:17
@jredrejo jredrejo added the TODO: needs review Waiting for review label Sep 12, 2022
@jredrejo jredrejo added this to the 0.16.0 milestone Sep 12, 2022
Copy link
Member

@jredrejo jredrejo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @abdelrahman725 ,
code for the two first commands look good and seem to work properly.
But generate_interactions.py does not, just executing it without any args, it fails with

  File "/datos/le/mio/kolibri/kolibri/core/logger/management/commands/generate_interactions.py", line 369, in generate_interactions
    generate_visitor_content_session_logs(
  File "/datos/le/mio/kolibri/kolibri/core/logger/management/commands/generate_interactions.py", line 280, in generate_visitor_content_session_logs
    generate_content_session_log(
  File "/datos/le/mio/kolibri/kolibri/core/logger/management/commands/generate_interactions.py", line 200, in generate_content_session_log
    return ContentSessionLog.objects.create(
  File "/datos/le/mio/kolibri/venv/lib/python3.10/site-packages/django/db/models/manager.py", line 85, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/datos/le/mio/kolibri/venv/lib/python3.10/site-packages/django/db/models/query.py", line 394, in create
    obj.save(force_insert=True, using=self.db)
  File "/datos/le/mio/kolibri/kolibri/core/logger/models.py", line 131, in save
    super(ContentSessionLog, self).save(*args, **kwargs)
  File "/datos/le/mio/kolibri/kolibri/core/auth/models.py", line 287, in save
    self.pre_save()
  File "/datos/le/mio/kolibri/kolibri/core/auth/models.py", line 282, in pre_save
    self.ensure_dataset()
  File "/datos/le/mio/kolibri/kolibri/core/auth/models.py", line 296, in ensure_dataset
    inferred_dataset_id = self.infer_dataset(*args, **kwargs)
  File "/datos/le/mio/kolibri/kolibri/core/logger/models.py", line 93, in infer_dataset
    raise AssertionError("Before you can save logs, you must have a facility")
AssertionError: Before you can save logs, you must have a facility

This has been tested after running the other two commands, so the db has both content and facilities and user, but you forgot to create a device settings, so logs can not find the default faciity of the system when adding a visitor that has not facility associated.
So, in generate_auth_data.py https://github.com/learningequality/kolibri/blob/develop/kolibri/core/device/utils.py#L93 needs to be executed

OTOH, as this PR is going to be converted into documentation for these commands, it would be good to add the default values for each of the different params the commands have. This is not a blocker anyway.

@jredrejo
Copy link
Member

@abdelrahman725 I can confirm that using
kolibri manage generate_interactions --visitors=0
code seems to work as expected, so the only pending issue is ensuring that generate_auth_data provisions a device doing a call to the provision_device function.
i.e. something like

from kolibri.core.device.utils import device_provisioned
from kolibri.core.device.utils import provision_device
...
...
        # if device has not been provisioned, set it up
        if not device_provisioned():
            provision_device()

at the end of the start_generating function would do it.

@rtibbles rtibbles changed the title Scripts 1st version (Fixtures generation for Testing) Fixture generation utilities Sep 16, 2022
@rtibbles
Copy link
Member

Superseded by #11859

@rtibbles rtibbles closed this Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TODO: needs review Waiting for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants