Allow customizable demographic fields to be set from the backend. #11999

rtibbles · 2024-03-19T19:50:06Z

Overview

Allow customisable demographic data collection for contextualized demographic reporting

Description and outcomes

During initial design of the demographic data collection that is currently in Kolibri, we were aware of additional demographic data reporting requirements, but were unable to come up with a more generalizable set than the limited demographic data that we currently allow collection of.

Because of the resource and data transmission constraints that exist in the contexts that Kolibri is used, collecting additional demographic data outside of the platform poses a significant additional burden to implementations, and a barrier to doing effective and targeted measurement and evaluation work.

To address this, this feature will pilot demographic data collection beyond the current fixed fields. Allowing someone with command line access to Kolibri to add additional fields to be collected for demographic data. This can be scripted for automated setup when piloting this feature. Only enumerated fields will be able to be added - i.e. in the user interface, it will be a dropdown menu, and in the backend, the schema specification will require an enum of allowed string values.

Technical specifications

A JSONSchema of this rough form will be added to the FacilityDataset extra_fields_schema as an additional property. It will also be reused in the DeviceSettings extra_settings_schema.

translations_schema = {
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "language": {
                "type": "string",
                "enum": list(KOLIBRI_SUPPORTED_LANGUAGES)
            },
            "message": {
                "type": "string"
            },
        },
    },
    "optional": True,
}


demographic_field_schema = {
    "type": "object",
    "properties": {
        "description": {
            "type": "string"
        },
        "enumValues": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "value": {
                        "type": "string"
                    },
                    "defaultLabel": {
                        "type": "string"
                    },
                    "translations": translations_schema,
                },
            },
        },
        "translations": translations_schema,
    },
}

An extra_fields field will be added to the FacilityUser model, which will be dynamically validated, depending on the value of the schema saved in the extra_fields of its associated FacilityDataset model.

For ease of programmatic setup, a management command to set these fields specifications in the device settings of an unprovisioned device will be created. Any Facility created on a device that has this setup will duplicate the fields specifications detailed in the DeviceSettings.

All user creation and editing workflows will also be updated to allow editing and saving of these new fields. As each field will be a dropdown, this will be implemented using a KSelect, with one displayed for each field. The order of the fields will be determined by the order of the fields saved in the array in JSON.

Places this will impact will be the setup wizard, the user profile page, and the facility user management.

The text was updated successfully, but these errors were encountered:

jredrejo · 2024-03-20T10:32:30Z

While implementing this pull request, it seemed like the json-schema-validator: https://pypi.org/project/json-schema-validator/ library was the only compatible option for Python 2.7. However, several newer features of the current JSON Schema standard are not supported in it.

Looking at the proposed schema I can't find support for translations or enumValues in the mentioned library. If this has to be applied in kolibri 0.16 which support python 2.7 I suggest to take a look at https://github.com/zyga/json-schema-validator/blob/master/json_schema_validator/tests/test_schema.py to check what can be used with this library.

Regarding the proposed approach, using customizable fields in the demographic data seems like the least disruptive approach. This maintains backward compatibility with the current model and minimizes changes needed for synchronization. However, I remember there were some problems in the past with Morango syncing json data in the past. Perhaps @bjester can shed light on whether this functionality is now working as expected.

bjester · 2024-03-20T14:53:18Z

@jredrejo No updates have been made to how JSON fields are handled when synced. If both devices in a sync have modified the data, the changes are not merged, so only one side's writes are preserved.

rtibbles · 2024-03-20T20:40:38Z

translations or enumValues in the mentioned library

These are just the names of properties that I am defining for the schema. We would then generate the JSONSchema from this for validation of the entries in the extra_fields of the FacilityUser.

jamalex · 2024-03-20T23:13:26Z

A couple of thoughts:

On the DeviceSettings, might be clearer to name it something like default_demographic_field_schema to make it clear that it's just referenced during facility creation, and wouldn't override what's marked on an existing facility dataset.
I like having the schema being synced, and I agree with validating any data entered from the frontend against the schema before saving it to the FacilityUser. I would caution against adding strict model validation on it, though, as the consequences of a FacilityUser (and all their data) being entirely blocked from syncing would outweigh the benefits to me. And it would mean it's possible for a formerly valid model to suddenly become invalid because of a change in another model (which may have happened on another device, meaning we'd need to worry about upgrade routines etc, but where would the logic of how to map between schema versions live?). Having it ensure that what gets saved from the frontend matches the schema, and then blanking out any invalid field values when loading to the frontend from the model, seems like it could be a safer alternative. In the edge cases of data on a user that doesn't match the current schema, we could apply centralized normalization to that on a case by case basis on KDP.

rtibbles · 2024-03-21T19:46:34Z

Thanks - it does seem like syncing the schema is desirable, but doing softer validation for a dynamic schema will be necessary.

The issue of it being updated on two different devices and then having conflicting schemas within the FacilityUser data does seem like a nightmare. I'll try to add some test cases around that to make sure we're not making a mess for ourselves.

rtibbles · 2024-04-02T15:13:45Z

Implemented in #12032

rtibbles self-assigned this Mar 21, 2024

rtibbles added the DEV: backend Python, databases, networking, filesystem... label Mar 21, 2024

rtibbles mentioned this issue Mar 28, 2024

Add customizable demographic field entry to facility admin interface #12032

Merged

9 tasks

rtibbles added this to the Kolibri 0.16: Planned Patch 1 milestone Mar 29, 2024

rtibbles closed this as completed Apr 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow customizable demographic fields to be set from the backend. #11999

Allow customizable demographic fields to be set from the backend. #11999

rtibbles commented Mar 19, 2024

jredrejo commented Mar 20, 2024

bjester commented Mar 20, 2024

rtibbles commented Mar 20, 2024

jamalex commented Mar 20, 2024 •

edited

Loading

rtibbles commented Mar 21, 2024

rtibbles commented Apr 2, 2024

Allow customizable demographic fields to be set from the backend. #11999

Allow customizable demographic fields to be set from the backend. #11999

Comments

rtibbles commented Mar 19, 2024

Overview

Description and outcomes

Technical specifications

jredrejo commented Mar 20, 2024

bjester commented Mar 20, 2024

rtibbles commented Mar 20, 2024

jamalex commented Mar 20, 2024 • edited Loading

rtibbles commented Mar 21, 2024

rtibbles commented Apr 2, 2024

jamalex commented Mar 20, 2024 •

edited

Loading