Basic server-side implementation of configurable expectation proposal #828

GabrielKS · 2021-07-13T23:40:04Z

What this PR does:

Provides a sample expectation/notification configuration file and a complete schema for such files
Exposes functions to calculate expectation/notification settings for a given trip based on its end date and inferred labels
Adds a stage to the intake pipeline that adds expectation data to each trip
Sends expectation data to the client app (which currently ignores it)
Fixes a bug in TestLabelInferencePipeline.py
Includes unit tests for most of the above

What this PR doesn't do:

Validate the expectation/notification configuration file against the schema
Fully implement the expectation proposal — for now, the only expect types implemented are "all" and "none"; the others default to "all"
Implement any notification behavior aside from the logic to calculate what the notifications should be

+ Adds pipeline stage EXPECTATION_POPULATION, which will compute the user input expectation for each trip, between LABEL_INFERENCE and CREATE_CONFIRMED_OBJECTS - The stage currently uses a placeholder algorithm that marks every trip as not expected Tested by confirming that confirmed trips are as desired and that client behavior is unchanged

+ expectation_notification_config.py: * Loads the configuration file * Implements rule- and schedule-matching algorithms * Ultimately exposes methods to get trip- and collection mode-level settings + TestExpectationNotificationConfig.py: Tests all of that. Currently passing. Schema changes: - Removes draftDelay option + Notes that omitting a time zone is valid and has a specific meaning None of this is yet connected to the pipeline.

The bug was due to the fact that the test datasets shankari_2015-07-22 and shankari_2015-jul-22 both exist and are different things. I was loading shankari_2015-jul-22 and it was not actually loading any trips; switching to shankari_2015-07-22 solved this. + Added a test case so if this happens in the future the tests actually fail

+ The expectation stage of the pipeline now consults the configuration to determine a trip's expectation status - For now, the only expect types implemented are "all" and "none"; the others default to "all" - Still no verification against the schema + Includes extensive unit tests Tested by running the unit tests and confirming that client-side behavior is as expected.

GabrielKS · 2021-07-13T23:46:39Z

The everlasting UI proposal, for reference — though it's a bit out of date and could use a new draft. A better reference for the config file is the schema itself.

shankari

I don't see any code changes required; these are primarily requests for clarification/additional documentation to improve long-term maintainability.

conf/ux/expectations.conf.schema.json

shankari · 2021-07-17T14:56:29Z

conf/ux/expectations.conf.schema.json

+                "enabled": {
+                    "description": "Whether or not to enable this mode; mode is enabled if this property is omitted",
+                    "type": "boolean"
+                },


When do you anticipate deployers will/should use this v/s setting up a schedule?

I just thought I'd add a way to "comment out" a mode. It's not strictly necessary. Do you think it adds excessive complexity?

no, that's fine. Just update the documentation in a future commit to highlight this.

shankari · 2021-07-17T15:06:18Z

conf/ux/expectations.conf.schema.json

+                    "type": "string"
+                },
+                "confidenceThreshold": {
+                    "description": "Only display yellow labels with confidence greater than this threshold (0 to display all, 1 to display none)",


Is this "only display" or "only mark"?

From this
https://github.com/e-mission/e-mission-server/pull/828/files#diff-5e8452621bcf334e7ef59c9d10744703d736adbb6109ce4cced282bf9b8c0631R92

It looks like if the confidence is below the threshold, we always show it.

Concretely, I have a trip with confidence 0.6.
Am I correct that in the "intensive" mode, this will be marked ("trigger": -1,) and in the "relaxed" mode, this will be marked "yellow"? An inline example will help because, as you might have figured out, nobody RTFMs.

This is actually something I haven't figured out yet. If the confidence for a given label is below the threshold, the phone should always display that label as red and blank; that part I'm sure of (and have implemented accordingly on the client side). The current implementation has these below-threshold labels also counting as red for the purposes of rule matching as well, but I'm not sure what the best choice is there.

If we wanted to be maximally flexible, we would not have below-threshold labels match "trigger": -1 because there is a semantic difference between very low confidence inferences and no inferences at all, and if a study admin wanted to treat them the same they could just write one more rule to explicitly match these trips. The argument in favor of having below-threshold labels match "trigger": -1 (the status quo) is that it forces expectation/notification behavior to match what the user actually sees: to the user, red labels are red labels, if some trips that display red labels are appearing in To Label and some aren't (and it's not because of some random sample option), they might get confused. It's also simpler.

I agree that this is complex. Let's discuss it after tomorrow. I really want to deploy this today so we can walk through the new screen at tomorrow's deployer/beta tester meeting.

shankari · 2021-07-17T15:09:33Z

conf/ux/expectations.conf.schema.json

+            "type": "object",
+            "properties": {
+                "startDate": {
+                    "description": "The base date from which to measure time; omit time zone to signify user's local time",


Timezones are indeed tricky (remind me to tell you about the Brazilian timezone problem sometime), but I am concerned that this is over-engineering.

I anticipate that the related code is also tricky and brittle (now I understand why you had to do complicated time management), and it also introduces ambiguity.

Why do you think it is important to have "the user's local time" as opposed to using UTC, for example? Again:

we are designing this for deployers, not end users, AND

hopefully, we can auto-detect individual schedules in the future

The logic for this part is actually relatively simple; the phone's time zone is embedded in the trip data object, so the server just inserts it if we're missing a time zone. (The complicated part is the schedule matching, regardless of time zones, and in particular figuring out how many months/weeks/days there are between two dates — see expectation_notification_config.py:64).

Let's say we want to do primary data collection for one week, Monday to Sunday, every eight weeks. For simplicity, say we want the user to label all trips at the end of the day under primary data collection and none under secondary data collection. Under the existing schema, we can implement this as a schedule that begins at 00:00 user's local time and lasts 7 days, and specify a startDate that is a Monday. Then the user will have trips appear on their To Label screen for exactly 7 days every 56 days; if they label trips at the end of each day as they are supposed to, they will be interacting with the app 1 in 8 days.

Say we remove the ability to specify a time in the user's local time zone. We implement the schedule as beginning at 00:00 UTC. If the user is in a US time zone, primary mode will begin on Sunday evening and last until the next Sunday evening. Though this is the same length of time and on average will cover the same number of trips, it now spans 8 days instead of 7. If the user is labeling at the end of each day, they will be interacting with the app 1 in 7 days. This is worse.

To summarize, the ability to use a user's local time zone minimizes the number of days the user has to interact with the app. Even if they'll be labeling the same number of trips either way, every day we can avoid making the user do something is a hassle avoided.

I suppose if we wanted to simplify we could only use local time and not give the option to specify a time zone.

Again, let's discuss this after tomorrow. Right now, I'm going to merge this and get the actual recommendations to show up.

emission/core/wrapper/confirmedtrip.py

emission/analysis/userinput/expectations.py

shankari · 2021-07-17T19:02:42Z

emission/tests/pipelineTests/TestExpectationPipeline.py

+        eaue._test_options = test_options_stash
+
+    def preprocess(self, trip):
+        # See eacilp.placeholder_predictor_2 for an explanation of the "fingerprint" technique


I don't see this explanation here - only a one-liner about Timestamp2index.

I can flesh out that explanation in an upcoming commit.

shankari · 2021-07-17T19:12:06Z

emission/analysis/userinput/expectations.py

+def _get_expectation_for_trip(trip):
+    raw_expectation = eace.get_expectation(trip)
+    # For now, expect always labeling unless the config file specifies no labeling at all
+    processed_expectation = not raw_expectation["type"] == "none"
+    return {"to_label": processed_expectation}


Can you add in a comment about what the difference between raw and processed is and is intended to be?
My understanding is that the raw expectation is of the form

"expect": { "type": "randomDays", "value": 2 },

and the processed is a simple boolean true/false on whether trips need to be labeled or not.

That's correct. Will add the documentation in an upcoming commit.

shankari · 2021-07-17T19:14:40Z

emission/tests/pipelineTests/TestExpectationPipeline.py

+        }
+        for trip in self.expected_trips:
+            fingerprint = trip["data"]["start_local_dt"]["hour"]*60+trip["data"]["start_local_dt"]["minute"]
+            if answers[fingerprint] is not None: self.assertEqual(trip["data"]["expectation"]["to_label"], answers[fingerprint])


why do you need this? won't the "randomDays" expectations also return True with your current implementation?

I want the tests to enforce the actual intended behavior, not the current implementation. The actual intended behavior for the random options is not currently well-defined, so we don't test it — returning True for the random options is an implementation detail (and a placeholder, at that).

shankari · 2021-07-17T19:46:02Z

emission/tests/pipelineTests/TestExpectationNotificationConfig.py

+
+
+    def testGetRuleForLabel(self):
+        # An answers entry is a (labels, index of corresponding rule under intensive mode, index of corresponding rule under relaxed mode) tuple


nit: expand on this a bit to explain how you go from the list of all rules to the selected rule.

Please handle this in a subsequent commit.

I'm not sure I understand what you mean.

shankari

I am going to go ahead and merge this now so we can deploy tonight, demo tomorrow and start getting feedback. I've left the "will handle in subsequent commit" comments unresolved for your reference.

This makes it compatible with https://github.com/e-mission/e-mission-server/releases/tag/v3.0.0 in particular, by merging e-mission/e-mission-server#829 and e-mission/e-mission-server#828 e-mission/e-mission-server#825

GabrielKS added 5 commits July 8, 2021 15:26

Add previously written expectations config sample and schema

5b32a41

shankari requested changes Jul 17, 2021

View reviewed changes

shankari approved these changes Jul 19, 2021

View reviewed changes

shankari merged commit ed2adbb into e-mission:master Jul 19, 2021

shankari mentioned this pull request Sep 16, 2021

Add the new label model creation to the crontab e-mission/e-mission-docker#20

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic server-side implementation of configurable expectation proposal #828

Basic server-side implementation of configurable expectation proposal #828

GabrielKS commented Jul 13, 2021

GabrielKS commented Jul 13, 2021

shankari left a comment

shankari Jul 17, 2021

GabrielKS Jul 19, 2021

shankari Jul 19, 2021

shankari Jul 17, 2021

GabrielKS Jul 19, 2021

shankari Jul 19, 2021

shankari Jul 17, 2021

GabrielKS Jul 19, 2021

shankari Jul 19, 2021

shankari Jul 17, 2021

GabrielKS Jul 19, 2021

shankari Jul 17, 2021

GabrielKS Jul 19, 2021

shankari Jul 17, 2021

GabrielKS Jul 19, 2021

shankari Jul 17, 2021

shankari Jul 19, 2021

GabrielKS Jul 19, 2021

shankari left a comment



		def testGetRuleForLabel(self):
		# An answers entry is a (labels, index of corresponding rule under intensive mode, index of corresponding rule under relaxed mode) tuple

Basic server-side implementation of configurable expectation proposal #828

Basic server-side implementation of configurable expectation proposal #828

Conversation

GabrielKS commented Jul 13, 2021

GabrielKS commented Jul 13, 2021

shankari left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shankari left a comment

Choose a reason for hiding this comment