-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User reports lots of spurious trips on iOS #704
Comments
Hm, the user does not appear to have any transitions for the past week
Returns an empty dataframe. |
Searching backwards, we find that the last transition was from |
shows us that the last trip is indeed from
Need to investigate why we stopped getting transitions and how our algorithm works when they are not present |
Doing an initial pass at classifying good vs. bad:
And plotting the trips, they are indeed in a straight line across town (maps redacted for privacy reasons).
Checking to see if this is a characteristic of all potential bad trips and of any potential good trips. |
There are no location points for the bad trips.
There are no location points for the good trips as well.
There are apparently no location points for the entire month of Feb.
Last location point was from December as well |
It also turned out that we hadn't filtered for the 7th correctly. After fixing this, we now have:
But every single trip seems to be a straight line, BUT they don't always have the same endpoints. The main difference between the "good" and "bad" trips seems to be that the endpoints sometimes double back. But given that they are straight lines, the distance between the endpoints and the distance of the trip are likely to be the same. Let's see if that helps. |
Ah, they are straight lines there are back. The actual O-D distance, even for the "bad trips" is very small
Unfortunately, that means that we can't actually use the o-d distance, since this could happen legitimately for a round trip. |
So if we categorize further:
Visualizing those 4 trips, we get what appear to be one-way trips. Let's see how many trips from the beginning of Feb would be affected. |
Majority are from the 6th. 7th and 8th. One from the 1st. Scatter plot shows vertical lines at various distances. |
to recap, at this point, we have a pretty good check (OD-distance < 100m). Let's plot the trip from the first since it is most likely to be the false positive (if one exists). |
Checking the other fields, it is a lot more than 7k in distance. |
So there are 8 trips > 7k in distance
On mapping them, the first and last entries (3 and 71) are valid round trips. Plotting the various trip level metrics, we don't see a clear separation between valid and invalid. |
Re-exported data for only the year 2022. We now see transitions, and all the transitions for the 7th seem to be visit only, without a corresponding geofence exit.
|
Re-running the rest of the analysis, we now have 79 trips, so it looks like the issue resolved itself after the 8th?
Looking at these last four trips, one has a clearly defined trajectory. The others are little groups of points, similar to some trips on the 8th. But the number of locations seems like a potential discriminator.
Still have the same potentially bad trips that are actually good. Plotting this, we get
So it looks like that will work! |
Double checking by mapping some known bad trips from the morning of the 7th The last few trips on the 8th+ have one trip that looks like that, and others that just look like a cluster of points at the destination. So the big gap/sparse points seems like a good check, at least for this user at this time. Need to think about whether we want to incorporate it into the regular pipeline. Double checking...
And for the mixed dataset
Note that our filter distance is supposed to be 1 meter.
So a possible threshold could be 100x that, so a density of > 100m |
To summarize, our check for "invalid trip" is:
Let's see how many of these show up for this user overall
|
Recomputing in a different way, we get the same result:
Visualizing the maps before 6th Feb, we get a bunch of valid trips. We need to add the density check as well. |
After adding the density check, it looks good.
|
Note also that the user does not have any motion activity data.
I wonder if that is the reason why our spurious trip detection code is not catching this automatically |
If we were incorporating this into the pipeline, we would reset the pipeline to before the 6th and then re-run. Before inserting the entries, the user inputs are as below. It looks like the user confirmed several trips before stopping.
|
After manually inserting entries on a copy of the database and then re-running the pipeline, we get
After configuring the analysis pipeline to included replaced mode, we get
We are now ready to change this on the production server once we get confirmation from the user that the trips are in fact spurious. |
"Thank you. Trips are showing a straight line across town. "
The text was updated successfully, but these errors were encountered: