Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow offline usage for POIs #1169

Closed
HarelM opened this issue Feb 27, 2020 · 19 comments
Closed

Allow offline usage for POIs #1169

HarelM opened this issue Feb 27, 2020 · 19 comments
Assignees
Labels
enhancement Low Nice to have or feature or a very edge case bug

Comments

@HarelM
Copy link
Member

HarelM commented Feb 27, 2020

Infra

The following library might help, but I'm not sure how to utilize it correctly.
Loading a file/large data to memory every time the app starts can be annoying.
https://github.com/mapbox/geojson-vt
Also there's a way to check which network is currently being used here:
https://developer.mozilla.org/en-US/docs/Web/API/NetworkInformation
There's a question of how to update the data without downloading everything every time.
Relevant SO question:
https://stackoverflow.com/questions/58330896/how-to-index-geojson-data-in-browser-to-store-in-indexdb-using-geojson-vt

@HarelM
Copy link
Member Author

HarelM commented Feb 27, 2020

Reading more into the geojson-vt code reveals that the tiles are saved inside a variable called tiles - this variable might can be used as a post-processing data that can be stored inside IndexDB or mbtiles file for offline use or online performance improve.

@HarelM
Copy link
Member Author

HarelM commented Feb 27, 2020

Also there's this library which can convert geojson to MVT (I hope):
https://github.com/anyways-open/NetTopologySuite.IO.VectorTiles
See here:
NetTopologySuite/NetTopologySuite.IO.GeoJSON#39

@HarelM HarelM added enhancement Low Nice to have or feature or a very edge case bug labels Mar 3, 2020
@HarelM
Copy link
Member Author

HarelM commented May 19, 2020

Problem:

MVT does not support clustering right now see here (Only geojson source allows clustering):
mapbox/mapbox-gl-js#5856

Soluton 1

Following this guide to eliminate properties that are not needed for the clustering allows to lower the geojson size to be ~ 6 Mb (Which can be shipped with the app to allow initial usage without waiting).
https://docs.mapbox.com/help/troubleshooting/working-with-large-geojson-data/
But there's still a need for the extra data when opening a POI. The file that has all the data needed is ~50Mb
So splitting this into tiles doesn't seem to solve much, as the data still needs to be aggregated in the client to facilitate for the GeoJSON source.
The main problem the way I see it is that there's a need to download a lot of data in order for this to work, which I'm not sure when to download it and how.

Solution 2

Another option is to ditch the clustering and move to zoom based POI showing - much like how the map is currently working.
This will allow using vector tiles with a dedicated style layer that the code will hold. There might be a lot of duplication between this and the map style, but that's a small detail.
Current categories can be used as a source-layer to allow filtering.
This is a far more easy solution to implement, but will make the POIs UX cumbersome in that the user will zoom out and POI will vanish.

Solution 3 - preferred:

Another option which might combine both of the above is the following:
https://docs.mapbox.com/mapbox-gl-js/api/#map#querysourcefeatures
Using a vector tile source to update the in-memory geojson source that will be presented to the user, clustered - we are currently building the geojson by sending a request to the server and getting the features back, this is basically changing to "tiled based" server requests, I think...

@HarelM
Copy link
Member Author

HarelM commented May 19, 2020

OK, finally got the POC working. Clustering seems to be working, speed is a lot better.
How it will work:

  1. Add a source to POIs MBTiles
  2. Add a dummy layer to make sure the tiles are in use, otherwise no features are returned
  3. Get the features from the source and add them to the relevant geojson (maybe I can merge the geojson sources in this case to avoid overlapping clusters, not sure...)
  4. Show the geojson as a cluster

There's a point to take into consideration here - when using features from a tiled source features can be duplicated, need to make sure they are unique.

There's a need to create a POI MBtiles now to fully test this.

@HarelM
Copy link
Member Author

HarelM commented May 20, 2020

When opening a POI that is off screen (like search and POI link) there's a need to get a POI by ID.
This won't work with a source that returns only features that are within the screen. Opening a tile and extracting a feature from it won't work in terms of performance, there's need to be a mapping between ID and feature in a way that is fast.

I'm starting to think there's a need for two different databases - one for the POIs that can be tiled base, only to show the POIs on screen and a second one to allow getting a POI by ID. MBtiles database can do that assuming there are multiple tables, offline, and elastic can do it when online in the server side...

@HarelM
Copy link
Member Author

HarelM commented May 21, 2020

OK, latest phone discussion:
The problem here should be split into two unrelated problems:

  1. Present a POI layer on the map - the best solution right now is using a slim geojson (contains minimal data on each point) stored in the site and on the device, refresh will download this entire file - current test shows it's about 6 Mb.
  2. Use a database for search and indexing that will store the extended data - POIs' geometry, links, image Url, etc... this database can be either index DB or SQLite. Current test show that the size is around 42 Mb, As a first solution, this database can be fully downloaded every time, later we'll need a better way to incrementally update it, probably out of scope for this issue.

@HarelM HarelM added this to the Next Release milestone May 24, 2020
@HarelM HarelM self-assigned this May 24, 2020
@HarelM HarelM changed the title Move POIs requests to be tile based Move POIs requests to be tile based and allow offline usage May 24, 2020
@HarelM HarelM changed the title Move POIs requests to be tile based and allow offline usage Improve public POI performance and allow offline usage May 24, 2020
HarelM added a commit that referenced this issue May 25, 2020
HarelM added a commit that referenced this issue May 26, 2020
HarelM added a commit that referenced this issue May 26, 2020
HarelM added a commit that referenced this issue May 26, 2020
…ablity to create only the poi geojson file
HarelM added a commit that referenced this issue May 26, 2020
… to use geojson file for offline use and site regular use.
HarelM added a commit that referenced this issue May 27, 2020
@HarelM
Copy link
Member Author

HarelM commented May 27, 2020

An initial version was created which supports only showing the clustering when offline.
This version can fully work offline in terms of requests that were made when the app starts - i.e. all requests that were made when the app started now have timeout and cached state for offline use.

The current experience of loading the geojson file is noticeable every time you open the app as you see the progress bar downloading the file - this is a bit wasteful in terms of network traffic and probably needs to be optimized using incremental update somehow.

HarelM added a commit that referenced this issue May 31, 2020
@HarelM
Copy link
Member Author

HarelM commented Jun 1, 2020

Latest changes: Moved away from using a file and create the clustering in the client side from the database when the app starts.
Remaining tasks:

  • Allow seeing full details of POIs offline
  • Allow getting updates for POIs including progressbar
  • Supported deleted POIs - server side
  • Support basic offline search - no indexing - in memory mapping.
  • Supported deleted POIs - Delete items in the client side
  • Support images - need to shrink and add to updates request
  • Reduce parallel computation due to high CPU usage
  • Make sure there are not performance issues on the server when all POIs are requested
  • progress text and UX finalization - allow continue before initial download?
  • Handle adding/updating a POI - set update date to the past and wait for rebuild, specific
  • Fix problem with elastic database and duplicate POIs (Duplicate IDs when getting data from database #1247)
  • Rebuild external source before merging the pois
  • Fully migrate to geojson for POIs? Currently using a specific domain model.

HarelM added a commit that referenced this issue Jun 1, 2020
HarelM added a commit that referenced this issue Jun 2, 2020
HarelM added a commit that referenced this issue Jun 2, 2020
@HarelM HarelM changed the title Improve public POI performance and allow offline usage Allow offline usage for POIs Jun 2, 2020
HarelM added a commit that referenced this issue Jun 6, 2020
HarelM added a commit that referenced this issue Jun 6, 2020
@HarelM
Copy link
Member Author

HarelM commented Jun 7, 2020

This seems to be happening on the server which causes a failure in databases rebuild :-(
NetTopologySuite/NetTopologySuite.IO.GeoJSON#46

HarelM added a commit that referenced this issue Jun 9, 2020
HarelM added a commit that referenced this issue Jun 16, 2020
…uest, Added update of all sources to global update procedure.
@HarelM
Copy link
Member Author

HarelM commented Jun 19, 2020

The following scenarios need to be tested - it might be a good idea to add unit test for each scenario (if possible).

  1. Add POI from the app when update is not running
  2. Add POI from the app when update is running
  3. Update POI from the app when update is not running
  4. Update POI from the app when update is running
  5. Add POI from the site when update is running - check what happens and when in the mobile version
  6. Add POI from the site when the update is not running - check what happens and when in the mobile version
  7. Update POI from the site when update is running - check what happens and when in the mobile version
  8. Update POI from the site when the update is not running - check what happens and when in the mobile version
  9. Add POI from external editor
  10. Update POI from external editor

After each scenario above it would be best to check the OSM element in the OSM editor, The site and the app to see how it looks.

The update is running every hour, on the hour, and takes about 15 minutes.

Note to myself:
Update candidate:
https://israelhiking.osm.org.il/poi/OSM/node_650493756?language=he
Add candidate: El-Roi old train station

@HarelM
Copy link
Member Author

HarelM commented Jun 19, 2020

When rebuilding the databases (external and OSM) the following scenario should be taken into consideration:
8:00 - Start getting all external source updates
8:01 - Added a point to Wikipedia which will be missed by the above process
8:04 - Update a point in OSM which won't be missed by the below process
8:05 - Finished updating external sources without the point added in 8:01
8:06 - Start updating to latest OSM file that will include the point updated in 8:04
8:08 - Finished - all elements updated until 8:06 are now being processed.
8:15 - All the elements are updated in the database.

HarelM added a commit that referenced this issue Jun 19, 2020
@HarelM
Copy link
Member Author

HarelM commented Jun 20, 2020

The following scenario should be tested as well, Same as the scenario above only
07:58 the point that is going to be updated in 8:04 is updated now

@HarelM HarelM closed this as completed in f708d68 Jun 20, 2020
@daniela-green
Copy link

Hey @HarelM, i just visited you site: https://israelhiking.osm.org.il/map/12.53/31.4760/35.0611.
I see you have clusters there.
Did you manage to make clusters work with tiles?

Thanks,
Daniela

@HarelM
Copy link
Member Author

HarelM commented Jul 29, 2020

Hi @daniela-green,
Well, the answer is a bit complicated.
I'm using a geojson source, this type of source doesn't have a tiled solution, as far as I managed to research, when it comes to using it in mapbox.
I wanted to allow offline usage and filtering so all the data needs to be in the client side so there was no added value for using tiled data (as can be read in this thread, there are also problems with using it).
For the maps "background" we are using tiled data and I belive vector tile source can be clustered, although I haven't tested it, but mapbox examples are usually good, so you should be able to google it.
We are using a geojson source with 20k records and it takes a few seconds (1-2) to load to the map so its faster for us to load it all (since it's already in the client DB).

I hope this helps.
Feel free to send me a private message on linked-in if you would like to talk on the phone.

@HarelM
Copy link
Member Author

HarelM commented Jul 29, 2020

Ok, sorry for the partially correct message I just wrote. I needed to read more in order to remember what I did.
As can be seen in the linked mapbox issue, clustering is not available for tiled vector source.
What I managed to do was get the data from the vector tile source that is currently visible, add it to a dynamic geojson source and cluster that geojson source.
After I did it I understood that it's redundant for offline use since I need all the data in the client and storing it in a database as a list of features is easier and simpler.
Having said that, I did manage to make it work.

@daniela-green
Copy link

daniela-green commented Jul 30, 2020

Hey @HarelM, thank you for your thorough reply!
You wrote:
"..What I managed to do was get the data from the vector tile source that is currently visible, add it to a dynamic geojson source and cluster that geojson source."

Can you explain more on how you manage to extract the data from a vector tile request and add it to a geojson?
Cause from mapbox spec (image below), they dont let you deal with the response of vector tiles. It seems that they load it directly into the map.
Mapbox spec on vector tiles sources:
image

Thanks again,
Daniela

@HarelM
Copy link
Member Author

HarelM commented Jul 30, 2020

@daniela-green check out the following comment I wrote here:
#1169 (comment)
Follow the four steps where mbtiles is equal to a vector source.
You can use the following mapbox method:
https://docs.mapbox.com/mapbox-gl-js/api/map/#map#querysourcefeatures

@HarelM
Copy link
Member Author

HarelM commented Jul 30, 2020

Just to clarify, I'm not doing anything with the network traffic. I let mapbox do all the heavy lifting of network traffic, caching and what not, and simply query the source from the map. This requires a "dummy" layer (which uses the source but doesn't draw anything on the map) to make sure mapbox uses the relevant source and loads its features.
You can also open the pbf tile in theory using mapbox js open source libraries but I'm not sure you would want to go that way...

@daniela-green
Copy link

Got it, thank you for this great explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Low Nice to have or feature or a very edge case bug
Projects
None yet
Development

No branches or pull requests

2 participants