Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Summer of Code #313

Closed
pnorman opened this issue Feb 19, 2018 · 15 comments
Closed

Google Summer of Code #313

pnorman opened this issue Feb 19, 2018 · 15 comments

Comments

@pnorman
Copy link
Contributor

pnorman commented Feb 19, 2018

OpenStreetMap is taking part in Google Summer of Code (GSoC) this year, and I was thinking a Tegola project might be possible. I wouldn't be able to mentor this myself, not being a Go programmer.

A good scope would be issues which apply to using OpenStreetMap data with Tegola. In particular, those which touch on some combination of

  • world-wide data;
  • continually updating data; or
  • tools to make queries easier to write against standard OSM PostGIS schemas (e.g. Merging multiple layers #127)

Is there any interest in this from Tegola developers?

@gdey
Copy link
Member

gdey commented Feb 19, 2018

I would/could help out. Never done something like this before. What would be needed from our side?

@pnorman
Copy link
Contributor Author

pnorman commented Feb 19, 2018

There's two parts - proposals, then doing the work. We're in the first part right now, so what's needed is a project idea. There are some others at https://wiki.openstreetmap.org/wiki/Google_Summer_of_Code/2018/Project_Ideas.

An idea needs to be

  • software development, not pure documentation or other similar activities, although a good project should include documenting it;
  • about three months of coding for a student;
  • doable without expert knowledge of the subject; and
  • something that will get merged and used if done properly.

It can consist of multiple sub-issues.

Once the project idea is listed, it's a matter of answering questions from potential students.

If a student is interested and submits a proposal, it can be selected. This depends on the quality of the proposal, the number of slots OSM gets assigned, and other factors.

During the coding period, the mentor needs to be in regular communication with the student, making sure that they're progressing, not getting stuck, etc. What this means varies from project to project. Some do emails, some do blog posts on osm.org, some do other means. You can treat it similarly to an internship or co-op student. How much help they need to get started depends a great deal on how much technical debt a project has and the quality of its documentation.

The mentor is also responsible for filling out two evaluations of the student, one at mid-term, the other at the end.

https://google.github.io/gsocguides/mentor/ has the full mentors guide.

@ARolek
Copy link
Member

ARolek commented Feb 20, 2018

@pnorman absolutely interested! Lots of possible projects a student could work on. Just a couple off the top of my mind:

  • data providers: we have a backlog of data providers that could be implemented for various geospatial data stores.
  • performance profiling: we're always looking for performance gains. learning to analyze programs via pprof and then implementing the changes is rewarding and valuable.
  • continually updating data: like you mentioned, deploying OSM data and keeping it up to date requires stitching together several technologies. R&D on this task and documenting it would be valuable to to numerous parties.

Looks like the student application period starts March 12th. When do you need proposals by?

@pnorman
Copy link
Contributor Author

pnorman commented Feb 20, 2018

data providers: we have a backlog of data providers that could be implemented for various geospatial data stores.

Realistically the only non-PostGIS provider that would get used in an OSM setup would be shapefiles, for the coastlines. Other spatial data stores tend not to be used with OSM.

performance profiling: we're always looking for performance gains. learning to analyze programs via pprof and then implementing the changes is rewarding and valuable.

This would be difficult to wrap into a proposal. I'm not saying it couldn't be turned into a good project, but it's hard to distill down into a statement of what work is intended.

continually updating data: like you mentioned, deploying OSM data and keeping it up to date requires stitching together several technologies. R&D on this task and documenting it would be valuable to to numerous parties.

Do we have a good grasp of what is missing from Tegola for this? If all that's missing is documentation then there's not enough for a project. I suspect there are features missing, but we have to identify what. I know that the other parts of the toolchain have all the code required - after all, that's what the standard raster setup does.

Looks like the student application period starts March 12th. When do you need proposals by?

Soon ;) Interested students are already asking questions about projects, and the ones that ask early are more likely to succeed in my experience.

@gdey
Copy link
Member

gdey commented Feb 21, 2018

data providers: we have a backlog of data providers that could be implemented for various geospatial data stores.

Realistically the only non-PostGIS provider that would get used in an OSM setup would be shapefiles, for the coastlines. Other spatial data stores tend not to be used with OSM.

@pnorman @ARolek

This was something that has been suggested. #160 There is already a library that can read and write shapefiles in go. The main thing would be to figure out the indexing strategy, whether r-index makes sense or the system @murphy214 is proposing.

The surface area is nice, there is a need, and any code they create will be used.
Even creating an initial unindexed provider that is slow is a good a start.

@ingenieroariel
Copy link

ingenieroariel commented Feb 21, 2018

I think an imposm + tegola + golang/geo would be a nice experiment that could yield a single binary that looks at a protobuf file and gets you vector tiles.

A single binary with a country .pbf extract with maputnik styles to start your own vector tiles basemap for country or state level websites. That .exe file could run on windows without any other dependencies.

@pnorman
Copy link
Contributor Author

pnorman commented Feb 21, 2018

This was something that has been suggested. #160 There is already a library that can read and write shapefiles in go. The main thing would be to figure out the indexing strategy, whether r-index makes sense or the system @murphy214 is proposing.

The surface area is nice, there is a need, and any code they create will be used.
Even creating an initial unindexed provider that is slow is a good a start.

How many weeks of work do you think that would be for a student?

@ARolek
Copy link
Member

ARolek commented Feb 21, 2018

@pnorman I would estimate a month to end to end depending on the foundational knowledge the student has. If we need to bring the student up to speed on the world of GIS then we should consider some additional time.

The task may require sending in code to the Go Shapefile package depending on what is discovered during the implementation. There are also a few r-tree indexes packages out there but I have not had time to review them. A ground up implementation would be a fun and challenging project. Additionally we could use the r-tree implementation for the planned geo json provider (#162)

@murphy214
Copy link

murphy214 commented Feb 22, 2018

If you guys need help or need me to elaborate more on what index I've implemented / proposed you can shoot me an email any time. I'd be happy to go through some of the advantages / disadvantages of such an index.

@gdey
Copy link
Member

gdey commented Feb 23, 2018

@pnorman Do you think #160 idea makes sense, should we add that to the list of project for OSM?

@pnorman
Copy link
Contributor Author

pnorman commented Feb 23, 2018

I'm worried that it's not enough work, otherwise yes. On the other hand, it's characteristic to underestimate time required...

For it I'd draw the connection between coastlines and other preprocessed data and shapefiles, and how all OSM styles use data that comes from shapefiles.

Another idea that would be useful for debugging and flexibility would be geojson output as well as mvt.

@gdey
Copy link
Member

gdey commented Feb 23, 2018

We are going to be doing some major rework on the generation part of the stack; at current it's not nicely set up for generating other formats. @ARolek and I still have to come up with how we want to architect that piece, so I don't think it would be a good spot, as it will change quite a bit.

@pnorman
Copy link
Contributor Author

pnorman commented Feb 26, 2018

I'd add #338 as another that's necessary, and it doesn't look like a large amount of work.

@ARolek
Copy link
Member

ARolek commented Feb 26, 2018

Another detail to note about the Shapefile provider is that we have two approaches that have been discussed:

  1. Reading in the features on startup of the server and building a spatial index that lives in memory.
  2. Reading the .sbn file which MAY come alongside the shapefile.

In memory spatial indexes will be helpful for other providers (i.e. geojson) but there is a UX cost to generating those indexes on init of the server. Ideally we could use both strategies, leveraging the sbn file if it exists. It would also be nice to help add support for sbn file generation to the go-shp package (jonas-p/go-shp#23).

@pnorman
Copy link
Contributor Author

pnorman commented Mar 30, 2018

We had no students apply for the Tegola-related projects this year. Interestingly, we did have one student submit a proposal for a project in Go that's related to the API, so I'm now looking for a mentor for it.

@pnorman pnorman closed this as completed Mar 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants