Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API preferences? #5

Closed
chrispahm opened this issue Jul 4, 2023 · 11 comments
Closed

API preferences? #5

chrispahm opened this issue Jul 4, 2023 · 11 comments

Comments

@chrispahm
Copy link
Owner

@sanak mentions the following in #4 (comment),

Personally, I prefer that users can choose how to serialize/deserialize from/to GEOS Geometry by themselves, like shapely IO.
https://shapely.readthedocs.io/en/stable/io.html

from_geojson
to_geojson
from_wkb
to_wkb
from_wkt
to_wkt
Also, it would be nice to support direct geometry creation, something like const point = new Point([0, 0]) or const point = geos.point([0, 0]), but it will be next step.

I think there are many good arguments for exposing a low-level API of all functions exported by GEOS! These would create little overhead, and should be simple(r) to maintain in the long run. However, at least for a few functions, I'd like to implement a high-level API that brings the convenience of not having to worry about pointers/memory management etc.

API proposal

The package could provide the low-level API (essentially everything wrapped in allCFunctions) using a sub-module (not sure if that's the correct term) e.g. geos-wasm/lib/core , as well as a high-level API (everything in allJSFunctions) for all function where someone contributes an implementation → imported through geos-wasm

Example

// importing the low-level API
import { 
  GEOSGeomFromWKT,
  GEOSBuffer,
  GEOSGeomToWKT,
  GEOSGeom_destroy,
  GEOSFree,
  UTF8ToString,
  stringToUTF8
} from "geos-wasm/lib/core";
// importing the high-level API
import { buffer } from "geos-wasm";

import { stringify, parse } from "wkt";
import geometry from "./path/geometry.geojson" assert { type: "json" };

// using the low-level API
const wkt = stringify(geometry);
const wktPtr = stringToUTF8(wkt);
const geomPtr = GEOSGeomFromWKT(wktPtr);
GEOSFree(wktPtr);
const bufferPtr = GEOSBuffer(geomPtr, (radius = 10));
const bufferedWKTptr = GEOSGeomToWKT(bufferPtr);
const buffered = parse(UTF8ToString(bufferedWKTptr));
GEOSFree(wktPtr);
GEOSFree(bufferedWKTptr);
GEOSGeom_destroy(geomPtr);
GEOSGeom_destroy(bufferPtr);

// using the high-level API
const buffered_2 = buffer(geometry, (radius = 10));

// JSON.stringifiy(buffered) === JSON.stringify(buffered_2) → true

What do you think about this proposal? Also, are there any preferences towards the naming of functions? Right now, we're just using the function names that we get from Emscripten, but the repeated use of "GEOS" in the function names seems unnecessarily verbose.

@sanak
Copy link
Contributor

sanak commented Jul 4, 2023

@chrispahm Thanks for response and creating this issue!

I think there are many good arguments for exposing a low-level API of all functions exported by GEOS! These would create little overhead, and should be simple(r) to maintain in the long run. However, at least for a few functions, I'd like to implement a high-level API that brings the convenience of not having to worry about pointers/memory management etc.

Yes, sure.

The package could provide the low-level API (essentially everything wrapped in allCFunctions) using a sub-module (not sure if that's the correct term) e.g. geos-wasm/lib/core , as well as a high-level API (everything in allJSFunctions) for all function where someone contributes an implementation → imported through geos-wasm
:
What do you think about this proposal?

That is really nice idea!
One thing my concerning is GEOS CAPI has xxx_r version which seems to support multi-thread (context), so defining both or using either of those needs to be decided at some points.
(But it could be done in future major version update.)

Also, are there any preferences towards the naming of functions? Right now, we're just using the function names that we get from Emscripten, but the repeated use of "GEOS" in the function names seems unnecessarily verbose.

I am not sure whether JavaScript supports something like C++ namespace, but if it's possible in either of definition or import timing, geos_core.free or geos_c.free may be understandable for users, even if those are low level function.


Related with high-level API, I personally prefer Geometry class which manage geometry pointer inside and other Point, LineString classes inherit from it.

class Geometry {
  ptr: null,
  buffer: function(radius, options),
  :
  free: function()
}

Some other GEOS wrapper libraries seem to have such strategy, but in my understanding, JavaScript has no destructor nor reference counter, so explicit free call will be necessary at somewhere (in above Geometry class or something pointer manager class)...

@chrispahm
Copy link
Owner Author

I see the benefits of having a very similar API to shapely, but unfortunately this would still mean that we'd need to write a lot of glue code since it's also an abstraction of the original GEOS API.

What I was thinking about is that if this library would export the barebones GEOS C API through e.g. geos-wasm/lib/core, someone else could use this to resemble the shapely API in another JS package if they wanted (maybe you @sanak?) 😊 But also, someone else could use it to write some highly optimised implementation for a custom use case that uses the power of the underlying C code. At least no one would have to compile GEOS first, because they could just depend on this package instead!

Anyways, if you'd implement a Geometry class like it's used in shapely, you can use JavaScripts FinalizationRegistry to free the memory when GC is picking up the memory.

import geos from "geos-wasm/lib/core"

const registry = new FinalizationRegistry((heldValue) => {
  geos.free(heldValue);
});

class Geometry {
  constructor() {
    this.ptr = null,
    registry.register(this, this.ptr, this);
  }
 
  // define Geometry methods
  buffer () {} //...

  destroy() { // manual destructor for backup, you don't have to call this
    registry.unregister(this);
    geos.free(this.ptr);
  }
}

This way users wouldn't have to deal with it, which makes the code feel more JavaScripty for sure. But as I said in the Background section of the readme, that would be way to much work for me to pull off 😅

@sanak
Copy link
Contributor

sanak commented Jul 5, 2023

What I was thinking about is that if this library would export the barebones GEOS C API through e.g. geos-wasm/lib/core, someone else could use this to resemble the shapely API in another JS package if they wanted (maybe you @sanak?) 😊 But also, someone else could use it to write some highly optimised implementation for a custom use case that uses the power of the underlying C code. At least no one would have to compile GEOS first, because they could just depend on this package instead!

Okay, sure! 😄

Anyways, if you'd implement a Geometry class like it's used in shapely, you can use JavaScripts FinalizationRegistry to free the memory when GC is picking up the memory.
:
This way users wouldn't have to deal with it, which makes the code feel more JavaScripty for sure. But as I said in the Background section of the readme, that would be way to much work for me to pull off 😅

Thanks for the information about JavaScript FinalizationRegistry method. (I didn't know it. 😅)
And yes, it will be much work and not high priority for now.


By the way, I missed to read README.md - API section carefully and noticed that high-level API (buffer) method seems to be designed to be compatible with turf. 🙇
But, will this turf-compatible be high-level API default ?
If so, I guess that battle-test in high-level API with using GEOS test data may become a bit hard work.

In my understanding, GeoJSON (default) and turf assume that the projection (unit) is longitude/latitude (decimal degree) ( ex. EPSG:4326), but GEOS accepts any projections/units and assumes radius, distance and area .etc as same projection unit.
I couldn't find the good way to solve both use cases, but I will try to think about that more.

@sanak
Copy link
Contributor

sanak commented Jul 6, 2023

@chrispahm After thinking about high-level API more, I remembered that GEOS has geosop from GEOS >= 3.10.0,

and I felt that its interface matches to solve above pointer management and battle-test issues.

I will try it on my fork repository's develop branch (which is same as PR:#3 for now) over this weekend.

@chrispahm
Copy link
Owner Author

Very nice find! Maybe you can get geosop compiled to WASM similar to how it's done in GDAL3.js? GDAL3.js also supports ogr2ogr, which is a CLI as well so it should be doable!

I'm a very big fan of mapshaper as well, so I guess this would be a great feature to have!

@sanak
Copy link
Contributor

sanak commented Jul 6, 2023

@chrispahm Thanks for comments!

Maybe you can get geosop compiled to WASM similar to how it's done in GDAL3.js?

Maybe possible, but geosop seems to switch geometry operation (buffer .etc) by argument, so user side usability may not be so good like gdal3.js.

So, high-level API will be basically same as current geos.buffer(geojson, radius, options), but difference is:

  • 1st geojson parameter will be replaced to geometry (but not pointer) and its format can be WKT (for battle-test), WKB (for performance) and GeoJSON (for usability).
    • Fortunately, each format has another JavaScript type (WKT=string, WKB=byte-array and GeoJSON=object), so something like typeof geometry check can be used.
  • last option can have format option which has WKT, WKB or GeoJSON for return value.

How do you think about the above I/O interface ?


I'm a very big fan of mapshaper as well, so I guess this would be a great feature to have!

I first heard about it. Thanks for the information!

@chrispahm
Copy link
Owner Author

More and more agreeing with you that the high-level API should rather resemble Shapely than turf and the like! I guess other solutions are somewhere in between, so I guess Shapely is familiar to many, but also highly flexible!

What do you think about these potential next steps:

  1. Make this repo just contain GEOS-WASM and solely provide the low level API → essentially what comes out of Emscripten, ideally with tests. Merge your PR upgrading GEOS, but remove the buffer example for the current high-level API, and instead just give a few examples on how the low level API can be used.
  2. Create a new repository (ideally in an organisation?), where the aim is to port more and more of Shapelys functionality to JS. Shapely.js is already taken, what do you think of Geometra? 😄 We'd need to talk to the Shapely maintainers if their ok with this approach, at least that's how I understand [their license]!

@sanak
Copy link
Contributor

sanak commented Jul 7, 2023

@chrispahm

More and more agreeing with you that the high-level API should rather resemble Shapely than turf and the like! I guess other solutions are somewhere in between, so I guess Shapely is familiar to many, but also highly flexible!

Thanks for confirmation!

About the next steps, I feel that it's too early to make decision on above No.1 and No.2.
And sorry if my PR #3 made you confusing, because I wrongly commented out turf compatible buffer implementation. 🙇💦

I think that following steps will be possible, but how do you think about these steps ?

  1. Merge your WKB PR Switch to WKB as data transfer format #6.
  2. I will rebase and separate my PR Upgrade to GEOS 3.10.5 and use GeoJSON Reader/Writer #3 as following smaller PRs, because it's too big.
    • Support make clean in Makefile (40e2ac9)
    • Upgrade GEOS 3.10.5 with CMake build and fill ALL_GEOS_FUNCTIONS.mk.
    • Add Geom(From|To)GeoJSON.mjs with GEOS_EMCC_FLAGS.mk update.
  3. Discuss about supporting WKB/WKT/GeoJSON input/output.
    • The following is my current idea, but are you agree with that ?
      • Rename Buffer_simple.mjs to Buffer.mjs for GEOS thin wrapper with supporting above I/O formats.
      • Rename Buffer.mjs to BufferWithUnits.mjs for turf compatible and call GEOS thin wrapper inside.

@sanak
Copy link
Contributor

sanak commented Jul 8, 2023

Well, I set my PR #3 to draft mode and appends some commits, but I noticed that npm run test result shapes are different between current buffer (turf compatible) and buffer_simple (geos original).

  • EPSG:4326 (lat/lon decimal degree) => buffer_simple (geos original, #00ff00) is expected result.
    • buffer-results-epsg4326
  • EPSG:3857 (Web Mercator) => buffer (turf compatible, #0000ff) is expected result. (Background #ff0000 is turf.buffer result.)
    • buffer-results-epsg3857

I felt that high-level buffer API is the most difficult part when porting GEOS to web-side, and now I think that something like additional spheroid: boolean option may be necessary...
I will try it on my fork develop branch, and I delete above my comment No.3 by strike-through, because it is not appropriate design. 🙇

About merging steps, rebasing, squashing and separating each parts are okay for me, so if you feel it's too big, let me know about that.

@sanak
Copy link
Contributor

sanak commented Jul 8, 2023

@chrispahm Thanks for merging WKB PR #6 !

Now I understand that WKB is a lot better than GEOS GeoJSON Reader/Writer.
(Here is the output of performance which includes buffer_simple.)

Testing feature Simple
GEOS buffer method took 21 milliseconds to run 100 times.
Average time per iteration: 0 milliseconds.
GEOS buffer_simple method took 20.248874962329865 milliseconds to run 100 times.
Average time per iteration: 0 milliseconds.
@turf/turf buffer method took 37 milliseconds to run 100 times.
Average time per iteration: 0 milliseconds.
GEOS buffer method was 1.76x faster than @turf/turf buffer method.

Testing feature Australia
GEOS buffer method took 5715 milliseconds to run 100 times.
Average time per iteration: 57 milliseconds.
GEOS buffer_simple method took 9943.73612499237 milliseconds to run 100 times.
Average time per iteration: 99 milliseconds.
@turf/turf buffer method took 4616 milliseconds to run 100 times.
Average time per iteration: 46 milliseconds.
@turf/turf buffer method was 1.24x faster than GEOS buffer method.

Real life performance test
Buffering all geometries at once
GEOS buffer method took 296 milliseconds to run 1 times.
Average time per iteration: 296 milliseconds.
GEOS buffer_simple method took 2003.0125830173492 milliseconds to run 1 times.
Average time per iteration: 2003 milliseconds.
@turf/turf buffer method took 239 milliseconds to run 1 times.
Average time per iteration: 239 milliseconds.
@turf/turf buffer method was 1.24x faster than GEOS buffer method.

Buffering each geometry individually
GEOS buffer method took 236 milliseconds to run 1 times.
Average time per iteration: 236 milliseconds.
GEOS buffer_simple method took 410.59958404302597 milliseconds to run 1 times.
Average time per iteration: 411 milliseconds.
@turf/turf buffer method took 142 milliseconds to run 1 times.
Average time per iteration: 142 milliseconds.
@turf/turf buffer method was 1.66x faster than GEOS buffer method.

Anyway, I will do the following No2, except last "Add Geom(From|To)GeoJSON.mjs with `GEOS_EMCC_FLAGS.mk update.", because it is useless for performance. 😅

  1. I will rebase and separate my PR Upgrade to GEOS 3.10.5 and use GeoJSON Reader/Writer #3 as following smaller PRs, because it's too big.
    • Support make clean in Makefile (40e2ac9)
    • Upgrade GEOS 3.10.5 with CMake build and fill ALL_GEOS_FUNCTIONS.mk.
    • Add Geom(From|To)GeoJSON.mjs with GEOS_EMCC_FLAGS.mk update.

This was referenced Jul 9, 2023
@chrispahm
Copy link
Owner Author

Closing this since #9 is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants