Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable import via vsicurl #482

Merged
merged 1 commit into from
Nov 2, 2023
Merged

Enable import via vsicurl #482

merged 1 commit into from
Nov 2, 2023

Conversation

mmacata
Copy link
Member

@mmacata mmacata commented Sep 26, 2023

This PR enables the user to import online raster resources via vsicurl via importer. In comparison to the current raster import it skips the wget step and passes the URL directly to r.import.
The vsicurl URL can directly be specified in the process chain:

    {
      "id": "importer_1",
      "module": "importer",
      "inputs": [
        {
          "import_descr": {
            "source": "/vsicurl/https://github.com/giswqs/data/raw/main/raster/srtm90.tif",
            "type": "raster",
            "resample": "bilinear_f",
            "resolution": "value",
            "resolution_value": "100"
          },
          "param": "map",
          "value": "mangkawuk_srtmgl1_v003_30m"
        }
      ]
    },

Output before without vsicurl URL:

"process_log": [
    {
      "executable": "/usr/bin/wget",
      "id": "importer_wget_myraster.tiff",
      "mapset_size": 421,
      "parameter": [
        "-t5",
        "-c",
        "-q",
        "-O",
        "/actinia_core/workspace/temp_db/gisdbase_aab038cc892a4eb199b3e3251ea51b9a/.tmp/myraster.tiff",
        "https://raw.githubusercontent.com/mmacata/pagestest/gh-pages/myraster.tiff"
      ],
      "return_code": 0,
      "run_time": 4.188900709152222,
      "stderr": [
        ""
      ],
      "stdout": ""
    },
    {
      "executable": "r.import",
      "id": "r_import_myraster.tiff",
      "mapset_size": 421,
      "parameter": [
        "input=/actinia_core/workspace/temp_db/gisdbase_aab038cc892a4eb199b3e3251ea51b9a/.tmp/myraster.tiff",
        "output=myraster",
        "--q",
        "resample=bilinear_f",
        "resolution=value",
        "resolution_value=100"
      ],
      "return_code": 1,
      "run_time": 1.5571587085723877,
      "stderr": [
        "WARNING: Invalid east longitude -246.534",
        "WARNING: Unable to determine area of interest for '+proj=lcc +lat_1=36.16666666666666 +lat_2=34.33333333333334 +lat_0=33.75 +lon_0=-79 +x_0=609601.22 +y_0=0 +no_defs +a=6378137 +rf=298.257222101 +towgs84=0.000,0.000,0.000 +type=crs '",
        "ERROR: Unable to initialize coordinate transformation",
        "ERROR: Unable to reproject to source location",
        ""
      ],
      "stdout": ""
    }
  ],

This is still possible. But now with vsicurl:

"process_log": [
    {
      "executable": "r.import",
      "id": "r_import_myraster.tif",
      "mapset_size": 24187896,
      "parameter": [
        "input=/vsicurl/https://github.com/giswqs/data/raw/main/raster/srtm90.tif",
        "output=myraster",
        "--q",
        "resample=bilinear_f",
        "resolution=value",
        "resolution_value=100"
      ],
      "return_code": 0,
      "run_time": 45.551167488098145,
      "stderr": [
        ""
      ],
      "stdout": ""
    }
  ],

These changes were tested with superadmin and normal user. The users limits were decreased to show the effect. For only 1 allowed pixel, v.import runs through but then the next step fails with "message": "AsyncProcessError: Region too large, set a coarser resolution to minimum nsres: 343448.918425 ewres: 343448.918425 [num_cells: 11781980]",. This might be critical as a large image could do harm to a running actinia installation? What do you think @anikaweinmann @neteler ?
With only one second allowed for calculation, actinia behaves as it should: "message": "AsyncProcessTimeLimit: Time (1 seconds) exceeded to run executable r.import",.

@mmacata mmacata added enhancement New feature or request question Further information is requested labels Sep 26, 2023
@mmacata mmacata added this to the next_minor milestone Sep 26, 2023
@anikaweinmann
Copy link
Member

anikaweinmann commented Sep 26, 2023

Is it possible to set something by the importer so that extent=region is used? So that only parts of large data sets will be imported.

@mmacata
Copy link
Member Author

mmacata commented Sep 26, 2023

I guess it would be easy to add it here. But then it would be a breaking change for actinia, no? If the importer behaves differently.
Either way we decide I think it would be good to have the same behaviour for imports via wget and for those with vsicurl.

@ninsbl
Copy link
Collaborator

ninsbl commented Sep 26, 2023

As a heads-up: /vsicurl/ uses (or at least may use) userfaultfd syscalls that will fail in a docker container, unless that capability is added to the container /service with seccomp / capadd. Ran into that recently myself and have not properly solved that yet, but currently looking for a good solution for our stack deployment...

@mmacata
Copy link
Member Author

mmacata commented Sep 26, 2023

As a heads-up: /vsicurl/ uses (or at least may use) userfaultfd syscalls that will fail in a docker container, unless that capability is added to the container /service with seccomp / capadd. Ran into that recently myself and have not properly solved that yet, but currently looking for a good solution for our stack deployment...

Interesting. When I tested it locally with vscode / docker dev setup, all went fine! Used mundialis/actinia:latest and started actinia as root user there.

@anikaweinmann
Copy link
Member

I guess it would be easy to add it here. But then it would be a breaking change for actinia, no? If the importer behaves differently.
Either way we decide I think it would be good to have the same behaviour for imports via wget and for those with vsicurl.

If we add a parameter which can be set but have not be added then it would not be a breaking change, or?

@ninsbl
Copy link
Collaborator

ninsbl commented Sep 27, 2023

Interesting. When I tested it locally with vscode / docker dev setup, all went fine! Used mundialis/actinia:latest and started actinia as root user there.

The issue I encountered seems to be format dependent. I tested your Geotiff and that works fine for me too. NetCDF however, does not seem to work...

So, in a docker container, this will fail:
gdalinfo /vsicurl/https://nbstds.met.no/thredds/fileServer/NBS/S2A/2023/09/26/S2A_MSIL1C_20230926T134931_N0509_R110_T31XEG_20230926T154938.nc

@neteler
Copy link
Member

neteler commented Sep 27, 2023

Issue confirmed also with podman:

podman run -it --rm osgeo/grass-gis:releasebranch_8_3-debian bash
root@94aa9494148b:/grassdb# gdalinfo /vsicurl/https://nbstds.met.no/thredds/fileServer/NBS/S2A/2023/09/26/S2A_MSIL1C_20230926T134931_N0509_R110_T31XEG_20230926T154938.nc
ERROR 1: CPLCreateUserFaultMapping(): syscall(__NR_userfaultfd) failed: insufficient permission. add CAP_SYS_PTRACE capability, or set /proc/sys/vm/unprivileged_userfaultfd to 1
gdalinfo failed - unable to open '/vsicurl/https://nbstds.met.no/thredds/fileServer/NBS/S2A/2023/09/26/S2A_MSIL1C_20230926T134931_N0509_R110_T31XEG_20230926T154938.nc'.

@ninsbl
Copy link
Collaborator

ninsbl commented Sep 28, 2023

OK, how I plan to tackle the issue in our swarm deployment (actinia-swarm.yml) is to add permissions / capabilities like this:

cap_add:
  - SYS_PTRACE

And here with context...

---
version: "3.1"
services:
  actinia:
    cap_add:
      - SYS_PTRACE
    environment:

With that addition /vsicurl/ works in docker also with netcdf...

My knowledge of Unix system admin stuff does not go deep enough to know if there is an option to allow only userfaultfd. A web search produced this:
https://lore.kernel.org/lkml/CAJHvVcgbCL7+4bBZ_5biLKfjmz_DKNBV8H6NxcLcFrw9Fbu7mw@mail.gmail.com/
But I have no idea if that patch was merged.

@mmacata
Copy link
Member Author

mmacata commented Oct 12, 2023

Map pixel limit check needed:

  • Create VRT with current region
  • determine allowed user pixel limit from actinia
  • check with GDALINFO if pixel limit is fine
  • if yes, proceed
  • if no, throw "message": "AsyncProcessError: Region too large, set a coarser resolution to minimum nsres: 343448.918425 ewres: 343448.918425 [num_cells: 11781980]"

@mmacata
Copy link
Member Author

mmacata commented Nov 2, 2023

  • Follow up work on pixel limit check is started here: Checking pixellimit for r.import commands #491
  • gdalinfo with above vsicurl + netcdf example works fine for me in mundialis/actinia:latest !?
  • merging now, as by default a normal user is not allowed to run r.import and Importer up to now also doesn't check pixel limit in early stage.

@mmacata mmacata merged commit 493456e into actinia-org:main Nov 2, 2023
11 checks passed
@mmacata mmacata modified the milestones: next_minor, 4.11.0 Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants