Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GDAL segfault loading massive VRT raster into std::vector #9713

Closed
Ryanf55 opened this issue Apr 20, 2024 · 3 comments
Closed

GDAL segfault loading massive VRT raster into std::vector #9713

Ryanf55 opened this issue Apr 20, 2024 · 3 comments

Comments

@Ryanf55
Copy link
Contributor

Ryanf55 commented Apr 20, 2024

What is the bug?

When you have a massive VRT dataset, and call RasterIo on the dataset, it can segfault. I am using a std::vector as the buffer.
Even if your hardware allocated the buffer, it can fail.

Steps to reproduce the issue

main.cpp

// Usage: ./build/vrt_loader /vsizip/vsicurl/https://terrain.ardupilot.org/SRTM1/ap_srtm1.zip/ap_srtm1.vrt

#include "gdal/gdal_priv.h"

#include <errno.h>
#include <vector>
#include <cassert>

int main(int argc, const char* argv[])
{
    if (argc != 2) {
        return EINVAL;
    }
    const char* pszFilename = argv[1];

    GDALDatasetUniquePtr poDataset;
    GDALAllRegister();
    const GDALAccess eAccess = GA_ReadOnly;
    poDataset = GDALDatasetUniquePtr(GDALDataset::FromHandle(GDALOpen( pszFilename, eAccess )));
    assert(poDataset != nullptr);
    assert(poDataset->GetRasterCount() == 1);
    const auto band = poDataset->GetRasterBand(1);

    const auto xSz = poDataset->GetRasterXSize();
    const auto ySz = poDataset->GetRasterYSize();

    std::vector<float> buffer;
    buffer.resize(xSz * ySz);
    // Same results with the "reserve" operator

    assert(band->RasterIO(GF_Read, 0, 0, xSz, ySz, &buffer[0],
                        xSz, ySz, GDT_Float32, 0, 0) != CE_None);

    return 0;
}

CMakeLists.txt

cmake_minimum_required(VERSION 3.22)
project(vrt_segfault)

set(CMAKE_FIND_DEBUG_MODE ON)
find_package(GDAL REQUIRED)
set(CMAKE_FIND_DEBUG_MODE OFF)
add_executable(vrt_loader main.cpp)
target_link_libraries(vrt_loader PRIVATE ${GDAL_LIBRARIES})
cmake -S . -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build
./build/vrt_loader /vsizip/vsicurl/https://terrain.ardupilot.org/SRTM1/ap_srtm1.zip/ap_srtm1.vrt

backtrace

gdb --args build/vrt_loader /vsizip/vsicurl/https://terrain.ardupilot.org/SRTM1/ap_srtm1.zip/ap_srtm1.vrt

image

Versions and provenance

Linux Ubuntu 22.04

$ apt show libgdal-dev
Package: libgdal-dev
Version: 3.4.1+dfsg-1build4

Additional context

I would like some information on how to protect against this - user-supplied datasets may be large.

See gazebosim/gz-common#596 (comment) for more context on how this is a problem in a user-application.

@Ryanf55 Ryanf55 changed the title GDAL segfault loading massive VRT GDAL segfault loading massive VRT into std::vector Apr 20, 2024
@Ryanf55 Ryanf55 changed the title GDAL segfault loading massive VRT into std::vector GDAL segfault loading massive VRT raster into std::vector Apr 20, 2024
@rouault
Copy link
Member

rouault commented Apr 20, 2024

This isn't a GDAL bug. You are just asking for something impossible

First there is an integer overflow occurring in your code when you do buffer.resize(xSz * ySz); since xSz and ySr are 32 bit integer. So you would need to cast them to int64 first, but that would trigger a 1.5 TB allocation ... You can't just read such a big raster at once. You need to proceed in a piecewise chunk.

@rouault rouault closed this as completed Apr 20, 2024
@Ryanf55
Copy link
Contributor Author

Ryanf55 commented Apr 20, 2024

Thanks for pointing out the error in my code. I was not intending to load the whole terrain, I was just hoping the library would fail in a safer way.

Is it possible to add a C++ API that can take in an STL container for the buffer argument that is size-aware and add some error protection? I would be happy to contribute such an alternative.

For anyone else who finds this, here is my solution, which will ensure it throws with an allocation failure rather than overflow and segfault:

  static_assert(std::numeric_limits<uint64_t>::max() >= std::numeric_limits<uint32_t>::max() * std::numeric_limits<uint32_t>::max());
  buffer.resize(static_cast<uint64_t>(destWidth) * static_cast<uint64_t>(destHeight));

@rouault
Copy link
Member

rouault commented Apr 20, 2024

I was just hoping the library would fail in a safer way.

although that's not easily tested, I believe GDAL should be able to load the data if you had sufficient RAM and patience... :-)

Is it possible to add a C++ API that can take in an STL container for the buffer argument that is size-aware and add some error protection

That would be a reasonable enhancement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants