Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues with ReaderLas #69

Closed
MBunel opened this issue Aug 10, 2023 · 5 comments · Fixed by #70
Closed

Performance issues with ReaderLas #69

MBunel opened this issue Aug 10, 2023 · 5 comments · Fixed by #70
Labels
enhancement New feature or request

Comments

@MBunel
Copy link

MBunel commented Aug 10, 2023

Hi,

I'm trying to use pdal's java bindings to read las and laz point clouds in Scala (as shown in the documentation), and I have some slowness problems. When I run the method execute (see below) my code takes a long time to run, as if calling the execute method loaded all the points in memory.

The main problem is that it's impossible to access the header content before calling the execute method, but I'd like to use the header content to filter the files I actually want to read.

I haven't found a solution to this problem in the java biding. They don't seem to expose a solution for easily reading a las file, for example with a pointer on files, as proposed in a lib like las_rs. However, I haven't looked at the C++ api, so I don't know whether this is a limitation of the java bindings, or a pdal design choice.

So I have two questions:

  1. Can I read a las file in Scala more effectively ?
  2. Can I access the header without running a pipeline?

Thanks


This is the current version of my code.

class LASPdalReader(path: String) extends PartitionReader[InternalRow] {

  private val expression = ReadLas(path)
  private val pipeline = expression.toPipeline
  pipeline.initialize()
  // This step if very long
  pipeline.execute()

  private val pvs: PointViewIterator = pipeline.getPointViews()
  private val pv = pvs.next()

  private val points_count = pv.length()
  private var counter = 0

  override def next(): Boolean = this.counter < this.points_count

  override def get(): InternalRow = {
    val row = InternalRow(
      pv.getX(this.counter).toFloat,
      pv.getY(this.counter).toFloat,
      pv.getY(this.counter).toFloat,
      pv.getShort(this.counter, "Classification")
    )
    this.counter += 1
    row
  }

  override def close(): Unit = pvs.close()
}
@hobu
Copy link
Member

hobu commented Aug 10, 2023

Does the Java bindings have the preview() method for pdal::Stage? This is what you want...

@pomadchin pomadchin added the enhancement New feature or request label Aug 10, 2023
@pomadchin
Copy link
Collaborator

pomadchin commented Aug 10, 2023

Nope, its not exposed.

I guess we need to implement smth similar to Pythons getQuickInfo

@pomadchin
Copy link
Collaborator

There's also https://github.com/geotrellis/geotrellis-pointcloud (which is slightly outdated), but may be of help (since I see some Spark code)! It does not implement DataSourcesV2 API sadly (yet, some attention and time needed for the project).

@pomadchin
Copy link
Collaborator

pomadchin commented Sep 24, 2023

Hey @MBunel see #70 with the quickInfo exposed.

Also most likely for the needs in the question the metadata could suffice!

@MBunel
Copy link
Author

MBunel commented Sep 25, 2023

Many thanks for this implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants