-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
92 lines (66 loc) · 5.64 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
title: "lasr"
author: "Donny Keighley"
date: "6/8/2022"
output: github_document
---
[lasr](https://github.com/donald-keighley/lasr) is a package designed for reading and writing [Log Ascii Standard (LAS)](https://www.cwls.org/products/) files in R. Currently it is in the beta testing stages. As such, it is subject to significant ongoing changes and is not complete. For instance it can't write LAS files yet...
## Goals
lasr is primarily designed to import LAS files at high speed and in large batches. To accomplish this, most of it is written in C++ and connected to R with [Rcpp](http://www.rcpp.org/). It stores the data in lists of [data.table's](https://rdatatable.gitlab.io/data.table/) for fast manipulation.
Currently, the focus is on supporting reading [LAS 3.0](https://www.cwls.org/wp-content/uploads/2014/09/LAS_3_File_Structure.pdf) as fully as possible. There will be some effort to handle non-standard LAS files but nothing too cute. lasr is being written to load files using as little information from the header as possible which should alleviate many common issues. Beyond that, the aim is to output helpful error messages so that non-standard files can be fixed. Afterall, one of the great things about LAS files is that they are human readable and can often be fixed using a simple text editor.
[**lasr**](https://github.com/donald-keighley/lasr) is not designed to accomplish traditional petrophysical workflows and there are no plans to do so. There's plenty of industry software for that. lasr is intended as a building block to facilitate people doing new and creative things with large volumes of log data. It's also intended to help build up the library of geoscience packages for R.
For a more traditional approach, or if you'd prefer to use Python, you should check out the excellent [**lasio**](https://lasio.readthedocs.io/en/latest/index.html) package.
## Installation
You can install [**lasr**](https://github.com/donald-keighley/lasr) from github using the [`install_github`](https://www.rdocumentation.org/packages/devtools/versions/1.13.6/topics/install_github) function from the [devtools](https://devtools.r-lib.org/) package. You will need to install [RTools](https://cran.r-project.org/bin/windows/Rtools/) first since the package needs compilation.
```{r install lasr, eval=FALSE}
if (!require('devtools')) install.packages('devtools')
library(devtools)
install_github('https://github.com/donald-keighley/lasr')
```
Currently, the only function is `read.las` which will import a vector of LAS file paths into a multi-part list. Each section of the file is stored as a separate element. In order to accomodate LAS 3.0 files which may have multiple log data sections, the log parameter, log definition, and log data are combined into numbered log elements. If your vector of paths contains more than one file, the output list will have an element for each file.
Here is an example reading a single LAS file that is included with the package:
```{r readlas, eval=TRUE, message=FALSE}
library(lasr)
las = read.las(system.file("extdata", "las_3_cwls.las", package = "lasr"))
#Display the WELL section
head(las$well, 10)
#Display the log curves
head(las$log$log.1$data, 10)
```
Most LAS files are version 2 and only have one log data section. If you know this is the case you can set `flatten = TRUE` and only the first log section will be returned. This makes referencing the log data quicker.
```{r readlas_flatten, eval=TRUE, message=FALSE}
las = read.las(system.file("extdata", "las_3_cwls.las", package = "lasr"), flatten=TRUE)
head(las$log$data, 10)
```
## Speed Test
Since the purpose of this package is to load LAS files as quickly as possible, a speed test is included here with a comparison to python's [**lasio**](https://lasio.readthedocs.io/en/latest/index.html). First, download a test dataset from the KGS website. In this case we're using the [2016 logs](http://www.kgs.ku.edu/PRS/Scans/Log_Summary/2016.zip) data. Download and unzip them into a folder called *"C:/temp/logs"*, or modify the code for wherever you put it.
Next, import the first 500 files. We'll use 4 threads for this comparison, although if you have more cores you can increase the number of threads to speed it up further. Only use this option if you are importing more than a handful of files, otherwise the parallel overhead will slow it down.
```{r, import_kgs_logs, eval=TRUE}
files = list.files('C:/temp/logs', pattern = '.las?', full.names=TRUE)
start.time = Sys.time()
las = read.las(files[1:500],nthreads=4)
end.time = Sys.time()
time.taken = end.time - start.time
time.taken
```
Now in Python in parallel using 4 cores:
```{r setup, include=FALSE}
library(reticulate)
use_python("C:/Users/donal/AppData/Local/Programs/Python/Python310")
```
```{python import_kgs_logs_lasio, eval=FALSE}
import lasio, glob, datetime, multiprocessing
from joblib import Parallel, delayed
num_cores = 4
files = glob.glob('C:/temp/logs/*.las')
start_time = datetime.datetime.now()
if __name__ == "__main__":
las = Parallel(n_jobs=num_cores)(delayed(lasio.read)(file) for file in files[0:499])
end_time = datetime.datetime.now()
print('Duration: {}'.format(end_time - start_time))
```
```{python print_time_python, echo=FALSE}
print('Duration: 0:04:45.961226')
```
Clearly, lasr is faster, however, please don't take this as a shot at lasio. The primary goal of this package is speed, and as such countless hours have been put into speed testing, de-bottlenecking, and enduring the pain of writing in C++. As with anything, there are tradeoffs, and lasr errs toward speed where lasio tends more toward user convenience. They are simply different products.
Good luck, and if you have any suggestions reach out!