-
Notifications
You must be signed in to change notification settings - Fork 1
/
DESCRIPTION
41 lines (41 loc) · 2.58 KB
/
DESCRIPTION
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
Package: analysisPipelines
Type: Package
Date: 2020-06-12
Title: Compose Interoperable Analysis Pipelines & Put Them in Production
Version: 1.0.2
Authors@R: c(
person("Naren","Srinivasan", email = "[email protected]", role = c("aut")),
person("Zubin Dowlaty","", email = "[email protected]", role = c("aut")),
person("Sanjay","", email = "[email protected]", role = c("ctb")),
person("Neeratyoy","Mallik", email = "[email protected]", role = c("ctb")),
person("Anoop S","", email = "[email protected]", role = c("ctb")),
person("Mu Sigma, Inc.", email = "[email protected]", role = c("cre"))
)
Description: Enables data scientists to compose pipelines of analysis which consist of data manipulation, exploratory analysis & reporting, as well as modeling steps. Data scientists can use tools of their choice through an R interface, and compose interoperable pipelines between R, Spark, and Python.
Credits to Mu Sigma for supporting the development of the package.
Note - To enable pipelines involving Spark tasks, the package uses the 'SparkR' package.
The SparkR package needs to be installed to use Spark as an engine within a pipeline. SparkR is distributed natively with Apache Spark and is not distributed on CRAN. The SparkR version needs to directly map to the Spark version (hence the native distribution), and care needs to be taken to ensure that this is configured properly.
To install SparkR from Github, run the following command if you know the Spark version: 'devtools::install_github('apache/[email protected]', subdir='R/pkg')'.
The other option is to install SparkR by running the following terminal commands if Spark has already been installed: '$ export SPARK_HOME=/path/to/spark/directory && cd $SPARK_HOME/R/lib/SparkR/ && R -e "devtools::install('.')"'.
Depends: R (>= 3.4.0), magrittr, pipeR, methods
Imports: ggplot2, dplyr, futile.logger, RCurl, rlang (>= 0.3.0), proto, purrr
Suggests: plotly, knitr, rmarkdown, parallel, visNetwork, rjson, DT, shiny, R.devices, corrplot, car, foreign
Enhances: SparkR, reticulate
BugReports: https://github.com/Mu-Sigma/analysis-pipelines/issues
URL: https://github.com/Mu-Sigma/analysis-pipelines
Encoding: UTF-8
License: MIT
LazyLoad: yes
LazyData: yes
RoxygenNote: 6.1.1
VignetteBuilder: knitr
Collate:
'analysisPipelines_package.R'
'core-functions.R'
'core-functions-batch.R'
'core-functions-meta-pipelines.R'
'core-streaming-functions.R'
'r-batch-eda-utilities.R'
'r-helper-utilites-python.R'
'spark-structured-streaming-utilities.R'
'zzz.R'