title | tags | authors | affiliations | date | bibliography | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
DynaML: A Scala Library/REPL for Machine Learning Research |
|
|
|
29 June 2018 |
paper.bib |
DynaML
is a Scala platform which aims to provide the user with an toolbox for research in
data science and machine learning. It can be used as
- A scala shell, local or remotely hosted.
$ dynaml
Welcome to DynaML v1.5.3-beta.3
Interactive Scala shell for Machine Learning Research
Currently running on:
(Scala 2.11.8 Java 1.8.0_101)
DynaML>
- A standalone script engine.
$ dynaml ./scripts/cifar.sc
- As a binary dependency for JVM based machine learning applications.
libraryDependencies += "com.github.transcendent-ai-labs" % "DynaML" % "master-SNAPSHOT"
DynaML
aims to provide an end to end solution for research and development in
machine learning, statistical inference and data science. Towards these goals, it
provides the user with modules for.
-
Data pre-processing using functional transformations. These transformations or pipes, can be joined to form complex processing pipelines.
-
Training predictive models, with a special focus on stochastic processes, kernel methods & neural networks. The model API can be extended to implement customized and complicated algorithms.
-
Model tuning & hyper-parameter optimization.
-
Visualization: two and three dimensional charts.
Scala [@scala] is a high level object oriented & functional programming language which runs on the Java Virtual Machine (JVM). Its expressiveness, multi-threading model and ability to execute on the JVM enable the prototyping and development of potentially large scale and data intensive applications.
The scala eco-system has a number of useful packages which DynaML
leverages such as,
Tensorflow [@tensorflow2015] support through Tensorflow for Scala [@tfscala],
the breeze linear algebra library and the
Ammonite project.
DynaML
has been applied in research into Gaussian Process based
geomagnetic time series prediction [@GPDst] & [@GPDst] and in on-going research in
MCMC based Bayesian inverse PDE problems specifically Fokker Planck
based plasma radial diffusion systems [@2017AGU].
It can be accessed via the online repository, or imported as a managed dependency into JVM projects via jitpack.
The user guide contains information regarding installation, usage, API documentation (Scaladoc) as well as usage examples.
DynaML
was conceived during the Master of Science, Artificial Intelligence
program at the KU Leuven and further developed during the PhD research carried out in the project
Machine Learning for Space Weather which is a part of the
CWI-INRIA International Lab.