Skip to content

Latest commit

 

History

History
112 lines (79 loc) · 4.07 KB

paper.md

File metadata and controls

112 lines (79 loc) · 4.07 KB
title tags authors affiliations date bibliography
DynaML: A Scala Library/REPL for Machine Learning Research
Scala
REPL/ssh-server
Machine Learning
TensorFlow
Kernel Methods
name orcid affiliation
Mandar H. Chandorkar
0000-0001-6025-7113
1, 2
name index
Centrum Wiskunde en Informatica, Multiscale Dynamics
1
name index
INRIA Paris-Saclay/Laboratoire de Recherche en Informatique, TAU
2
29 June 2018
paper.bib

Summary

DynaML is a Scala platform which aims to provide the user with an toolbox for research in data science and machine learning. It can be used as

  • A scala shell, local or remotely hosted.
$ dynaml

Welcome to DynaML v1.5.3-beta.3 
Interactive Scala shell for Machine Learning Research

Currently running on:
(Scala 2.11.8 Java 1.8.0_101)

DynaML> 
  • A standalone script engine.
$ dynaml ./scripts/cifar.sc
  • As a binary dependency for JVM based machine learning applications.
libraryDependencies += "com.github.transcendent-ai-labs" % "DynaML" % "master-SNAPSHOT"

Motivation & Design

DynaML aims to provide an end to end solution for research and development in machine learning, statistical inference and data science. Towards these goals, it provides the user with modules for.

  • Data pre-processing using functional transformations. These transformations or pipes, can be joined to form complex processing pipelines.

  • Training predictive models, with a special focus on stochastic processes, kernel methods & neural networks. The model API can be extended to implement customized and complicated algorithms.

  • Model tuning & hyper-parameter optimization.

  • Model evaluation

  • Visualization: two and three dimensional charts.

Scala Ecosystem

Scala [@scala] is a high level object oriented & functional programming language which runs on the Java Virtual Machine (JVM). Its expressiveness, multi-threading model and ability to execute on the JVM enable the prototyping and development of potentially large scale and data intensive applications.

The scala eco-system has a number of useful packages which DynaML leverages such as, Tensorflow [@tensorflow2015] support through Tensorflow for Scala [@tfscala], the breeze linear algebra library and the Ammonite project.

Applications

DynaML has been applied in research into Gaussian Process based geomagnetic time series prediction [@GPDst] & [@GPDst] and in on-going research in MCMC based Bayesian inverse PDE problems specifically Fokker Planck based plasma radial diffusion systems [@2017AGU].

It can be accessed via the online repository, or imported as a managed dependency into JVM projects via jitpack.

The user guide contains information regarding installation, usage, API documentation (Scaladoc) as well as usage examples.

Example figure.

Acknowledgements

DynaML was conceived during the Master of Science, Artificial Intelligence program at the KU Leuven and further developed during the PhD research carried out in the project Machine Learning for Space Weather which is a part of the CWI-INRIA International Lab.

References