Skip to content

Analytics Toolkit Overview

sharibenko edited this page May 24, 2016 · 3 revisions

Wiki HomeAT Overview

AT Overview

AT provides an extensible API for ETL, feature engineering, graph building and query, ML analytics with deep programming language integration.

  • Python package for data scientists:
    • Makes Big Data easier to use.
  • REST web service:
    • Protects clients from needing to know details of the Big Data services being used.
  • Processing engine:
    • Coordinates, executes, monitors, audits.
    • Implements all the analytics and machine learning.

API functionality

  • Operations on tabular data:
    • Import from HDFS files.
    • Add/remove columns, impute missing values.
    • Filter, join.
    • Machine learning.
  • Operations on graphical data:
    • Load from tabular data frames.
    • Query.
    • Machine learning.

Read the AT User Documentation for the full API documentation.

Client installation configuration and usage

Creating a toolkit server instance

You can create an Analytics Toolkit for Apache Hadoop* software server directly from Data Scientist View in the Console. Type the desired toolkit server instance name and click the Create new instance button.

Adding an Application to Marketplace Step 1

Installation on Windows*

  1. To use the toolkit Python client on a Windows-based computer, you will need to install the Anaconda Python distribution. You can install this software through an msi which can be found on the Anaconda download page.

  2. Before installing the toolkit Python client, you must first install NumPy and Pandas through Anaconda. Open a command line terminal and run the following command:

    $ pip install --extra-index-url https://pypi.analyticstoolkit.intel.com/latest/simple/ trustedanalytics

    All the dependencies for the Analytics Toolkit for Apache Hadoop* software package will be installed automatically.

Installation on Linux*

  1. You will need Python 2.7 and pip installed before you can install the toolkit client.

  2. After installing Python 2.7 and pip, run the following command to install the toolkit client:

    $ pip install --extra-index-url https://pypi.analyticstoolkit.intel.com/latest/simple/ trustedanalytics

Configuration

After the installation of the toolkit Python client, you must configure it to connect to the desired REST server.

Clone this wiki locally