-
Notifications
You must be signed in to change notification settings - Fork 8
Analytics Toolkit Overview
Wiki Home ▸ AT Overview
AT provides an extensible API for ETL, feature engineering, graph building and query, ML analytics with deep programming language integration.
- Python package for data scientists:
- Makes Big Data easier to use.
- REST web service:
- Protects clients from needing to know details of the Big Data services being used.
- Processing engine:
- Coordinates, executes, monitors, audits.
- Implements all the analytics and machine learning.
- Operations on tabular data:
- Import from HDFS files.
- Add/remove columns, impute missing values.
- Filter, join.
- Machine learning.
- Operations on graphical data:
- Load from tabular data frames.
- Query.
- Machine learning.
Read the AT User Documentation for the full API documentation.
You can create an Analytics Toolkit for Apache Hadoop* software server directly from Data Scientist View in the Console. Type the desired toolkit server instance name and click the Create new instance button.
-
To use the toolkit Python client on a Windows-based computer, you will need to install the Anaconda Python distribution. You can install this software through an msi which can be found on the Anaconda download page.
-
Before installing the toolkit Python client, you must first install NumPy and Pandas through Anaconda. Open a command line terminal and run the following command:
$ pip install --extra-index-url https://pypi.analyticstoolkit.intel.com/latest/simple/ trustedanalytics
All the dependencies for the Analytics Toolkit for Apache Hadoop* software package will be installed automatically.
-
You will need Python 2.7 and pip installed before you can install the toolkit client.
-
After installing Python 2.7 and pip, run the following command to install the toolkit client:
$ pip install --extra-index-url https://pypi.analyticstoolkit.intel.com/latest/simple/ trustedanalytics
After the installation of the toolkit Python client, you must configure it to connect to the desired REST server.