Skip to content
This repository has been archived by the owner on Dec 31, 2020. It is now read-only.

Operations

Eron Wright edited this page Sep 6, 2017 · 11 revisions

WikiOperations

Please see the Examples to best understand how to use flink-htm.

Imports

To get started, import the flink-htm API into your streaming program.

Java

import org.numenta.nupic.flink.streaming.api.HTM;

Scala

import org.apache.flink.streaming.api.scala._
import org.numenta.nupic.encoders.scala._
import org.numenta.nupic.flink.streaming.api.scala._

DataStream Transformations

flink-htm provides transformations of Flink DataStreams for online learning with HTM.

Here's a usage example:

val anomalyScores: DataStream[(DateTime,Double)] = env
      .addSource(nycTraffic)
      .keyBy("streamId")
      .learn(network)
      .resetOn(input => input.isStartOfDay)
      .select(inference => (inference._1.datetime, inference._2.getAnomalyScore))

Learn

DataStream → HTMStream

Processes each input element with a global HTM model to produce an inference element. Consider using a keyed stream to improve program parallelism.

The learn function accepts a network specification as described in the next section.

Learn

KeyedStream → HTMStream

Processes each input element of a keyed data stream with a separate HTM model for each key.

Select

HTMStream → DataStream

Transforms each inference produced by the HTMStream. For each input element, each corresponding inference contains:

  • an anomaly score for that input, and
  • results computed by the algorithms contained in the configured HTM network, typically prediction data.

Statefulness: Keep in mind that predictions relate to future inputs. To compare a predicted value to an actual value, a stateful function is needed to store the prediction. See Working with State for more information. The HotGym example demonstrates how to use mapWithState to correctly compare a predicted value with the actual value.

Reset

HTMStream → HTMStream

Apply a reset function to the HTMStream to identify inputs that represent the start of a temporal sequence. For example, given a stream with the repeating sequence A,B,C,D,E, apply a reset function that returns true whenever 'A' is encountered.

Statefulness: Use a stateful reset function for complex sequences, such as when the end of a sequence is easiest to identify.

Reliability

The flink-htm operators fully support the checkpointing mechanism of Flink. At checkpoint time, the HTM model is saved to checkpoint storage. In the event of failure, the HTM model is restored to the most recent checkpoint. The program should resume normally without loss of learned state.

Note on savepoints: the stored format of HTM models does not support format upgrades. Do not upgrade the flink-htm dependency in an application with which you expect to restore a savepoint.

Clone this wiki locally