SDR: Implement ways to measure quality of produced SDR #155

breznak · 2018-12-10T16:25:12Z

dkeeney · 2018-12-10T16:38:56Z

when making changes to SP, we don't have ways to meansure the quality of its outputs: SDRs.

Yes, I agree. Having some sort of measure would be very useful. 👍

ctrl-z-9000-times · 2018-12-10T16:59:58Z

Great idea!

I would add stats: min/mean/std-dev/max for ac tiveDutyCycles, and then binary entropy which is a single fraction (in range 0-1) which describes utilization.

Ths SDR class has a hook which is called everytime its value is updated, could be useful for this task?

breznak · 2018-12-11T09:20:31Z

I would add stats: min/mean/std-dev/max for ac tiveDutyCycles

So maybe in 2 ways of implementation; as for the first type, I'd like only metrics that is computed instantly, just from the SDR. To make it simpler (no logic needs to be added to SP), and faster.

entropy .. which describes utilization.

For a single bit, whole SDR, or the SP?

ctrl-z-9000-times · 2018-12-11T13:59:11Z

We could split these metrics into different methods, and then have a print method which calls all four. Then the min/max can be computed fast and separately, but the print method (which is typically only called once at end of program) can display all of the stats.

class SDR_ActivationFrequency {
    SDR_ActivationFrequency( SDR &dataSource );
    Real min();
    Real max();
    Real mean();
    Real std();
    Real entropy();
    String pretty_print(); // Uses all the metrics.
}

entropy .. which describes utilization.

For a single bit, whole SDR, or the SP?

Entropy is for the activation frequency of the SDR as a whole. Here is my python function for it:

def _binary_entropy(p): // p is an array of floats in range [0, 1]
    p_ = (1 - p)
    s  = -p*np.log2(p) -p_*np.log2(p_)
    return np.mean(np.nan_to_num(s))

Then to scale entropy into range [0, 1] simply divide by the theoretical maximum entropy which is: entropy(mean(activationFrequency)).

ctrl-z-9000-times · 2018-12-11T14:24:14Z

min, max active bits in SDR, compared to % size

Another good idea, to which I would add mean & std. Min & max tell you about the extremes & outliers, which can be helpful for spotting bugs. Mean & std tell you about its normal operating behaviour.

Yet another interesting metric to track is: Average overlap between consecutive assignments to an SDR. This measures how quickly an SDR changes, sort of like a derivative. I have in past experiments used this to measure the quality of encoders, w/ regards to semantic similarity property. I've also used this metric in experiments with Layer 2/3 cell stability / view-point invariance.

breznak · 2018-12-11T14:40:02Z

split these metrics into different methods,

Yes, I'd like the metric to provide a mapping to [0, 1], but (also) return the separate stats.

Mean & std tell you about its normal operating behaviour.

First I've thought of "quality" as a one-shot measure of an SDR, you're suggesting to add statistics over the run of the program on the dataset (which is a good thing!) Only if these should be separate? Quality of SDR, and stats of SP. Or keep it together in one.

def _binary_entropy(p): // p is an array of floats in range [0, 1]

And the p here is? activation freq for each column(bit) after N runs?

Yet another interesting metric to track is: Average overlap between consecutive assignments to an SDR

What do you mean by this? If it's overlap between 2 consecutive (any) SDR values produced by SP, that imho has no meaning, as these do not have to be anyhow correlated...?

ctrl-z-9000-times · 2018-12-11T14:40:56Z

I think a good way to organize all of these would be to give each metric its own class. Then create a class named SDR_Metrics which would gather up all of the metrics into a single easy to use package.

Each metric could follow a common design pattern, such as:

SDR_MetricName {
    void SDR_MetricName( SDR &dataSource, ... );
    Real statistics(); // Min Mean Std Max
    String print();
    void save( ... );
    void load( ... );
}

breznak · 2018-12-11T14:45:08Z

I'm looking into measuring the following property of an SDR:
"Distributed = each bit can be REused in several different contexts(SDRs), and a collection of multiple bits is unique (a SDR)"

We can evaluate sparsity quite well, but this distributed-ness? Would information/entropy over column activations over the run over dataset (too much over-s :D) be enough? In SP we can use (active)DutyCycles as well...

breznak · 2018-12-11T14:50:20Z

I would like to tyrn this into a paper. The main ideas are:

we can&should measure quality of the encoding (SDRs) - how? What features?
(How) does the quality correlate with good algorithm results? (prediction & anomaly)
Compare quality of (output) encodings of other ML algorithms. (Which? Only sparse representations?)
- sparse auto-encoders
- cortical.io retina (SDRs for NLP)
- biological (BCI data from which regions, retina, ...)

ctrl-z-9000-times · 2018-12-11T14:51:03Z

And the p here is? activation freq for each column(bit) after N runs?

Yes.

What do you mean by [average overlap]? If it's overlap between 2 consecutive (any) SDR values produced by SP, that imho has no meaning, as these do not have to be anyhow correlated...?

This is only relevant for time-series datasets. The encoder output should have an overlap when the input value is slowly and smoothly moving, which indicates semantic similarity between encoded values. The SP should have very little overlap because it should map similar inputs to distinctive outputs. The column-pooler should have a significant average overlap because it is supposed to do view-point invariance.

ctrl-z-9000-times · 2018-12-11T15:13:13Z

For reference: I got a lot of ideas for statistics by reading numenta's papers. In their SP paper they describe several ways to measure the quality of their results. IIRC the SDR paper was also useful.

"Distributed = each bit can be REused in several different contexts(SDRs), and a collection of multiple bits is unique (a SDR)"

In this context I think that "distributed" means "decorrelated". You can measure the correlation between two SDRs, and between every pair of SDRs in a set, and then average those correlations together into a single result describing overall quality. In past experiments I've measured correlations between & within labelled categories, which I found useful.

I would like to tyrn this into a paper

Alternatively, this info would be great for our wiki too. It would be helpful for other ppl to understand how to build & debug HTM systems. I have been meaning to write on the htm-community wiki. I've started writing a wiki in my fork of nupic.cpp but its not done yet. I am hoping to turn the wiki into a practical guide for using HTMs. The numenta wiki already has a lot of good material & docs which we should copy into this wiki at some point.

breznak · 2018-12-20T12:58:56Z

Average overlap between consecutive assignments to an SDR. This measures how quickly an SDR changes, sort of like a derivative. I have in past experiments used this to measure the quality of encoders, w/ regards to semantic similarity property. I've also used this metric in experiments with Layer 2/3 cell stability / view-point invariance.

What would be good datasets to test this?

You can measure the correlation between two SDRs

I'm trying to figure how to eliminate the (error) caused from encoders, which are written by hand. We could use a set of SDRs and just modify them (to have semantically similar, with known difference data), MNIST would be a good example from a practical domain.

Also, would c++, py be a better repo to start this research at?

ctrl-z-9000-times · 2018-12-20T15:16:24Z

What would be good datasets to test [SP-AverageOverlap]?

This would be useful in conjunction with any encoder. Use artificial data as input so that you can control the rate it changes at, and check that the resulting SDR has a reasonable average overlap. The SP-AverageOverlap class should use an exponential rolling average, so it is possible to get the exact overlap (rather than an average) for testing purposes by setting its parameter to 1.

What would be good datasets to test [Layer 2/3 cell stability / view-point invariance]?

An artificial dataset. Numenta created 3D objects to test this.

In my experiments I used words: I encoded each letter of the alphabet as a random SDR, and fed the two layer network a sequence of words (with whitespace removed). I judged the quality of layers 2/3 by the average overlap, as well as a more detailed analysis of the actual overlaps within & between categories (where each word is a category).

Also, would c++, py be a better repo to start this research at?

IMO C++. I would rather make this repo really good, and then have python bindings.

breznak · 2018-12-21T09:05:02Z

Related Numenta papers :
https://arxiv.org/abs/1601.00720

https://numenta.com/neuroscience-research/research-publications/papers/htm-spatial-pooler-neocortical-algorithm-for-online-sparse-distributed-coding/

https://arxiv.org/abs/1503.07469

https://arxiv.org/abs/1602.05925

(please add more resources)

ctrl-z-9000-times · 2018-12-21T15:40:38Z

From the SP paper: Two more metrics for the SP, not generic for all SDRs. These metrics depend on an input dataset and prior training, so there is some work required from the user.

Noise resistance
- We could have a method SP.computeNoisy(inputs, outputs, percentNoise) -> noiseResistance which would calculate this metric and return the percent overlap between the clean & noisy results.
Long term stability of inputs -> outputs
- Method SDR.overlap() can help with this.

From "Properties of Sparse Distributed Representations and their Application to Hierarchical Temporal Memory": Both of the following metrics could be methods of TM class.

False positive rate (estimate): needs input SDR-Sparsity
False negative rates (estimate): needs input SDR-Sparsity & percentNoise

Cell death experiments: We could make an SDR subclass which kills a fraction of cells in an SDR and filters them out of its value.

breznak · 2018-12-22T12:03:22Z

These metrics depend on an input dataset and prior training, so there is some work required from the user.

I figured most of the interesting metrics would be task (dataset) dependant. In a form of a sliding window, as HTM is doing online learning.

Noise resistance

I'd add this under the autoassociative memory experiment, with dropout:

Cell death experiments: We could make an SDR subclass which kills a fraction of cells in an SDR and filters them out of its value.

Also, about this

Cell death experiments:

would not add a subclass, but constructor param float dropoutRatio that kills (=flips) each bit randomly with given chance.

False positive rate (estimate): needs input SDR-Sparsity

False negative rates (estimate): needs input SDR-Sparsity & percentNoise

FP, FN rates: 👍

breznak · 2018-12-22T12:13:39Z

Some other hypothesis to verify:

H3: low accumulated SDR quality -> hint to change (running) params of the network (SP, TM,..params); should ignore anomalies (as it implies "I don't understand the problem").
H4: Quality acts as a "confidence measure", orthogonal to anomaly score. Allows us to say: "I'm highly confident this is a contextual anomaly" (=high quality, high anomaly) vs. "anomaly && low quality" = "I'm new to the problem, don't take predictions too seriously" (= we may filter out the anomaly) vs (high quality & low anomaly) -> "don't filter out, just small anomaly, but I'm confident about that"
H5: cummulatice quality drop indicates domain change -> could trigger auto-reset(), param tuning, or just hint the domain change. (ex sine wave switches to stairs patter), find datasets for this.

ctrl-z-9000-times · 2019-01-06T14:20:39Z

Update: Implemented in PR #184

SDR Sparsity Metrics
SDR Activation Frequency Metrics
SDR Average Overlap Metrics
SDR All Metrics Convenience Class

TODO: This is not critical, but maybe useful? I'd like all the SDR Metrics to have another constructor which does not accept an SDR, instead the user must call Metric.addData( SDR ). This lets the users manage their own data and is a more flexible solution.

UPDATE Metric.addData( SDR ) implemented.

Summary: Ideas which are discussed here but not yet implemented:

Cell death (via SDR subclass)
SDR topology
SP noise resistance
SP long term stability
TM estimate false positive & negative rates
Test Hypothesises
Write about how to measure HTM's using these metrics

ctrl-z-9000-times · 2019-01-09T17:53:26Z

does higher quality SDR translate to better (how?) results? (in what?)

I accidentally made a bug in the mnist branch, which resulted in a 2% decrease in accuracy from 95 to 93%. This bug also caused the entropy to drop from ~95% to less than 75%!

breznak added enhancement New feature or request SP labels Dec 10, 2018

breznak self-assigned this Dec 10, 2018

breznak added the research new functionality of HTM theory, research idea label Dec 10, 2018

breznak mentioned this issue Dec 10, 2018

Implement SP with apical connections to other active columns #156

Open

ctrl-z-9000-times mentioned this issue Dec 20, 2018

SDR Metrics classes #184

Merged

ctrl-z-9000-times self-assigned this Dec 22, 2018

breznak closed this as completed Dec 22, 2018

breznak reopened this Dec 22, 2018

breznak mentioned this issue Sep 17, 2019

Dropout WIP #535

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDR: Implement ways to measure quality of produced SDR #155

SDR: Implement ways to measure quality of produced SDR #155

breznak commented Dec 10, 2018 •

edited by ctrl-z-9000-times

Loading

dkeeney commented Dec 10, 2018

ctrl-z-9000-times commented Dec 10, 2018

breznak commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

breznak commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

breznak commented Dec 11, 2018

breznak commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

breznak commented Dec 20, 2018

ctrl-z-9000-times commented Dec 20, 2018

breznak commented Dec 21, 2018

ctrl-z-9000-times commented Dec 21, 2018

breznak commented Dec 22, 2018

breznak commented Dec 22, 2018

ctrl-z-9000-times commented Jan 6, 2019 •

edited

Loading

ctrl-z-9000-times commented Jan 9, 2019

SDR: Implement ways to measure quality of produced SDR #155

SDR: Implement ways to measure quality of produced SDR #155

Comments

breznak commented Dec 10, 2018 • edited by ctrl-z-9000-times Loading

Update: Implemented in PR #184

Summary: Ideas which are discussed here but not yet implemented:

dkeeney commented Dec 10, 2018

ctrl-z-9000-times commented Dec 10, 2018

breznak commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

breznak commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

breznak commented Dec 11, 2018

breznak commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

ctrl-z-9000-times commented Dec 11, 2018

breznak commented Dec 20, 2018

ctrl-z-9000-times commented Dec 20, 2018

breznak commented Dec 21, 2018

ctrl-z-9000-times commented Dec 21, 2018

breznak commented Dec 22, 2018

breznak commented Dec 22, 2018

ctrl-z-9000-times commented Jan 6, 2019 • edited Loading

Update: Implemented in PR #184

Summary: Ideas which are discussed here but not yet implemented:

ctrl-z-9000-times commented Jan 9, 2019

breznak commented Dec 10, 2018 •

edited by ctrl-z-9000-times

Loading

ctrl-z-9000-times commented Jan 6, 2019 •

edited

Loading