-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDR: Implement ways to measure quality of produced SDR #155
Comments
Yes, I agree. Having some sort of measure would be very useful. 👍 |
Great idea! I would add stats: min/mean/std-dev/max for ac tiveDutyCycles, and then binary entropy which is a single fraction (in range 0-1) which describes utilization. Ths SDR class has a hook which is called everytime its value is updated, could be useful for this task? |
So maybe in 2 ways of implementation; as for the first type, I'd like only metrics that is computed instantly, just from the SDR. To make it simpler (no logic needs to be added to SP), and faster.
For a single bit, whole SDR, or the SP? |
We could split these metrics into different methods, and then have a print method which calls all four. Then the min/max can be computed fast and separately, but the print method (which is typically only called once at end of program) can display all of the stats.
Entropy is for the activation frequency of the SDR as a whole. Here is my python function for it:
Then to scale entropy into range [0, 1] simply divide by the theoretical maximum entropy which is: |
Another good idea, to which I would add mean & std. Min & max tell you about the extremes & outliers, which can be helpful for spotting bugs. Mean & std tell you about its normal operating behaviour. Yet another interesting metric to track is: Average overlap between consecutive assignments to an SDR. This measures how quickly an SDR changes, sort of like a derivative. I have in past experiments used this to measure the quality of encoders, w/ regards to semantic similarity property. I've also used this metric in experiments with Layer 2/3 cell stability / view-point invariance. |
Yes, I'd like the metric to provide a mapping to [0, 1], but (also) return the separate stats.
First I've thought of "quality" as a one-shot measure of an SDR, you're suggesting to add statistics over the run of the program on the dataset (which is a good thing!) Only if these should be separate? Quality of SDR, and stats of SP. Or keep it together in one.
And the
What do you mean by this? If it's overlap between 2 consecutive (any) SDR values produced by SP, that imho has no meaning, as these do not have to be anyhow correlated...? |
I think a good way to organize all of these would be to give each metric its own class. Then create a class named Each metric could follow a common design pattern, such as:
|
We can evaluate sparsity quite well, but this distributed-ness? Would information/entropy over column activations over the run over dataset (too much over-s :D) be enough? In SP we can use (active)DutyCycles as well... |
I would like to tyrn this into a paper. The main ideas are:
|
Yes.
This is only relevant for time-series datasets. The encoder output should have an overlap when the input value is slowly and smoothly moving, which indicates semantic similarity between encoded values. The SP should have very little overlap because it should map similar inputs to distinctive outputs. The column-pooler should have a significant average overlap because it is supposed to do view-point invariance. |
For reference: I got a lot of ideas for statistics by reading numenta's papers. In their SP paper they describe several ways to measure the quality of their results. IIRC the SDR paper was also useful.
In this context I think that "distributed" means "decorrelated". You can measure the correlation between two SDRs, and between every pair of SDRs in a set, and then average those correlations together into a single result describing overall quality. In past experiments I've measured correlations between & within labelled categories, which I found useful.
Alternatively, this info would be great for our wiki too. It would be helpful for other ppl to understand how to build & debug HTM systems. I have been meaning to write on the htm-community wiki. I've started writing a wiki in my fork of nupic.cpp but its not done yet. I am hoping to turn the wiki into a practical guide for using HTMs. The numenta wiki already has a lot of good material & docs which we should copy into this wiki at some point. |
What would be good datasets to test this?
I'm trying to figure how to eliminate the (error) caused from encoders, which are written by hand. We could use a set of SDRs and just modify them (to have semantically similar, with known difference data), MNIST would be a good example from a practical domain. Also, would c++, py be a better repo to start this research at? |
This would be useful in conjunction with any encoder. Use artificial data as input so that you can control the rate it changes at, and check that the resulting SDR has a reasonable average overlap. The SP-AverageOverlap class should use an exponential rolling average, so it is possible to get the exact overlap (rather than an average) for testing purposes by setting its parameter to 1.
An artificial dataset. Numenta created 3D objects to test this. In my experiments I used words: I encoded each letter of the alphabet as a random SDR, and fed the two layer network a sequence of words (with whitespace removed). I judged the quality of layers 2/3 by the average overlap, as well as a more detailed analysis of the actual overlaps within & between categories (where each word is a category).
IMO C++. I would rather make this repo really good, and then have python bindings. |
Related Numenta papers : https://arxiv.org/abs/1503.07469 https://arxiv.org/abs/1602.05925 (please add more resources) |
From the SP paper: Two more metrics for the SP, not generic for all SDRs. These metrics depend on an input dataset and prior training, so there is some work required from the user.
From "Properties of Sparse Distributed Representations and their Application to Hierarchical Temporal Memory": Both of the following metrics could be methods of TM class.
Cell death experiments: We could make an SDR subclass which kills a fraction of cells in an SDR and filters them out of its value. |
I figured most of the interesting metrics would be task (dataset) dependant. In a form of a sliding window, as HTM is doing online learning.
I'd add this under the autoassociative memory experiment, with dropout:
Also, about this
would not add a subclass, but constructor param
FP, FN rates: 👍 |
Some other hypothesis to verify:
|
Update: Implemented in PR #184
TODO: This is not critical, but maybe useful? I'd like all the SDR Metrics to have another constructor which does not accept an SDR, instead the user must call Metric.addData( SDR ). This lets the users manage their own data and is a more flexible solution. UPDATE Summary: Ideas which are discussed here but not yet implemented:
|
I accidentally made a bug in the mnist branch, which resulted in a 2% decrease in accuracy from 95 to 93%. This bug also caused the entropy to drop from ~95% to less than 75%! |
Relevant classes:
Why?
Functionality:
SDR = sparse distributed representation
Implementation:
Hypothesis:
EDIT: latest update 14/01/2019
Update: Implemented in PR #184
Summary: Ideas which are discussed here but not yet implemented:
The text was updated successfully, but these errors were encountered: