Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add Tutorial and Derivations Notebooks for VALMOD #585 #586

Open
wants to merge 69 commits into
base: main
Choose a base branch
from

Conversation

NimaSarajpoor
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor commented Apr 6, 2022

This notebook addresses issue #585. In this notebook, we would like to implement the VALMOD method proposed in VALMOD_2018 and VALMOD_2020.

What I have done so far:

  • Provide introduction that provides gist of concept proposed in the paper
  • Calculate Lower-Bound distance profile after correcting the typo in eq(1) of paper....and verify the calculation with a np.random.uniform time series data.

For now, I calculated LB given q>0 (see eq(2) in paper.) However, we still need to find LB when q <= 0.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
Please allow me some time to see if I can calculate LB for q<=0 (see eq(2) of paper). I will let you know when I am done...

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Apr 6, 2022

I had a miscalculation. Although there is a typo in the paper, it seems the eq(2) of paper is correct. I fixed the typo of paper when I was doing calculation. However, I had a miscalculation somewhere else...so I corrected it and I got the eq(2)... for q>0. I will fix the notebook. (so, I assume the equation should be correct for case q<=0 as well).

@codecov-commenter
Copy link

codecov-commenter commented Apr 6, 2022

Codecov Report

Patch and project coverage have no change.

Comparison is base (275b998) 99.24% compared to head (f6126ca) 99.24%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #586   +/-   ##
=======================================
  Coverage   99.24%   99.24%           
=======================================
  Files          82       82           
  Lines       12956    12956           
=======================================
  Hits        12858    12858           
  Misses         98       98           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Apr 7, 2022

@seanlaw

Notebook is ready. The notebook covers the first 12 pages of VALMOD_2020 paper. I FIXed my miscalculation and things are good! I also implemented the Low-Bound distance profile function for now to see how it performs (we may use it later in VALMOD algorithm).

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Apr 9, 2022

Just wanted to let you know that you can ignore the function '_calc_LB_dist_profile' at the end of notebook (it is working..but I think it is not clean. I may probably remove it as VALMOD algorithm does not use such function. I just created it to get Lower-Bound of distance profile for now to show the result)

@seanlaw
Copy link
Contributor

seanlaw commented Apr 9, 2022

I will first need to go over the initial 12 pages myself and then I will review the notebook :)

@seanlaw
Copy link
Contributor

seanlaw commented Apr 10, 2022

@NimaSarajpoor I've gone over your notebook quickly but haven't verified the derivation. Usually, with derivations, I like to write things out fully without skipping any steps (see https://github.com/TDAmeritrade/stumpy/blob/main/docs/Matrix_Profile_Derivation.ipynb). Some of your equations don't seem to be rendering for me and it's a bit hard for me to follow. I can try to find some time to work through the derivation to verify your work if that's helpful?

@NimaSarajpoor
Copy link
Collaborator Author

I see. Please let me re-write it. I will try to follow the same approach/style you used in the link you provided. I will check ReviewNB and if it is rendered well, I will let you know. Sounds good?
(Btw, is it necessary to provide the derivation? I did that because of the typo in the paper, but then I realized the result is the same as what provided in eq(2) for q>0)

@seanlaw
Copy link
Contributor

seanlaw commented Apr 10, 2022

Yes, that would be great!

Personally, I think writing out the derivation clearly will help (me) and others reduce any doubt in understanding. Also, I find that it provides an opportunity to help maintain consistency in the code regarding variable names.

@NimaSarajpoor
Copy link
Collaborator Author

Weird...still not rendering well.... please let me do some investigation on my end to see what's going on...

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Apr 11, 2022

@seanlaw
So, I tried the Welford_Review/ Matrix_Profile_Derivation/ Pearson notebooks to see if ReviewerNB on Github can render them. Unfortunately, it seems that it cannot render them properly for the first two notebooks. Peason notebook is rendered properly though!

I guess you wrote the notebooks on your end and pushed them to the repo...and things were good when rendered locally in .ipynb. Did you, by any chance, try to check your notebooks via ReviewerNB of Github?


It seems the problem is related to the ReviewerNB of Github. I enclosed the math equations with $$ and it seems the problem is resolved. Well, almost resolved! I checked out the ReviewNB here and it seems there is still one error. I pushed the same notebook to a test_repo I created and it seems that single error does not appear when I checked it out with ReviewerNB.

my_test_repo

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
Just for the records:
if I click on the ReviewNB purple button (on top of this PR page), it seems there is still one error related to rendering. However, when I clicked on ReviewNB blue hyperlink (on top of this PR page just below the purple button), and navigated to the notebook from my fork of STUMPY, everything seems to be fine and there is no error in rendering....

STUMPY_my_fork

@seanlaw
Copy link
Contributor

seanlaw commented Apr 11, 2022

Sounds good

docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
@seanlaw
Copy link
Contributor

seanlaw commented Apr 11, 2022

Apologies, these comments are for an older commit. I forgot to hit "Finish Review" along with my last comment.

docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
@seanlaw
Copy link
Contributor

seanlaw commented Apr 11, 2022

@NimaSarajpoor I provided some comments and stopped at the "Expanding (3)" line

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Apr 11, 2022

@seanlaw
Thanks for the comments!

Apologies, these comments are for an older commit. I forgot to hit "Finish Review" along with my last comment.

I found two of those comments in the ReviewNB. Maybe they got mixed together (?). I will address those two comments and then I ignore/resolve the rest. Please let me know if I miss anything.

@NimaSarajpoor
Copy link
Collaborator Author

I think we are all set. I can push commits after revising the notebook.

@seanlaw
Copy link
Contributor

seanlaw commented Apr 24, 2022

@NimaSarajpoor I think things look good.

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
Great! Thanks for checking it out in your busy schedule. So, I will continue working on the implementation of VALMOD.

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Apr 25, 2022

@seanlaw
please free to review.

  • Improved section 2: Lower Bound of Distance Profile
  • Added section 3: Core idea to briefly explain VALMOD
  • Added section 4: VALMOD algorithm. ( Implemented algorithm3 of paper)

docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
docs/Tutorial_VALMOD.ipynb Show resolved Hide resolved
@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw

We basically need to spend some time figuring out how to allow stumpy.stump , stumpy.stumped , and stumpy.gpu_stump return top-k nearest neighbors.

So, should I now go and study stump/stumped/gpu_stump? And, then try to change all of them to return top-k nearest neighbors?

@seanlaw
Copy link
Contributor

seanlaw commented Apr 27, 2022

@NimaSarajpoor Yes, I also added a new issue #592 where we can discuss it in more detail

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Jun 22, 2022

A couple of notes that are confirmed by the main author of VALMOD:

  • On Page 13, in Algorithm 2: update VALMP, line 3 should have been: if VALMP.normdistances[i] > lNormDist or ...
  • On page 16, the Algorithm 4: ComputeSubMP can be further optimized by updating minDistABS in the for-loop (line 28)

(This is to make sure that we do not lose this information later)

@seanlaw
Copy link
Contributor

seanlaw commented Jan 31, 2023

So, should I now go and study stump/stumped/gpu_stump? And, then try to change all of them to return top-k nearest neighbors?

We've come along way @NimaSarajpoor! I wonder how easy/hard it would be to implement VALMOD now that we have top-k nearest neighbors?

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Feb 1, 2023

@seanlaw

We've come along way

We have indeed!

I wonder how easy/hard it would be to implement VALMOD now that we have top-k nearest neighbors?

I took a quick look at the paper. I don't remember the details but I think the first four algorithms are the core ones. The first two algorithms are easy. The third one is already done (the top-k feature added to STUMPY). In my opinion, the main remaining task is algorithm 4. I think its implementation should be straightforward.

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Mar 7, 2023

I think the algorithm presented in a paper has a flaw. (I sent an email to the main author and am waiting for his response.)

In the page14 of the paper, the following can be read:

Algorithm 2 shows the routine to update the VALMP structure. The final VALMP consists of four parts. The i the entry of the normDistances vector stores the smallest length-normalized Euclidean distance values between the ith subsequence and its nearest neighbor, while the ith place of vector distances stores their straight Euclidean distance. The location of each subsequence’s nearest neighbor is stored in the vector indices. The structure lengths contains the length of the i th subsequences pair.

Let's assume P denotes the variable-length matrix profile obtained by VALMOD. According to my investigation P cannot be exact. Athough min(P) is exact, P itself is not exact. (see algorithm 4, line 29)


It is also possible that the author is aware of this, and considered it in other algorithms provided in the paper, like the ones presented for discovering motifs / discords. In other words, the aforementioned paragraph might just be written badly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants