Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Deterministic UMAP with floating point rounding. #3848

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented May 11, 2021

Use floating rounding to make UMAP optimization deterministic. This is a breaking change as the batch size parameter is removed.

  • Add procedure for rounding the gradient updates.
  • Add buffer for gradient updates.
  • Add an internal parameter deterministic, which should be set to true when random_state is set.

The test file is removed due to #3849 .

@trivialfis trivialfis changed the title [WIP] Deterministic UMAP with floating point roundng. [WIP] Deterministic UMAP with floating point rounding. May 11, 2021
@github-actions github-actions bot added CUDA/C++ Cython / Python Cython or Python issue labels May 11, 2021
@trivialfis trivialfis added breaking Breaking change feature request New feature or request labels May 11, 2021
@trivialfis
Copy link
Member Author

trivialfis commented May 11, 2021

Related:

I will look into them once issue in optimize_layout is resolved.

@trivialfis trivialfis requested a review from cjnolet May 11, 2021 13:38
@cjnolet
Copy link
Member

cjnolet commented May 11, 2021

@trivialfis, I compiled and executed both of your approaches against the current approach to reproducibility. So far, it does look like this approach is faster and I've verified that it appears to be reproducible. It would still be nice for completeness to do some profiling and isolate where the bottlenecks exist in the warp-reduction approach but this approach does have the benefit of requiring less changes to the code.

Here's some preliminary results from a very informal benchmark on a V100 (using defaults). Notice your truncation approach is right about the same timing as the non-reproducible approach of the current UMAP implementation

>>> def do_it():
...   import time
...   s = time.time()
...   m = UMAP().fit(X)
...   print("TooK %ss" % (time.time() - s))

Current UMAP (non-reproducible)
>>> X, y = make_blobs(100000, 256)
>>> do_it()
TooK 0.7839245796203613s
>>> do_it()
TooK 0.790290355682373s

Current UMAP (reproducible)
>>> X, y = make_blobs(100000, 256)
>>> do_it()
TooK 1.065580129623413s
>>> do_it()

Warp-level reductions:
>>> X, y = make_blobs(100000, 128)
>>> do_it()
TooK 1.012941837310791s
>>> do_it()
TooK 0.9855947494506836s
>>> X, y = make_blobs(100000, 256)
>>> do_it()
TooK 1.2152369022369385s

Truncation: 
>>> X, y = make_blobs(100000, 256)
>>> do_it()
TooK 1.2795426845550537s
>>> do_it()
TooK 0.7870500087738037s
>>> do_it()
TooK 0.7900457382202148s
>>> do_it()
TooK 0.7811644077301025s
>>> def do_it():
...   import time
...   s = time.time()
...   m = UMAP(random_state=42).fit(X)
...   print("TooK %ss" % (time.time() - s))
...   print(m.embedding_)
... 
>>> do_it()
TooK 0.9100887775421143s
[[-8.367687  -3.012333 ]
 [-9.082779  -4.595929 ]
 [-0.6999092  1.0279074]
 ...
 [ 8.140747   3.4932442]
 [ 8.890211   3.0183897]
 [ 8.434332   2.4182014]]
>>> do_it()
TooK 0.7889752388000488s
[[-8.367687  -3.012333 ]
 [-9.082779  -4.595929 ]
 [-0.6999092  1.0279074]
 ...
 [ 8.140747   3.4932442]
 [ 8.890211   3.0183897]
 [ 8.434332   2.4182014]]
>>> do_it()
TooK 0.7994036674499512s
[[-8.367687  -3.012333 ]
 [-9.082779  -4.595929 ]
 [-0.6999092  1.0279074]
 ...
 [ 8.140747   3.4932442]
 [ 8.890211   3.0183897]
 [ 8.434332   2.4182014]]
>>> do_it()
TooK 0.8132762908935547s
[[-8.367687  -3.012333 ]
 [-9.082779  -4.595929 ]
 [-0.6999092  1.0279074]
 ...
 [ 8.140747   3.4932442]
 [ 8.890211   3.0183897]
 [ 8.434332   2.4182014]]

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just providing some initial feedback. I'll go through another round when you're ready. So far I'm excited by the timings I'm seeing for both approaches.

cpp/src/umap/simpl_set_embed/optimize_batch_kernel.cuh Outdated Show resolved Hide resolved
cpp/src/umap/simpl_set_embed/algo.cuh Outdated Show resolved Hide resolved
cpp/src/umap/simpl_set_embed/algo.cuh Outdated Show resolved Hide resolved
cpp/test/sg/umap_parametrizable_test.cu Show resolved Hide resolved
python/cuml/test/test_umap.py Outdated Show resolved Hide resolved
@trivialfis trivialfis marked this pull request as ready for review May 12, 2021 07:44
@trivialfis trivialfis requested review from a team as code owners May 12, 2021 07:44
@trivialfis trivialfis changed the title [WIP] Deterministic UMAP with floating point rounding. [REVIEW] Deterministic UMAP with floating point rounding. May 12, 2021
@cjnolet cjnolet added the 4 - Waiting on Author Waiting for author to respond to review label May 12, 2021
@trivialfis
Copy link
Member Author

trivialfis commented May 13, 2021

It seems the mnmg test is flaky.

Update: NVM, fixed.

@trivialfis trivialfis removed the 4 - Waiting on Author Waiting for author to respond to review label May 13, 2021
@dantegd dantegd added the 4 - Waiting on Reviewer Waiting for reviewer to review or respond label May 13, 2021
@mdemoret-nv mdemoret-nv linked an issue May 13, 2021 that may be closed by this pull request
Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finished last review round. Have you gotten a chance to run these changes on larger datasets such as fashion mnist or google news embeddings? It would help just to test against some datasets just to verify there aren't any violated assumptions in the rounding. Otherwise, these changes are looking great.

cpp/src/umap/simpl_set_embed/algo.cuh Outdated Show resolved Hide resolved
cpp/test/sg/umap_parametrizable_test.cu Outdated Show resolved Hide resolved
python/cuml/manifold/umap.pyx Show resolved Hide resolved
cpp/src/umap/simpl_set_embed/optimize_batch_kernel.cuh Outdated Show resolved Hide resolved
@trivialfis
Copy link
Member Author

trivialfis commented May 14, 2021

fashion

branch-0.20 This PR
dft-branch-0 20 dft-rounding

@trivialfis trivialfis force-pushed the fea-deterministic-umap-truncation-max branch 2 times, most recently from b850a33 to bc26e07 Compare May 14, 2021 17:07
@codecov-commenter
Copy link

Codecov Report

Merging #3848 (bc26e07) into branch-0.20 (46174b7) will decrease coverage by 8.60%.
The diff coverage is 52.66%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.20    #3848      +/-   ##
===============================================
- Coverage        85.96%   77.35%   -8.61%     
===============================================
  Files              225      214      -11     
  Lines            16986    16552     -434     
===============================================
- Hits             14602    12804    -1798     
- Misses            2384     3748    +1364     
Flag Coverage Δ
dask ?
non-dask 77.35% <52.66%> (-0.46%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
python/cuml/benchmark/nvtx_benchmark.py 0.00% <0.00%> (ø)
python/cuml/common/memory_utils.py 76.82% <ø> (-1.93%) ⬇️
python/cuml/dask/common/dask_arr_utils.py 27.77% <0.00%> (-68.00%) ⬇️
python/cuml/dask/common/utils.py 28.15% <0.00%> (-15.54%) ⬇️
python/cuml/dask/ensemble/base.py 19.55% <0.00%> (-64.36%) ⬇️
python/cuml/ensemble/randomforestclassifier.pyx 83.61% <ø> (ø)
python/cuml/linear_model/logistic_regression.pyx 89.21% <ø> (ø)
python/cuml/neighbors/nearest_neighbors.pyx 93.11% <ø> (-0.03%) ⬇️
python/cuml/common/base.pyx 74.10% <29.41%> (-6.23%) ⬇️
python/cuml/model_selection/_split.py 88.99% <75.67%> (-1.87%) ⬇️
... and 100 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d06991...bc26e07. Read the comment docs.

This is a breaking change as the batch size parameter is removed.

* Add procedure for rounding the gradient updates.
* Add buffer for gradient updates.
* Add an internal parameter `deterministic`, which should be set to `true` when
`random_state` is set.
* Cleanup tests.
@trivialfis trivialfis force-pushed the fea-deterministic-umap-truncation-max branch from bc26e07 to 3945822 Compare May 17, 2021 17:00
@trivialfis
Copy link
Member Author

trivialfis commented May 17, 2021

Rebased onto branch-21.06. Not sure why is conda failing.

@trivialfis
Copy link
Member Author

rerun tests

@trivialfis trivialfis changed the title [REVIEW] Deterministic UMAP with floating point rounding. [REVIEW] Deterministic UMAP with floating point rounding May 17, 2021
@trivialfis trivialfis changed the title [REVIEW] Deterministic UMAP with floating point rounding [REVIEW] Deterministic UMAP with floating point rounding. May 17, 2021
@trivialfis trivialfis requested a review from cjnolet May 18, 2021 00:52
@trivialfis
Copy link
Member Author

rerun tests

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The evaluation on the datasets I've seen look great.

@cjnolet
Copy link
Member

cjnolet commented May 20, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 99a80c8 into rapidsai:branch-21.06 May 20, 2021
@trivialfis trivialfis deleted the fea-deterministic-umap-truncation-max branch May 20, 2021 18:53
@trivialfis
Copy link
Member Author

@cjnolet Thanks for all the advice! Learned a lot during this.

rapids-bot bot pushed a commit that referenced this pull request May 13, 2022
Closes #4725

#3848 removes the usage of `optim_batch_size` in code. This PR removes the parameter from the docstring and in `UMAPParams`.

Authors:
  - Thomas J. Fan (https://github.com/thomasjpfan)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #4732
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
Use floating rounding to make UMAP optimization deterministic.  This is a breaking change as the batch size parameter is removed.

* Add procedure for rounding the gradient updates.
* Add buffer for gradient updates.
* Add an internal parameter `deterministic`, which should be set to `true` when `random_state` is set.

The test file is removed due to rapidsai#3849 .

Authors:
  - Jiaming Yuan (https://github.com/trivialfis)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: rapidsai#3848
vimarsh6739 pushed a commit to vimarsh6739/cuml that referenced this pull request Oct 9, 2023
)

Closes rapidsai#4725

rapidsai#3848 removes the usage of `optim_batch_size` in code. This PR removes the parameter from the docstring and in `UMAPParams`.

Authors:
  - Thomas J. Fan (https://github.com/thomasjpfan)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: rapidsai#4732
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Waiting on Reviewer Waiting for reviewer to review or respond breaking Breaking change CUDA/C++ Cython / Python Cython or Python issue feature request New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] UMAP test is not built.
4 participants