Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding tiny amount of noise to PCA/spectral init to prevent points from overlapping #224

Closed
dkobak opened this issue Jan 10, 2023 · 0 comments · Fixed by #225
Closed

Adding tiny amount of noise to PCA/spectral init to prevent points from overlapping #224

dkobak opened this issue Jan 10, 2023 · 0 comments · Fixed by #225
Labels
bug Something isn't working

Comments

@dkobak
Copy link
Contributor

dkobak commented Jan 10, 2023

A student encountered an unusual situation with spectral initialization: due to the specifics of the dataset, there were points that were EXACTLY overlapping in the initialization. This made the points "stuck" to each other forever -- these points felt the same repulsive force to all other points and so could not separate from each other, even though there should actually be repulsion between them. Adding a tiny amount of random noise to the initialization solved this problem and made points spread over the embedding, as expected.

This reminded me of another issue we discussed a while ago #180 (still open) where points exactly overlapping in the initialization were causing some problems.

My suggestion is to always add a tiny amount of noise to all initializations that we compute. Specifically, in the rescale() function here https://github.com/pavlin-policar/openTSNE/blob/master/openTSNE/initialization.py#L9 I would replace

x /= np.std(x[:, 0]) * 10000

with

x /= np.std(x[:, 0]) * 10000
x += np.random.randn(*x.shape) * 1e-5

This would affect PCA and spectral init, but would not affect a user-provided init.

@pavlin-policar pavlin-policar added the bug Something isn't working label Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants