Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document vector similarity functions #913

Merged
merged 7 commits into from
Mar 11, 2024

Conversation

nilsceberg
Copy link
Contributor

No description provided.

@nilsceberg nilsceberg added the 5.18 label Mar 5, 2024
Copy link
Contributor

@gem-neo4j gem-neo4j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

modules/ROOT/pages/functions/vector.adoc Outdated Show resolved Hide resolved
| `vector.similarity.euclidean(a, NULL)` returns `NULL`.
| Both vectors must be of the same dimension.
| Both vectors must be {link-vector-indexes}#indexes-vector-similarity-euclidean[*valid*] with respect to Euclidean similarity.
| The implementation is exactly identical to that of the `latest` available vector index provider (currently `vector-2.0`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we avoid saying words like "currently" in the docs :)

Copy link
Contributor

@JPryce-Aklundh JPryce-Aklundh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nilsceberg, looks good overall! I have added some release version information, and also have a few editorial suggestions. Let me know what you think :)

modules/ROOT/pages/functions/index.adoc Outdated Show resolved Hide resolved
modules/ROOT/pages/functions/vector.adoc Show resolved Hide resolved
modules/ROOT/pages/functions/vector.adoc Outdated Show resolved Hide resolved
These vector similarity functions are identical to those used by Neo4j's {link-vector-indexes}[vector search indexes].

Consider for example _k_-nearest neighbor queries, which return the _k_ entities with highest similarity scores based on comparing their associated vectors with a query vector.
Such queries can be run against vector indexes to quickly get an approximate result.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it is clear what "approximate result" means here?

Suggested change
Such queries can be run against vector indexes to quickly get an approximate result.
Such queries can be run against vector indexes to return an approximate similarity score.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. I wanted to avoid restating too much from the vector index documentation, but I'll see if I can clarify this bit.

It's not the score that's approximate; it's whether or not the true nearest neighbours are included in the result set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not assume that people have read that page, so a bit more might be necessary here (maybe something like "return an approximate result indicating whether or not the true nearest neighbors are included in the results".

modules/ROOT/pages/functions/vector.adoc Outdated Show resolved Hide resolved
Consider for example _k_-nearest neighbor queries, which return the _k_ entities with highest similarity scores based on comparing their associated vectors with a query vector.
Such queries can be run against vector indexes to quickly get an approximate result.
However, they can also be expressed as an exhaustive search using similarity functions directly.
While this is typically significantly slower than using an index, it is exact and does not require an existing index.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
While this is typically significantly slower than using an index, it is exact and does not require an existing index.
While vector functions are typically slower than using an index, they are exact and do not require an existing index.

modules/ROOT/pages/functions/vector.adoc Outdated Show resolved Hide resolved
modules/ROOT/pages/functions/index.adoc Outdated Show resolved Hide resolved
modules/ROOT/pages/functions/vector.adoc Outdated Show resolved Hide resolved
modules/ROOT/pages/functions/vector.adoc Outdated Show resolved Hide resolved
Co-authored-by: Jens Pryce-Åklundh <[email protected]>
@nilsceberg
Copy link
Contributor Author

Thanks @JPryce-Aklundh, great suggestions - I'll revisit the part you had questions on.

@nilsceberg
Copy link
Contributor Author

@JPryce-Aklundh What do you think about something like this?

@JPryce-Aklundh
Copy link
Contributor

Great @nilsceberg - thanks! We just need to add the page-role indicating 5.18 on the full vector functions page. Are you OK with me fixing that?

@nilsceberg
Copy link
Contributor Author

@JPryce-Aklundh Please do, thanks! :)

@JPryce-Aklundh
Copy link
Contributor

You had already added it @nilsceberg :) so I will merge this 👍

@neo-technology-commit-status-publisher
Copy link
Collaborator

Thanks for the documentation updates.

The preview documentation has now been torn down - reopening this PR will republish it.

@JPryce-Aklundh JPryce-Aklundh merged commit 6cb60fe into neo4j:dev Mar 11, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants