Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Perform RAG in search instead of simple keywords #11891

Open
petros94 opened this issue Nov 19, 2024 · 0 comments
Open

feat: Perform RAG in search instead of simple keywords #11891

petros94 opened this issue Nov 19, 2024 · 0 comments

Comments

@petros94
Copy link

petros94 commented Nov 19, 2024

At my team, we usually use the search field to search for datasets with various characteristics. Currently this field operates with keyword matching of columns or words in the description of a dataset.

Since RAG adoption is on the rise, I'd like to propose a new feature leveraging natural language for searching the datasets:

  • All the information regarding the datasets (description, tags, owner, etc.) are transformed into embeddings and stored in a vector db.
  • The user would then use the search bar to write a complete question like "What are the datasets we have about X?", "Who is the owner of dataset Y", and the system would perform a similarity search + generation with LLM to answer the query.
  • Since there is already an ontology defined in DataHub, there could even be a more sophisticated graph RAG to answer questions involving relationships like "How many datasets we have regarding Z?", "Which dataset is the parent of W?", etc.

I think that feature would greatly enhance the user experience and productivity, provide a competitive advantage against other solutions (https://www.secoda.co/blog/transforming-data-discovery-using-secoda-ai) and open new possibilities for the platform as a whole.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant