Skip to content

AI search for all the best resources in AI – powered by Ben's Bites 💯

License

Notifications You must be signed in to change notification settings

ramzimalhas/bens-bites-ai-search

 
 

Repository files navigation

Ben's Bites

Ben's Bites Link Search

Search across all of the AI-related links in the Ben's Bites newsletter – using AI-powered semantic search.

Build Status MIT License Prettier Code Formatting

Intro

The goal of this app is to provide a highly curated search for staying up-to-date with the latest AI resources and news.

All search results are extracted from Ben's Bites AI Newsletter, which is used as a highly curated data source.

How it works

A cron job is run every 24 hours to update the database.

The steps involved include:

  1. Crawling the source Beehiiv newsletter
  2. Converting each post to markdown
  3. Extracting and resolving unique links
  4. Fetching opengraph metadata for each link
  5. Fetching provider-specific metadata for some links (e.g. tweet text)
  6. Generating vector embeddings for each link using OpenAI
  7. Upserting all links into a Pinecone vector database

We're using IFramely to extract opengraph metadata for each link, and we also special-case tweet links to extract the tweet text.

Once we have all of the links locally, we upsert them into a Pinecone vector database for semantic search.

Semantic Search

Semantic search is powered by OpenAI's `text-embedding-ada-002` embedding model and Pinecone's hosted vector database.

TODO

  • better search UX so back button works
  • show the number of posts / links on the home page so it's clear when it was last updated
  • acutally sort by recency instead of faking it
  • set up cron to update the DB daily
  • test on safari/firefox
  • display which newsletter the post first appeared in
  • explore hybrid search
  • infinite scroll so you can keep scrolling results

License

MIT © Travis Fischer

All link data is extracted from Ben's Bites AI Newsletter and is licensed under CC BY-NC-ND 4.0.

If you found this project interesting, please consider sponsoring me or following me on twitter twitter

About

AI search for all the best resources in AI – powered by Ben's Bites 💯

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 85.7%
  • CSS 13.1%
  • JavaScript 1.2%