Skip to content

Commit

Permalink
Update notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
davidmezzetti committed Nov 18, 2024
1 parent 439b9d7 commit 9444fcc
Showing 1 changed file with 51 additions and 1 deletion.
52 changes: 51 additions & 1 deletion examples/67_Whats_new_in_txtai_8_0.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
"Agents automatically create workflows to answer multi-faceted user requests. Agents iteratively prompt and/or interface with tools to\n",
"step through a process and ultimately come to an answer for a request.\n",
"\n",
"This release also adds support for Model2Vec vectorization.\n",
"\n",
"**Standard upgrade disclaimer below**\n",
"\n",
"While everything is backwards compatible, it's prudent to backup production indexes before upgrading and test before deploying."
Expand All @@ -40,7 +42,7 @@
"outputs": [],
"source": [
"%%capture\n",
"!pip install git+https://github.com/neuml/txtai autoawq"
"!pip install git+https://github.com/neuml/txtai autoawq model2vec"
]
},
{
Expand Down Expand Up @@ -462,6 +464,54 @@
"💥 Look at that! A full API service from a simple configuration file. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Vectorization with Model2Vec\n",
"\n",
"While the agent framework is the headline change, there is another major update - support for Model2Vec models. \n",
"\n",
"[Model2Vec](https://github.com/MinishLab/model2vec) is a technique to turn any sentence transformer into a really small static model, reducing model size by 15x and making the models up to 500x faster, with a small drop in performance."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Canada's last fully intact ice shelf has suddenly collapsed, forming a Manhattan-sized iceberg\""
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from txtai import Embeddings\n",
"\n",
"# Data to index\n",
"data = [\n",
" \"US tops 5 million confirmed virus cases\",\n",
" \"Canada's last fully intact ice shelf has suddenly collapsed, forming a Manhattan-sized iceberg\",\n",
" \"Beijing mobilises invasion craft along coast as Taiwan tensions escalate\",\n",
" \"The National Park Service warns against sacrificing slower friends in a bear attack\",\n",
" \"Maine man wins $1M from $25 lottery ticket\",\n",
" \"Make huge profits without work, earn up to $100,000 a day\"\n",
"]\n",
"\n",
"# Create an embeddings\n",
"embeddings = Embeddings(method=\"model2vec\", path=\"minishlab/M2V_base_output\")\n",
"embeddings.index(data)\n",
"\n",
"uid = embeddings.search(\"climate change\")[0][0]\n",
"data[uid]"
]
},
{
"cell_type": "markdown",
"metadata": {
Expand Down

0 comments on commit 9444fcc

Please sign in to comment.