-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: add lms for data post #8222
Conversation
forgot to wrap some lines -- want to get feedback, then finish all the prose/cleanup |
1 similar comment
|
||
## Natural language processing | ||
|
||
This includes tasks like: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it's powerful to be able to say: extract sentiment of each row, then do a join/groupby, all in GPU (instead of doing sentiment on gpus and join/groupby on cpus)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're not there with Ibis just yet! soon...
docs/posts/lms-for-data/index.qmd
Outdated
|
||
We can think of three approaches to analytical code with language models: | ||
|
||
1. Use LMs in an analytic subroutine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like an agent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, just what's shown above -- basically using LLMs in UDFs
@cpcloud this should be good to merge! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm.
- summarization | ||
- translation | ||
- question answering | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is embedding or encoding good use cases in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could but I haven't really understood why these are useful (as opposed to storing text)
docs/posts/lms-for-data/index.qmd
Outdated
``` | ||
|
||
1. Import Ibis, the data engineering toolkit | ||
2. Import Marvin, the AI engineering toolkit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some confusion here, "our team" means ibis. I thought something like langchain.
I believe our team already invested lots of time on the toolkit, I am not very sure how popular marvin is.
argh let me fix some of those last things before we merge |
ready to merge |
Description of changes
work in progress but code is finalized IMO
follow-up post for local "open source" LMs needed
covers:
Issues closed