Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve RetrieveChat #6

Merged
merged 11 commits into from
Sep 27, 2023
Merged

Improve RetrieveChat #6

merged 11 commits into from
Sep 27, 2023

Conversation

thinkall
Copy link
Collaborator

@thinkall thinkall commented Sep 22, 2023

Why are these changes needed?

  • Use upsert in batch of 40000 to improve chromadb upsert stability, avoid error for super large corpus.
  • Improve Update Context logic, support customizing trigger words customized_answer_prefix. If customized_answer_prefix is given and it is not in the answer, Update Context will also be triggered.
  • Add a parameter no_update_context to stop Update Context, useful for ablation experiments.
  • Add an example of customizing prompt, trigger word and few-shot learning to RetrieveChat notebook.
  • Update tests.

Related issue number

Checks

@codecov-commenter
Copy link

codecov-commenter commented Sep 22, 2023

Codecov Report

Merging #6 (596ff65) into main (6f54631) will increase coverage by 0.94%.
Report is 11 commits behind head on main.
The diff coverage is 20.96%.

@@            Coverage Diff             @@
##             main       #6      +/-   ##
==========================================
+ Coverage   34.22%   35.16%   +0.94%     
==========================================
  Files          17       17              
  Lines        1911     1965      +54     
  Branches      416      465      +49     
==========================================
+ Hits          654      691      +37     
- Misses       1207     1223      +16     
- Partials       50       51       +1     
Flag Coverage Δ
unittests 35.11% <20.96%> (+0.94%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
autogen/code_utils.py 45.53% <92.30%> (+2.19%) ⬆️
autogen/retrieve_utils.py 14.78% <0.00%> (+14.78%) ⬆️
...gen/agentchat/contrib/retrieve_user_proxy_agent.py 4.46% <2.17%> (+3.74%) ⬆️

@sonichi
Copy link
Contributor

sonichi commented Sep 22, 2023

This PR decreases coverage. Can we avoid that?

@thinkall
Copy link
Collaborator Author

This PR decreases coverage. Can we avoid that?

I see test/agentchat/test_retrievechat.py is skipped, is there any concerns for these tests?

@sonichi
Copy link
Contributor

sonichi commented Sep 22, 2023

This PR decreases coverage. Can we avoid that?

I see test/agentchat/test_retrievechat.py is skipped, is there any concerns for these tests?

Some tests are skipped because chromadb is not installed in [test]. We can add it.
The test that depends on OpenAI should check whether openai is installed and skip if not, like other tests.

@thinkall
Copy link
Collaborator Author

One last change we need: bump the version of autogen and the version number in the notebook. Then I'll need to make a new release. Otherwise the notebook will not work.

Updated to 0.1.2

Copy link
Contributor

@pcdeadeasy pcdeadeasy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a chunk of code that was commented out in generate_init_message.
Also consider making _generate_retrieve_user_reply more readable.

@thinkall thinkall temporarily deployed to openai September 25, 2023 14:32 — with GitHub Actions Inactive
@thinkall thinkall temporarily deployed to openai September 25, 2023 14:32 — with GitHub Actions Inactive
@thinkall thinkall temporarily deployed to openai September 25, 2023 14:32 — with GitHub Actions Inactive
@@ -230,6 +230,11 @@ def _generate_retrieve_user_reply(
sender: Optional[Agent] = None,
config: Optional[Any] = None,
) -> Tuple[bool, Union[str, Dict, None]]:
"""In this function, we will update the context and reset the conversation based on different conditions.
We'll update the context and reset the conversation if no_update_context is False and either of the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be easier to think of "update_context is True" than "no_update_context is False".

Copy link
Contributor

@pcdeadeasy pcdeadeasy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. once suggestion that you could consider is to rename the rerieve_user_proxy_agent.py as RAG_user_proxy_agent.py.

@sonichi sonichi added this pull request to the merge queue Sep 27, 2023
Merged via the queue into main with commit 1108818 Sep 27, 2023
16 of 19 checks passed
@thinkall thinkall deleted the improve_retrieve branch September 27, 2023 01:57
whiskyboy pushed a commit to whiskyboy/autogen that referenced this pull request Apr 17, 2024
* Upsert in batch

* Improve update context, support customized answer prefix

* Update tests

* Update intermediate answer

* Fix duplicate intermediate answer, add example 6 to notebook

* Add notebook results

* Works better without intermediate answers in the context

* Bump version to 0.1.2

* Remove commented code and add descriptions to _generate_retrieve_user_reply

---------

Co-authored-by: Qingyun Wu <[email protected]>
randombet pushed a commit to randombet/autogen that referenced this pull request Sep 9, 2024
* intial commit for aws-bedrock

* format

* converse setup for model req-response

* Renamed to bedrock.py, updated parameter parsing, system message extraction, client class incorporation

* Established Bedrock class based on @astroalek and @ChristianT's code, added ability to disable system prompt separation

* Image parsing and removing access credential checks

* Added tests, added additional parameter support

* Amazon Bedrock documentation

* Moved client parameters to init, align parameter names with Anthropic, spelling fix, remove unnecessary imports, use base and additional parameters, update documentation

* Tidy up comments

* Minor typo fix

* Correct comment re aws_region

---------

Co-authored-by: Hk669 <[email protected]>
Co-authored-by: HRUSHIKESH DOKALA <[email protected]>
jackgerrits pushed a commit that referenced this pull request Oct 2, 2024
add envvars in all projects
jackgerrits pushed a commit that referenced this pull request Oct 2, 2024
* namespace fixes + remove skills definitios from Actors project

* add waf context to actors

* deploy to Azure WIP

* add bicep for gh-flow and cosmos

* azure deploy fixes

* azure deploy WIP
jackgerrits added a commit that referenced this pull request Oct 2, 2024
jackgerrits pushed a commit that referenced this pull request Oct 2, 2024
* Renamed TeamOne to MagenticOne

* Updated uv.lock

* Fixed workflows.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants