Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update to 0.1, remove deprecated functionality and focus on api catalog backend #48

Merged
merged 42 commits into from
May 31, 2024

Conversation

mattf
Copy link
Collaborator

@mattf mattf commented May 30, 2024

this is version 0.1.0 of the connectors with two primary changes -

  1. [user visible] deprecated or unavailable functionality removed
  2. [not use visible] use of api catalog (integrate.api.nvidia.com and ai.api.nvidia.com) instead of nvcf (api.nvcf.nvidia.com) for inference

all playground_* model endpoints have been decommissioned.

functionality removed -

  • models: playground_mamba_chat, playground_smaug_72b, playground_nemotron_qa_8b, playground_nemotron_steerlm_8b, playground_steerlm_llama_70b, playground_yi_34b, playground_nvolveqa_40k
  • methods: available_functions, get_available_functions, get_model_details, get_binding_model, mode, validate_model, validate_base_url, reset_method_cache, validate_client, aifm_deprecated, aifm_bad_deprecated, aifm_labels_deprecated, custom_preprocess, preprocess_msg, custom_postprocess, get_generation, get_stream, get_astream, get_payload, prep_payload, prep_msg, deprecate_max_length
  • properties: ChatNVIDIA.bad, ChatNVIDIA.labels, ChatNVIDIA.infer_endpoint, ChatNVIDIA.client, ChatNVIDIA.streaming, NVIDIAEmbeddings.infer_endpoint, NVIDIAEmbeddings.client, NVIDIAEmbeddings.max_length

functionality deprecated -

  • NVIDIAEmbeddings.model_type, instead of setting model_type="query" or "passage" use NVIDIAEmbeddings.embed_query and NVIDIAEmbeddings.embed_documents

migration guide -

  • ChatNVIDIA().mode("nim", base_url="http://...") must use ChatNVIDIA(base_url="http://...")
  • NVIDIAEmbeddings().mode("nim", base_url="http://...") must use NVIDIAEmbeddings(base_url="http://...")
  • NVIDIARerank().mode("nim", base_url="http://...") must use NVIDIARerank(base_url="http://...")
  • compatibility for the playground_nvolveqa_40k (aka nvolveqa_40k) model. when specifying model="nvolveqa_40k", the NV-Embed-QA model will be used and truncate="END" will be set. note: the old nvolveqa_40k model endpoint would silently truncated input while all available endpoints raise an error instead of truncating.

model migration, the following models will raise a warning and use an alternative (model changes in italics) -

model alternative
playground_llama2_13b meta/llama2-70b
playground_llama2_code_13b meta/codellama-70b
playground_llama2_code_34b meta/codellama-70b
playground_nv_llama2_rlhf_70b meta/llama2-70b
playground_phi2 microsoft/phi-3-mini-4k-instruct
playground_llama2_code_70b meta/codellama-70b
playground_gemma_2b google/gemma-2b
playground_gemma_7b google/gemma-7b
playground_llama2_70b meta/llama2-70b
playground_mistral_7b mistralai/mistral-7b-instruct-v0.2
playground_mixtral_8x7b mistralai/mixtral-8x7b-instruct-v0.1
playground_deplot google/deplot
playground_fuyu_8b adept/fuyu-8b
playground_kosmos_2 microsoft/kosmos-2
playground_neva_22b nvidia/neva-22b
playground_nvolveqa_40k NV-Embed-QA

@mattf mattf requested a review from dglogo May 30, 2024 18:49
@mattf mattf self-assigned this May 30, 2024
@mattf mattf requested a review from raspawar May 30, 2024 18:49
Copy link
Collaborator

@dglogo dglogo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattf mostly reviewed the tests, docs, and notebook changes. All lgtm! Thanks

infer_path = "{base_url}/embeddings"
# not all embedding models are on https://integrate.api.nvidia.com/v1,
# those that are not are served from their own endpoints
if model := determine_model(self.model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see code for validating model determine_model in _NVIDIAClient, is this necessary to determine again? As determine_model returns Model obj, we can check the endpoint is present via _client.model.endpoint.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately yes. the self.model is only the model's id str. this call gets the full Model so the endpoint can be checked.

a better approach is to store the Model instead of the id str.

self.model = self._client.model

# todo: remove when nvolveqa_40k is removed from MODEL_TABLE
if "model" in kwargs and kwargs["model"] in [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this to validate the original model user input? why checking from kwargs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, checking the original model input by the user. at this point in the code self.model will be a transformed version of the original input, e.g. nvolveqa_40k -> NV-Embed-QA.

# those that are not are served from their own endpoints
if model := determine_model(self.model):
if model.endpoint: # some models have custom endpoints
infer_path = model.endpoint
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as embedding, can we check model.endpoint after the _NVIDIAClient initialization?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are quite correct. this is code that's duplicated across all connectors and should move into _NVIDIAClient.

i ran into an issue w/ differing types when making the _NVIDIAClient.model as Model instead of a str.

let me follow up this commit w/ cleanup of the Model/str handling.

mattf added 25 commits May 31, 2024 09:32
@mattf mattf force-pushed the mattf/dev-v0.1 branch from b95a069 to f1504f3 Compare May 31, 2024 13:39
@mattf mattf merged commit 2719ca5 into main May 31, 2024
12 checks passed
@mattf mattf deleted the mattf/dev-v0.1 branch May 31, 2024 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants