update to 0.1, remove deprecated functionality and focus on api catalog backend #48

mattf · 2024-05-30T18:45:29Z

this is version 0.1.0 of the connectors with two primary changes -

[user visible] deprecated or unavailable functionality removed
[not use visible] use of api catalog (integrate.api.nvidia.com and ai.api.nvidia.com) instead of nvcf (api.nvcf.nvidia.com) for inference

all playground_* model endpoints have been decommissioned.

functionality removed -

models: playground_mamba_chat, playground_smaug_72b, playground_nemotron_qa_8b, playground_nemotron_steerlm_8b, playground_steerlm_llama_70b, playground_yi_34b, playground_nvolveqa_40k
methods: available_functions, get_available_functions, get_model_details, get_binding_model, mode, validate_model, validate_base_url, reset_method_cache, validate_client, aifm_deprecated, aifm_bad_deprecated, aifm_labels_deprecated, custom_preprocess, preprocess_msg, custom_postprocess, get_generation, get_stream, get_astream, get_payload, prep_payload, prep_msg, deprecate_max_length
properties: ChatNVIDIA.bad, ChatNVIDIA.labels, ChatNVIDIA.infer_endpoint, ChatNVIDIA.client, ChatNVIDIA.streaming, NVIDIAEmbeddings.infer_endpoint, NVIDIAEmbeddings.client, NVIDIAEmbeddings.max_length

functionality deprecated -

NVIDIAEmbeddings.model_type, instead of setting model_type="query" or "passage" use NVIDIAEmbeddings.embed_query and NVIDIAEmbeddings.embed_documents

migration guide -

ChatNVIDIA().mode("nim", base_url="http://...") must use ChatNVIDIA(base_url="http://...")
NVIDIAEmbeddings().mode("nim", base_url="http://...") must use NVIDIAEmbeddings(base_url="http://...")
NVIDIARerank().mode("nim", base_url="http://...") must use NVIDIARerank(base_url="http://...")
compatibility for the playground_nvolveqa_40k (aka nvolveqa_40k) model. when specifying model="nvolveqa_40k", the NV-Embed-QA model will be used and truncate="END" will be set. note: the old nvolveqa_40k model endpoint would silently truncated input while all available endpoints raise an error instead of truncating.

model migration, the following models will raise a warning and use an alternative (model changes in italics) -

model	alternative
playground_llama2_13b	meta/llama2-70b
playground_llama2_code_13b	meta/codellama-70b
playground_llama2_code_34b	meta/codellama-70b
playground_nv_llama2_rlhf_70b	meta/llama2-70b
playground_phi2	microsoft/phi-3-mini-4k-instruct
playground_llama2_code_70b	meta/codellama-70b
playground_gemma_2b	google/gemma-2b
playground_gemma_7b	google/gemma-7b
playground_llama2_70b	meta/llama2-70b
playground_mistral_7b	mistralai/mistral-7b-instruct-v0.2
playground_mixtral_8x7b	mistralai/mixtral-8x7b-instruct-v0.1
playground_deplot	google/deplot
playground_fuyu_8b	adept/fuyu-8b
playground_kosmos_2	microsoft/kosmos-2
playground_neva_22b	nvidia/neva-22b
playground_nvolveqa_40k	NV-Embed-QA

dglogo

@mattf mostly reviewed the tests, docs, and notebook changes. All lgtm! Thanks

libs/ai-endpoints/langchain_nvidia_ai_endpoints/chat_models.py

raspawar · 2024-05-31T11:43:43Z

libs/ai-endpoints/langchain_nvidia_ai_endpoints/embeddings.py

+        infer_path = "{base_url}/embeddings"
+        # not all embedding models are on https://integrate.api.nvidia.com/v1,
+        # those that are not are served from their own endpoints
+        if model := determine_model(self.model):


I do see code for validating model determine_model in _NVIDIAClient, is this necessary to determine again? As determine_model returns Model obj, we can check the endpoint is present via _client.model.endpoint.

unfortunately yes. the self.model is only the model's id str. this call gets the full Model so the endpoint can be checked.

a better approach is to store the Model instead of the id str.

libs/ai-endpoints/langchain_nvidia_ai_endpoints/_statics.py

raspawar · 2024-05-31T12:04:48Z

libs/ai-endpoints/langchain_nvidia_ai_endpoints/embeddings.py

+        self.model = self._client.model
+
+        # todo: remove when nvolveqa_40k is removed from MODEL_TABLE
+        if "model" in kwargs and kwargs["model"] in [


Is this to validate the original model user input? why checking from kwargs?

yes, checking the original model input by the user. at this point in the code self.model will be a transformed version of the original input, e.g. nvolveqa_40k -> NV-Embed-QA.

raspawar · 2024-05-31T12:07:29Z

libs/ai-endpoints/langchain_nvidia_ai_endpoints/reranking.py

+        # those that are not are served from their own endpoints
+        if model := determine_model(self.model):
+            if model.endpoint:  # some models have custom endpoints
+                infer_path = model.endpoint


Same comment as embedding, can we check model.endpoint after the _NVIDIAClient initialization?

you are quite correct. this is code that's duplicated across all connectors and should move into _NVIDIAClient.

i ran into an issue w/ differing types when making the _NVIDIAClient.model as Model instead of a str.

let me follow up this commit w/ cleanup of the Model/str handling.

…ves, remove those without an alternative

…unavailable test

…ic classes

…oogle/codegemma-1.1-7b to set of ChatNVIDIA models

mattf requested a review from dglogo May 30, 2024 18:49

mattf self-assigned this May 30, 2024

mattf requested a review from raspawar May 30, 2024 18:49

dglogo approved these changes May 31, 2024

View reviewed changes

raspawar reviewed May 31, 2024

View reviewed changes

mattf added 25 commits May 31, 2024 09:32

bump version to 0.1.0-rc0

fffe37e

remove deprecated methods and properties

856e735

define model table and semantics, replaces model specs

81ebefd

add phi-3-small/medium/vision to MODEL_TABLE

9853259

refactor to use MODEL_TABLE and integrate.api.nvidia.com

ac7fbbb

assume deprecated models without alternatives are not available

410ad6e

make NV-Embed-QA an alternative for nvolveqa_40k

ccfa8fe

remove deprecated bad & labels param from ChatNVIDIA

c8eb33f

remove deprecated max_length param from NVIDIAEmbeddings

3b0926a

remove special handling of removed kosmos_2 model

044766e

add nvidia/embed-qa-4 to list of embedding models

1440857

playground_* models no longer exist - add aliases for model alternati…

38502fd

…ves, remove those without an alternative

remove redundant model_name

5e097ad

update default embedding model to nvidia/embed-qa-4

9c12d5a

update chat unit tests to remove duplicate deprecation check and fix …

29def65

…unavailable test

NVIDIAEmbeddings has _NVIDIAClient instead of is _NVIDIAClient

88ba0c5

add tests for base_url, share public classes fixture

01556e4

ChatNVIDIA has _NVIDIAClient instead of is _NVIDIAClient

2e090e1

pass base_url tests

4acf6ab

add hosted nim identification

827c0ba

warn if no key provided for hosted nim

1130abb

update api key tests and implementation

4bd3c5f

align available_models integration test

1fe47f2

ensure model field reflects the model that will be used

0514018

add support for VLM models

f82d8f0

mattf added 12 commits May 31, 2024 09:33

add support for embedding models on custom endpoints

7d84a02

align alias and model availability unit test

20bc3ec

align test_langchain_reranker_direct_endpoint_unavailable across publ…

e23b5df

…ic classes

remove unused deprecated field from MODEL_TABLE

deb8053

correct meta/llama3-{8,70}b-instruct alias names

1505eec

remove references to deprecated models from docs

c1b374a

bump version from 0.1.0-rc0 to 0.1.0

a696849

mark NVIDIAEmbeddings.model_type as deprecated

064c1ee

remove dead code handling nvolveqa_40k

598ee33

ensure available models includes known and listed

4474971

add compat mode for nvolveqa_40k

46bc579

use get's default instead of or

f1504f3

mattf force-pushed the mattf/dev-v0.1 branch from b95a069 to f1504f3 Compare May 31, 2024 13:39

mattf added 5 commits May 31, 2024 10:10

add ibm/granite-8b-code-instruct, ibm/granite-34b-code-instruct and g…

eaab838

…oogle/codegemma-1.1-7b to set of ChatNVIDIA models

lookup_model: remove unnecessary iteration, clarify intent

858afb8

remove unused streaming field

15d28a2

add doc for internal determine_model func

d4a32f2

update documentation for primary public interfaces

e88df10

mattf merged commit 2719ca5 into main May 31, 2024
12 checks passed

mattf deleted the mattf/dev-v0.1 branch May 31, 2024 15:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update to 0.1, remove deprecated functionality and focus on api catalog backend #48

update to 0.1, remove deprecated functionality and focus on api catalog backend #48

mattf commented May 30, 2024 •

edited

Loading

dglogo left a comment

raspawar May 31, 2024

mattf May 31, 2024

raspawar May 31, 2024

mattf May 31, 2024

raspawar May 31, 2024

mattf May 31, 2024

update to 0.1, remove deprecated functionality and focus on api catalog backend #48

update to 0.1, remove deprecated functionality and focus on api catalog backend #48

Conversation

mattf commented May 30, 2024 • edited Loading

dglogo left a comment

Choose a reason for hiding this comment

raspawar May 31, 2024

Choose a reason for hiding this comment

mattf May 31, 2024

Choose a reason for hiding this comment

raspawar May 31, 2024

Choose a reason for hiding this comment

mattf May 31, 2024

Choose a reason for hiding this comment

raspawar May 31, 2024

Choose a reason for hiding this comment

mattf May 31, 2024

Choose a reason for hiding this comment

mattf commented May 30, 2024 •

edited

Loading