Skip to content

Commit

Permalink
Fix ExLlamaV2 context length setting (closes #5750)
Browse files Browse the repository at this point in the history
  • Loading branch information
oobabooga committed Mar 31, 2024
1 parent 70c58b5 commit 624faa1
Showing 1 changed file with 4 additions and 3 deletions.
7 changes: 4 additions & 3 deletions modules/models_settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,10 @@ def get_model_metadata(model):
# Transformers metadata
if hf_metadata is not None:
metadata = json.loads(open(path, 'r', encoding='utf-8').read())
if 'max_position_embeddings' in metadata:
model_settings['truncation_length'] = metadata['max_position_embeddings']
model_settings['max_seq_len'] = metadata['max_position_embeddings']
for k in ['max_position_embeddings', 'max_seq_len']:
if k in metadata:
model_settings['truncation_length'] = metadata[k]
model_settings['max_seq_len'] = metadata[k]

if 'rope_theta' in metadata:
model_settings['rope_freq_base'] = metadata['rope_theta']
Expand Down

2 comments on commit 624faa1

@turboderp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this isn't directly related to EXL2. EXL2 just retains the config.json from the original model (apart from adding the quantization_config key). So the Transformers loader would need this change as well for DRBX.

If rope_theta is used anywhere, also be aware that DBRX moved it into the attn_config section for some reason.

@oobabooga
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't noticed they had moved rope_theta somewhere else, thanks. That should be accounted for as well now 9ab7365

Please sign in to comment.