Default pretrained model config #1

MiqG · 2024-12-05T18:52:12Z

Thanks for developing and sharing such a novel model, it is very exciting to be able to model full isoforms finally.

When I wanted to give the pre-trained model a try I found the default config does not match the pre-trained model architecture and the tokenizer cannot be loaded with the .from_pretrained method.

I only managed to load the model by manually tweaking the parameters based on your manuscript and quite some trial and error. Is this config correct? Could you upload yours to make sure everything is as trained and trust its outputs?

This code was run from the training-model directory without errors:

from modeling_hyena import StripedHyenaModelForCausalLM, StripedHyenaModelForExtractingEmbeddings
from configuration_hyena import StripedHyenaConfig

config_dict = {
    "vocab_size": 32,                   
    "hidden_size": 128,                 
    "num_filters": 128,                 
    "inner_mlp_size": 352,             
    "attn_layer_idxs": [4, 8, 12],      
    "hyena_layer_idxs": [0, 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14, 15],
    "num_layers": 16,                   
    "tie_embeddings": True,            
    "short_filter_length": 3,          
    "num_attention_heads": 16,         
    "proj_groups": 1,                  
    "hyena_filter_groups": 1,          
    "split_k0": True,                  
    "column_split_hyena": True,        
    "column_split": False,             
    "model_parallel_size": 1,          
    "pipe_parallel_size": 1,           
    "short_filter_bias": True,         
    "mha_out_proj_bias": True,         
    "qkv_proj_bias": True,             
    "final_norm": True,                
    "use_cache": False,                
    "use_flash_attention_2": True,     
    "use_flash_rmsnorm": True,         
    "use_flash_depthwise": False,      
    "use_flashfft": False,             
    "inference_mode": True,            
    "prefill_style": "fft",            
    "max_seqlen": 65536,               
    "eps": 1e-5,                       
    "state_size": 8,                   
    "rotary_emb_base": 500000,         
    "smeared_gqa": False,              
    "make_vocab_size_divisible_by": 8,  
    "log_intermediate_values": False,   
    "bidirectional": False}

config = StripedHyenaConfig(**config_dict)

checkpoint = "../weights"
model = StripedHyenaModelForCausalLM.from_pretrained(checkpoint, config=config)

However, instantiating the StripedHyenaForEmbeddings class resulted in an error:

from transformers import PreTrainedTokenizerFast

tokenizer = PreTrainedTokenizerFast(
    tokenizer_file="../processing-seqs/lornash_tokenizer.json",
    padding_side='right',
    truncation_side='right',
    cls_token='[CLS]',
    bos_token='[CLS]',
    sep_token='[SEP]',
    eos_token='[SEP]',
    unk_token='[UNK]',
    mask_token='[MASK]',
    pad_token='[PAD]',
    model_max_length=2**16
)

model = StripedHyenaModelForExtractingEmbeddings.from_pretrained(checkpoint, tokenizer=tokenizer, config=config)

 self.backbone = StripedHyenaForEmbeddings(model_config)
TypeError: __init__() missing 1 required positional argument: 'tokenizer'

Changing these lines from the StripedHyenaModelForExtractingEmbeddings class (

lorna-sh/training-model/modeling_hyena.py

Line 219 in 8f2b32d

class StripedHyenaModelForExtractingEmbeddings(StripedHyenaPreTrainedModel):

):

    def __init__(self, config, **kwargs):
         super().__init__(config, **kwargs)
        model_config = dotdict(config.to_dict())
        self.backbone = StripedHyenaForEmbeddings(model_config)

into

    def __init__(self, config, tokenizer, **kwargs):
        super().__init__(config, **kwargs)
        model_config = dotdict(config.to_dict())
        self.backbone = StripedHyenaForEmbeddings(model_config, tokenizer)

Made it run without errors. I will submit this as a PR.

Thanks in advance! Best,

Miquel

The text was updated successfully, but these errors were encountered:

MiqG mentioned this issue Dec 5, 2024

Added tokenizer argument in StripedHyenaModelForExtractingEmbeddings … #2

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default pretrained model config #1

Default pretrained model config #1

MiqG commented Dec 5, 2024

Default pretrained model config #1

Default pretrained model config #1

Comments

MiqG commented Dec 5, 2024