Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default pretrained model config #1

Open
MiqG opened this issue Dec 5, 2024 · 0 comments
Open

Default pretrained model config #1

MiqG opened this issue Dec 5, 2024 · 0 comments

Comments

@MiqG
Copy link

MiqG commented Dec 5, 2024

Hi @saberiato ,

Thanks for developing and sharing such a novel model, it is very exciting to be able to model full isoforms finally.

When I wanted to give the pre-trained model a try I found the default config does not match the pre-trained model architecture and the tokenizer cannot be loaded with the .from_pretrained method.

I only managed to load the model by manually tweaking the parameters based on your manuscript and quite some trial and error. Is this config correct? Could you upload yours to make sure everything is as trained and trust its outputs?

This code was run from the training-model directory without errors:

from modeling_hyena import StripedHyenaModelForCausalLM, StripedHyenaModelForExtractingEmbeddings
from configuration_hyena import StripedHyenaConfig

config_dict = {
    "vocab_size": 32,                   
    "hidden_size": 128,                 
    "num_filters": 128,                 
    "inner_mlp_size": 352,             
    "attn_layer_idxs": [4, 8, 12],      
    "hyena_layer_idxs": [0, 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14, 15],
    "num_layers": 16,                   
    "tie_embeddings": True,            
    "short_filter_length": 3,          
    "num_attention_heads": 16,         
    "proj_groups": 1,                  
    "hyena_filter_groups": 1,          
    "split_k0": True,                  
    "column_split_hyena": True,        
    "column_split": False,             
    "model_parallel_size": 1,          
    "pipe_parallel_size": 1,           
    "short_filter_bias": True,         
    "mha_out_proj_bias": True,         
    "qkv_proj_bias": True,             
    "final_norm": True,                
    "use_cache": False,                
    "use_flash_attention_2": True,     
    "use_flash_rmsnorm": True,         
    "use_flash_depthwise": False,      
    "use_flashfft": False,             
    "inference_mode": True,            
    "prefill_style": "fft",            
    "max_seqlen": 65536,               
    "eps": 1e-5,                       
    "state_size": 8,                   
    "rotary_emb_base": 500000,         
    "smeared_gqa": False,              
    "make_vocab_size_divisible_by": 8,  
    "log_intermediate_values": False,   
    "bidirectional": False}

config = StripedHyenaConfig(**config_dict)

checkpoint = "../weights"
model = StripedHyenaModelForCausalLM.from_pretrained(checkpoint, config=config)

However, instantiating the StripedHyenaForEmbeddings class resulted in an error:

from transformers import PreTrainedTokenizerFast

tokenizer = PreTrainedTokenizerFast(
    tokenizer_file="../processing-seqs/lornash_tokenizer.json",
    padding_side='right',
    truncation_side='right',
    cls_token='[CLS]',
    bos_token='[CLS]',
    sep_token='[SEP]',
    eos_token='[SEP]',
    unk_token='[UNK]',
    mask_token='[MASK]',
    pad_token='[PAD]',
    model_max_length=2**16
)

model = StripedHyenaModelForExtractingEmbeddings.from_pretrained(checkpoint, tokenizer=tokenizer, config=config)
 self.backbone = StripedHyenaForEmbeddings(model_config)
TypeError: __init__() missing 1 required positional argument: 'tokenizer'

Changing these lines from the StripedHyenaModelForExtractingEmbeddings class (

class StripedHyenaModelForExtractingEmbeddings(StripedHyenaPreTrainedModel):
):

    def __init__(self, config, **kwargs):
         super().__init__(config, **kwargs)
        model_config = dotdict(config.to_dict())
        self.backbone = StripedHyenaForEmbeddings(model_config)

into

    def __init__(self, config, tokenizer, **kwargs):
        super().__init__(config, **kwargs)
        model_config = dotdict(config.to_dict())
        self.backbone = StripedHyenaForEmbeddings(model_config, tokenizer)

Made it run without errors. I will submit this as a PR.

Thanks in advance! Best,

Miquel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant