Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Distilbert and added Example of Fill Mask task for Distilbert and Roberta #200

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

deveshjawla
Copy link

No description provided.

example/DistilBert_FillMask/fill_mask.jl Outdated Show resolved Hide resolved
example/DistilBert_FillMask/fill_mask.jl Outdated Show resolved Hide resolved
example/DistilBert_FillMask/fill_mask.jl Outdated Show resolved Hide resolved
example/Roberta_FillMask/fill_mask.jl Outdated Show resolved Hide resolved
example/Roberta_FillMask/fill_mask.jl Outdated Show resolved Hide resolved
example/DistilBert_FillMask/fill_mask.jl Outdated Show resolved Hide resolved
@AbrJA
Copy link

AbrJA commented Nov 13, 2024

Hi @deveshjawla I hope you are doing well,

There was an error running the tests, it's the same issue across all versions. In file load on line 286 you have this:

if !isnothing(m.pooler)
        get_state_dict(HGFDistilBertModel, m.pooler.layer.dense, state_dict, joinname(prefix, "pooler.dense"))
    end

But the atribute pooler is not defined for HGFDistilBertModel because the line with this definition is commented

# pooler = DistilBertPooler(Layers.Dense(NNlib.tanh_fast, weight, bias))

I'm gonna try to solve it but maybe you already have the solution!

Best regards, I really appreciate your contribution

@deveshjawla
Copy link
Author

deveshjawla commented Dec 9, 2024

Hi @deveshjawla I hope you are doing well,

There was an error running the tests, it's the same issue across all versions. In file load on line 286 you have this:

if !isnothing(m.pooler)
        get_state_dict(HGFDistilBertModel, m.pooler.layer.dense, state_dict, joinname(prefix, "pooler.dense"))
    end

But the atribute pooler is not defined for HGFDistilBertModel because the line with this definition is commented

# pooler = DistilBertPooler(Layers.Dense(NNlib.tanh_fast, weight, bias))

I'm gonna try to solve it but maybe you already have the solution!

Best regards, I really appreciate your contribution

Dear Abraham,

I hope you are doing well. Apologies for a very late response. I have been working on another project.

Thank you for bringing this to my attention. I was having problems with running the HuggingValidation script when I was implementing the Distilbert. The error was raised as an EnvironmentException at the code where the hugging face checkpoint is being loaded by the python in the following code at Transformers.jl/example/HuggingFaceValidation/main.jl

@info "Load configure file in Python"
        global pyconfig = @tryrun begin
            cfg = hgf_trf.AutoConfig.from_pretrained(model_name, layer_norm_eps = 1e-9, layer_norm_epsilon = 1e-9)
            if cfg.model_type == "clip"
                if haskey(cfg, "text_config")
                    cfg.text_config.layer_norm_eps = 1e-9
                    cfg.text_config.layer_norm_epsilon = 1e-9
                end
                if haskey(cfg, "vision_config")
                    cfg.vision_config.layer_norm_eps = 1e-9
                    cfg.vision_config.layer_norm_epsilon = 1e-9
                end
            end
            cfg
        end "Failed to load configure file in Python, probably unsupported"

So I removed it and my validation for all models worked fine as below:
Screenshot 2024-12-09 at 5 53 43 PM

Perhaps when you run the validation, it catches those errors which my build had not.

The pooler layer I had implemented thinking NextSentencePrediciton in mind, but I think it is not implemented on HuggingFace Distilbert anyways.

Please let me know if the test pass now.

As for the other tasks, such as QA, SeqClassification, I have put the code in the distilbert implementation but not successfully tested it yet, but soon I intend to do so. Perhaps someone else might implement is sooner than me and so I have commented them out.

In any case, Please let me know. Thank you.

@deveshjawla
Copy link
Author

Hi, I have fixed the error related to forcausalLM but there are failures which I don't understand.
Could you please let me know what might be causing the following:
` ```
Load: Log Test Failed at /Users/runner/work/Transformers.jl/Transformers.jl/test/huggingface/load.jl:34
Expression: load_model(model_name, hgf_model_name, task_type; config = cfg, cache = false)
Log Pattern: min_level = Logging.Debug

@chengchingwen
Copy link
Owner

/test/huggingface/load.jl:34 is testing if there are parameters that exist but are not found in the state_dict thus randomly initialized. You can set ENV["JULIA_DEBUG"] = Transformers before calling load_model in the REPL to see which parameter is being initialized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants