What Transformer Models are Supported? #2498

jawaff · 2023-04-02T20:25:19Z

jawaff
Apr 2, 2023

I'm currently looking into text generation using a T5 Huggingface model (https://huggingface.co/tscholak/cxmefzzi). I have it loaded in using the TorchScript format. I have the tokenizer loaded in from the tokenizer.json file. I have a translator written that encodes the inputs and attempts to process the outputs. My problem is that the outputs for this model aren't what I expected and I'm starting to think that T5 models aren't supported or maybe the translation to the TorchScript format wasn't correct.

The shape of the output is (inputSize, vocabSize) and the values are floats (I'm interpreting the first item in the NDList to be the output since the other items are of shape (batchSize, inputSize, 128)). I figured I could just choose the most probable word translation for each input word. The problem is that each array of length vocabSize is almost identical. The output of the translation is always some repeated word.

Do I need to switch to a BERT model or is there something wrong here?

frankfliu · 2023-04-02T21:00:11Z

frankfliu
Apr 2, 2023

For text generation tasks, we don't have a built-in translator. We are working on it (@KexinFeng).

You can consider use our python engine with djl-serving to serve it.

You only need prepare a serving.properties file in model folder:

option.model_id=tscholak/cxmefzzi
option.task=text-generation

And then start djl-serving docker container:

docker run --runtime=nvidia --shm-size=2g --rm -it -p 8080:8080 \
    --rm -v $PWD/model:/opt/ml/model/test \
    deepjavalibrary/djl-serving:deepspeed-nightly

4 replies

jawaff Apr 2, 2023
Author

It does look like that's the direction I need to go in for now. I figured it'd be possible to just convert the model myself and make my own translator.

I like that idea, but if I'm going to be working with Python I might as well use Huggingface features. I found a test (first link below) for the Huggingface models, but it's not immediately obvious how to send arguments to Huggingface's model.generate() (second link below). I'd like to supply some additional arguments to that function and those arguments might be a bit complex, like a LogitsProcessor. Is it possible for me to replace the djl_python/huggingface.py file using the entryPoint option in the Criteria builder (third link below)? If so, then how would I go about adding support for additional dependencies and files? Sorry, that's a little complex potentially, but I'm getting a feel for what is possible there.

https://github.com/deepjavalibrary/djl-serving/blob/master/engines/python/src/test/java/ai/djl/python/engine/PyEngineTest.java#L330

https://github.com/deepjavalibrary/djl-serving/blob/master/engines/python/setup/djl_python/huggingface.py#L243

https://github.com/deepjavalibrary/djl-serving/blob/master/engines/python/src/test/java/ai/djl/python/engine/PyEngineTest.java#L341

jawaff Apr 2, 2023
Author

If there's a branch or some unfinished code for the text generation support in DJL, then I might be interested in contributing later on.

lanking520 Apr 2, 2023

Most likely you are going to implement a Java based sample/Greedy/beam search algorithm

jawaff Apr 2, 2023
Author

I would love to be able to make a Java base beam search algorithm, but I'm not sure how to do it with this specific T5 model. After looking through the tokenizer.json file there doesn't seem to be support for any of the <extra_input_0> special tokens (there's no distinction between those special tokens), so a fill mask approach doesn't seem possible. Beyond a fill mask approach, I'm not sure how I would do the beam search. If I understood correctly, Huggingface has support for constraining the decoder in the model and allowing multiple decoding passes to accomplish the beam search and all those fancy features. I just don't know how I'd be able to accomplish the same thing since the model is a black box on the Java side.

frankfliu · 2023-04-02T22:11:22Z

frankfliu
Apr 2, 2023

@jawaff
You can always provide your model model implementation. If the model directory contains a model.py, it will use your model.py. See: https://github.com/deepjavalibrary/djl-serving/tree/master/engines/python/src/test/resources/echo

1 reply

frankfliu Apr 2, 2023

https://docs.djl.ai/master/docs/serving/serving/docs/modes.html#python-mode

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What Transformer Models are Supported? #2498

{{title}}

Replies: 2 comments 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

What Transformer Models are Supported? #2498

jawaff Apr 2, 2023

Replies: 2 comments · 5 replies

frankfliu Apr 2, 2023

jawaff Apr 2, 2023 Author

jawaff Apr 2, 2023 Author

lanking520 Apr 2, 2023

jawaff Apr 2, 2023 Author

frankfliu Apr 2, 2023

frankfliu Apr 2, 2023

jawaff
Apr 2, 2023

Replies: 2 comments 5 replies

frankfliu
Apr 2, 2023

jawaff Apr 2, 2023
Author

jawaff Apr 2, 2023
Author

jawaff Apr 2, 2023
Author

frankfliu
Apr 2, 2023