NextViT-tf

This repository is about an implementation of the research paper "Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios" using Tensorflow.

In this paper authors propose a next generation vision Transformer for efficient deployment in realistic industrial scenarios, namely Next-ViT, which dominates both CNNs and ViTs from the perspective of latency/accuracy trade-off. In this work, the Next Convolution Block (NCB) and Next Transformer Block (NTB) are respectively developed to capture local and global information with deployment-friendly mechanisms. Then, Next Hybrid Strategy (NHS) is designed to stack NCB and NTB in an efficient hybrid paradigm, which boosts performance in various downstream tasks. Extensive experiments show that Next-ViT significantly outperforms existing CNNs, ViTs and CNN-Transformer hybrid architectures with respect to the latency/accuracy trade-off across various vision tasks.

Model Architecture

Usage:

Clone Github Repo:

$ git clone https://github.com/IMvision12/NextViT-tf
$ cd nextvit_tf

Then import model

import tensorflow as tf
from models.nextvit_tf import nextvit_base
model = nextvit_base(input_shape=(224, 224, 3), num_classes=100)
print(model.summary(expand_nested=True))

References

[1] NextViT paper: https://arxiv.org/pdf/2207.05501v4

[2] Official NextViT Repo: https://github.com/bytedance/next-vit

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
img		img
models		models
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NextViT-tf

Model Architecture

Usage:

References

About

Releases

Packages

Languages

IMvision12/NextViT-tf

Folders and files

Latest commit

History

Repository files navigation

NextViT-tf

Model Architecture

Usage:

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages