Implementation of vision and language model transformers from the paper https://arxiv.org/abs/2405.15712.
This also contains modification to the Allen AI OLMo codebase https://github.com/allenai/OLMo to allow for infinite width and depth limits when training on C4.