You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we are struggling to train transformer models for inputs larger than 128^3 volume. Within the volumetric grid representation of shape, the resolution achievable from this may not be enough. For example, for large bone shapes such as hip and ribs, we have had to use voxel resolution of ~2.5mm which is very coarse.
We might be able to output a more high resolution image if the transformers were lighter in terms of memory usage.
Proposal
Use a 2D encoder and a 3D decoder. How do we execute this using transformer? A case in point how do we port the ideas from TransVert, 3DReconNet into pure transformers. How do we concatenate features from the two parallel x-ray branches and decode a 3D output?
Implement/port Conv+transformer
These might be more efficient in terms of memory usage.
Try:
implement/port various lightweight transformers and other interesting ideas
3D Medical Axial transformer, CoTr and AFTer-UNet seem promising
The text was updated successfully, but these errors were encountered:
Description
Currently, we are struggling to train transformer models for inputs larger than 128^3 volume. Within the volumetric grid representation of shape, the resolution achievable from this may not be enough. For example, for large bone shapes such as hip and ribs, we have had to use voxel resolution of ~2.5mm which is very coarse.
We might be able to output a more high resolution image if the transformers were lighter in terms of memory usage.
Proposal
These might be more efficient in terms of memory usage.
Try:
The text was updated successfully, but these errors were encountered: