Teaser: this version will continue using CINN architecture (Live3D-v1,2021) for mutual conditioning but there will be no pre-determined task formulation.
The actual behavior of the model will be determined during inference time, similar to diffusion models and masked autoencoders.