This repo extract MFCC feature from one audio frame (input.wav) file and regress to a 3D Geometry label. I implement two network for this task: Train LSTM MobileNet (treat audio feature map as image)