GENERATION TO ONE ANOTHER DOMAIN: Audio ⟷ Image (Pytorch)

Abstract

In deep learning, research on the generation between the same domains has already shown excellent performance. However inter-domain generation research remains a challenging field, and there are still lot of studies about finding correlations between different modalities. Among these, we propose Adversarial Conditional VAE(AC-VAE) model which generates one modality(audio/visual) from the other modality(visual/audio) by utilizing the advantages of two representative generation methods and simple auxiliary classifier. In the experiments, the proposed model shows quite good results in both audio to image and image to audio generations. We report our results and discussion.

Model structure: Adversarial Conditional VAE(AC-VAE):

Results:

Audio to Image:

In Result(A2I) folder, you can find more results.

Image to Audio:

In Result(I2A) folder, you can find more results.

Visualization latent space:

Usage:

First, put dataset in <Code_path>/dataset/

Dataset Link: https://www.cs.rochester.edu/~cxu22/d/vagan/

The results will save in <Code_path>/experiment/

Run Audio to Image:

python trainA2I.py --name <save_result_name>

Run Image to Audio:

python trainA2I.py --name <save_result_name>

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
Result(A2I)		Result(A2I)
Result(I2A)		Result(I2A)
figs		figs
pretrained		pretrained
.gitignore		.gitignore
2021_F_ GCT634 Final Project Report _Gihoon Kim, Jiho Kang, Hyunsong Kwon.pdf		2021_F_ GCT634 Final Project Report _Gihoon Kim, Jiho Kang, Hyunsong Kwon.pdf
ForDebug.ipynb		ForDebug.ipynb
README.md		README.md
data_utils.py		data_utils.py
loss_function.py		loss_function.py
model.py		model.py
test_A2I.py		test_A2I.py
test_I2A.py		test_I2A.py
train_A2I.py		train_A2I.py
train_I2A.py		train_I2A.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GENERATION TO ONE ANOTHER DOMAIN: Audio ⟷ Image (Pytorch)

Abstract

Model structure: Adversarial Conditional VAE(AC-VAE):

Results:

Audio to Image:

In Result(A2I) folder, you can find more results.

Image to Audio:

In Result(I2A) folder, you can find more results.

Visualization latent space:

Usage:

Run Audio to Image:

Run Image to Audio:

About

Releases

Packages

Contributors 2

Languages

rlgnswk/Generation-btw-Audio-Image

Folders and files

Latest commit

History

Repository files navigation

GENERATION TO ONE ANOTHER DOMAIN: Audio ⟷ Image (Pytorch)

Abstract

Model structure: Adversarial Conditional VAE(AC-VAE):

Results:

Audio to Image:

In Result(A2I) folder, you can find more results.

Image to Audio:

In Result(I2A) folder, you can find more results.

Visualization latent space:

Usage:

Run Audio to Image:

Run Image to Audio:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages