Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to achieve high target accuracy given DeepJDOT limitations #7

Open
offchan42 opened this issue Jul 6, 2019 · 2 comments
Open

Comments

@offchan42
Copy link
Contributor

offchan42 commented Jul 6, 2019

I want to improve accuracy or loss of target dataset. So I would want to ask a few questions that might affect the accuracy.

  1. If I want to increase accuracy of target, should I train with a varied source domain? E.g. to increase accuracy for MNIST (for SVHN to MNIST adaptation), should I augment SVHN to include variations like a grayscale version of SVHN, different colored SVHN, etc?
  2. What are the example source and target datasets that would achieve high target accuracy? Give an idea if you haven't experiment before.
  3. What are the example source and target datasets that would achieve low target accuracy? Give an idea if you haven't experiment before.
@bbdamodaran
Copy link
Owner

DeepJDOT might require proper initialization of target model. If you initialize the weights of the target model with source model, almost in all cases DeepJDOT works pretty well.

@offchan42
Copy link
Contributor Author

offchan42 commented Jul 16, 2019

I always set the weights of the target model with the source model and check the accuracy of the target model before training to ensure similarity with the source model.

I'm training deepJDOT with my dataset and I found it's quite difficult to ensure that DeepJDOT will improve the error. Sometimes it just increases the error after a few hundred iterations. Maybe the regression problem is more difficult than classification? This also sometimes happens with the rotated SVHN->MNIST dataset.
My source is a face image in normal lighting. My target is a face image with a flashlight shining under the face. My outputs are 50 sigmoid units. (How much open is an eye, how much open is the mouth, etc)

What I've noticed also that in the feature extraction layer if I set the activation to ReLU instead of sigmoid, deepjdot will make the target error increase instead of decrease. Is this expected? I see that sigmoid trains quite slow so I wanted to change it.

  1. How do I know when to stop training (to not make the deepjdot overfits) if I don't have target labels? Because the loss that deepjdot gives doesn't seem to be correlated or directly proportional to the mean abs error. And the error seems to be increasing after some period of iterations.
  2. Does deepjdot requires a lot of target train data? What is the rule of thumb for this?
  3. For the feature extraction layer, should it always be the last layer before the final output layer? How would increasing/decreasing the number of units (128) affect the outcome? Can I use the output layer as feature extraction layer?
  4. How does batch size and learning rate affect the outcome? Do you think it's very critical or not?

If I don't have the target labels (in real unsupervised case), training with deepjdot can be quite scary and unreliable. I won't know if the error will be increased or decreased. That's why I want to increase the chance of getting it right.

Thank you for help! It's quite critical for my work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants