Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On the problems for the training data. #1

Closed
okunoyukihiro2 opened this issue Jul 28, 2020 · 25 comments
Closed

On the problems for the training data. #1

okunoyukihiro2 opened this issue Jul 28, 2020 · 25 comments

Comments

@okunoyukihiro2
Copy link

Dear Composition-Conditoned-Crystal Gan developers.

I'm now trying to run your github code of your paper
'Generative Adversarial Networks for Crystal Structure Prediction'

In order to run the train.py , it needs the training data mgmno_2000.pickle
and I cannot found the training data on the github.
Then, I had make the training data by run

  1. 5.make_comp_dict.py and 2) 6.data_augmentation_mgmno.py from
    'unique_sc_mnmno.npy' and 'unique_sc_mgmno_name_list' on the github.

However, the train.py code did not work with the generated training data.

I see the code of train.py then, found the problems below,

The training data are assumed to be packed with crystal images and labels in the code
like,

for j, (imgs, label) in enumerate(dataloader):
batch_size = img.shape[0]
real_imgs = img.view(batch_size,1,30,3)

I think it assumes images (denoted as C in your paper : the representation of the crystal structure)
and labels (denoted as A in your paper, the atomic status ) for the training data.

However, in the codes of data preparation (6.data_augmentation_mgmno.py)
the only images data is dumped. Atom status is not generated from the scripts for data preparation on the github.

On the S.I of your paper, the loss function of the classifier are written as

L_class_comp = CE(C_real,\hat(C_real) + lamba1 CE(C_gen, \hat(C_gen))
L_class_atm = CE(A_real,\hat(A_real) + lamba1 CE(A_gen, \hat(A_gen))
L_class = L_class_atom + lambdaC * L_class_comp

On the other hand, in the code (train.py)

cat_loss_real = 0.3*(cat_loss_mg_real + cat_loss_mn_real + cat_loss_o_real) + cat_loss_mg_real2+cat_loss_mn_real2 + cat_loss_o_real

it seems L_class is given like

L_class = L_class_comp + 0.3 * L_class_atom

Furthermore, in the code (train.py), it does not considered the loss function term

CE(Agen,\hat(A_gen) . In the code of train.,py

fake_mg_label fake_mn_label, fale_o_label, fake_mg_cat, ,,,,, = net_Q(fake)

fake_mg_label, fake_mn_label, fake_o_label are not used in the code, and not implemented the

loss function term CE(A_gen, \hat(A_gen)) .

If I can get your reply, I'm very happy.

Sincerely,

Yukihiro Okuno.

@1098994933
Copy link

So much hard code and no comments make the code difficult to use.

@sgbaird
Copy link

sgbaird commented Jan 20, 2022

@jhwann @syaym any update on this? Could really use some instructions and on your part make sure someone can actually run the code per the instructions without error (e.g. test by downloading a fresh copy from GitHub in a fresh conda environment).

@syaym
Copy link
Collaborator

syaym commented Jan 21, 2022

@okunoyukihiro2

Regarding the training data, note that we have a separate routine for data pretreatments due to the capacity limit of this site to upload the entire data. I think you are likely running into a problem since you probably have not run that code needed for data pretreatments. So, I suggest you to run "5.make_comp_dict.py" ~ "7.make_label.py" in the preparing_dataset folder. Or, we now have uploaded the full data in a different website where you can download the trainable data already pretreated at https://figshare.com/s/0dce6bb830ae1e392206.

For the second question regarding the loss function of the classifier, the general form of L_class_atom has the lambda2*CE(A_gen, \hat(A_gen)) term, to be symmetric with L_class_comp, but the final value of lambda2 we used in the end is 0 instead of 1 to ensure structural diversity, and this is why the CE(Agen,\hat(A_gen)) term does not exist in the original code. Thus, in short, the hyperparameter for lambda2 in Table S1 in the SI file was a typo, and this correction is now in progress with the journal and will be updated shortly.

Many thanks for your comments.

(PS. The above reply is an edited/corrected version of my earlier response.)

@sgbaird
Copy link

sgbaird commented Jan 21, 2022

@syaym thank you for the response. I plan to give the code a try. Perhaps you could upload a copy of mgmno_2000.pickle to figshare, assuming it it less than 20 GB, and then include the link in the README?

@Z-Abbas
Copy link

Z-Abbas commented Jan 26, 2022

Anyone, please guide, from where we can get the ".cif" and ".vasp" files?
@syaym are you going to upload the mgmno_2000.pickle file?

@syaym
Copy link
Collaborator

syaym commented Jan 26, 2022

@sgbaird @Z-Abbas I uploaded the mgmno_2000.pickle file.

@sgbaird
Copy link

sgbaird commented Jan 26, 2022

@syaym Wonderful. Thank you!

I see that you added the link to the README https://figshare.com/s/0dce6bb830ae1e392206

@Z-Abbas
Copy link

Z-Abbas commented Feb 2, 2022

@syaym Thank you for the file.
I am now able to run the "train.py" file after preparing the dataset by running "5.make_comp_dict.py" ~ "7.make_label.py". After running "train.py", it creates two folders; 1.model_cwgan_mgmno and 2. gen_image_cwgan_mgmno". The 2nd one contains the npy files which look like the screenshot attached.
Are these the x,y, and z-axis? How can I see the newly generated crystal-structues?
Would appreciate your earliest response.
Capture

@sgbaird
Copy link

sgbaird commented Feb 2, 2022

@Z-Abbas you may consider emailing @syaym

@syaym
Copy link
Collaborator

syaym commented Feb 2, 2022 via email

@Z-Abbas
Copy link

Z-Abbas commented Feb 7, 2022

@syaym Would you please guide me, how can I get the ".vasp" files for the below list?
vasp_list = glob.glob(vasp_path+'/*.vasp')

@syaym
Copy link
Collaborator

syaym commented Feb 7, 2022

@Z-Abbas In "preparing_dataset" folder, please run "5.make_comp_dict.py" ~ "7.make_label.py" for making data-augmented mgmno dataset. We provided 'unique_sc_mgmno.npy' instead of vasp files.

@Z-Abbas
Copy link

Z-Abbas commented Feb 7, 2022

@syaym Thank you for your prompt response.
I already have run 5~7. I just want to follow it from the start by importing the cif and vasp files. That's why I am looking for it. I already found the cif files but unable to find the vasp files.

@Z-Abbas
Copy link

Z-Abbas commented Feb 8, 2022

@syaym I would appreciate it if it's possible for you to share the vasp files.

@syaym
Copy link
Collaborator

syaym commented Feb 8, 2022

@Z-Abbas https://figshare.com/s/350a8ac4732de2da3a00

@Z-Abbas
Copy link

Z-Abbas commented Feb 9, 2022

Much appreciated! Thank you!

@sgbaird
Copy link

sgbaird commented Feb 9, 2022

@Z-Abbas would be interested to hear back once you're able to get it running from start to finish using the VASP files

@Z-Abbas
Copy link

Z-Abbas commented Feb 9, 2022

@sgbaird sure :)

@Z-Abbas
Copy link

Z-Abbas commented Feb 24, 2022

Hello! @syaym and @sgbaird
I have successfully run it from start to end using the VASP files.
At the end it generates "gen_images_x.npy" files.
Using this npy file (any single npy file) I run the below function inside view_atoms_mgmno.py;
def view_atoms_classifier(image,mg_label,mn_label, o_label, view=True):

When I print the atom, it gives me the output as:
Atoms(symbols='Mg4Mn4O', pbc=True, cell=[[6.622104644775391, 0.0, 0.0], [0.9188044602466798, 10.297901251162942, 0.0], [4.528352067461127, -7.3943218741460095, 15.134449311591837]])

and when I do "atoms.edit()" it generates the image in gui as attached.
Mg4Mn4O

@syaym Now I am wondering, how to check the validity of the generated atoms and convert it in structural form?

@syaym
Copy link
Collaborator

syaym commented Feb 24, 2022

@Z-Abbas I think that the generator is not fully trained enough for generating structures and you should run more training epochs.

@Z-Abbas
Copy link

Z-Abbas commented Feb 24, 2022

@syaym Thank you! and yes I reduced the n_epochs to 300 from 501(in code), and constraint_epoch to 5000(10000 in code). If I use the same epochs as in your code, will it generate the structures similar as given in paper?

And, am I correctly viewing the atoms using the "def view_atoms_classifier(image,mg_label,mn_label, o_label, view=True):" function?

@syaym
Copy link
Collaborator

syaym commented Feb 24, 2022

@Z-Abbas the structures in the paper were post-processed with DFT-optimization. Generated structures that have not undergone post-process may have the form of somewhat weird structures.
And, It is right to use "view_atoms_classifier "

@Z-Abbas
Copy link

Z-Abbas commented Feb 24, 2022

@syaym Thank you so much!
What about checking the validity of the generated molecules?

@syaym syaym closed this as completed Aug 3, 2022
@loldeng
Copy link

loldeng commented Nov 26, 2024

Hello! @syaym and @sgbaird I have successfully run it from start to end using the VASP files. At the end it generates "gen_images_x.npy" files. Using this npy file (any single npy file) I run the below function inside view_atoms_mgmno.py; def view_atoms_classifier(image,mg_label,mn_label, o_label, view=True):

When I print the atom, it gives me the output as: Atoms(symbols='Mg4Mn4O', pbc=True, cell=[[6.622104644775391, 0.0, 0.0], [0.9188044602466798, 10.297901251162942, 0.0], [4.528352067461127, -7.3943218741460095, 15.134449311591837]])

and when I do "atoms.edit()" it generates the image in gui as attached. Mg4Mn4O

@syaym Now I am wondering, how to check the validity of the generated atoms and convert it in structural form?

Dear Zeeshan Abbas, hello. I would like to inquire about viewing the newly generated crystal structure. You mentioned using this npy file (any single npy file). I will run the following function inside view_otoms_mgmno. py;, How it works, I'm not quite sure. Looking forward to your reply

@loldeng
Copy link

loldeng commented Nov 26, 2024

一旦您能够使用 VASP 文件从头到尾运行它,将有兴趣收到回复

Dear Sterling G. Baird, may I ask if you are still researching this field? I have some issues running this code but the publisher has not responded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants