Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1.0 stops after first refinement iteration #44

Closed
rfronzes opened this issue May 17, 2023 · 18 comments
Closed

V1.0 stops after first refinement iteration #44

rfronzes opened this issue May 17, 2023 · 18 comments

Comments

@rfronzes
Copy link

No description provided.

@rfronzes
Copy link
Author

rfronzes commented May 17, 2023

Dear ModelAngelo developers,

first, many thanks for the amazing work you are doing !

We have a bug with the new version of ModelAngelo (v1.0).

The prediction runs OK for the Alpha-C and first refinement round. At the end of the first round, we have an error message and the program stops (see below).

Many thanks

Rémi

2023-05-17 at 14:25:23 | INFO | ModelAngelo with args: {'volume_path': 'map.mrc', 'protein_fasta': 'AdhE-SP.fa', 'rna_fasta': None, 'dna_fasta': None, 'output_dir': 'test2', 'mask_path': None, 'device': None, 'config_path': None, 'model_bundle_name': 'nucleotides', 'model_bundle_path': None, 'keep_intermediate_results': False, 'pipeline_control': False, 'func': <function main at 0x7f2c33703d00>}
2023-05-17 at 14:25:23 | INFO | Initial C-alpha prediction with args: {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': 'map.mrc', 'output_path': 'test2/see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}
2023-05-17 at 14:25:24 | INFO | Using model file /srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/model.py
2023-05-17 at 14:25:24 | INFO | Using checkpoint file /srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/chkpt.torch
2023-05-17 at 14:25:25 | INFO | Input structure has shape: (186, 186, 186)
2023-05-17 at 14:25:25 | INFO | Running with these arguments:
2023-05-17 at 14:25:25 | INFO | {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': 'map.mrc', 'output_path': 'test2/see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}
2023-05-17 at 14:28:15 | INFO | Model prediction done, took 170.09 seconds for 512 sliding windows
2023-05-17 at 14:28:15 | INFO | Average time is 332.199 ms
2023-05-17 at 14:28:15 | INFO | Starting Cα grid to points...
2023-05-17 at 14:28:16 | INFO | Have 25615 Cα points before pruning and 5318 after pruning
2023-05-17 at 14:28:18 | INFO | Starting P grid to points...
2023-05-17 at 14:28:18 | INFO | Have 1448 P points before pruning and 354 after pruning
2023-05-17 at 14:28:18 | INFO | Finished inference!
2023-05-17 at 14:28:18 | INFO | GNN model refinement round 1 with args: {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': True, 'seq_attention_batch_size': 200, 'fp16': False, 'batch_size': 1, 'voxel_size': 1.0, 'map': 'map.mrc', 'protein_fasta': 'AdhE-SP.fa', 'rna_fasta': None, 'dna_fasta': None, 'struct': 'test2/see_alpha_output/see_alpha_merged_output.cif', 'output_dir': 'test2/gnn_output_round_1', 'model_dir': '/srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/gnn', 'device': None, 'write_hmm_profiles': False, 'refine': False}
2023-05-17 at 14:28:18 | INFO | Loaded module from step: 483863
2023-05-17 at 14:33:48 | ERROR | Error in ModelAngelo
Traceback (most recent call last):

File "/app/anaconda3/envs/model_angelo/bin/model_angelo", line 33, in
sys.exit(load_entry_point('model-angelo==1.0.0', 'console_scripts', 'model_angelo')())
│ │ └ <function importlib_load_entry_point at 0x7f2d79967d90>
│ └
└ <module 'sys' (built-in)>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/main.py", line 52, in main
args.func(args)
│ │ └ Namespace(volume_path='map.mrc', protein_fasta='AdhE-SP.fa', rna_fasta=None, dna_fasta=None, output_dir='test2', mask_path=No...
│ └ <function main at 0x7f2c33703d00>
└ Namespace(volume_path='map.mrc', protein_fasta='AdhE-SP.fa', rna_fasta=None, dna_fasta=None, output_dir='test2', mask_path=No...

File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/apps/build.py", line 241, in main
gnn_output = gnn_infer(gnn_infer_args)
│ └ {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': Tru...
└ <function infer at 0x7f2c33efa9e0>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/gnn/inference.py", line 184, in infer
final_results_to_cif(
└ <function final_results_to_cif at 0x7f2c33703520>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/gnn/flood_fill.py", line 251, in final_results_to_cif
final_results["aa_logits"][existence_mask][c] for c in pruned_chains
│ └ array([ True, True, True, ..., True, True, True])
└ {'pred_positions': array([[ 97.94374 , 121.828865, 35.82869 ],
[ 93.678734, 123.0471 , 36.359264],
[ 98.8719...

NameError: name 'pruned_chains' is not defined

@rfronzes rfronzes changed the title Bug in v1 V1.0 stops after first refinement iteration May 17, 2023
jamaliki added a commit that referenced this issue May 17, 2023
@jamaliki
Copy link
Collaborator

Hi,

This is strange, I made a change. Could you update your installation and please try again?

Best,
Kiarash.

@rfronzes
Copy link
Author

Hi

Unfortunately, it crashes at the same point.
Different error message

Many thanks

Rémi


2023-05-17 at 16:11:49 | INFO | ModelAngelo with args: {'volume_path': 'map.mrc', 'protein_fasta': 'AdhE-SP.fa', 'rna_fasta': None, 'dna_fasta': None, 'output_dir': 'test-commit', 'mask_path': None, 'device': None, 'config_path': None, 'model_bundle_name': 'nucleotides', 'model_bundle_path': None, 'keep_intermediate_results': False, 'pipeline_control': False, 'func': <function main at 0x7f30b7b0bc70>}
2023-05-17 at 16:11:49 | INFO | Initial C-alpha prediction with args: {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': 'map.mrc', 'output_path': 'test-commit/see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}
2023-05-17 at 16:11:49 | INFO | Using model file /srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/model.py
2023-05-17 at 16:11:49 | INFO | Using checkpoint file /srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/chkpt.torch
2023-05-17 at 16:11:51 | INFO | Input structure has shape: (186, 186, 186)
2023-05-17 at 16:11:51 | INFO | Running with these arguments:
2023-05-17 at 16:11:51 | INFO | {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': 'map.mrc', 'output_path': 'test-commit/see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}
2023-05-17 at 16:14:44 | INFO | Model prediction done, took 172.46 seconds for 512 sliding windows
2023-05-17 at 16:14:44 | INFO | Average time is 336.832 ms
2023-05-17 at 16:14:44 | INFO | Starting Cα grid to points...
2023-05-17 at 16:14:45 | INFO | Have 25615 Cα points before pruning and 5318 after pruning
2023-05-17 at 16:14:46 | INFO | Starting P grid to points...
2023-05-17 at 16:14:47 | INFO | Have 1448 P points before pruning and 354 after pruning
2023-05-17 at 16:14:47 | INFO | Finished inference!
2023-05-17 at 16:14:47 | INFO | GNN model refinement round 1 with args: {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': True, 'seq_attention_batch_size': 200, 'fp16': False, 'batch_size': 1, 'voxel_size': 1.0, 'map': 'map.mrc', 'protein_fasta': 'AdhE-SP.fa', 'rna_fasta': None, 'dna_fasta': None, 'struct': 'test-commit/see_alpha_output/see_alpha_merged_output.cif', 'output_dir': 'test-commit/gnn_output_round_1', 'model_dir': '/srv/home/rfuser/.cache/torch/hub/checkpoints/model_angelo_v1.0/nucleotides/gnn', 'device': None, 'write_hmm_profiles': False, 'refine': False}
2023-05-17 at 16:14:47 | INFO | Loaded module from step: 483863
2023-05-17 at 16:20:19 | ERROR | Error in ModelAngelo
Traceback (most recent call last):

File "/app/anaconda3/envs/model_angelo/bin/model_angelo", line 33, in
sys.exit(load_entry_point('model-angelo==1.0.0', 'console_scripts', 'model_angelo')())
│ │ └ <function importlib_load_entry_point at 0x7f31fdd1fd90>
│ └
└ <module 'sys' (built-in)>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/main.py", line 52, in main
args.func(args)
│ │ └ Namespace(volume_path='map.mrc', protein_fasta='AdhE-SP.fa', rna_fasta=None, dna_fasta=None, output_dir='test-commit', mask_p...
│ └ <function main at 0x7f30b7b0bc70>
└ Namespace(volume_path='map.mrc', protein_fasta='AdhE-SP.fa', rna_fasta=None, dna_fasta=None, output_dir='test-commit', mask_p...

File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/apps/build.py", line 241, in main
gnn_output = gnn_infer(gnn_infer_args)
│ └ {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': Tru...
└ <function infer at 0x7f30b82ca950>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/gnn/inference.py", line 184, in infer
final_results_to_cif(
└ <function final_results_to_cif at 0x7f30b7b0b490>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/gnn/flood_fill.py", line 291, in final_results_to_cif
fix_chains_output = fix_chains_pipeline(
└ <function fix_chains_pipeline at 0x7f30b7b0add0>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/utils/hmm_sequence_align.py", line 521, in fix_chains_pipeline
best_match_output = best_match_to_sequences(
└ <function best_match_to_sequences at 0x7f30b7b0a290>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/utils/hmm_sequence_align.py", line 211, in best_match_to_sequences
hmm_alignment = get_hmm_alignment(
└ <function get_hmm_alignment at 0x7f30b7b0a200>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/utils/hmm_sequence_align.py", line 50, in get_hmm_alignment
msas = pyhmmer.hmmer.hmmalign(
│ │ └ <function hmmalign at 0x7f30b7b096c0>
│ └ <module 'pyhmmer.hmmer' from '/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/pyhmmer/hmmer.py'>
└ <module 'pyhmmer' from '/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/pyhmmer/init.py'>
File "/app/anaconda3/envs/model_angelo/lib/python3.10/site-packages/pyhmmer/hmmer.py", line 1369, in hmmalign
traces = aligner.compute_traces(hmm, sequences)
│ │ │ └ DigitalSequenceBlock(pyhmmer.easel.Alphabet.amino(), [<pyhmmer.easel.DigitalSequence object at 0x7f30b442c480>])
│ │ └ <pyhmmer.plan7.HMM object at 0x7f3076949440>
│ └ <method 'compute_traces' of 'pyhmmer.plan7.TraceAligner' objects>
└ TraceAligner()
File "pyhmmer/plan7.pyx", line 8440, in pyhmmer.plan7.TraceAligner.compute_traces
cpdef Traces compute_traces(self, HMM hmm, DigitalSequenceBlock sequences):
│ └ <class 'pyhmmer.plan7.HMM'>
└ <class 'pyhmmer.plan7.Traces'>
File "pyhmmer/plan7.pyx", line 8480, in pyhmmer.plan7.TraceAligner.compute_traces
raise ValueError(f"Invalid HMM: {err_msg}")

ValueError: Invalid HMM: TMD should be 0 for last node

@jamaliki
Copy link
Collaborator

I have not seen this before. Could you send me the fasta file you used?

@rfronzes
Copy link
Author

rfronzes commented May 17, 2023

it crashes at the same point even without Fasta file. Could it come from the map ?
I tested 2 different maps. Same maps and fasta files were working with the previous version of ModelAngelo

@jamaliki
Copy link
Collaborator

It crashes without the Fasta file as well? Could you please provide the log file for that run as well?

Are you willing to share the map with me? I need the map and fasta to be able to see what the issue is.

@rfronzes
Copy link
Author

rfronzes commented May 17, 2023

Can I send you the map and fasta by Email ?

I tested the build_no_seq again. Now it is working !!

Still not working with the fasta .

@martinpacesa
Copy link

martinpacesa commented May 17, 2023

I am also getting an error when trying to build with RNA and DNA nucleotides. Previous version of modelangelo ran fine on the same map with just protein:

``2023-05-17 at 18:06:00 | INFO | ModelAngelo with args: {'volume_path': '/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', 'protein_fasta': '/local/seq/test.fasta', 'rna_fasta': '/local/seq/test_RNA.fasta', 'dna_fasta': '/local/seq/test_DNA.fasta', 'output_dir': '.', 'mask_path': None, 'device': None, 'config_path': None, 'model_bundle_name': 'nucleotides', 'model_bundle_path': None, 'keep_intermediate_results': False, 'pipeline_control': False, 'func': <function main at 0x2b48b9c3aaf0>}
2023-05-17 at 18:06:01 | INFO | Initial C-alpha prediction with args: {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': '/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', 'output_path': './see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}
2023-05-17 at 18:06:01 | INFO | Using model file /local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/model.py
2023-05-17 at 18:06:01 | INFO | Using checkpoint file /local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/chkpt.torch
2023-05-17 at 18:06:06 | INFO | Input structure has shape: (194, 194, 194)
2023-05-17 at 18:06:06 | INFO | Running with these arguments:
2023-05-17 at 18:06:06 | INFO | {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': '/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', 'output_path': './see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}
2023-05-17 at 18:10:32 | INFO | Model prediction done, took 265.69 seconds for 729 sliding windows
2023-05-17 at 18:10:32 | INFO | Average time is 364.460 ms
2023-05-17 at 18:10:32 | INFO | Starting Cα grid to points...
2023-05-17 at 18:10:33 | INFO | Have 15582 Cα points before pruning and 1887 after pruning
2023-05-17 at 18:10:34 | INFO | Starting P grid to points...
2023-05-17 at 18:10:34 | INFO | Have 6515 P points before pruning and 303 after pruning
2023-05-17 at 18:10:35 | INFO | Finished inference!
2023-05-17 at 18:10:35 | INFO | GNN model refinement round 1 with args: {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': True, 'seq_attention_batch_size': 200, 'fp16': False, 'batch_size': 1, 'voxel_size': 1.0, 'map': '/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', 'protein_fasta': '/local/seq/test.fasta', 'rna_fasta': '/local/seq/test_RNA.fasta', 'dna_fasta': '/local/seq/test_DNA.fasta', 'struct': './see_alpha_output/see_alpha_merged_output.cif', 'output_dir': './gnn_output_round_1', 'model_dir': '/local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/gnn', 'device': None, 'write_hmm_profiles': False, 'refine': False}
2023-05-17 at 18:10:35 | INFO | Loaded module from step: 483863
2023-05-17 at 18:13:06 | ERROR | Error in ModelAngelo
Traceback (most recent call last):

File "/home/pacesa/miniconda3/envs/model_angelo/bin/model_angelo", line 33, in
sys.exit(load_entry_point('model-angelo==1.0.0', 'console_scripts', 'model_angelo')())
│ │ └ <function importlib_load_entry_point at 0x2b4809d32280>
│ └
└ <module 'sys' (built-in)>
File "/home/pacesa/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-1.0.0-py3.9.egg/model_angelo/main.py", line 52, in main
args.func(args)
│ │ └ Namespace(volume_path='/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', protein_fas...
│ └ <function main at 0x2b48b9c3aaf0>
└ Namespace(volume_path='/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', protein_fas...

File "/home/pacesa/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-1.0.0-py3.9.egg/model_angelo/apps/build.py", line 241, in main
gnn_output = gnn_infer(gnn_infer_args)
│ └ {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': Tru...
└ <function infer at 0x2b48b8ec4f70>
File "/home/pacesa/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-1.0.0-py3.9.egg/model_angelo/gnn/inference.py", line 184, in infer
final_results_to_cif(
└ <function final_results_to_cif at 0x2b48b9c3a8b0>
File "/home/pacesa/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-1.0.0-py3.9.egg/model_angelo/gnn/flood_fill.py", line 251, in final_results_to_cif
final_results["aa_logits"][existence_mask][c] for c in pruned_chains
│ └ array([ True, True, True, ..., True, True, True])
└ {'pred_positions': array([[149.68655 , 158.68929 , 80.635506],
[152.87912 , 157.20757 , 82.420456],
[151.5113...

NameError: name 'pruned_chains' is not defined
``

@jamaliki
Copy link
Collaborator

Can I send you the map and fasta by Email ?

I tested the build_no_seq again. Now it is working !!

Still not working with the fasta .

Yes please, email is [email protected]

@jamaliki
Copy link
Collaborator

I am also getting an error when trying to build with RNA and DNA nucleotides. Previous version of modelangelo ran fine on the same map with just protein:

``2023-05-17 at 18:06:00 | INFO | ModelAngelo with args: {'volume_path': '/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', 'protein_fasta': '/local/seq/test.fasta', 'rna_fasta': '/local/seq/test_RNA.fasta', 'dna_fasta': '/local/seq/test_DNA.fasta', 'output_dir': '.', 'mask_path': None, 'device': None, 'config_path': None, 'model_bundle_name': 'nucleotides', 'model_bundle_path': None, 'keep_intermediate_results': False, 'pipeline_control': False, 'func': <function main at 0x2b48b9c3aaf0>}

2023-05-17 at 18:06:01 | INFO | Initial C-alpha prediction with args: {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': '/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', 'output_path': './see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}

2023-05-17 at 18:06:01 | INFO | Using model file /local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/model.py

2023-05-17 at 18:06:01 | INFO | Using checkpoint file /local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha/chkpt.torch

2023-05-17 at 18:06:06 | INFO | Input structure has shape: (194, 194, 194)

2023-05-17 at 18:06:06 | INFO | Running with these arguments:

2023-05-17 at 18:06:06 | INFO | {'model_checkpoint': 'chkpt.torch', 'bfactor': 0, 'batch_size': 4, 'box_size': 64, 'stride': 16, 'dont_mask_input': True, 'threshold': 0.05, 'save_real_coordinates': False, 'save_cryo_em_grid': False, 'do_nucleotides': True, 'save_backbone_trace': False, 'save_output_grid': False, 'crop': 6, 'log_dir': '/local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/c_alpha', 'map_path': '/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', 'output_path': './see_alpha_output', 'mask_path': None, 'device': None, 'auto_mask': False}

2023-05-17 at 18:10:32 | INFO | Model prediction done, took 265.69 seconds for 729 sliding windows

2023-05-17 at 18:10:32 | INFO | Average time is 364.460 ms

2023-05-17 at 18:10:32 | INFO | Starting Cα grid to points...

2023-05-17 at 18:10:33 | INFO | Have 15582 Cα points before pruning and 1887 after pruning

2023-05-17 at 18:10:34 | INFO | Starting P grid to points...

2023-05-17 at 18:10:34 | INFO | Have 6515 P points before pruning and 303 after pruning

2023-05-17 at 18:10:35 | INFO | Finished inference!

2023-05-17 at 18:10:35 | INFO | GNN model refinement round 1 with args: {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': True, 'seq_attention_batch_size': 200, 'fp16': False, 'batch_size': 1, 'voxel_size': 1.0, 'map': '/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', 'protein_fasta': '/local/seq/test.fasta', 'rna_fasta': '/local/seq/test_RNA.fasta', 'dna_fasta': '/local/seq/test_DNA.fasta', 'struct': './see_alpha_output/see_alpha_merged_output.cif', 'output_dir': './gnn_output_round_1', 'model_dir': '/local/Pipelines/ModelAngelo/model_angelo_weights/hub/checkpoints/model_angelo_v1.0/nucleotides/gnn', 'device': None, 'write_hmm_profiles': False, 'refine': False}

2023-05-17 at 18:10:35 | INFO | Loaded module from step: 483863

2023-05-17 at 18:13:06 | ERROR | Error in ModelAngelo

Traceback (most recent call last):

File "/home/pacesa/miniconda3/envs/model_angelo/bin/model_angelo", line 33, in

sys.exit(load_entry_point('model-angelo==1.0.0', 'console_scripts', 'model_angelo')())

│   │    └ <function importlib_load_entry_point at 0x2b4809d32280>

│   └ <built-in function exit>

└ <module 'sys' (built-in)>

File "/home/pacesa/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-1.0.0-py3.9.egg/model_angelo/main.py", line 52, in main

args.func(args)

│    │    └ Namespace(volume_path='/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', protein_fas...

│    └ <function main at 0x2b48b9c3aaf0>

└ Namespace(volume_path='/local/Maps/cryosparc_P2_J341_003_volume_map.mrc', protein_fas...

File "/home/pacesa/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-1.0.0-py3.9.egg/model_angelo/apps/build.py", line 241, in main

gnn_output = gnn_infer(gnn_infer_args)

             │         └ {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': Tru...

             └ <function infer at 0x2b48b8ec4f70>

File "/home/pacesa/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-1.0.0-py3.9.egg/model_angelo/gnn/inference.py", line 184, in infer

final_results_to_cif(

└ <function final_results_to_cif at 0x2b48b9c3a8b0>

File "/home/pacesa/miniconda3/envs/model_angelo/lib/python3.9/site-packages/model_angelo-1.0.0-py3.9.egg/model_angelo/gnn/flood_fill.py", line 251, in final_results_to_cif

final_results["aa_logits"][existence_mask][c] for c in pruned_chains

│                          └ array([ True,  True,  True, ...,  True,  True,  True])

└ {'pred_positions': array([[149.68655 , 158.68929 ,  80.635506],

         [152.87912 , 157.20757 ,  82.420456],

         [151.5113...

NameError: name 'pruned_chains' is not defined

``

Hi @martinpacesa ,

Sorry, this is a bug. It's been fixed, could you pull the repo again, update the installation, and try again please?

@IMhallelujahxn
Copy link

I got a similar error after first refinement iteration as follows, could you help to figure out the problem?

2023-05-18 at 09:56:47 | INFO | Finished inference!
2023-05-18 at 09:56:47 | INFO | GNN model refinement round 1 with args: {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': True, 'seq_atte$
2023-05-18 at 09:56:47 | INFO | Loaded module from step: 483863
2023-05-18 at 09:57:59 | ERROR | Error in ModelAngelo
Traceback (most recent call last):

File "/home/hxn/anaconda3/envs/model_angelo/bin/model_angelo", line 33, in
sys.exit(load_entry_point('model-angelo==1.0.0', 'console_scripts', 'model_angelo')())
│ │ └ <function importlib_load_entry_point at 0x7fbcf6a5bd90>
│ └
└ <module 'sys' (built-in)>
File "/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/main.py", line 52, in main
args.func(args)
│ │ └ Namespace(volume_path='J345_map.mrc', protein_fasta='BC-preF-3.fasta', rna_fasta=None, dna_fasta=None, output_dir='angelo_out...
│ └ <function main at 0x7fbbb14d1ea0>
└ Namespace(volume_path='J345_map.mrc', protein_fasta='BC-preF-3.fasta', rna_fasta=None, dna_fasta=None, output_dir='angelo_out...

File "/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/apps/build.py", line 241, in main
gnn_output = gnn_infer(gnn_infer_args)
│ └ {'num_rounds': 3, 'crop_length': 200, 'repeat_per_residue': 1, 'esm_model': 'esm1b_t33_650M_UR50S', 'aggressive_pruning': Tru...
└ <function infer at 0x7fbbb2054310>
File "/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/gnn/inference.py", line 184, in infer
final_results_to_cif(
└ <function final_results_to_cif at 0x7fbbb14d16c0>
File "/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/gnn/flood_fill.py", line 291, in final_results_to_cif
fix_chains_output = fix_chains_pipeline(
└ <function fix_chains_pipeline at 0x7fbbb14d1000>
File "/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/utils/hmm_sequence_align.py", line 521, in fix_chains_pipeline
best_match_output = best_match_to_sequences(
└ <function best_match_to_sequences at 0x7fbbb14d04c0>
File "/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/utils/hmm_sequence_align.py", line 211, in best_match_to_sequences
hmm_alignment = get_hmm_alignment(
└ <function get_hmm_alignment at 0x7fbbb14d0430>
File "/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/model_angelo-1.0.0-py3.10.egg/model_angelo/utils/hmm_sequence_align.py", line 50, in get_hmm_alignment
msas = pyhmmer.hmmer.hmmalign(
│ │ └ <function hmmalign at 0x7fbbb20af880>
│ └ <module 'pyhmmer.hmmer' from '/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/pyhmmer/hmmer.py'>
└ <module 'pyhmmer' from '/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/pyhmmer/init.py'>
File "/home/hxn/anaconda3/envs/model_angelo/lib/python3.10/site-packages/pyhmmer/hmmer.py", line 1369, in hmmalign
traces = aligner.compute_traces(hmm, sequences)
│ │ │ └ DigitalSequenceBlock(pyhmmer.easel.Alphabet.amino(), [<pyhmmer.easel.DigitalSequence object at 0x7fbba0729b80>])
│ │ └ <pyhmmer.plan7.HMM object at 0x7fbbb2a3cc80>
│ └ <method 'compute_traces' of 'pyhmmer.plan7.TraceAligner' objects>
└ TraceAligner()
File "pyhmmer/plan7.pyx", line 8440, in pyhmmer.plan7.TraceAligner.compute_traces
cpdef Traces compute_traces(self, HMM hmm, DigitalSequenceBlock sequences):
│ └ <class 'pyhmmer.plan7.HMM'>
└ <class 'pyhmmer.plan7.Traces'>
File "pyhmmer/plan7.pyx", line 8480, in pyhmmer.plan7.TraceAligner.compute_traces
raise ValueError(f"Invalid HMM: {err_msg}")

ValueError: Invalid HMM: TMD should be 0 for last node

@martinpacesa
Copy link

This is now resolved for me, thank yoU!

@jamaliki
Copy link
Collaborator

Hi @IMhallelujahxn ,

I am not sure what the problem is. It could be either

  1. Something strange with your FASTA file, or
  2. You are using an old version of pyHMMER

To find out the problem, could you please:

  1. Send me (or upload here) your FASTA file and
  2. With the model_angelo conda environment activated, run the following command and report back the results:
python -c 'import pyhmmer; print(pyhmmer.__version__)'

@IMhallelujahxn
Copy link

Hi @jamaliki,
I used a fasta file downloaded from PDB, so probably it's not the problem.
The command returned a version number of 0.8.0

@jamaliki
Copy link
Collaborator

@IMhallelujahxn could you please upload the FASTA anyway? The PDB has a myriad of different conventions. For example, if the FASTA file contains "X" amino-acids, it won't work. If it is too much trouble to upload the FASTA file, then please send me the link you used to download it. I need to be able to reproduce your problem so that I can help :)

@IMhallelujahxn
Copy link

@jamaliki
fasta file is sent through email.

@jamaliki
Copy link
Collaborator

Thank you @IMhallelujahxn !

The issue with the error ValueError: Invalid HMM: TMD should be 0 for last node is related to pyHMMER version 0.8.0

@rfronzes and @IMhallelujahxn to fix this problem, please revert to pyHMMER 0.7.1 like so:

pip install pyhmmer==0.7.1 -U

@jamaliki
Copy link
Collaborator

This is fixed now as of v1.0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants