Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running atacworks train #248

Open
albertdyu opened this issue May 22, 2022 · 2 comments
Open

Error running atacworks train #248

albertdyu opened this issue May 22, 2022 · 2 comments

Comments

@albertdyu
Copy link

Hi there,

I'm running atacworks on some Drosophila ATAC-seq data we had generated, and it's throwing an error at me!

atacworks train
--noisybw ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw
--cleanbw ClkZT14_1.sort.bam.cutsites.smoothed_100.coverage.bw
--cleanpeakfile ClkZT14_1.sort.bam.peaks.bed
--genome dm6.chrom.sizes
--val_chrom chr2R
--holdout_chrom chr3L
--out_home "./"
--exp_name "856_train"
--distributed

INFO:2022-05-22 17:50:19,291:AtacWorks-peak2bw] Reading input file
INFO:2022-05-22 17:50:19,295:AtacWorks-peak2bw] Read 15265 peaks.
INFO:2022-05-22 17:50:19,297:AtacWorks-peak2bw] Adding score
INFO:2022-05-22 17:50:19,297:AtacWorks-peak2bw] Writing peaks to bedGraph file
Discarding 0 entries outside sizes file.
INFO:2022-05-22 17:50:19,335:AtacWorks-peak2bw] Writing peaks to bigWig file ./856_train_2022.05.22_17.50/bigwig_peakfiles/ClkZT14_1.sort.bam.peaks.bed.bw
INFO:2022-05-22 17:50:19,364:AtacWorks-peak2bw] Done!
INFO:2022-05-22 17:50:19,367:AtacWorks-intervals] Generating training intervals
INFO:2022-05-22 17:50:20,831:AtacWorks-intervals] Generating val intervals
INFO:2022-05-22 17:50:20,840:AtacWorks-bw2h5] Reading intervals
INFO:2022-05-22 17:50:20,841:AtacWorks-bw2h5] Read 1691 intervals
INFO:2022-05-22 17:50:20,841:AtacWorks-bw2h5] Selecting intervals with nonzero coverage
Traceback (most recent call last):
File "/home/albz/miniconda3/envs/atacworks/bin/atacworks", line 8, in
sys.exit(main())
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 409, in main
prefix, args.pad)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/io/bw2h5.py", line 73, in bw2h5
intervals, noisybw)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/io/bigwigio.py", line 106, in check_bigwig_intervals_nonzero
check_bigwig_nonzero, axis=1, args=(bw,))
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/frame.py", line 6906, in apply
return op.get_result()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 186, in get_result
return self.apply_standard()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 292, in apply_standard
self.apply_series_generator()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 321, in apply_series_generator
results[i] = self.f(v)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/pandas/core/apply.py", line 112, in f
return func(x, *args, **kwds)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/io/bigwigio.py", line 90, in check_bigwig_nonzero
result = bw.values(interval[0], interval[1], interval[2])
RuntimeError: ('Invalid interval bounds!', 'occurred at index 1680')

Any thoughts on what could be going wrong?

@avantikalal
Copy link
Contributor

@albertdyu, it looks like there's a problem reading the values for the 1680th training interval. If you look at the output files that are generated by this command you should be able to find an intervals/ folder which will contain a file with train in the name that should gave 1691 rows. You can look at the 1680th row in this file. It could be an issue with this chromosome name not being included in your bigwig file, or your bigwig file being produced using a genome assembly different from dm6.chrom.sizes.

@albertdyu
Copy link
Author

That did the trick! I accidentally used a sizes file with slightly different chromosome sizes relative to the assembly I mapped to - good catch! Thank you!

I was able to successfully train a model, but now I'm having some trouble with denoising, haha...

atacworks denoise
--noisybw ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw
--genome gfachrome.sizes
--weights_path ./856_train_latest/model_best.pth.tar
--out_home "./"
--exp_name "856_ZT14_2_denoise"
--distributed
--num_workers 0

INFO:2022-05-22 19:19:49,827:AtacWorks-intervals] Generating intervals tiling across all chromosomes in sizes file: gfachrome.sizes
INFO:2022-05-22 19:19:49,841:AtacWorks-intervals] Done!
INFO:2022-05-22 19:19:49,841:AtacWorks-bw2h5] Reading intervals
INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] Read 2747 intervals
INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] Writing data in 3 batches.
INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] Extracting data for each batch and writing to h5 file
INFO:2022-05-22 19:19:49,842:AtacWorks-bw2h5] batch 0 of 3
INFO:2022-05-22 19:19:58,719:AtacWorks-bw2h5] Done! Saved to ./856_ZT14_2_denoise_2022.05.22_19.19/bw2h5/ClkZT14_2.sort.bam.cutsites.smoothed_100.coverage.bw.denoise.h5
INFO:2022-05-22 19:19:58,719:AtacWorks-main] Checking input files for compatibility
Building model: resnet ...
Loading model weights from ./856_train_latest/model_best.pth.tar...
Finished loading.
Finished building.
Inference -------------------- [ 0/2747]
Inference -------------------- [ 50/2747]
Inference #------------------- [ 100/2747]
Inference #------------------- [ 150/2747]
Inference #------------------- [ 200/2747]
Inference ##------------------ [ 250/2747]
Inference ##------------------ [ 300/2747]
Inference ###----------------- [ 350/2747]
Inference ###----------------- [ 400/2747]
Inference ###----------------- [ 450/2747]
Inference ####---------------- [ 500/2747]
Inference ####---------------- [ 550/2747]
Inference ####---------------- [ 600/2747]
Inference #####--------------- [ 650/2747]
Inference #####--------------- [ 700/2747]
Inference #####--------------- [ 750/2747]
Inference ######-------------- [ 800/2747]
Inference ######-------------- [ 850/2747]
Inference #######------------- [ 900/2747]
Inference #######------------- [ 950/2747]
Inference #######------------- [1000/2747]
Inference ########------------ [1050/2747]
Inference ########------------ [1100/2747]
Traceback (most recent call last):
File "/home/albz/miniconda3/envs/atacworks/bin/atacworks", line 8, in
sys.exit(main())
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 569, in main
worker(args.gpu_idx, ngpus_per_node, args, res_queue)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/worker.py", line 290, in infer_worker
pad=args.pad)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/atacworks/dl4atac/infer.py", line 80, in infer
res_queue.put((idxes, batch_res))
File "", line 2, in put
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/managers.py", line 772, in _callmethod
raise convert_to_error(kind, result)
multiprocessing.managers.RemoteError:

Traceback (most recent call last):
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/managers.py", line 228, in serve_client
request = recv()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 282, in rebuild_storage_fd
fd = df.detach()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/resource_sharer.py", line 58, in detach
return reduction.recv_handle(conn)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/reduction.py", line 182, in recv_handle
return recvfds(s, 1)[0]
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/reduction.py", line 161, in recvfds
len(ancdata))
RuntimeError: received 0 items of ancdata

Process Process-2:
Traceback (most recent call last):
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/site-packages/scripts/main.py", line 217, in writer
if not res_queue.empty():
File "", line 2, in empty
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/managers.py", line 756, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/home/albz/miniconda3/envs/atacworks/lib/python3.6/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe

It outputs a half-finished bedgraph file before going down.

I'm running:

CUDA 11.6
PyTorch 1.7.1
Python 3.6.7

Any thoughts? I appreciate your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants