Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NextPolish #398

Closed
3 tasks
laraPPr opened this issue Aug 23, 2024 · 24 comments · Fixed by easybuilders/easybuild-easyconfigs#21265 or easybuilders/easybuild-easyconfigs#21588
Closed
3 tasks

NextPolish #398

laraPPr opened this issue Aug 23, 2024 · 24 comments · Fixed by easybuilders/easybuild-easyconfigs#21265 or easybuilders/easybuild-easyconfigs#21588
Assignees
Labels
difficulty: easy software that should be easy to support new new software priority: ASAP site:ugent Software installation request for UGent Tier-2

Comments

@laraPPr
Copy link
Collaborator

laraPPr commented Aug 23, 2024

@laraPPr laraPPr added difficulty: easy software that should be easy to support new new software priority: ASAP site:ugent Software installation request for UGent Tier-2 labels Aug 23, 2024
@pavelToman pavelToman self-assigned this Aug 26, 2024
pavelToman added a commit that referenced this issue Aug 26, 2024
pavelToman added a commit that referenced this issue Aug 26, 2024
@pavelToman
Copy link
Collaborator

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 27, 2024

installing now

@laraPPr laraPPr reopened this Aug 27, 2024
@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 27, 2024

Failed on shinx will check it out

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 27, 2024

Sanity check failed: No '(RPATH)' found in 'readelf -d' output for /apps/gent/RHEL9/zen4-ib/software/NextPolish/1.4.1-GCCcore-12.3.0/lib/calgs.so

@boegel
Copy link
Contributor

boegel commented Aug 27, 2024

The issue is that NextPolish comes with a bunch of pre-compiled libraries (which don't use RPATH linking, which makes sense because that would make them difficult to use across different systems):

$ ls -lrt lib/*.so
-rwxr-xr-x@ 1 kehoste  staff  579992 Jul  7  2022 lib/nextpolish1.so
-rwxr-xr-x@ 1 kehoste  staff   10280 Jul  7  2022 lib/calgs.so
-rwxr-xr-x@ 1 kehoste  staff  678328 Jul  7  2022 lib/nextpolish2.so
$ ldd lib/*.so
lib/calgs.so:
        linux-vdso.so.1 (0x00007ffe9b99c000)
        libz.so.1 => /lib64/libz.so.1 (0x00001459853fc000)
        libc.so.6 => /lib64/libc.so.6 (0x0000145985037000)
        /lib64/ld-linux-x86-64.so.2 (0x0000145985817000)
lib/nextpolish1.so:
        linux-vdso.so.1 (0x00007ffda4b8e000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00001510409c6000)
        libm.so.6 => /lib64/libm.so.6 (0x0000151040644000)
        libbz2.so.1 => /lib64/libbz2.so.1 (0x0000151040433000)
        liblzma.so.5 => /lib64/liblzma.so.5 (0x000015104020c000)
        libz.so.1 => /lib64/libz.so.1 (0x000015103fff4000)
        libc.so.6 => /lib64/libc.so.6 (0x000015103fc2f000)
        /lib64/ld-linux-x86-64.so.2 (0x0000151040e74000)
lib/nextpolish2.so:
        linux-vdso.so.1 (0x00007ffe3d119000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x000014621a7a3000)
        libm.so.6 => /lib64/libm.so.6 (0x000014621a421000)
        libbz2.so.1 => /lib64/libbz2.so.1 (0x000014621a210000)
        liblzma.so.5 => /lib64/liblzma.so.5 (0x0000146219fe9000)
        libz.so.1 => /lib64/libz.so.1 (0x0000146219dd1000)
        libc.so.6 => /lib64/libc.so.6 (0x0000146219a0c000)
        /lib64/ld-linux-x86-64.so.2 (0x000014621ac69000)

The best way to fix this is probably via patchelf, by using that to inject an RPATH section that ensures that the correct libz.so is picked up for example. This would need to be done conditionally, only when EasyBuild is configured to use RPATH, so that probably means using an easyblock (unless we enhance framework so an easyconfig file can know whether or not RPATH is enabled). See also (for example) the use of patchelf in MaSuRCA-4.1.0-GCC-11.3.0.eb.

@pavelToman
Copy link
Collaborator

pavelToman commented Aug 27, 2024

So the solution is to add to postinstallcmds this: "patchelf --force-rpath --set-rpath '$ORIGIN' %(installdir)s/lib/calgs.so" ?
Or the $ORIGIN should be changed?

Slack conversation about calgs.so : https://easybuild.slack.com/archives/GP6UN4MKP/p1724764742743239

@boegel
Copy link
Contributor

boegel commented Aug 27, 2024

$ORIGIN will need to be changed to also include $EBROOTZLIB/lib, likewise for other dependencies. And usually both $ORIGIN and $ORIGIN/../lib are included (though that doesn't seem relevant here since those libraries aren't linking to each other), so something like:

patchelf --force-rpath --set-rpath "\$ORIGIN:\$ORIGIN/..lib:$EBROOTZLIB/lib:..." %(installdir)s/lib/calgs.so

Escaping the $ in $ORIGIN is important, since $ORIGIN should not be resolved, while $EBROOTZLIB should be.

The patchelf should be done on the lib/*.so files, mainly.

That said, I fail to see where any of these libraries are actually used... What's their purpose exactly?

All the provided binaries are actually of dependencies (which ideally don't get built from source when building NextPolish, since we have modules for them already).

@laraPPr Maybe we should remove the modules we've installed already, and take a closer look at this first before informing the people who requested this installation?

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 27, 2024

I'm ok with removing them. Have not informed them yet.

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 27, 2024

Removed the modules

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 27, 2024

This is the extra info that was given. It is still not clear to me what they need NextPolish for so maybe I should even ask for more clarification?

I want to use NextPolish to fix base errors in my assembled genome. So after assembly,
we would run NextPolish to 'polish' the genome.
NextPolish uses less time and has a higher correction accuracy than alternative
programs, especially for large genomes. We are working with Fusarium genomes, which
are known to be quite big.

Underneath is an example script, we will do something similar.
reads="/location of reads/example.fastq.gz"
genome="/path to long read polish/racon-2.fasta"
rounds=2
cpus=XX
memory=XXg

for ((i=1; i<=${rounds};i++))
do
#step 1
#index the genome and do alignment
bwa index $genome
bwa mem -t $cpus -p $genome $reads \
| samtools view -@ $cpus -F 0x4 -b - \
| samtools fixmate -m -@ $cpus - - \
| samtools sort -m $memory -@ $cpus - \
| samtools markdup -@ $cpus -r - sorted.bam
#index bam and genome
samtools index -@ $cpus sorted.bam
samtools faidx $genome
#polish genome
    python nextpolish1.py -g $genome -t 1 -p $cpus -s sorted.bam >tmp~${i}.fasta
#step 2
    #index the genome and do alignment
    bwa index tmp~${i}.fasta
    bwa mem -t $cpus -p tmp~${i}.fasta ${readFile} \
        | samtools view -@ $cpus -F 0x4 -b - \
        | samtools fixmate -m -@ $cpus  - - \
        | samtools sort -m $memory -@ $cpus - \
        | samtools markdup -@ $cpus -r - sorted.bam
    #index bam and genome
    samtools index -@ $cpus sorted.bam
    samtools faidx tmp~${i}.fasta
    #polish genome
    python nextpolish1.py -g tmp~${i}.fasta -t 2 -p $cpus -s sorted.bam > nextPolish~${i}.fasta

     genome=nextPolish~${i}.fasta
done

@boegel
Copy link
Contributor

boegel commented Aug 27, 2024

Since that only shows bwa and samtools commands, can you ask them how NextPolish is different from just using BWA and SAMtools?

@boegel
Copy link
Contributor

boegel commented Aug 29, 2024

I'll ask the user for more input, like where nextpolish1.py comes from, and what's in it...

@boegel
Copy link
Contributor

boegel commented Aug 29, 2024

nextpolish1.py is part of the installation, see $EBROOTNEXTPOLISH/lib/nextpolish1.py, so we can try ourselves whether using with BWA & SAMtools as dependencies and not letting NextPolish installs its own copies works, using the tutorial available at https://nextpolish.readthedocs.io/en/latest/TUTORIAL.html

So I don't think we need more output from the user on this...

@pavelToman
Copy link
Collaborator

minimap2, samtools and bwa is no more from NextPolish but from EB: https://github.com/vscentrum/vsc-software-stack/blob/wip/398_NextPolish/nextpolish.eb

For a problem with RPATH there is another WIP: https://github.com/vscentrum/vsc-software-stack/blob/wip/398_NextPolish/nextpolish_v2.eb
It can be build but still the RPATH is not ok.
You can test it by: $ nextPolish test_data/run.cfg
For now it returns: OSError: /kyukon/scratch/gent/vo/001/gvo00117/easybuild/RHEL8/cascadelake-ampere-ib/software/NextPolish/1.4.1-GCC-12.3.0/lib/calgs.so: ELF load command address/offset not properly aligned

@boegel
Copy link
Contributor

boegel commented Oct 7, 2024

@pavelToman There's a typo in your patchelf command, you should use $ORIGIN/../lib instead of $ORIGIN/..lib (missing /).
And you can drop the $EBROOTGCC entry

@pavelToman
Copy link
Collaborator

I still missing something:
My command to fix RPATH:
patchelf --force-rpath --set-rpath "\$ORIGIN:\$ORIGIN/../lib:$EBROOTZLIB/lib" %(installdir)s/lib/calgs.so
After, the patchelf --print-rpath %(installdir)s/lib/calgs.so cmd returns: $ORIGIN:$ORIGIN/../lib:/apps/gent/RHEL8/cascadelake-ib/software/zlib/1.2.13-GCCcore-12.3.0/lib, but as I run nextPolish test_data/run.cfg it still returns: ELF load command address/offset not properly aligned

@boegel
Copy link
Contributor

boegel commented Oct 7, 2024

@pavelToman What do you get when you run ldd on the library after using patchelf --set-rpath?

NixOS/patchelf#492 suggests you may be hitting a bug in patchelf v0.18.0.
There's an unmerged fix for it in NixOS/patchelf#566, maybe you can try patching and rebuilding patchelf 0.18.0 with that fix, and see if that fixes the problem?

@pavelToman
Copy link
Collaborator

pavelToman commented Oct 7, 2024

ldd returns not a dynamic executable after patchelf --set-rpath,
but file calgs.so returns calgs.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=e746a58f13d563314f1d0504f97bc12057875684, stripped
and readelf -h calgs.so returns DYN (Shared object file)

@boegel
Copy link
Contributor

boegel commented Oct 7, 2024

@pavelToman I think what ldd is reporting is another sign of the library being broken by patchelf

@pavelToman
Copy link
Collaborator

pavelToman commented Oct 7, 2024

Ok, there is a another news - it start working without patchelf. I delete all the NextPolish modules to start fresh and let it be build without patchelf cmd.
Now when I run ldd calgs.so it returns:

linux-vdso.so.1 (0x00001486bc29f000)
libz.so.1 => /apps/gent/RHEL8/cascadelake-ib/software/zlib/1.2.13-GCCcore-12.3.0/lib/libz.so.1 (0x00001486bc27e000)
libc.so.6 => /lib64/libc.so.6 (0x00001486bbaab000)
/lib64/ld-linux-x86-64.so.2 (0x00001486bc073000)

so it seems the libz is ok.
But patchelf --print-rpath returns nothing
and readelf -d calgs.so returns:

  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libz.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x898
 0x000000000000000d (FINI)               0x1538
 0x0000000000000019 (INIT_ARRAY)         0x201de0
 0x000000000000001b (INIT_ARRAYSZ)       8 (bytes)
 0x000000000000001a (FINI_ARRAY)         0x201de8
 0x000000000000001c (FINI_ARRAYSZ)       8 (bytes)
 0x000000006ffffef5 (GNU_HASH)           0x1f0
 0x0000000000000005 (STRTAB)             0x4b8
 0x0000000000000006 (SYMTAB)             0x230
 0x000000000000000a (STRSZ)              295 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000003 (PLTGOT)             0x202000
 0x0000000000000002 (PLTRELSZ)           360 (bytes)
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000017 (JMPREL)             0x730
 0x0000000000000007 (RELA)               0x658
 0x0000000000000008 (RELASZ)             216 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x000000006ffffffe (VERNEED)            0x618
 0x000000006fffffff (VERNEEDNUM)         1
 0x000000006ffffff0 (VERSYM)             0x5e0
 0x000000006ffffff9 (RELACOUNT)          3
 0x0000000000000000 (NULL)               0x0

Either nextPolish test_data/run.cfg runs ok

@pavelToman
Copy link
Collaborator

pavelToman commented Oct 7, 2024

So I patched the patchelf v0.18 and it looks good:
I used patchelf --force-rpath --set-rpath '\$ORIGIN:\$ORIGIN/../lib:$EBROOTZLIB/lib' %(installdir)s/lib/calgs.so
patchelf --print-rpath calgs.so returns:
\$ORIGIN:\$ORIGIN/../lib:$EBROOTZLIB/lib
ldd calgs.so returns:

linux-vdso.so.1 (0x00007ffe745e4000)
libz.so.1 => /apps/gent/RHEL8/cascadelake-ib/software/zlib/1.2.13-GCCcore-12.3.0/lib/libz.so.1 (0x0000154f7b07
5000)
libc.so.6 => /lib64/libc.so.6 (0x0000154f7a69e000)
/lib64/ld-linux-x86-64.so.2 (0x0000154f7ae64000)

and readelf --dynamic calgs.so returns:

Dynamic section at offset 0x200440 contains 26 entries:
  Tag        Type                         Name/Value
 0x000000000000000f (RPATH)              Library rpath: [\$ORIGIN:\$ORIGIN/../lib:$EBROOTZLIB/lib]
 0x0000000000000001 (NEEDED)             Shared library: [libz.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x898
 0x000000000000000d (FINI)               0x1538
 0x0000000000000019 (INIT_ARRAY)         0x201de0
 0x000000000000001b (INIT_ARRAYSZ)       8 (bytes)
 0x000000000000001a (FINI_ARRAY)         0x201de8
 0x000000000000001c (FINI_ARRAYSZ)       8 (bytes)
 0x000000006ffffef5 (GNU_HASH)           0x400028
 0x0000000000000005 (STRTAB)             0x4002f0
 0x0000000000000006 (SYMTAB)             0x400068
 0x000000000000000a (STRSZ)              336 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000003 (PLTGOT)             0x202000
 0x0000000000000002 (PLTRELSZ)           360 (bytes)
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000017 (JMPREL)             0x730
 0x0000000000000007 (RELA)               0x658
 0x0000000000000008 (RELASZ)             216 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x000000006ffffffe (VERNEED)            0x618
 0x000000006fffffff (VERNEEDNUM)         1
 0x000000006ffffff0 (VERSYM)             0x5e0
 0x000000006ffffff9 (RELACOUNT)          3
 0x0000000000000000 (NULL)               0x0

Is it ok like that? Should it return \$ORIGIN not $ORIGIN?

@pavelToman
Copy link
Collaborator

@boegel
Copy link
Contributor

boegel commented Oct 8, 2024

@pavelToman Make sure you test with eb --rpath to enable RPATH linking (which will be the default in EasyBuild 5.0, and we're already using that on our RHEL9 cluster)

@boegel
Copy link
Contributor

boegel commented Oct 16, 2024

@pavelToman ready for cleanup

@boegel boegel reopened this Oct 16, 2024
pavelToman added a commit that referenced this issue Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty: easy software that should be easy to support new new software priority: ASAP site:ugent Software installation request for UGent Tier-2
Projects
None yet
4 participants