-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
famdb.py and the latest version of Dfam #238
Comments
Immanently is the the hopeful answer. I expect either today or tomorrow. |
Wow! That is epic! I can't wait for it and I do really appreciate your work on this great software |
Dear Robert, I have downloaded the lastest version of RepeatMasker and prepared the dfam38_full.5.h5 (gunzip from dfam38_full.5.h5.gz) in /home/my_data/biosoft/RepeatMasker/Libraries/famdb/. When I run perl ./configure, it hints as below:
it failed to find the database, could you tell me what has happened? |
my fault. I need to download root partition to meet minimun requirement. |
Dear Robert, After installing RepeatMasker 4.1.6, I noticed that the
Here are ./Libraries index:
Here are the installation notice:
Thank you very much |
The "RepeatMaskerLib.h5" file has been replaced by the individual HDF5 partitions now located in Libraries/famdb/*.h5. I suspect we missed updating the example documentation somewhere as the famdb option "-i" is used to point the tool to the directory containing the partitions. E.g:
or simply (using default locations):
NOTE: The Dfam 3.8 famdb partitions now contain both curated/uncurated families. If you are used to working with only the curated Dfam families, you may want to add the "--curated" flag to famdb to only retrieve those:
which sadly reports that there are no curated TE families available for this clade. RepeatMasker will still (by default) use only curated families from famdb to perform a search. However we have introduced a new RepeatMasker flag ( "-uncurated" ) to support the new combined famdb files and utilize both the curated and uncurated families in a search when requested. To see where all the uncurated families are coming from, you may wan to check out the lineage command:
It looks like these came from de-novo runs on Citrullus lanatus (1,804 families), Cucumis melo (1,970 families), and Cucumis sativus (1,062 families). The value following each species/clade in parentheses denotes the famdb partition containing these families. |
Dear Robert, Thank you for your kind assistance! I will give it a try, and should there be any new developments or bugs, I would keep you updated. |
Dear developer,
What do you want to know?
When will the RepeatMasker or famdb.py update for the latest version of Dfam?
Helpful context
I found that Dfam has just updated the Dfam.h5 version (https://www.dfam.org/releases/Dfam_3.8/families/FamDB/), and now I can only download the Viridiplantae Partition, it will take me less time to build database.
I have install RepeatMasker-4.1.5 with Dfam.h5 and Repbase. By the way, it takes me quite a long time to do it
The text was updated successfully, but these errors were encountered: