Dfam 3.2

Created 2020-07-09. Updated 2020-07-09.

Dfam is proud to announce the release of Dfam 3.2.  This release represents a significant step in the expansion of Dfam by providing early access to uncurated, de novo generated families.  As a demonstration of this new capability, we imported a set of 336 RepeatModeler generated libraries produced by Fergal Martin and Denye Ogeh at the European Bioinformatics Institute (EBI).  Also in this release, Dfam now provides family alignments to the RepeatMasker TE protein database aiding in the discovery of related families and in the classification of uncurated TEs.

—Xfam blog, https://xfam.wordpress.com/2020/07/09/dfam-3-2-release/


I would like to thank my employer, the Institute of Systems Biology (ISB), who has always been supportive of remote work and especially so during the ongoing COVID-19 pandemic. Thanks especially to: our communications and HR teams, for clear messaging during these difficult times; to those at ISB who are not working from home and are doing COVID-19 research, for helping find effective vaccines and treatments; and to our IT, facilities, janitorial, operations, and other departments, for keeping everything running smoothly.


I haven't really described my work on my blog before, so here goes my (long) elevator pitch:


Our small team at ISB has been hard at work on Dfam 3.2, especially the last two or three months. Dfam now has 40 times as many entries as last year, so this was a perfect opportunity to test our infrastructure. Despite our preparations, the sheer size of the dataset revealed bugs in our scripts and tools and I for one am happy that things have calmed down a bit. With this release we have a baseline data set to curate in the future, developing and sharing our curation techniques and invititing collaboration from the community along the way.

The opinions expressed herein are my own and do not necessarily represent the views of ISB or any of its collaborators.