DeepSpeed4Science releases the new Megatron-DeepSpeed framework to unleash the power of training/inference with very long sequences for structural biology

A good understanding of the latent space can help bio models like GenSLMs from Argonne National Lab (Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics) tackle new domains beyond just viral sequences and expand their ability to model bacterial pathogens and even eukaryotic organisms, e.g., to understand things such as function, pathway membership, and evolutionary relationships . To achieve this scientific goal, GenSLMs and similar models require very long sequence support for both training and inference that is beyond generic LLMsโ€™ long-sequence strategies like FlashAttention. Through DeepSpeed4Scienceโ€™s new designs, scientists can now build and train models with significantly longer context windows, allowing them to explore relationships that were previously inaccessible.