Powered By:

Announcing the DeepSpeed4Science Initiative:

Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

Open Source

AI System Designs for Science Domains

Committed to AI4Good

Broad AI Hardware Coverage

Optimized for Performance and Scalability

Technology Reuse

Close-up, abstract view of architecture.

Mission Statement

In the next decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. In line with Microsoft’s mission to solve humanity’s most pressing challenges, the DeepSpeed team at Microsoft is responding to this opportunity by launching a new initiative called DeepSpeed4Science, aiming to build unique capabilities through AI system technology innovations to help domain experts unlock today’s biggest science mysteries.


The DeepSpeed system is an industry leading open-source AI system framework, developed by Microsoft, that enables unprecedented scale and speed for deep learning training and inference on a wide range of AI hardware. By leveraging DeepSpeed’s current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). We work closely with internal and external partners who own AI-driven science models that represent key science missions, to identify and address general domain-specific AI system challenges. This includes climate science, drug design, biological understanding, molecular dynamics simulation, cancer diagnosis and surveillance, catalyst/material discovery, quantum computing, and other domains.

Ultimate Goals

  • DeepSpeed4Science eliminates memory explosion problems for scaling Evoformer-centric structural biology

    By collaborating with OpenFold, DeepSpeed4Science enables a set of highly memory-efficient DS4Sci_EvoformerAttention kernels enabled by sophisticated fusion/tiling strategies and on-the-fly memory reduction methods, are created for the broader community as high-quality machine learning primitives to build modern protein structure prediction models driven by AI.

  • DeepSpeed4Science releases the new Megatron-DeepSpeed framework to unleash the power of training/inference with very long sequences for structural biology

    A good understanding of the latent space can help bio models like GenSLMs from Argonne National Lab (Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics) tackle new domains beyond just viral sequences and expand their ability to model bacterial pathogens and even eukaryotic organisms, e.g., to understand things such as function, pathway membership, and evolutionary relationships .…

  • Grand Release of DeepSpeed4Science Initiative

    DeepSpeed4Science initiative is finally released! Check out our new system support for structural biology research, including addressing the memory explosion problem for EvoFormerAttention-centric protein structure prediction models, and our new Megatron-DeepSpeed framework release to enable extremely long sequences both systematically and algorithmically for domain scientists.

Accessible by clicking on the weather icons on the Windows 10/11 taskbars or from the Microsoft Start homepage, Weather from Microsoft Start provides precise weather information to help users make better decisions for their lifestyles, health, jobs and activities – including accurate 10-day global weather forecasts updated multiple times every hour. It was recently recognized by the Director of NOAA’s National Weather Service for its contributions to public safety, and it was named the most accurate weather forecasting service by ForecastWatch. In the past, Microsoft Weather team benefited from DeepSpeed technologies to accelerate their multi-GPU training environments. Currently, DeepSpeed4Science is helping Microsoft WebXT’s weather team to further enhance Microsoft Weather services using DeepSpeed4Science technologies on novel scalable model design and highly-efficient training and inference support at service level.

Slide 1
Scientific Foundation Model (SFM), MSR AI4Science
Image is not available

Scientific foundation model (SFM) aims to create a unified large-scale foundation model to empower natural scientists for scientific discovery by supporting diverse inputs, multiple scientific domains (e.g., drug, materials, biology, health, etc.) and computational tasks (e.g., quantum chemistry, dynamic sampling, etc.). The DeepSpeed4Science partnership will provide new training and inference technologies to empower the SFM team’s continuous research on projects like Microsoft’s new generative AI methods, such as Distributional Graphormer.

Slide 2
ClimaX, MSR AI4Science
Image is not available

Our changing climate is producing more frequent extreme weather events. To mitigate the negative effects, it is increasingly important to predict where these events will occur. ClimaX is the first foundation model designed to perform a wide variety of weather and climate modeling tasks. It can absorb many different datasets with different variables and resolutions, potentially improving weather forecasting. DeepSpeed4Science is creating new system supports and acceleration strategies for ClimaX for efficiently pretraining/finetuning bigger foundation models while handling very large high-resolution image data (e.g., tens to hundreds of petabytes) with long sequences.

Slide 3
AI Powered Ab Initio Molecular Dynamics (AI2MD), MSR AI4Science
Image is not available

This project simulates the dynamics of large (million-atom) molecular systems with near ab initio accuracy using AI-powered force field models while maintaining the efficiency and scalability of classical molecular dynamics. The simulations are efficient enough to generate trajectories long enough to observe chemically significant events. Typically, millions or even billions of inference steps are required for this process. This poses a significant challenge in optimizing the inference speed of graph neural network (GNN)+ LLM models, for which DeepSpeed4Science will provide new acceleration strategies.

previous arrowprevious arrow
next arrownext arrow

OpenFold is a non-profit AI research and development consortium developing free and open-source software tools for biology and drug discovery. Their mission is to bring the most powerful software ever created — AI systems with the ability to engineer the molecules of life — to everyone. These tools can be used by academics, biotech and pharmaceutical companies, or students learning to create the medicines of tomorrow, to accelerate basic biological research, and bring new cures to market that would be impossible to discover without AI. One of their recent works is the OpenFold library, which is a faithful but trainable PyTorch reproduction of DeepMind’s AlphaFold 2. In the past, the initial OpenFold release has benefited from the classic DeepSpeed technologies. As part of our initiative’s kick-off release, we are showcasing how DeepSpeed4Science empowered their continuous research in high-fidelity protein structure prediction. For more information, please visit here




Argonne is pleased to be partnering with Microsoft DeepSpeed in the development of the Aurora model for our core science project through DeepSpeed4Science open sourced technologies. Science-driven large-language models will be developed at Argonne and trained on the Aurora Exascale system (https://www.anl.gov/aurora). The targeting models are in the trillion-parameter class, pretrained on a large corpus of general text and code as well as expertly curated large-scale (trillions of tokens) scientific domain datasets in microbiology, biochemistry, ecology, materials, chemistry, mathematics, physics and cosmology. DeepSpeed4Science technology is critical to reach the training performance requirements and scalability needed for these models to be feasibly trained. Argonne and partners recently formed the Trillion Parameter Consortium to bring together global-scale stakeholders interested in advancing large-scale scientific discovery. Microsoft’s DeepSpeed4Science team are founding members of the consortium.

Oak Ridge has initiated an exciting new partnership with DeepSpeed4Science, expanding the scope of our target science domains including cancer research and GeoAI applications. For example, MOSSAIC project for cancer surveillance partnering with National Cancer Institute (NCI) and DeepSpeed4Science to enable high-fidelity extraction and classification of information from unstructured clinical texts.

Fusion energy represents a promising source of clean energy for the future, harnessing the power processes that fuel the sun and stars. However, plasma disruptions in tokamaks present significant challenges to the consistent and safe delivery of fusion energy. The INFUSE project, representing a partnership between scientists from Princeton University, DOE Argonne National Lab, DOE Brookhaven National Lab, and Microsoft, addresses this challenge by leveraging AI and advanced transformer models to enhance plasma disruption forecasting and control capabilities in magnetically-confined tokamak experiments. DeepSpeed4Science open source library plays a key role in the project by enabling efficient distributed training of these large transformer models, tailored for intricate time series forecasting tasks.

The DeepSpeed4Science team is excited to welcome AMD as a founding partner. We are looking forward to closely collaborating with AMD and their Research and Advanced Development (RAD) team to accelerate AI4Science models and workflows on world-class AMD hardware and software. AMD RAD is a leading research group with a stellar track record of partnering with the U.S. Department of Energy (DOE) National Laboratories and other organizations for more than a decade to drive critical advances in next-generation high-performance computing technologies and scientific breakthroughs. For example, AMD collaborated with the DOE National Laboratories on the FastForward, FastFoward-2, DesignForward, DesignForward-2, and PathForward programs to research and develop hardware and software technologies for exascale computing that are now found in some of the world’s fastest and most power-efficient supercomputers. 

AMD offers a wide variety of compute and network technology solutions with unmatched performance for critical workloads across AI/ML and HPC. The AMD ROCm™ open software platform is optimized for major HPC and machine learning frameworks, including PyTorch and DeepSpeed. According to the June 2023 Top500/Green500 list, systems with AMD EPYC™ Processors and/or AMD Instinct™ Accelerators power twelve of the twenty fastest supercomputers and thirteen of the twenty most energy-efficient supercomputers in the world. This includes ORNL’s Frontier supercomputer, the world’s fastest supercomputer on the 2023 Top500 list and the first to break the coveted exascale barrier, and CSC’s LUMI supercomputer, the fastest supercomputer in Europe. The Frontier and LUMI systems also take the top two spots in the latest HPL-MxP mixed-precision benchmark, which highlights the convergence of HPC and AI workloads. The Frontier supercomputer is being used to build a suite of open foundation large language models (LLMs) called FORGE for cutting edge scientific discovery. FORGE includes models with up to 26 billion parameters using 257 billion tokens from over 200 million scientific articles. Furthermore, the LUMI Supercomputer is being used to train a GPT-3 LLM with 13 billion parameters. The Allen Institute is collaborating with AMD to use LUMI to train a new, open LLM called Open Language Model (OLMo), for scientific discovery. This LLM is intended to benefit the research community by providing access and education for all aspects of model creation.

The DeepSpeed4Science team looks forward to closely collaborating with AMD to explore their broad portfolio of compute and network hardware including CPUs, GPUs, AI inference engines, FPGAs, Adaptive SoCs, and Smart NICs in order to deliver adaptable and accelerated systems that drive the AI4Science future. For example, high performance and energy efficiency may be achieved with ML-based surrogate models by running scientific simulations on AMD CPUs and GPUs, training/retraining on AMD GPUs, performing inference on AI Engines, and using FPGAs for data processing. Efficient scale-out performance can be achieved by offloading common distributed deep learning primitives on AMD Smart NICs.

The AMD RAD team has been actively developing and investigating MLIR-based compiler passes, smart runtimes for heterogeneous architectures, fast communication primitives, and optimized neural network libraries for AMD’s heterogeneous hardware portfolio. These techniques and technologies can help optimize foundational AI4Science models on systems with AMD hardware. AMD RAD has also released low-overhead, open-source performance monitoring and analysis tools that will be critical to design, optimize, and deploy AI4Science workloads. We look forward to further exploring these research topics by closely collaborating with AMD researchers and engineers to build new foundational AI4Science models and co-design deep learning system architecture technologies. Together, we plan to advance scientific knowledge and accelerate breakthroughs across various domains, such as material sciences, computational chemistry, health sciences, manufacturing, molecular dynamics, high-energy physics, cosmology, and more.


DS4Sci_EvoformerAttention: eliminating memory explosion problems for scaling Evoformer-centric structural biology models

September 19, 2023

To address this common system challenge in structural biology research (e.g., protein structure prediction and equilibrium distribution prediction), DeepSpeed4Science is addressing this memory inefficiency problem by designing customized exact attention kernels for the attention variants (i.e., EvoformerAttention), which widely appear in this category of science models. Specifically, a set of highly memory-efficient DS4Sci_EvoformerAttention kernels enabled by sophisticated fusion/tiling strategies and on-the-fly memory reduction methods, are created for the broader community as high-quality machine learning primitives.

DeepSpeed4Science enables very-long sequence support via both systematic and algorithmic approaches for genome-scale foundation models

September 19, 2023

At system level, we release the newest Megatron-DeepSpeed framework for very-long sequence support along with other new optimizations . Scientists can now train their large science models like GenSLMs from Argonne National Lab with much longer sequences via a synergetic combination of our newly added memory optimization techniques on attention mask and position embedding, tensor parallelism, pipeline parallelism, sequence parallelism, ZeRO-style data parallelism and model state offloading. Additional support for domain scientists who prefer algorithmic strategies like relative position embedding techniques is also integrated in this new release.

Kick-off Release of DeepSpeed4Science Initiative

September 19, 2023

DeepSpeed4Science initiative is finally released! Check out our MSR blog !

Join the Consortium

We welcome broader community engagement. Members may contribute technically and play a key role in helping us choose new directions and high-priority projects. If you are interested in becoming a member, please contact us with the details of your science model.


The Consortium also benefits from support provided by different non-member organizations and individuals with aligned missions — if you have a research idea or seek other collaboration opportunities, please reach out to us.

Contribute through Code

DeepSpeed4Science is an open source project and anyone can help contribute directly through contributing PRs, joining the discussion or issues submission. Please help us build this unique open-source community for AI for science.

What is the mission of DeepSpeed4Science?

By partnering with our internal and external collaborators and their key AI-driven science missions, DeepSpeed4Science aims to build unique capabilities through AI system technology innovations to help domain experts unlock today’s biggest science mysteries.

Who are the members?

DeepSpeed4Science’s founding members span technology corporations and their research divisions, academia, startups and key national labs. The launching group includes DeepSpeed@Microsoft, Microsoft WebXT/Bing, MSR AI4Science@Microsoft, Argonne National Lab, Columbia University, Argonne National Lab, AMD, Oak Ridge National Lab, Princeton University and Brookhaven National Lab.

Who can join our mission?

We welcome broader community engagement. Members may contribute technically and play a key role in helping us choose new directions and high-priority projects. We then decide what these high-priority projects for us are and how we can collaborate.

Where can we find the code for the new releases?

Open source software releases from DeepSpeed4Science will be hosted at here, as part of DeepSpeed framework. DeepSpeed4Science will not host these science models. Our partners will host their own science models and data. DeepSpeed4Science only provides tailored AI system support for scientific discoveries.

Image by Freepik