SHI Collaboration Profiles

Profile pages for Sustainable Horizons Institute SRP 2025-2026 Project Leaders


Xiao Wang

Xiao Wang

Oak Ridge National Laboratory

Biography

Dr. Xiao Wang is a research staff scientist in the Computational Science and Engineering Division at Oak Ridge National Laboratory (ORNL). He earned dual Bachelor's degrees in Mathematics and Computer Science from Saint John’s University, MN (2012), and completed his M.S. and Ph.D. in Electrical and Computer Engineering at Purdue University (2016–2017) under Dr. Charles Bouman and Dr. Samuel Midkiff. Before joining ORNL in 2021, he conducted postdoctoral research at Harvard Medical School and Boston Children’s Hospital, focusing on medical imaging. Dr. Wang’s research lies at the intersection of artificial intelligence (AI), high-performance computing (HPC), and computational imaging. He develops algorithms that integrate AI, imaging physics, and HPC to enable high-resolution, data-efficient imaging across modalities such as X-ray, CT, MRI, electron tomography, and satellite imaging, with applications in medicine, biology, climate science, and national security. He received the 2022 AAPM Truth CT Reconstruction Challenge award, was a finalist for the ACM Gordon Bell Prize in 2017 and 2024, and received the 2024 HPCWire Top Supercomputing Achievement Award. His current work focuses on scalable, energy-efficient, and trustworthy Vision Transformer foundation models for large-scale imaging applications.

SRP Project Title

Computing-Efficient Training for Large-Scale Vision Transformer Foundation Models

NAIRR Project

Computing-Efficient Training for Large-Scale Vision Transformer Foundation Models

Topical Areas

Applied Computer Science; Artificial Intelligence and Intelligent Systems; Computer Science

Abstract

Vision Transformer (ViT) is a powerful AI architecture for computer vision that is used by most imaging foundation models due to its effectiveness in discerning complex visual patterns across many tasks. However, training large-scale ViT foundation models requires considerable computing resources, leading to a significant energy footprint for training. For example, Open-AI’s SORA video generator model was trained on more than 10,000 NVIDIA H-100 GPUs and the training took more than a month on a supercomputer. The energy consumption for training SORA was equivalent to the total annual energy consumption of 300 US households. This one-year project aims to improve ViT scaling algorithms computing efficiency, reducing AI development cycle and training time. We will develop a training framework optimized for hardware-conscious scaling and computing efficiency, specifically tailored for large-scale ViT models.

Desired Skills

AI, efficient computing

Lightning Talk Title

Energy Efficient Vision Transformer Training Framework For Exascale Foundation Models

Keywords

vision transformer, exascale foundation model, high performance computing, energy efficiency