SHI Collaboration Profiles

Profile pages for Sustainable Horizons Institute SRP 2025-2026 Project Leaders


Haohan Wang

Haohan Wang

University of Illinois Urbana Champaign

School of Information Sciences

Biography

Haohan Wang is an assistant professor in the School of Information Sciences at the University of Illinois Urbana-Champaign. His research focuses on the development of trustworthy machine learning methods for computational biology and healthcare applications, such as decoding the genomic language of Alzheimer's disease. In his work, he uses statistical analysis and deep learning methods, with an emphasis on data analysis using methods least influenced by spurious signals. Wang earned his PhD in computer science through the Language Technologies Institute of Carnegie Mellon University where he works with Professor Eric Xing. In 2019, Wang was recognized as the Next Generation in Biomedicine by the Broad Institute of MIT and Harvard because of his contributions in dealing with confounding factors with deep learning.

SRP Project Title

Toward Redefining Disease Taxonomy: Scaling Transcriptomic Analysis with Agentic AI Systems

NAIRR Project

Toward Redefining Disease Taxonomy: Scaling Transcriptomic Analysis with Agentic AI Systems

Topical Areas

Basic Medicine; Biochemistry and Molecular Biology; Health Sciences

Abstract

The project aims to redefine how diseases are classified by using agentic artificial intelligence—a new generation of self-revising, collaborative AI systems—to analyze large-scale human transcriptomic data. Today’s biomedical research relies on rigid pipelines that struggle to integrate data across tissues, populations, and rare conditions. We will build an AI framework composed of autonomous reasoning agents that continuously analyze and validate public datasets, learning to correct themselves and adapt to missing metadata or unexpected patterns. These agents will uncover disease relationships directly from molecular signals rather than from pre-defined diagnostic categories. The outcome is a data-driven taxonomy of disease—linking common and rare conditions through shared regulatory signatures—that could reshape how medicine understands biological variability. Students will join a cross-disciplinary team at the intersection of AI, computational biology, and open science, contributing to a system that learns science itself.

Desired Skills

We welcome students from computer science, data science, biology, or related fields who are curious about how AI can advance scientific discovery. Interest in coding, statistics, or genomics is helpful but not required. The most important traits are curiosity, persistence, and comfort working across disciplines. Experience with Python, R, or machine learning frameworks is a plus, but students can learn these skills during the project.

Additional Comments

Students will gain experience working with large biological datasets and intelligent systems that operate autonomously. The environment emphasizes mentorship, open collaboration, and publication-quality research. Outstanding participants may continue with the lab through independent study or joint conference submissions.

Lightning Talk Title

Toward Redefining Disease Taxonomy

Keywords

Disease should not be defined by clinical manifestation, but by the underlying mechanism