SHI SRP 25-26 Profiles

Profile pages for Sustainable Horizons Institute SRP 25-26 Student Matching Workshop participants.

Naga Venkata Sai Rama Krishna Rohan Avireddy

He/Him/His

University of Washington

School of Engineering and Technology

Biography

Hi, I’m Rohan, originally from Rajahmundry, a small suburban town in Andhra Pradesh, India. I did my schooling there and went on to pursue a bachelor’s degree in Computer Science and Engineering from Vellore Institute of Technology, Tamil Nadu. I recently completed my MS in Computer Science and Systems at the University of Washington Tacoma, where I will continue as a PhD student from Spring under the mentorship of Dr. Ka Yee Yeung. Her lab develops cloud-based bioinformatics pipelines for biomedical data as part of the NIH MorPhiC consortium My research domain is Machine Learning and Bioinformatics. I’ve worked on transformer-based models for genomic data, generative modelling for synthetic transcriptomics, and integrating omics and imaging data for multi-modal analysis. At present, I’m working as a Machine Learning Engineer at BioDepot LLC, where I focus on developing and benchmarking ML algorithms for omics data pipelines. Earlier, I interned at Amazon as an Applied Scientist, where I had the opportunity to work on generative AI solutions, and at Samsung R&D, where I built transformer models for the Bixby Assistant. Outside of work, I enjoy hiking, reading, playing pickleball, and listening to Radiohead.

Academic Status

PhD Student - 1st

Research Area/Department

Biology; Computer Science; Machine Learning/AI

Major/Specialty

Computer Science

Degrees Earned or in Progress

1. PhD in Computer Science and Systems University of Washington, Tacoma 2026 - 2029 2. Master of Science in Computer Science and Systems University of Washington, Tacoma 2023 - 2025 3. Bachelor of Technology in Computer Science and Engineering: Vellore Institute of Technology, Chennai, Tamil Nadu 2019 - 2023

Academic Preparation

1. Data Structures and Algorithms (UG) 2. Object-oriented Programming (UG) 3. Database system design (UG + MS) 4. Machine Learning (UG + MS) 5. Data Science and Analytics (UG) 6. Distributed Computing (UG + MS) 7. Theory of Computation and Compiler Design ( UG) 8. Operating Systems (UG) 9. Computer Networks (UG) 10. Cloud Computing (MS) 11. Advanced Machine Learning (MS) 12. Capstone Project (UG + MS)

Research/Publications

Undergraduate Research in Natural Language Processing: 1. Hate Speech Detection using BERT models: https://ceur-ws.org/Vol-3159/T1-39.pdf 2. Beware Haters at ComMA@ICON: Sequence and Ensemble Classifiers for Aggression, Gender Bias and Communal Bias Identification in Indian Languages: https://aclanthology.org/2021.icon-multigen.4/

Research/Academic Interests

My core research interests are in developing machine learning systems for multiomics and tissue imaging, with a strong focus on translational impact in omics, pathology, and precision medicine. I aim to develop AI models and frameworks that analyse and interpret molecular and spatial data to reveal actionable biological insights, collaborating closely with pathologists, clinicians from Fred Hutchinson Cancer research centre and consortia like MorPhiC. Under Dr. Ka Yee Yeung’s guidance, I developed transformer-based models for genomic and transcriptomic data and adapted foundation models like scGPT to identify key biomarkers for Acute Myeloid Leukemia. This work gave me a firsthand understanding of how representation learning can reveal hidden biological relationships, bridge noisy multi-omics signals, and drive scientific insight into disease mechanisms. Looking forward, I am particularly interested in three directions. The first is multimodal representation and fusion, developing architectures that can embed and align diverse biological data types such as spatial proteomics, transcriptomics, and histology images into a shared latent space. I am especially interested in models that combine graph, vision, and molecular components to connect information across scales, for instance linking cell to cell interaction graphs from spatial omics data with histology image features or protein expression patterns to understand how molecular changes manifest in tissue structure. The second is causal and interpretable modeling in tissue contexts, where the goal is to move beyond descriptive correlations and focus on models that can infer mechanistic relationships across cell neighborhoods. For instance, studies like CellCharter (Nature Methods, 2024) and NicheCompass (Nature Biotechnology, 2023) use graph based causal inference to uncover how ligand receptor interactions and spatial signaling contribute to tumor progression. I find this line of work exciting because it connects machine learning directly to biological reasoning. Finally, I am curious about agentic AI systems for scientific discovery, such as those developed in Mahmood Lab at Harvard like PathChat and SlideSeek, which use multi agent reasoning to autonomously explore and interpret pathology slides, or STAgent from LiuLab at Harvard, which uses code generation, visual reasoning, and retrieval augmented literature grounding to automate spatial transcriptomics analysis.

Computational and Data Science Areas

Applied Computer Science; Artificial Intelligence and Intelligent Systems; Biochemistry and Molecular Biology; Computer Science

Motivation

One of the things that motivated me to apply to the SRP program is that it brings together some of the biggest national laboratories under one community. For someone like me, who is still thinking of my long-term research direction, it is very meaningful. Each lab has its own way of connecting AI with different science domains, and I see parts of my own interests reflected across them. When I read about the AI Foundation Model for Proteins at Berkeley Lab (LBL), I found it very close to the kind of work I want to do in my PhD. They are training large models that learn relationships between protein sequences, structure and function, which reminded me of what I try to do in tissue data, which is understanding how molecular changes affect what we see at the cell and image level. Their approach gives me ideas about using structural or biochemical priors in my own models for spatial omics, especially for learning how local cell neighbourhoods influence disease progression. At Brookhaven lab, the "vision" project caught my attention for a different reason. It connects LLM reasoning with lab tools. I kept thinking how something similar could work in biology, maybe an assistant that helps a biologist explore spatial omics data, interpret patterns, or suggest what to look at next. What I really like is how they are testing these systems in actual physics experiments, because it shows how agentic AI can be made reliable and safe in a controlled environment before being adapted to biological workflows. Argonne lab's work on scientific machine learning also feels close to my interests. Their research on AI-driven simulation surrogates and adaptive modeling for complex systems made me think about using similar ideas to study biological processes. They built surrogate models using physics informed neural networks and uncertainty aware training to approximate expensive simulations in nuclear energy domain. In biology, simulations like cell–tissue interactions are often too expensive to run repeatedly, so having a learned surrogate that can approximate or refine results would be incredibly useful. Especially when dealing with spatial omics or tissue imaging, where the data is noisy and multi-scale, I think surrogate modeling could help biologists test hypotheses faster, explore edge cases, or even guide experimental design in real time. I also came across work at Oak Ridge lab, where teams are developing AI methods to analyze massive microscopy and genomics datasets on HPC systems. They have built tools like MENNDL that use neural architecture search(NAS) and asynchronous scheduling across thousands of GPUs to evolve deep learning models for biological data annotation. The scale of that work is impressive to me because biological data is often limited by compute rather than ideas. Understanding how they design and optimize their infrastructure or how they parallelize training, manage hyperparameter tuning, and adapt models to noisy scientific data could help me design more efficient pipelines for my own models, especially when dealing with large tissue imaging datasets. Through SRP, I want to spend time inside such environments and learn how people in these labs design experiments, handle uncertainty, and validate AI models against real measurements. These are lessons that are hard to learn from papers alone. I hope to carry those ideas back into my PhD work. Being part of the SRP community also matters to me personally. As an international student, I have always valued the chance to work in teams that combine different perspectives and expertise, and I see SRP as a place where that happens naturally. I hope to learn from mentors who have built long-term careers in this space, contribute to their projects in whatever way I can, and grow into a researcher who can connect AI methods to scientific problems in a useful way.

Lightning Talk Title

Multimodal AI for Mechanistic Insight in Spatial Biology

Keywords (Maximum 20 words)

Multimodal learning;Graph neural networks;Spatial graphs;Causal modeling;Representation learning;Biological mechanisms;Interpretable AI;Scientific reasoning;Agentic AI;Foundation models;Tissue imaging;Spatial transcriptomics;Spatial proteomics;Microenvironment modeling;Cell-cell communication;Molecular phenotyping;AI for discovery;