Xingyi Chen
he/him/his
Johns Hopkins University
Department of Applied Mathematics & Statistics
Biography
Xingyi (Daniel) Chen is a Junior at Johns Hopkins University majoring in Applied Mathematics & Statistics with minors in Computational Medicine and Mathematics (GPA 3.99). He works in Dr. Stephanie Hicks’s biostatistics lab developing robust, reproducible workflows for spatial transcriptomics, including a pip-installable Visium HD preprocessing toolkit and a Python port of the SpotSweeper quality-control framework. He has also led analyses and created publication-quality visualizations of spatial transcriptomics datasets to support an NIH R01 grant application. Previously, he contributed to exploratory genomics projects in Dr. Michael Beer’s lab and completed a bioinformatics internship in Shanghai synthesizing spatial-omics literature and assisting with data pipelines. Daniel is also the lead teaching assistant for Differential Equations: he mentors other TAs, designs worksheets, and leads review sessions (98% student rating). He plans to pursue a Ph.D. in biostatistics or biomedical data science. His goal is to develop statistical methods that make complex biomedical data more interpretable and useful for biological or clinical discovery.
Academic Status
Undergraduate Student - 3rd
Research Area/Department
Applied Mathematics; Data Science; Machine Learning/AI
Major/Specialty
Major: Applied Mathematics & Statistics Minors: Computational Medicine, Mathematics Areas of Focus: Biostatistics, Biomedical Data Science, Computational Biology, Applied Statistics for Medicine and Healthcare.
Degrees Earned or in Progress
Degree in Progress: Bachelor of Science (B.S.) in Applied Mathematics & Statistics, Expected May 2027
Academic Preparation
Science: General Physics, Chemistry, Biology, Genetics. Math/Stats: Multivariable Calculus, Differential Equations, Linear Algebra, Real Analysis; Probability & Statistics (A+); Applied Statistics & Data Analysis (A+); Data Science (A+); Optimization; Monte Carlo Methods; Causal Inference. Computing: Intermediate Programming (C/C++); Genomic Data Visualization (R); Foundations of Computational Biology & Bioinformatics (Python/R); Computational Medicine (Python).
Research/Publications
Hicks Lab (Biostatistics, JHU): spatial transcriptomics workflows; Visium HD preprocessing package; Python SpotSweeper; R01 dataset analyses and figures. Beer Lab (BME, JHU): exploratory regulatory-genomics analyses and data processing. Course projects: Visium spatial dataset comparison analysis (Genomic Data Visualization, Jean Fan); BRCA-mutant scRNA-seq differential expression and clade-based analysis (Foundations of Computational Biology, Chris Bradburne).
Research/Academic Interests
I am interested in making high-dimensional data analysis reproducible and reliable, through the combination of statistical learning, uncertainty quantification, and reproducible pipelines. The majority of my recent work has been with single cell and spatial transcriptomics, in which preprocessing and data quality control can significantly change outcomes. I aim at versioned datasets, tests, readable notebooks, and benchmarks that correlate method selection with outcome stability. My coursework in probability, statistical inference, optimization, and Monte Carlo, combined with Python/R and Unix/Git proficiency, enables method design, data analysis, and production-level implementation.
Computational and Data Science Areas
Applied Computer Science; Applied Mathematics; Clinical Medicine; Other Medical Sciences; Performance Evaluation and Benchmarking; Statistics and Probability
Motivation
I'm applying to Sustainable Research Pathways because the program sits exactly where I want to grow - robust, open, and collaborative research on scale. My recent work in Dr. Stephanie Hicks's lab - creating reproducible spatial transcriptomics pipelines, translating SpotSweeper to Python and bundling Visium HD preprocessing into a pip-installable package - taught me that it is accurate assumptions, uncertainty quantification, and easy-to-access software that transform analyses into common, durable products. SRP's association with NAIRR projects would enable me to stress-test such habits on larger, more advanced datasets and adopt community standards that traverse the walls of a single lab. I will contribute as both a builder and a teammate. I write clean, commented code; version data and experiments; and create publication-quality figures explaining parameter choices. As lead TA for Differential Equations, I have learned to elaborate briefly, scaffold to different starting points, and provide templates others can easily learn from: those are techniques I could apply to research onboarding and open-source documentation. In return, I’m seeking close mentorship on method design and benchmarking, exposure to production-grade data and computing platforms, and feedback that raises my bar for statistical rigor and software reliability. Later on, I aspire to go for a PhD in biostatistics/biomedical data science. SRP's multicultural community and cross-institutional collaborations are the ideal context for me to develop statistically sound, computationally fast methods and tools that are simple for others to deploy - so results are not just interesting, but also significant and useful.
Lightning Talk Title
Building Trustworthy Analytics for Biomedical Data
Keywords (Maximum 20 words)
Biomedical Data Science; Biostatistics; Statistical Genomics; Machine Learning; Applied Statistics; Data Visualization; Spatial Transcriptomics; Reproducible Workflows