Biography
Varun Kasireddy is a Project Scientist at Carnegie Mellon Universityâs AirLab. His research focuses on multimodal perception, visionâlanguage models, and learning on edge devices. He also collaborates with academic and industry partners, including efforts on autonomous aerial systems for object identification and tracking. Varun is committed to reproducible research, practical evaluation, and humanâinâtheâloop development. He mentors students on concise experimentation, observability, and data quality, and enjoys designing small-scale prototypes that translate quickly to field trials. Through SRP, he aims to build a collaborative summer project that combines rigorous evaluation with scalable perception to deliver reliable robotic behaviors in real environments.
SRP Project Title
Visual Language Model for Stand-off Triage Sensing
NAIRR Project
LLM Training and Evaluation for Pandemic Prevention and Response
Topical Areas
Computer Science
Abstract
We build vision-language systems that help robots perform fast, reliable casualty triage. As our work involves practical edge deployment (e.g., NVIDIA Jetson), a critical aspect in our decision making is to balance algorithm accuracy with latency requirements. By Summer 2026, the project will be at a point where students will test and iterate on existing pipelinesâvideo highlight extraction, robust person and blood segmentation, and Visual Question Answering (VQA) for scene understanding. Each week, we will run systems tests and score performance. Based on the results, students should quickly synthesize progress, create short presentations for the broader team, and help the project leads finetune models and manage versioning to push improvements for the next deployment. Work spans dataset preparation, annotation, benchmarking, and optimization on GPU clusters. The aim is clear tools that fit robotic workflows. Impact includes emergency response, safety monitoring, and human-robot teaming. Itâs hands-on and fast-pacedâyouâll see changes week to week and get to validate them on real hardware.
Desired Skills
⢠Core: Python, basic Linux, Docker containerization, and comfort with Git/GitHub for fast iteration. ⢠Robotics: ROS 2 basics (nodes, topics, launch files) to run weekly systems tests and integrate updated components. ⢠Experiment tracking: Weights & Biases (W&B) for logging metrics, comparing runs, and sharing dashboards with the team. ⢠Evaluation: Clear thinking about test scores, precision/recall, latency, and failure cases; ability to propose quick fixes. ⢠Data pipelines: PyTorch, computer vision, segmentation, dataset curation/annotation (e.g., CVAT/FiftyOne), and simple GPU optimization. ⢠Optional: Edge deployment (e.g., NVIDIA Jetson)
Lightning Talk Title
Push AI-driven robots to the field
Keywords
Robotics; Multimodal Perception; VisionâLanguage Models; Edge Computing; ROS 2; RealâTime Evaluation; Reliable AI; Field Testing; Sensor Fusion; LiDAR