SHI Collaboration Profiles

Profile pages for Sustainable Horizons Institute SRP 25-26 Faculty Participants


Tejas Gokhale

Tejas Gokhale

Assistant Professor

Computer Science and Electrical Engineering

University of Maryland, Baltimore County

Biography

Tejas Gokhale is an Assistant Professor of Computer Science and Electrical Engineering at the University of Maryland, Baltimore County. He directs the Cognitive Vision Group at UMBC, working on research themes such as: concept-level characterization of the visual world; the interpretation of visual data in presence of incomplete information; recognizing and adapting to novelty and variations; leveraging external knowledge and reasoning modules to generalize to new contexts, domains, environments, and tasks; acquiring visual knowledge and communicating it to other machines and humans. Tejas earned his Ph.D. from Arizona State University, M.S. from Carnegie Mellon University, and B.E. (Honours) from Birla Institute of Technology and Science, Pilani. Tejas has been involved with several conference organization roles such as Tutorial Chair at ICCV 2025 and workshop/tutorial organizer at CVPR, ECCV, and WACV. He was selected as a "Highlighted New Faculty" at AAAI 2024 and his work has been funded by DARPA, UMBC, and Microsoft. His work has been published in top-tier proceedings across several fields of AI, such as CV (CVPR, ICCV, ECCV), NLP (ACL, EMNLP, NAACL), ML (AAAI, NeurIPS, ICLR), and others. Tejas has previously participated as a sub-team mentor for SCALE 2024 hosted by JHU HLTCOE. Website: https://www.tejasgokhale.com/

Degrees Earned

Ph.D. Computer Engineering. 2025. Arizona State University. M.S. Electrical and Computer Engineering. 2017. Carnegie Mellon University. B.E.(Honours). Electronics and Instrumentation Engineering. 2015. Birla Institute of Technology and Science, Pilani (India).

Research Areas

Computer Science; Engineering; Machine Learning/AI

Research Interests

Tejas grapples with research questions in computational perception, learning, reasoning, and communication. His research interests are broadly in AI, at the intersection of machine learning, computer vision, and natural language processing. The current research theme in our Cognitive Vision Group is "conceptual characterization of visual scenes" which includes concept-level understanding and interpretation of images and videos, event-centric understanding in videos to aid retrieval and reasoning, visual question answering and multimodal reasonign by leveraging external knowledge from various sources, innovations in diffusion models and their robustness and security, and improving the generalization and robustness of data-driven models. These research interests align well with the goals of SCALE 2026, which focuses on Event Understanding and Summarization from Real-Time Videos and SCALE 2025 (in which Tejas participated) which focused on Video-Based Event Retrieval. Research synergies and motivations are described below.

Topical Areas

Applied Computer Science; Artificial Intelligence and Intelligent Systems; Computer Science; Other Computer and Information Sciences; Performance Evaluation and Benchmarking

Research Synergy

There are strong synergies between Tejas and his group's interest/expertise and the goals of the project/project leaders at SCALE 2026 on the topic of "Event Understanding and Summarization from Real-time Videos". Tejas has participated in a previous related iteration: SCALE 2024 (Video-Based Event Understanding) and has continued to work on video retrieval and visual understanding since then. Tejas has a body of work on multimodal learning and reasoning, such as his contributions to visual question answering, vision-language entailment, and video captioning; as well as his work on multimodal information retrieval and knowledge-intensive tasks. His expertise in dataset creation and benchmark creation is also relevant to the goals of SCALE 2026. Tejas and his students (Sourajit and Naren) can contribute strongly to "First Stage" topics such as visual event detection and visual frame analysis to describe physical events in unedited long-form videos; and to "Second Stage" topics such as multimodal retrieval and multi-video summarization towards the task of multimodal RAG.

Motivation

As a new (<3 years on tenure track) faculty member at UMBC, I am seeking opportunities to connect with collaborators on NAIRR projects, particularly the SCALE (Summer Camp for Applied Language Exploration). In addition to the obvious synergies and strong alignment between my background/interests and the NAIRR-funded project, my motivation is to become a crucial part of the larger research community in this area. As my new research group grows, I would like to establish new collaborations and reinforce existing collaborations with the dual goal of (1) working on exciting new topics and (2) providing my students mentorship and research opportunities beyond UMBC. The SRP program will allow me to work on larger scale problems (such as the topic of SCALE 2026) which wouldn't be possible for only one research group to carry out and thus necessitates the collaboration of many scientists, engineers, faculty, and students to push towards a big goal and make impact, faster and together. Science is rarely done in isolation and thus a community of like-minded researchers pushing towards a shared goal from different perspectives is exactly the sort of environment that I would like myself and my students to experience. One of my goals as a faculty member is to create future professors and SRP offers a summer experience that combines research projects with career development activities and is thus is a unique opportunity for my students to grow as independent researchers and get insights and feedback from multiple faculty members at various level.

Supervising Students Plan

I will be supervising two students: Sourajit Saha and Naren Sivakumar; both are my advisees at UMBC in the Ph.D. Computer Science program. Sourajit joined my group in Fall 2023 and Naren in Summer 2025 and I have known both for around 2 years. Supervision of students for the 2026 summer research experience will include: (1) daily "stand-up" meetings with the entire team (including my students and other participants) where we briefly (in a couple of sentences) describe what we did the day before and what we plan to do today. (2) Brainstorming sessions in small groups, to discuss ideas, analyze results, and plan next steps; (3) SCALE 2026 will likely have several sub-teams (for example, OCR, audio, NLP, IR, CV, infrastructure, etc.). I will facilitate collaborations between these teams to encourage interdisciplinary and impactful work. These collaborations will potentially expose my students to different research fields and perspectives on solving problems. (4) Opportunities to give longer presentations to a wider audience. For example, weekly update meetings.

Student Merit

Both Sourajit and Naren have strong programming skills in Python, PyTorch etc. and have experience using the latest LLMs, VLMs, diffusion models, and computer vision models. They have taken graduate-level AI, ML, NLP, and CV classes as well as advanced seminars on topics such as GenAI, RobustML, etc. Sourajit has participated in SCALE before, in SCALE 2024 on Video-Based Event Retrieval. He is very much keyed-in to the domain of multimodal LLMs, VLMs, video understanding, captioning, and retrieval, through his previous participation in scale, his ongoing research projects, and his synergistic work on our DARPA project. He is in the process of submitting a paper on zero-shot video retrieval to CVPR -- we believe this work potentially be part of one of the approaches for the SCALE 2026 primary task. Naren has participated (and won) hackathons before and would be an asset to SCALE 2026 as he can rapidly prototype AI code and communicates well with large teams. For example, he has been a good contributor to a system we're building for our DARPA project and has already ideated and prototyped an innovative prompting approach that could potentially improve the performance of our model. Because we already have connections between JHU HLTCOE and my group (given our previous participation in SCALE), onboarding for Sourajit and Naren will be quick, both in terms of getting to know new people and getting to understand the infrastructure and logistics at the host venue.

Lightning Talk Title

Multimodal Active Data Pursuit for Learning, Reasoning, and Retrieval

Keywords

video understanding; multimodal retrieval; multimodal claim extraction; machine learning; computer vision; natural language processing; robustness; compositionality

Student(s) of Faculty

Sourajit Saha, Naren Sivakumar