Biography
Khaled Ibrahim is a member of the Parallel Performance and AI Nexus (PPAN) group in the Computing Sciences Area. He is working on various research projects on high performance computing focusing on runtime systems, programming models, performance modeling and optimization, and computer architectures. Khaled Ibrahim came to Berkeley Lab in 2009, after working in INRIA, France. He obtained his PhD in computer engineering from North Carolina State University.
SRP Project Title
Performant Scientific Workflows in HPC Environments
Topical Areas
Applied Computer Science; Artificial Intelligence and Intelligent Systems; Computer Science; High Performance Computing; Performance Evaluation and Benchmarking
Abstract
Modern scientific workflows are rapidly evolving, integrating LLMs, AI, and simulation codes to control, predict, and model complex discovery processes. Managing these intricate workflows effectively is crucial for leveraging HPC resources, preventing underutilization, and accelerating scientific progress. This project addresses these challenges through two main approaches: 1- Workload Instrumentation to Profile and Model Performance: By instrumenting workflow components, we gather detailed performance metrics to identify bottlenecks, understand resource consumption, and predict performance under various loads, enabling proactive optimization. 2- Orchestration of Resource Scheduling to Improve User Experience: Building on performance insights, we develop sophisticated scheduling mechanisms to dynamically allocate HPC resources, minimizing wait times, maximizing throughput, and enhancing user experience for scientists by accelerating experiment execution and model iteration.
Desired Skills
Programming Languages: C++, Python; Profiling infrastructures: Nvidia tools
Lightning Talk Title
Performant Scientific Workflows in HPC Environments
Keywords
High Performance Computing HPC/AI workflows