Deploying Inference Services Webinar at ALCF Dec. 3

Nov. 24, 2025 — The Argonne Leadership Computing Facility will hold a webinar showcasing the ALCF’s Inference Service on Wednesday, Dec. 3, 2025.

Registration information can be found here.

The facility’s inference services provides cloud-like access to diverse AI models—including large language models (LLMs)—on existing HPC clusters.

ALCF’s Benoit Côté will demonstrate how to integrate the Inference Service within scientific applications and share examples of interacting with our chat interface and API.

The webinar will cover ALCF’s experience deploying and optimizing inference endpoints on the Sophia cluster, leveraging Globus Compute and frameworks like vLLM to support a broad range of models, including science foundation models and open-weight models such as gpt-oss, Meta Llama, and the Mistral family. Côté will also highlight the latest integration with Metis, a SambaNova SN40L cluster highly optimized for inference.

Topics will includetechnical advancements—such as efficient model loading, batch processing for large-scale inference, and authentication via Globus Auth—as well as challenges like resource contention, payload limitations, and fault tolerance, with performance metrics and practical applications. The ALCF Inference Service enables researchers to run secure, scalable AI inference on Argonne’s HPC systems, delivering enhanced accessibility, massive scalability, privacy, and performance tailored to scientific workflows to accelerate data-driven discovery.

Côté is a software developer in the Data Services and Workflows team at ALCF. His work revolves around designing and hosting automated workflows and user-facing services for scientific applications. He obtained a PhD degree in Physics from Université Laval (Canada) in 2015. He held postdoctorate appointments at the University of Victoria (Canada), Michigan State University, and the Konkoly Observatory (Hungary) to combine galaxy formation and nuclear astrophysics to study the origin of the elements and isotopes in the Universe. He became a permanent research staff at the Konkoly Observatory in 2019, but decided to move back to North America during the pandemic. He then held a remote postdoctorate appointment to contribute to the development of a nucleosynthesis data software before joining Argonne in 2022.