This chalk talk delves into optimizing and deploying large language models (LLMs) at scale. Explore large model hosting, optimization techniques, model partitioning, batch processing, and model fine-tuning.
Quick Info
Content
Speaker
Haowen Huang
Haowen Huang is senior evangelist at Amazon Web Services, based in Hong Kong. He has more than 20 years of experience in architecture design, technology, and startup management across the telecommunications, internet, and cloud computing industries. Additionally, he has worked for renowned companies like Microsoft, Sun Microsystems, and China Telecom. His current research interests include generative AI, large language models (LLMs), machine learning, and data science.
Background
Country / Region
Hong Kong
Affiliations
Amazon
Internal
Is Remote Presentation
false