SGLang
high-performance serving framework for large language models and multimodal models.
llm-inferencepythonresearch
PythonPyTorchCUDA
Things I've built and worked on.
high-performance serving framework for large language models and multimodal models.
Interactive visualization tool for exploring attention patterns in transformer models.
Interactive visualization tool for minimizing language deprivation in deaf and hard-of-hearing children.
This very website - built with Next.js, TypeScript, and Tailwind CSS. Features blog posts, project showcase, and responsive design.