Blog
Thoughts on AI, software, life and more. Short notes and longer write-ups.
Can We Use LLMs Itself to Speed Up LLM Inference?
May 01, 2023Large language models (LLMs) possess a remarkable ability to anticipate the length of their generated responses. By leveraging this capability, we propose a novel technique called Sequence Scheduling to enhance the efficiency of LLM batch inference.
Read more
A Detailed Derivation of Backpropagation
Sep 07, 2022A step-by-step derivation of the backpropagation algorithm for multi-layer perceptrons.
Read more