Blog

Thoughts on AI, software, life and more. Short notes and longer write-ups.

Can We Use LLMs Itself to Speed Up LLM Inference?

May 01, 2023

Large language models (LLMs) possess a remarkable ability to anticipate the length of their generated responses. By leveraging this capability, we propose a novel technique called Sequence Scheduling to enhance the efficiency of LLM batch inference.

A Detailed Derivation of Backpropagation

Sep 07, 2022

A step-by-step derivation of the backpropagation algorithm for multi-layer perceptrons.