Blog

Thoughts on AI, software, life and more. Short notes and longer write-ups.

Evaluation as Creation: How AI is Redefining Human Productivity

Jan 15, 2025

In the age of AI, the ability to evaluate, judge, and appreciate is becoming the new form of productivity. Human definitions of 'good' and 'beautiful' are reshaping the essence and value of creation itself. From the complete creative loop of the craft era to the evaluation-driven paradigm of the AI age, we're witnessing a fundamental revolution in how we produce and create.

Read more

InfoBatch: Dataset Pruning on the Fly

Jan 17, 2024

Multi‑epoch training wastes time on easy, well‑learned samples. InfoBatch dynamically prunes data and rescales the loss to keep accuracy while speeding up training by 20–40% across vision and language tasks.

Read more

ZSCL: Fine-tuning Vision-Language Models without Zero‑Shot Transfer Degradation

Jul 15, 2023

Continual fine‑tuning of vision‑language models can damage zero‑shot transfer. ZSCL adds simple constraints in feature and parameter space to keep zero‑shot ability while improving downstream performance.

Read more

CAME Optimizer: Adam Performance with Adafactor Memory Requirements

Jul 14, 2023

Training large language models uses a lot of memory. CAME cuts memory use to Adafactor levels but keeps Adam-like performance.

Read more

Can We Use LLMs Itself to Speed Up LLM Inference?

May 01, 2023

Large language models (LLMs) possess a remarkable ability to anticipate the length of their generated responses. By leveraging this capability, we propose a novel technique called Sequence Scheduling to enhance the efficiency of LLM batch inference.

Read more

A Detailed Derivation of Backpropagation

Sep 07, 2022

A step-by-step derivation of the backpropagation algorithm for multi-layer perceptrons.

Read more