danielhanchen
Just a compilation of machine learning notes I wrote some time ago! It's not fully completed, but includes a section on optimization & generalized linear models for now with tonnes of maths and focuses a lot on implementation and practicality.

Optimization: AdamW optimizers, weight initialization, learning rate range finding, bias init, gradient centralization, RAdam, Lookahead, Linear Reg & Least Squares, Cholesky, Ridge Reg

GLMs: IRLS, calculus stuff, statistical inference, diagnostics, high dimensional optimization, logistic regression