Research
Ask anyone (including an LLM) how a frontier AI model works, how they arrive at their predictions, and you'll be met with one of two answers: "I don't know" or "it's a statistical algorithm reproducing what it saw in its training data." If you're like me, neither of those answers actually says why the models work, just that they do. What's actually going on inside these models? If they're reproducing patterns in the data they saw during training, which ones? And just exactly how?
I generally want to understand how neural networks learn. I believe this understanding comes from descriptions of simple models (which I will explain in a future blogpost), more specifically on their optimization dynamics and how they create features.
As an early-stage PhD student, a lot of my work is still unpublished, I will try to update this page whenever I have a new preprint.