I recently finished studying introductory abstract algebra, and I’d like to retain that knowledge. Also, now that I have a studying math shaped hole in my schedule, and given I am a creature of habit, I’ve decided to study something else. Specifically, I want to understand how autoregressive transformers work, at as fine grained a level as I can pull off. I talk to the things every day, and the whole world seems to be chattering about them, so if I’d like to make myself do homework, seems like the correct homework to do.
I’ll probably write more later about my experience teaching myself what GPT-2 is, actually, but that’s for another post. This post is about a tool I’m using, to retain my abstract algebra knowledge and build some cool new AI knowledge. That tool is Anki.
What is Anki?
Per its website, Anki is “a program which makes remembering things easy.” It’s a digital flashcards program, where you can make whatever flashcards you want, and then study them. Anki leverages spaced repetition, research suggesting that the best way to remember information over the long term is to expose yourself to it at gradually extending intervals over time. Often when you’re first memorizing, then rarely to make sure it doesn’t slip.
Though I’ve only recently started using it, I recommend Anki. In fact, I recommend flashcards more generally. I remember having fun with them as a kid, the very few times I used them. I probably would have benefitted in college from using them more. And there’s something nice about the interface of Anki being simple, and the product being free and open source. It’s something in the same spirit as Wikipedia, of people who care about knowledge and learning helping each other out of pure scholastic altruism, an unfussy love of the truth as a beacon for those who know to seek it.
I don’t know. It’s fun to make a flashcard, and it’s fun to get a flashcard right, and it’s fun, even, to get one wrong and realize, oof, guess I didn’t know that as well as I thought! Because there will be opportunity, soon, to try again.
What took me so long?
I feel a little silly, getting into Anki now. I guess for the same reason that I feel a little silly about trying to figure out how transformers work when the world is in a full blown hype cycle about them, and there are video guides about it.1 I’ve heard people rumbling about the merits of spaced repetition forever, from Gwern to Dwarkesh Patel. I internalized, conceptually, that it’s one of the most evidence based ways to learn and retain knowledge. And yet, I deliberately studied a difficult subject for six months, which includes tons of terminology it’s important to get straight, and just… never tried it.
Part of that, I think, was that there was something romantic about my math habit simply involving a notepad, a textbook, and a pencil. But that’s not even true! I asked modern AI for second opinions all the time. What harm would an Anki practice have done, as a supplement? Do the homework problems, write a few cards, review a few cards. Simple and effective. But I didn’t get around to it.
Then again, I think it’s probably good to celebrate trying new things, rather than noticing that I might have tried them sooner. I spent months feeling vaguely like I should buy NVIDIA, but didn’t because I figured I’d be too late. Then I actually did buy a little just to see (this is not investment advice), and it wasn’t too late after all. I’ve spent a few years thinking about LLMs all the time, but never bothered to actually learn exactly what softmaxing is. But now I know that little piece of the puzzle, and it’s cool.
I won’t forget it, either. It’s on an Anki card.
Though I notice the thumbnail includes a diagram from Attention Is All You Need, which is of a traditional transformer for sequence to sequence tasks, while GPT is an autoregressive transformer that doesn’t have an encoder, so the image doesn’t actually show the architecture for how GPT works. A lot of the fun of trying to learn finicky new stuff is how the range of stuff you can notice expands.