1 min read

4

This is a special post for quick takes by Jacob G-W. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
15 comments, sorted by Click to highlight new comments since:

Over the past few months, I helped develop Gradient Routing, a non loss-based method to shape the internals of neural networks. After my team developed it, I realized that I could use the method to do something that I have long wanted to do: make an autoencoder with an extremely interpretable latent space.

I created an MNIST autoencoder with a 10 dimensional latent space, with each dimension of the latent space corresponding to a different digit. Before I get into how I did it, feel free to play around with my demo here (it loads the model into the browser): https://jacobgw.com/gradient-routed-vae/.

In the demo, you can both see how a random MNIST image encodes but also directly play around with the encoding itself and create different types of digits by just moving the sliders.

The reconstruction is not that good, and I assume this is due to some combination of (1) using the simplest possible architecture of MLP layers and ReLU (2) only allowing a 10 dimensional latent space which could constrain the representation a lot (3) not doing data augmentation, so it might not generalize that well, and (4) gradient routing targeting an unnatural internal representation, causing the autoencoder to not fit the data that well. This was just supposed to be a fun proof of concept project, so I’m not too worried about the reconstruction not being that good.

How it works

My implementation of gradient routing is super simple and easy to add onto a variational autoencoder. During training, after I run the encoder, I just detach every dimension of the encoding except for the one corresponding to the label of the image:

def encode_and_mask(self, images: Tensor, labels: Tensor):
    encoded_unmasked, zeta, mean_from_encoded, cov_diag_from_encoded = self.encode(images)
    mask_one_hot = F.one_hot(labels, num_classes=self.latent_size).float()
    encoded = mask_one_hot * encoded_unmasked + (1 - mask_one_hot) * encoded_unmasked.detach()
    return encoded, zeta, mean_from_encoded, cov_diag_from_encoded

This causes each dimension of the latent space to “specialize” to representing its corresponding image since the error for that image type can only be propagated through the single dimension of the latent space.

It turns out that if you do this, nothing forces the model to represent “more of a digit” in the positive direction. Sometimes the model represented “5-ness” in the negative direction in the latent space (e.g. as [0, 0, 0, 0, 0, -1.0, 0, 0, 0, 0]This messed with my demo a bit since I wanted all the sliders to only go in the positive direction. My solution? Just apply ReLU the encoding so it can only represent positive numbers! This is obviously not practical and I only included it so the demo would look nice.[1]

In our Gradient Routing paper, we found that models sometimes needed regularization to split the representations well. However, in this setting, I’m not applying any regularization besides the default regularization of the encoding that comes with a variational autoencoder. I guess it turns out that this regularization is enough to effectively split the digits.

Classification

It turns out that even though there was no loss function causing the encoding to activate most strongly on the dimension corresponding to the digit being encoded, it happened! In fact, we can classify digits to 92.58% accuracy by just taking the argmax over the encoding, which I find pretty amazing.

Code

You can see the code here.

(this was a crosspost of a post from my blog)

  1. ^

    I did have to train the model a few times to get something that behaved nicely enough for the demo.

If you're not already aware of the information bottleneck, I'd recommend The Information Bottleneck Method, Efficient Compression in Color Naming and its Evolution, and Direct Validation of the Information Bottleneck Principle for Deep Nets. You can use this with routing for forward training.

Due to someone's suggestion, I've turned this into a top level post.

And I migrated my comment.

When I was recently celebrating something, I was asked to share my favorite memory. I realized I didn't have one. Then (since I have been studying Naive Set Theory a LOT), I got tetris-effected and as soon as I heard the words "I don't have a favorite" come out of my mouth, I realized that favorite memories (and in fact favorite lots of other things) are partially ordered sets. Some elements are strictly better than others but not all elements are comparable (in other words, the set of all memories ordered by favorite does not have a single maximal element). This gives me a nice framing to think about favorites in the future and shows that I'm generalizing what I'm learning by studying math which is also nice!

I noticed this too, but when trying to rank music based on my taste. I wonder if when people are asked to give their favorite (of something), do they just randomly give a maximal element, or do they have an implicit aggregate function that kind of converts the partial order into a total order

Hmm, my guess is that people initially pick a random maximal element and then when they have said it once, it becomes a cached thought so they just say it again when asked. I know I did (and do) this for favorite color. I just picked one that looks nice (red) and then say it when asked because it's easier than explaining that I don't actually have a favorite. I suspect that if you do this a bunch / from a young age, the concept of doing this merges with the actual concept of favorite.

I just remembered that Stallman also realized the same thing:

I do not have a favorite food, a favorite book, a favorite song, a favorite joke, a favorite flower, or a favorite butterfly. My tastes don't work that way.

In general, in any area of art or sensation, there are many ways for something to be good, and they cannot be compared and ordered. I can't judge whether I like chocolate better or noodles better, because I like them in different ways. Thus, I cannot determine which food is my favorite.

I agree with most of this but I partially (hah!) disagree with the part that they cannot be compared at all. Only some elements can be compared (e.g. I like the memory of hiking more than the memory of feeling sick.) But not all can be compared.

Someone recently tried to sell me on the Ontological Argument for God which begins with "God is that for which nothing greater can be conceived." For the reasons you described, this is completely nonsensical, but it was taken seriously for a long time (even by Bertrand Russell!), which made me realize how much I took modern logic for granted

How would you go about answering this question post-said insight? What would the mental moves look like?

I'm never good at giving an answer to my favorite book/movie/etc.

Just say something like here is a memory I like (or a few) but I don't have a favorite.

Do adults actually ask each other “What’s your favorite…” whatever? It sounds to me like the sort of question an adult asks a child in order to elicit a “childish” answer, whereupon the adults in the room can nod and wink at each other to the effect of “isn’t that sweeet?” so as to maintain the power differential.

If I am faced with such a question, I ignore the literal meaning and take it to be a conversation hook (of a somewhat unsatisfactory sort, see above) and respond by talking more generally about the various sorts of whatever that I favour, and ignoring the concept of a “favorite”.

"Fantasia: The Sorcerer's Apprentice": A parable about misaligned AI told in three parts: https://www.youtube.com/watch?v=B4M-54cEduo https://www.youtube.com/watch?v=m-W8vUXRfxU https://www.youtube.com/watch?v=GFiWEjCedzY

Best watched with audio on.

A great example of more dakka: https://www.nytimes.com/2024/03/06/health/217-covid-vaccines.html

(Someone got 217 covid shots to sell vaccine cards on the black market; they had high immune levels!)

So, someone volunteered to test the safety of the vaccine, in an extreme way. Thank you for advancing science!

Something hashed with shasum -a 512 2d90350444efc7405d3c9b7b19ed5b831602d72b4d34f5e55f9c0cb4df9d022c9ae528e4d30993382818c185f38e1770d17709844f049c1c5d9df53bb64f758c