joseph_c - LessWrong

[PAPER] Jacobian Sparse Autoencoders: Sparsify Computations, Not Just Activations

I'm a little confused about which computation you're trying to sparsify. The paper seems to be written in the context of the technique where one uses sparse autoencoders to extract "features" from the embedding space of large language models which are hopefully interpretable. (Please correct me if I'm wrong about that!)

The goal would seem to be, then, to sparsify the computation of the language model. However, the method in your paper seems to sparsify the computation of the autoencoders themselves, not the language model. Shouldn't the goal be to sparsify the language model's computation? If so, why not use weight pruning? What is JSAE better at?

If all trade is voluntary, then what is "exploitation?"

joseph_c3mo20

Isn't $\beta$ proportional to the inverse temperature, and so should be smaller now (with easier, more frequent trading)?

On Shifgrethor

joseph_c5mo80

A math textbook leaving certain results as an exercise for the reader?

I think this is usually actually one of (1) the author not wanting to write out the proof (because it's boring/tedious) or (2) a proof that would make a good exercise because it is easy enough if you understand the big ideas (and coming up with good exercises is not always easy).

"Slow" takeoff is a terrible term for "maybe even faster takeoff, actually"

joseph_c6mo11

Gradual/Sharp

joseph_c's Shortform

joseph_c8mo40

I recently came across Backpack Language Models and wanted to share it in case any AI interpretability people have not seen it. (I have yet to see this posted on LessWrong.)

The main difference between a backpack model and an LLM is that it enforces a much stricter rule to map inputs' embeddings to output logits. Most LLMs allow the output logits to be an arbitrary function of the inputs' embeddings; a backpack model requires the output logits to be a linear transformation of a linear combination of the input embeddings. The weights for this linear combination are parameterized by a transformer.

The nice thing about backpack models is that they are somewhat easier to interpret/edit/control: The output logits are a linear combination of the inputs' embeddings, so you can directly observe how changing the embeddings changes the outputs.

Childhood and Education Roundup #6: College Edition

joseph_c10mo2-1

Most students, 48 percent, claimed to be Native American on their application....
According to Intelligent.com Managing Editor Kristen Scatton, the prevalence of applicants who claim Native American ancestry is possibly due to the popular narrative that for many Americans, a small percentage of their DNA comes from a Native American tribe.

Maybe these students are purposely misinterpreting "Native American" to be someone who was born and raised in the United States, perhaps with ancestors born and raised in the US as well. This is actually the older sense of the term "Native American", found, for example, in the name of the Native American Party back in the mid-1800s.

Less Anti-Dakka

joseph_c10mo20

includeIt is written in More Dakka:
If something is a good idea, you need a reason to not try doing more of it.
Taken at face value, it implies the contrapositive:
If something is a bad idea, you need a reason to not try doing less of it.

This is not the contrapositive. It is not even the opposite.

UDT1.01: The Story So Far (1/10)

joseph_c1y10

Parfit's Hitchhiker: You're stranded in the desert and Omega comes up. It will give you a ride out of the desert iff it predicts you'd give it 10,000 dollars upon reaching civilization again. You get a ride. When in civilization again, do you go over to the bank and withdraw some money? Well, policies which pay up in this specific situation get (value of a life - 10,000 dollars) more than policies which don't pay in this specific situation, which just die.

Why is this called Parfit's Hitchhiker? Who is the Parfit it is referring to? Where was this scenario first written up? (I'm trying to dig up the original reference.)

I was raised by devout Mormons, AMA [&|] Soliciting Advice

joseph_c1y10

Not for Mormons. They don't believe in an omnipresent God.

Increasing IQ by 10 Points is Possible

joseph_c1y6452

Well, what are your actual steps? Or is this just advertisement?

LESSWRONG
LW

Posts

Wikitag Contributions

Comments