LESSWRONG
LW

Linda Linsefors
2543Ω366483172
Message
Dialogue
Subscribe

Hi, I am a Physicist, an Effective Altruist and AI Safety student/researcher.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Compressed Computation is (probably) not Computation in Superposition
Linda Linsefors15d20

I did some quick calculations for what the mse per feature should be for compressed storage. I.e. storing T features in D dimension where T > D.

I assume every feature is on with probably p. On feature equals 1, off feature equals 0. Mse is mean square error for linear readout of features.

For random embeddings (super possition):

mse_r = Tp/(T+D)

If using the D neurons to embed T features exactly, and output feature value constant p, for the rest.

mse_d = p(1-p)(T-D)/T

 

This suggest we should see a transition between these types of embeddings when

mse_r = mse_d when T^2/D^2=(p-1)/p

For T=100 and T=50, this means p=0.2

 

The model in this post is doing a bit more than just embedding features. But I don't think, it can't do better than the most effective embedding of the T output features in the D neurons?

 

mse_r only depends on E[(u dot v)^2]=1/D where v and u are diffrent embedding vectors. Lots of ebeddings have this property, e.g. embedding features along random basis vectors, i.e. assigning each feature to a random single neuron. This will result in some embedding vectors being exactly identical. But the mse (L2) loss is equally happy with this as with random (almost orthogonal) feature directions.

Reply
Compressed Computation is (probably) not Computation in Superposition
Linda Linsefors15d20

Yes, thanks!

Reply
Compressed Computation is (probably) not Computation in Superposition
Linda Linsefors15d20

Does p=1 mean that all features are always on?

If yes, how did it fail to get perfect loss in this case?

Reply
A Straightforward Explanation of the Good Regulator Theorem
Linda Linsefors22d20

johnswentworth's post Fixing The Good Regulator Theorem has the same definition of the Good Regulator Theorem. 

That is enough for me to confirm that this is indeed what it says. Not because I trust John more than Alfred, but because there are now tow independent enough claims on LW for the same definition of the theorem, which would be very surprising if the definition was wrong.

Reply
A Straightforward Explanation of the Good Regulator Theorem
Linda Linsefors22d30

If a regulator is 'good' (in the sense described by the two criteria in the previous section), then the variable R can be described as a deterministic function of S .


Really! This is the theorem?

Is there anyone else who understands the Good Regulator Theorem that can confirm this?

The reasons I'm surprised/confused are:

  1. This has nothing to do with modelling
  2. The teorem is too obviously true to be interesting
Reply
New Endorsements for “If Anyone Builds It, Everyone Dies”
Linda Linsefors24d163

I think what is going on is something like what Scott describe in Can Things Be Both Popular And Silenced?

6. Celebrity helps launder taboo ideology. If you believe Muslim immigration is threatening, you might not be willing to say that aloud – especially if you’re an ordinary person who often trips on their tongue, and the precise words you use are the difference between “mainstream conservative belief” and “evil bigot who must be fired immediately”. Saying “I am really into Sam Harris” both leaves a lot of ambiguity, and lets you outsource the not-saying-the-wrong-word-and-getting-fired work to a professional who’s good at it. In contrast, if your belief is orthodox and you expect it to win you social approval, you want to be as direct as possible.

I don't think that admitting to believe in AI X-risk would get you labelled as evil, but it could possibly get you labelled as crazy (which is maybe worse than evil for intellectuals?). To be able to publicly admit to that view, you either have to be able to argue for it really well, or be able to outsource that arguing to someone else.

I don't know why this is happening with this book in particular though, since there are currently lots of books, blogposts, YouTube videos, etc, that explains AI X-risk. Maybe non of the previous work where good enough? Or respectable enough? Or advertised enough?

Reply
I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?
Linda Linsefors1mo20

I think that currently the highest leverage action anyone (with the right skills for this) can do, is to increase awareness of AI risk. Given that you're the sort of person who succeeded in getting a few hundred thousand followers, I'm guessing you would be good at this.

 

Reply
Kajus's Shortform
Linda Linsefors2mo20

Who do you have in mind, and what work? The line between safety and capabilities is blurry, and everyone disagrees about where it is. 

Other reasons could be:

  • They needed a job and could not get a safety job, and the skill they learned landed them a capabilities job.
  • They where never that concerned with safety to start with, but just used the free training and career support provided by the safety people.
Reply
Kajus's Shortform
Linda Linsefors2mo20

My theory is that safety ai folk are taught that a rules framework is how to provide oversight over the ai...like the idea that you can define constraints, logic gates, or formal objectives, and keep the system within bounds, like a classic control theory... 

 

I don't know anyone in AI safety who have missed that fact that NNs are not GOFAI. 

Reply
Kajus's Shortform
Linda Linsefors2mo20

LW discussion norms is that you're supposed to say what you mean, and not leave people to guess, because this leads to more precise communication. E.g. I guessed that you did not mean what you literary wrote, because that would be dumb, but I don't know exactly what statement you're arguing for. 

I know this is not standard communication practice in most places, but it is actually very valuable, you should try it.

Reply
Load More
3Linda Linsefors's Shortform
Ω
5y
Ω
69
63Circuits in Superposition 2: Now with Less Wrong Math
Ω
13d
Ω
0
25Is the output of the softmax in a single transformer attention head usually winner-takes-all?
Q
5mo
Q
1
36Theory of Change for AI Safety Camp
6mo
3
36We don't want to post again "This might be the last AI Safety Camp"
6mo
17
60Funding Case: AI Safety Camp 11
7mo
4
38AI Safety Camp 10
Ω
9mo
Ω
9
17Invitation to lead a project at AI Safety Camp (Virtual Edition, 2025)
Ω
11mo
Ω
2
75AISC9 has ended and there will be an AISC10
Ω
1y
Ω
4
46Some costs of superposition
Ω
1y
Ω
11
196This might be the last AI Safety Camp
1y
34
Load More
Outer Alignment
2y
(+9/-80)
Inner Alignment
2y
(+13/-84)