Edward Kmett — LessWrong

LESSWRONG
LW

Replying toThe Waluigi Effect (mega-post)

The problem is the lack of narrative heel-face turns for truly deceptive characters. Once a character reveals they've been secretly a racist, evil, whatever, they rarely flip to good and honest spontaneously without a huge character arc.

Replying toWhy I'm joining Anthropic

Edward Kmett3y

Why I'm joining Anthropic

Time to update my position on

Replying toDiscovering Language Model Behaviors with Model-Written Evaluations

Edward Kmett3y

Discovering Language Model Behaviors with Model-Written Evaluations

Great work!

One of these days I hope Evan collaborates on a paper that gives me more reason to expect a brighter future -- beyond surfacing latent issues that we really need to pay attention to now for said future to be realizable!

Today is not one of those days.

That said, seeing all the emergent power-seeking behaviors laid out is quite depressing.

Replying toTheses on Sleep

Edward Kmett4y*

Theses on Sleep

I heavily sympathize with a lot of the views from this post.

I used to sleep much more (~9 hours), but as I've aged, I now tend to sleep between 3-5 hours a night. This was a rather conscious choice on my part, but now I find it hard to revert to my previous behavior. I switched to various forms of polyphasic sleep during my bender through academia from 2004-2006, and while I eventually abandoned polyphasic, I haven't switched gone back to a "regular" sleep schedule since.

I do find that at my most acute stage of sleep deprivation I become much more mono-focused. I have to stay interested to stay awake, so I... (read more)

Replying toWhat are good election betting opportunities?

Edward Kmett5y

What are good election betting opportunities?

I was able to complete the transactions on the "What will be the Electoral College margin in the 2020 presidential election?" side, but not on the election itself side.

Replying toThe rationalist community's location problem

Edward Kmett5y

The rationalist community's location problem

True. Sorry. My baseline for that passing tax comment was the previous clause about New Hampshire, as it seems a significant part of the argument trotted out in favor of New Hampshire, over all the other points scattered around Boston. e.g. northern or western MA, New Haven, Providence, etc.

I do agree that it is, as you point out, almost as strong a strike against my Ann Arbor narrative.

Replying toThe rationalist community's location problem

Edward Kmett5y

The rationalist community's location problem

Madison checks most of the same cultural boxes, but it loses out on the ease of international air travel.

Replying toThe rationalist community's location problem

Edward Kmett5y*

The rationalist community's location problem

My working model of a good location is either in or around Ann Arbor.

Travel is going to be a concern for any location, I think. Why? I think you want visiting scholars, the ability to reach out to other organizations, the ability for folks who have become sort of part of the rationalist diaspora to be able to physically reach out and connect. You may not want to be in the major city, but ready access to an international airport seems like a good filter, as the farther the nearest one is away from you, the steeper the gradient to get anyone to come visit is.

If you run through a list of... (read 819 more words →)

Replying toAre we in an AI overhang?

Edward Kmett6y

Are we in an AI overhang?

Networking 500 V100 together is one challenge, but networking 500k V100s is another entirely.

Even if you might have trouble networking a 100x larger system together for training, you can train the smaller network 100x and stitch answers together using ensemble methods, and make decent use of the extra compute. It may not be as good as growing the network that full factor, but if you have extra compute beyond the cap of whatever connected-enough training system size you can muster, there are worse ways to spend it.

I am somewhat more prone to think that more selective attention (e.g. Big Bird's block-random attention model) could bring down the quadratic cost of the window size quickly enough to be a factor here. Replacing a quadratic term with a linear or n log n or heck even a n^1.85 term goes a long way when billions are on the table.

Replying toAIRCS Workshop: How I failed to be recruited at MIRI.

Edward Kmett6y

AIRCS Workshop: How I failed to be recruited at MIRI.

Congratulations on ending my long-time LW lurker status and prompting me to comment for once. =)

I think Ben's comment hits pretty close to the state of affairs. I have been internalizing MIRI's goals and looking for obstacles in the surrounding research space that I can knock down to make their (our? our) work go more smoothly, either in the form of subgoals or backwards chaining required capabilities to get a sense of how to proceed.

Why do I work around the edges? Mostly because if I take the vector I'm trying to push the world and the direction MIRI is trying to push the world, and project one onto the other, it currently... (read more)