LessWrong

Ilya Sutskever and Jan Leike resign from OpenAI

This is a linkpost for https://www.nytimes.com/2024/05/14/technology/ilya-sutskever-leaving-openai.html

Ilya Sutskever and Jan Leike have resigned. They led OpenAI's alignment work. Superalignment will now be led by John Schulman, it seems. Jakub Pachocki replaced Sutskever as Chief Scientist.

Reasons are unclear (as usual when safety people leave OpenAI).

The NYT piece and others I've seen don't really have details. Archive of NYT if you want to read it anyway.

OpenAI announced Sutskever's departure in a blogpost.

Sutskever and Leike confirmed their departures in tweets.

Zach Stein-Perlman13m20

Two other executives left two weeks ago, but that's not obviously safety-related.

1Teun van der Weij1h

I am lacking context, why is this important?

6habryka1h

Cade Metz was the NYT journalist who doxxed Scott Alexander. IMO he has also displayed a somewhat questionable understanding of journalistic competence and integrity, and seems to be quite into narrativizing things in a weirdly adversarial way (I don't think it's obvious how this applies to this article, but it seems useful to know when modeling the trustworthiness of the article).

1Mateusz Bagiński1h

Yeah, that meme did reach me. But I was just assuming Ilya got back (was told to get back) to doing the usual Ilya superalignment things and decided (was told) not to stick his neck out.

O O's Shortform

O O

O O21m30

Is this paper essentially implying the scaling hypothesis will converge to a perfect world model? https://arxiv.org/pdf/2405.07987

It says models trained on text modalities and image modalities both converge to the same representation with each training step. It also hypothesizes this is a brain like representation of the world. Ilya liked this paper so I’m giving it more weight. Am I reading too much into it or is it basically fully validating the scaling hypothesis?

Teaching CS During Take-Off

andrew carle

I stayed up too late collecting way-past-deadline papers and writing report cards. When I woke up at 6, this anxious email from one of my g11 Computer Science students was already in my Inbox.

Student: Hello Mr. Carle, I hope you've slept well; I haven't.

I've been seeing a lot of new media regarding how developed AI has become in software programming, most relevantly videos about NVIDIA's new artificial intelligence software developer, Devin.

Things like these are almost disheartening for me to see as I try (and struggle) to get better at coding and developing software. It feels like I'll never use the information that I learn in your class outside of high school because I can just ask an AI to write complex programs, and it will do it...

(See More – 541 more words)

Daniel Kokotajlo1h75

I think that's a great answer -- assuming that's what you believe.

For me, I don't believe point 3 on the AI timelines -- I think AGI will probably be here by 2029, and could indeed arrive this year. And even if it goes well and humans maintain control and we don't get concentration-of-power issues... the software development skills your students are learning will be obsolete, along with almost all skills.

5Raemon2h

2AnthonyC5h

That depends on the student. It definitely would have been a wonderful answer for me at 17. I will also say, well done, because I can think of at most 2 out of all my teachers, K-12, who might have been capable to giving that good of an answer to pretty much any deep question about any topic this challenging (and of those two, only one who might have put in the effort to actually do it).

introduction to cancer vaccines

bhauth

10d

This is a linkpost for https://www.bhauth.com/blog/biology/cancer%20vaccines.html

cancer neoantigens

For cells to become cancerous, they must have mutations that cause uncontrolled replication and mutations that prevent that uncontrolled replication from causing apoptosis. Because cancer requires several mutations, it often begins with damage to mutation-preventing mechanisms. As such, cancers often have many mutations not required for their growth, which often cause changes to structure of some surface proteins.

The modified surface proteins of cancer cells are called "neoantigens". An approach to cancer treatment that's currently being researched is to identify some specific neoantigens of a patient's cancer, and create a personalized vaccine to cause their immune system to recognize them. Such vaccines would use either mRNA or synthetic long peptides. The steps required are as follows:

The cancer must develop neoantigens that are sufficiently distinct from human surface

...

(Continue Reading – 1445 more words)

habryka1h10

Promoted to curated: Cancer vaccines are cool. I didn't quite realize how cool they were before this post, and this post is a quite accessible intro into them.

my note system

bhauth

This is a linkpost for https://www.bhauth.com/blog/thinking/note%20system.html

I've been told that my number of blog posts is impressive, but my personal notes are much larger than my blog, over a million words and with higher information density. Since I've had a bit of practice taking notes, I thought I'd describe the system I developed. It's more complex than some integrated solutions, but it's powerful, modular, free, and doesn't rely on any specific service.

is this necessary?

Most people don't take extensive notes and don't use git at all, so obviously a fancy note system isn't strictly necessary for most people. To some extent, you have to ask yourself: are your notes on some topic going to be more useful to you than a Wikipedia page or internet search or book? Do you need more records than...

(See More – 442 more words)

Slapstick1h10

Have you ever used Obsidian? Sounds similar to the method you're describing. If so, what do you think of it? Especially with respect to your preferred workflow?

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Embedded Whistle Synth

jefftk

A few years ago I ported my whistle synth system from my laptop to a Raspberry Pi. This was a big improvement, but I still wasn't that happy:

To get good quality audio in and out I was using a 2i2 audio interface, which is expensive, bulky, and has a lot of buttons and knobs that can be bumped.
To use a single mic for both whistle and talkbox I was using a cheap passive A/B switcher. Which feels fragile, causes pops when phantom power is on, and is one more thing to plug in.
It's hard to get super low levels of latency with the Pi. It's probably possible to get more performance out of my existing hardware than I'm managing, but as it is I'm not happy with it.
The Pi's SD card gets

...

(See More – 728 more words)

An Introduction to AI Sandbagging

Teun van der Weij, Felix Hofstätter, Francis Rhys Ward

Ω 1819d

Summary: Evaluations provide crucial information to determine the safety of AI systems which might be deployed or (further) developed. These development and deployment decisions have important safety consequences, and therefore they require trustworthy information. One reason why evaluation results might be untrustworthy is sandbagging, which we define as strategic underperformance on an evaluation. The strategic nature can originate from the developer (developer sandbagging) and the AI system itself (AI system sandbagging). This post is an introduction to the problem of sandbagging.

The Volkswagen emissions scandal

There are environmental regulations which require the reduction of harmful emissions from diesel vehicles, with the goal of protecting public health and the environment. Volkswagen struggled to meet these emissions standards while maintaining the desired performance and fuel efficiency of their diesel engines (Wikipedia). Consequently, Volkswagen...

(Continue Reading – 2217 more words)

gwern4h60

Seems like this might work better as a control variate sort of idea, with the noise added to the environment rather than the model itself, to measure whether the model can steer its way back to its intended measured-capability level. If you add a lot of noise to an environment, then an agent which is trying to pessimize and achieve a certain lower performance will have to overperform in order to 'merely' be bad; then you can subtract the noise which you added, and estimate its performance. If it's been deliberately taking a dive, then it'll do better than ... (read more)

Alexander Gietelink Oldenziel's Shortform

Alexander Gietelink Oldenziel

kromem4h10

It's funny that this has been recently shown in a paper. I've been thinking a lot about this phenomenon regarding fields with little to no capacity for testable predictions like history.

I got very into history over the last few years, and found there was a significant advantage to being unknowledgeable that was not available to the knowledged, and it was exactly what this paper is talking about.

By not knowing anything, I could entertain multiple bizarre ideas without immediately thinking "but no, that doesn't make sense because of X." And then, each of tho... (read more)

2DanielFilan7h

Links to Dan Murfet's AXRP interview: * Transcript * Video

2Vladimir_Nesov15h

The story involves phase changes. Just scaling is what's likely to be available to human developers in the short term (a few years), it's not enough for superintelligence. Autonomous agency secures funding for a bit more scaling. If this proves sufficient to get smart autonomous chatbots, they then provide speed to very quickly reach the more elusive AI research needed for superintelligence. It's not a little speed, it's a lot of speed, serial speedup of about 100x plus running in parallel. This is not as visible today, because current chatbots are not capable of doing useful work with serial depth, so the serial speedup is not in practice distinct from throughput and cost. But with actually useful chatbots it turns decades to years, software and theory from distant future become quickly available, non-software projects get to be designed in perfect detail faster than they can be assembled.

2Alexander Gietelink Oldenziel16h

I didn't intend the causes Cj to equate to direct computation of \phi(x) on the x_i. They are rather other pieces of evidence that the powerful agent has that make it believe \phi(x_i). I don't know if that's what you meant. I agree seeing x_i such that \phi(x_i) should increase credence in \forall x \phi(x) even in the presence of knowledge of C_j. And the Shapely value proposal will do so. (Bad tex. On my phone)

LESSWRONG
LW

LessOnline Festival

May 31st - June 2nd, in Berkeley CA

Quick Takes

Popular Comments

Recent Discussion

cancer neoantigens

is this necessary?

The Volkswagen emissions scandal