LessOnline Festival

May 31st - June 2nd, in Berkeley CA

A festival of truth-seeking, optimization, and blogging. We'll have writing workshops, rationality classes, puzzle hunts, and thoughtful conversations across a sprawling fractal campus of nooks and whiteboards.

Buy Tickets

Customize
Rationality+Rationality+World Modeling+World Modeling+AIAIWorld OptimizationWorld OptimizationPracticalPracticalCommunityCommunity
Personal Blog+
For anyone interested in Natural Abstractions type research: https://arxiv.org/abs/2405.07987 Claude summary: Key points of "The Platonic Representation Hypothesis" paper: 1. Neural networks trained on different objectives, architectures, and modalities are converging to similar representations of the world as they scale up in size and capabilities. 2. This convergence is driven by the shared structure of the underlying reality generating the data, which acts as an attractor for the learned representations. 3. Scaling up model size, data quantity, and task diversity leads to representations that capture more information about the underlying reality, increasing convergence. 4. Contrastive learning objectives in particular lead to representations that capture the pointwise mutual information (PMI) of the joint distribution over observed events. 5. This convergence has implications for enhanced generalization, sample efficiency, and knowledge transfer as models scale, as well as reduced bias and hallucination. Relevance to AI alignment: 1. Convergent representations shaped by the structure of reality could lead to more reliable and robust AI systems that are better anchored to the real world. 2. If AI systems are capturing the true structure of the world, it increases the chances that their objectives, world models, and behaviors are aligned with reality rather than being arbitrarily alien or uninterpretable. 3. Shared representations across AI systems could make it easier to understand, compare, and control their behavior, rather than dealing with arbitrary black boxes. This enhanced transparency is important for alignment. 4. The hypothesis implies that scale leads to more general, flexible and uni-modal systems. Generality is key for advanced AI systems we want to be aligned.
elifland4534
1
The word "overconfident" seems overloaded. Here are some things I think that people sometimes mean when they say someone is overconfident: 1. They gave a binary probability that is too far from 50% (I believe this is the original one) 2. They overestimated a binary probability (e.g. they said 20% when it should be 1%) 3. Their estimate is arrogant (e.g. they say there's a 40% chance their startup fails when it should be 95%), or maybe they give an arrogant vibe 4. They seem too unwilling to change their mind upon arguments (maybe their credal resilience is too high) 5. They gave a probability distribution that seems wrong in some way (e.g. "50% AGI by 2030 is so overconfident, I think it should be 10%") * This one is pernicious in that any probability distribution gives very low percentages for some range, so being specific here seems important. 6. Their binary estimate or probability distribution seems too different from some sort of base rate, reference class, or expert(s) that they should defer to. How much does this overloading matter? I'm not sure, but one worry is that it allows people to score cheap rhetorical points by claiming someone else is overconfident when in practice they might mean something like "your probability distribution is wrong in some way". Beware of accusing someone of overconfidence without being more specific about what you mean.
RobertM4520
7
Vaguely feeling like OpenAI might be moving away from GPT-N+1 release model, for some combination of "political/frog-boiling" reasons and "scaling actually hitting a wall" reasons.  Seems relevant to note, since in the worlds where they hadn't been drip-feeding people incremental releases of slight improvements over the original GPT-4 capabilities, and instead just dropped GPT-5 (and it was as much of an improvement over 4 as 4 was over 3, or close), that might have prompted people to do an explicit orientation step.  As it is, I expect less of that kind of orientation to happen.  (Though maybe I'm speaking too soon and they will drop GPT-5 on us at some point, and it'll still manage to be a step-function improvement over whatever the latest GPT-4* model is at that point.)
Epistemic status: not a lawyer, but I've worked with a lot of them. As I understand it, an NDA isn't enforceable against a subpoena (though the former employer can seek a protective order for the testimony).   Someone should really encourage law enforcement or Congress to subpoena the OpenAI resigners...
Ruby92
0
As noted in an update on LW Frontpage Experiments! (aka "Take the wheel, Shoggoth!"), yesterday we started an AB test on some users automatically being switched over to the Enriched [with recommendations] Latest Posts feed. The first ~18 hours worth of data does seem like a real uptick in clickthrough-rate, though some of that could be novelty. (examining members of the test (n=921) and control groups (n~=3000) for the last month, the test group seemed to have a slightly (~7%) lower clickthrough-rate baseline, I haven't investigated this) However the specific posts that people are clicking on don't feel on the whole like the ones I was most hoping the recommendations algorithm would suggest (and get clicked on). It feels kinda like there's a selection towards clickbaity or must-read news (not completely, just not as much as I like).  If I look over items recommended by Shoggoth that are older (50% are from last month, 50% older than that), they feel better but seem to get fewer clicks.   A to-do item is to look at voting behavior relative to clicking behavior. Having clicked on these items, do people upvote them as much as others?  I'm also wanting to experiment with just applying a recency penalty if it seems that older content suggested by the algorithm is more "wholesome", though I'd like to get some data from the current config before changing it.

Popular Comments

Recent Discussion

For anyone interested in Natural Abstractions type research: https://arxiv.org/abs/2405.07987

Claude summary:

Key points of "The Platonic Representation Hypothesis" paper:

  1. Neural networks trained on different objectives, architectures, and modalities are converging to similar representations of the world as they scale up in size and capabilities.

  2. This convergence is driven by the shared structure of the underlying reality generating the data, which acts as an attractor for the learned representations.

  3. Scaling up model size, data quantity, and task dive

... (read more)

Predicting the future is hard, so it’s no surprise that we occasionally miss important developments.

However, several times recently, in the contexts of Covid forecasting and AI progress, I noticed that I missed some crucial feature of a development I was interested in getting right, and it felt to me like I could’ve seen it coming if only I had tried a little harder. (Some others probably did better, but I could imagine that I wasn't the only one who got things wrong.)

Maybe this is hindsight bias, but if there’s something to it, I want to distill the nature of the mistake.

First, here are the examples that prompted me to take notice:

Predicting the course of the Covid pandemic:

  • I didn’t foresee the contribution from sociological factors (e.g., “people not wanting
...

The biggest danger with AIs slightly smarter than the average human is that they will be weaponised, so they'd only safe in a very narrow sense.

I should also note, that if we built an AI that was slightly smarter than the average human all-round, it'd be genius level or at least exceptional in several narrow capabilities, so it'll be a lot less safe than you might think.

Caspar Oesterheld came up with two of the most important concepts in my field of work: Evidential Cooperation in Large Worlds and Safe Pareto Improvements. He also came up with a potential implementation of evidential decision theory in boundedly rational agents called decision auctions, wrote a comprehensive review of anthropics and how it interacts with decision theory which most of my anthropics discussions built on, and independently decided to work on AI some time late 2009 or early 2010.


 

Needless to say, I have a lot of respect for Caspar’s work. I’ve often felt very confused about what to do in my attempts at conceptual research, so I decided to ask Caspar how he did his research. Below is my writeup from the resulting conversation.

How Caspar came up with surrogate goals

The process

  • Caspar
...
1ektimo
Thanks for the interesting write-up. Regarding Evidential Cooperation in Large Worlds, the Identical Twin One Shot Prisoner's dilemma makes sense to me because the entity giving the payout is connected to both worlds. What is the intuition for ECL (where my understanding is there isn't any connection)?
1ektimo
What is PTI?

Likely: Path To Impact

I stayed up too late collecting way-past-deadline papers and writing report cards. When I woke up at 6, this anxious email from one of my g11 Computer Science students was already in my Inbox.

Student: Hello Mr. Carle, I hope you've slept well; I haven't.

I've been seeing a lot of new media regarding how developed AI has become in software programming, most relevantly videos about NVIDIA's new artificial intelligence software developer, Devin.

Things like these are almost disheartening for me to see as I try (and struggle) to get better at coding and developing software. It feels like I'll never use the information that I learn in your class outside of high school because I can just ask an AI to write complex programs, and it will do it...

13Daniel Kokotajlo
I think that's a great answer -- assuming that's what you believe. For me, I don't believe point 3 on the AI timelines -- I think AGI will probably be here by 2029, and could indeed arrive this year. And even if it goes well and humans maintain control and we don't get concentration-of-power issues... the software development skills your students are learning will be obsolete, along with almost all skills.
No77e20

You may have already qualified this prediction somewhere else, but I can't find where. I'm interested in:

1. What do you mean by "AGI"? Superhuman at any task?
2. "probably be here" means >= 50%? 90%?

5Raemon
<3
2AnthonyC
That depends on the student. It definitely would have been a wonderful answer for me at 17. I will also say, well done, because I can think of at most 2 out of all my teachers, K-12, who might have been capable to giving that good of an answer to pretty much any deep question about any topic this challenging (and of those two, only one who might have put in the effort to actually do it).

Epistemic status: not a lawyer, but I've worked with a lot of them.

As I understand it, an NDA isn't enforceable against a subpoena (though the former employer can seek a protective order for the testimony).   Someone should really encourage law enforcement or Congress to subpoena the OpenAI resigners...

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

This is the second in a sequence of four posts taken from my recent report: Why Did Environmentalism Become Partisan?

Many of the specific claims made here are investigated in the full report. If you want to know more about how fossil fuel companies’ campaign contributions, the partisan lean of academia, or newspapers’ reporting on climate change have changed since 1980, the information is there.

Introduction

Environmentalism in the United States today is unusually partisan, compared to other issues, countries, or even the United States in the 1980s. This contingency suggests that the explanation centers on the choices of individual decision makers, not on broad structural or ideological factors that would be consistent across many countries and times.

This post describes the history of how particular partisan alliances were made involving...

sloonz10

Environmentalism is not partisan in many other countries, including in highly partisan countries like South Korea  or France

 

French here. I think diving into details will shed some light.

Our mainstream right is roughly around your Joe Biden. Maybe a bit more on the right, but not much more. Our mainstream left is roughly around your Bernie Sanders. We just don’t have your republicans in the mainstream. And it turns out that there’s not much partisanship relative to climate change between Biden and Sanders.

This can be observed on other topics. The... (read more)

This is a linkpost for https://arxiv.org/abs/2405.05673

Linked is my MSc thesis, where I do regret analysis for an infra-Bayesian[1] generalization of stochastic linear bandits.

The main significance that I see in this work is:

  • Expanding our understanding of infra-Bayesian regret bounds, and solidifying our confidence that infra-Bayesianism is a viable approach. Previously, the most interesting IB regret analysis we had was Tian et al which deals (essentially) with episodic infra-MDPs. My work here doesn't supersede Tian et al because it only talks about bandits (i.e. stateless infra-Bayesian laws), but it complements it because it deals with a parameteric hypothesis space (i.e. fits into the general theme in learning-theory that generalization bounds should scale with the dimension of the hypothesis class).
  • Discovering some surprising features of infra-Bayesian learning that have no analogues in classical theory. In particular, it
...

I'll note that I think this is a mistake that lots of people working in AI safety have made, ignoring the benefits of academic credentials and prestige because of the obvious costs and annoyance.  It's not always better to work in academia, but it's also worth really appreciating the costs of not doing so in foregone opportunities and experience, as Vanessa highlighted. (Founder effects matter; Eliezer had good reasons not to pursue this path, but I think others followed that path instead of evaluating the question clearly for their own work.)

And in m... (read more)

cancer neoantigens

For cells to become cancerous, they must have mutations that cause uncontrolled replication and mutations that prevent that uncontrolled replication from causing apoptosis. Because cancer requires several mutations, it often begins with damage to mutation-preventing mechanisms. As such, cancers often have many mutations not required for their growth, which often cause changes to structure of some surface proteins.

The modified surface proteins of cancer cells are called "neoantigens". An approach to cancer treatment that's currently being researched is to identify some specific neoantigens of a patient's cancer, and create a personalized vaccine to cause their immune system to recognize them. Such vaccines would use either mRNA or synthetic long peptides. The steps required are as follows:

  1. The cancer must develop neoantigens that are sufficiently distinct from human surface
...
2dr_s
Question: would it be possible to use retroviruses to target cancer cells selectively to insert a gene that expresses a target protein, and then do monoclonal antibody treatment on that? Would the cancer accelerated metabolism make this any good?

Not an expert here, but it seems to me that if you can make a virus that preferentially infects cancer cells you might as well make the virus kill the infected cancer cells directly.

3habryka
Promoted to curated: Cancer vaccines are cool. I didn't quite realize how cool they were before this post, and this post is a quite accessible intro into them.