Please stop publishing ideas/insights/research about AI

19h

Basically all ideas/insights/research about AI is potentially exfohazardous. At least, it's pretty hard to know when some ideas/insights/research will actually make things better; especially in a world where building an aligned superintelligence (let's call this work "alignment") is quite harder than building any superintelligence (let's call this work "capabilities"), and there's a lot more people trying to do the latter than the former, and they have a lot more material resources.

Ideas about AI, let alone insights about AI, let alone research results about AI, should be kept to private communication between trusted alignment researchers. On lesswrong, we should focus on teaching people the rationality skills which could help them figure out insights that help them build any superintelligence, but are more likely to first give them insights...

(Continue Reading – 1022 more words)

mesaoptimizer3m10

Note that I agree with your sentiment here, although my concrete argument is basically what LawrenceC wrote as a reply to this post.

1mesaoptimizer5m

Ryan, this is kind of a side-note but I notice that you have a very Paul-like approach to arguments and replies on LW. Two things that come to notice: 1. You have a tendency to reply to certain posts or comments with "I don't quite understand what is being said here, and I disagree with it." or, "It doesn't track with my views", or equivalent replies that seem not very useful for understanding your object level arguments. (Although I notice that in the recent comments I see, you usually postfix it with some elaboration on your model.) 2. In the comment I'm replying to, you use a strategy of black-box-like abstraction modeling of a situation to try to argue for a conclusion, one that usually involves numbers such as multipliers or percentages. (I have the impression that Paul uses this a lot, and one concrete example that comes to mind is the takeoff speeds essay. I usually consider such arguments invalid when they seem to throw away information we already have, or seem to use a set of abstractions that don't particularly feel appropriate to the information I believe we have. I just found this interesting and plausible enough to highlight to you. Its a moderate investment of my time to find out examples from your comment history to highlight all these instances, but writing this comment still seemed valuable.

1mesaoptimizer15m

This is a really well-written response. I'm pretty impressed by it.

1mesaoptimizer29m

"The optimal amount of fraud is non-zero."

List your AI X-Risk cruxes!

Aryeh Englander

[I'm posting this as a very informal community request in lieu of a more detailed writeup, because if I wait to do this in a much more careful fashion then it probably won't happen at all. If someone else wants to do a more careful version that would be great!]

By crux here I mean some uncertainty you have such that your estimate for the likelihood of existential risk from AI - your "p(doom)" if you like that term - might shift significantly if that uncertainty were resolved.

More precisely, let's define a crux as a proposition such that: (a) your estimate for the likelihood of existential catastrophe due to AI would shift a non-trivial amount depending on whether that proposition was true or false; (b) you think there's at least...

(See More – 327 more words)

Odd anon10m10

Relevant: My Taxonomy of AI-risk counterarguments, inspired by Zvi Mowshowitz's The Crux List.

[Cosmology Talks] New Probability Axioms Could Fix Cosmology's Multiverse (Partially) - Sylvia Wenmackers

mako yass

19d

This is a linkpost for https://www.youtube.com/watch?v=MBeSoig4DPY

Sylvia is a philosopher of science. Her focus is probability and she has worked on a few theories that aim to extend and modify the standard axioms of probability in order to tackle paradoxes related to infinite spaces. In particular there is a paradox of the "infinite fair lottery" where within standard probability it seems impossible to write down a "fair" probability function on the integers. If you give the integers any non-zero probability, the total probability of all integers is unbounded, so the function is not normalisable. If you give the integers zero probability, the total probability of all integers is also zero. No other option seems viable for a fair distribution.

This paradox arises in a number of places within cosmology, especially in the context of

...

(See More – 131 more words)

justshaun25m10

Thanks for posting Mako. I even mention Effective Altruism/Longtermism at one point in the video!

Mechanistic Interpretability Workshop Happening at ICML 2024!

Neel Nanda, LawrenceC, Fazl

Ω 219h

Announcing the first academic Mechanistic Interpretability workshop, held at ICML 2024! I think this is an exciting development that's a lagging indicator of mech interp gaining legitimacy as an academic field, and a good chance for field building and sharing recent progress!

We'd love to get papers submitted if any of you have relevant projects! Deadline May 29, max 4 or max 8 pages. We welcome anything that brings us closer to a principled understanding of model internals, even if it's not "traditional” mech interp. Check out our website for example topics! There's $1750 in best paper prizes. We also welcome less standard submissions, like open source software, models or datasets, negative results, distillations, or position pieces.

And if anyone is attending ICML, you'd be very welcome at the workshop!...

(See More – 28 more words)

3Florian_Dietz3h

Would a tooling paper be appropriate for this workshop? I wrote a tool that helps ML researchers to analyze the internals of a neural network: https://github.com/FlorianDietz/comgra It is not directly research on mechanistic interpretability, but this could be useful for many people working in the field.

Neel Nanda33m20

Looks relevant to me on a skim! I'd probably want to see some arguments in the submission for why this is useful tooling for mech interp people specifically (though being useful to non mech interp people too is a bonus!)

On precise out-of-context steering

Olli Järviniemi

39m

Meta: I'm writing this in the spirit of sharing negative results, even if they are uninteresting. I'll be brief. Thanks to Aaron Scher for lots of conversations on the topic.

Summary

Problem statement

You are given a sequence of 100 random digits. Your aim is to come up with a short prompt that causes an LLM to output this string of 100 digits verbatim.

To do so, you are allowed to fine-tune the model beforehand. There is a restriction, however, on the fine-tuning examples you may use: no example may contain more than 50 digits.

Results

I spent a few hours with GPT-3.5 and did not get a satisfactory solution. I found this problem harder than I initially expected it to be.

Setup

The question motivating this post's setup is: can you do precise steering...

(See More – 532 more words)

EA Vegan Advocacy is not truthseeking, and it’s everyone’s problem

319

Elizabeth

7mo

This is a linkpost for https://acesounderglass.com/2023/09/28/ea-vegan-advocacy-is-not-truthseeking-and-its-everyones-problem/

Introduction

Effective altruism prides itself on truthseeking. That pride is justified in the sense that EA is better at truthseeking than most members of its reference category, and unjustified in that it is far from meeting its own standards. We’ve already seen dire consequences of the inability to detect bad actors who deflect investigation into potential problems, but by its nature you can never be sure you’ve found all the damage done by epistemic obfuscation because the point is to be self-cloaking.

My concern here is for the underlying dynamics of EA’s weak epistemic immune system, not any one instance. But we can’t analyze the problem without real examples, so individual instances need to be talked about. Worse, the examples that are easiest to understand are almost by definition...

(Continue Reading – 6498 more words)

SpectrumDT1h10

Originally I felt happy about these, because “mostly agreeing” is an unusually positive outcome for that opening. But these discussions are grueling. It is hard to express kindness and curiosity towards someone yelling at you for a position you explicitly disclaimed. Any one of these stories would be a success but en masse they amount to a huge tax on saying anything about veganism, which is already quite labor intensive.
The discussions could still be worth it if it changed the arguer’s mind, or at least how they approached the next argument. But I don’t g

... (read more)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Why is AGI/ASI Inevitable?

DeathlessAmaranth

16h

Hello! My name is Amy.

This is my first LessWrong post. I'm about somewhat certain it will be deleted, but I'm giving it a shot anyway, because I've seen this argument thrown around a few places and I still don't understand. I've read a few chunks of the Sequences, and the fundamentals of rationality sequences.

What makes artificial general intelligence 'inevitable'? What makes artificial superintelligence 'inevitable'? Can't people decide simply not to build AGI/ASI?

I'm very, very new to this whole scene, and while I'm personally convinced AGI/ASI is coming, I haven't really been convinced it's inevitable, the way so many people online (mostly Twitter!) seem convinced.

While I'd appreciate to hear your thoughts, what I'd really love is to get some sources on this. What are the best sequences to read on this topic? Are there any studies or articles which make this argument?

Or is this all just some ridiculous claim those 'e/acc' people cling to?

Hope this doesn't get deleted! Thank you for your help!

1zeshen7h

Yeah, many people, like the majority of users on this forum, have decided to not build AGI. On the other hand, other people have decided to build AGI and are working hard towards it. Side note: LessWrong has a feature to post posts as Questions, you might want to use it for questions in the future.

8the gears to ascension14h

In order to decide to not build it, all people who can and would otherwise build it must in some way end up not doing so. For any individual actor who could build it, they must either choose themselves to not build it, or be prevented from doing so. Pushing towards the former is why it's a good idea to not publish ideas that could, even theoretically, help with building it. In order for the latter to occur, rules backed by sufficient monitoring and force must be used. I don't expect that to happen in time. As a result, I am mostly optimistic about plans where it goes well, rather than plans where it doesn't happen. Plans where it goes well depend on figuring out how to encode to it an indelible target that makes it care about everyone, and then convincing a team who will build it to use that target. as you can imagine, that is an extremely tall order. Therefore, I expect humanity to die, likely incrementally as more and more businesses grow that are more and more AI-powered and uninhibited by any worker or even owner constraints. But those are the places where I see branches that can be intervened on. If you want to prevent it, people are attempting to get governments to implement rules sufficient to actually prevent it from coming into existence anywhere, at all. It looks to me like it's going to just create regulatory capture and still allow the companies and governments to create catastrophically uncaring AI. And no, your question is not the kind that would be deleted here. I appreciate you posting it. Sorry to be so harshly gloomy in response.

12Nathan Helm-Burger15h

I think people can in theory collectively decide not to build AGI or ASI. Certainly you as an individual can choose this! Where things get tricky is when asking whether that outcome seems probable, or coming up with a plan to bring that outcome about. Similarly, as a child I wondered, "Why can't people just choose not to have wars, just decide not to kill each other?" People have selfish desires, and group loyalty instincts, and limited communication and coordination capacity, and the world is arranged in such a way that sometimes this leads to escalating cycles of group conflict that are net bad for everyone involved. That's the scenario I think we are in with AI development also. Everyone would be safer if we didn't, but getting everyone to agree not to and hold to that agreement even in private seems intractably hard.

MondSemmel1h20

In the war example, wars are usually negative sum for all involved, even in the near-term. And so while they do happen, wars are pretty rare, all things considered.

Meanwhile, the problem with AI development is that that there are enormous financial incentives for building increasingly more powerful AI, right up to the point of extinction. Which also means that you need not some but all people from refraining from developing more powerful AI. This is a devilishly difficult coordination problem. What you get by default, absent coordination, is that everyone ... (read more)

Which skincare products are evidence-based?

Vanessa Kosoy

19h

The beauty industry offers a large variety of skincare products (marketed mostly at women), differing both in alleged function and (substantially) in price. However, it's pretty hard to test for yourself how much any of these product help. The feedback loop for things like "getting less wrinkles" is very long.

So, which of these products are actually useful and which are mostly a waste of money? Are more expensive products actually better or just have better branding? How can I find out?

I would guess that sunscreen is definitely helpful, and using some moisturizers for face and body is probably helpful. But, what about night cream? Eye cream? So-called "anti-aging"? Exfoliants?

1Anders Lindström2h

You mean in a positive or negative way? Harmful? https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5615097/ , and/or useless? https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1447210/

1David Fendrich2h

David Sinclair mentioned in a podcast that he is also a bit worried about the long term anabolic effects of the retinoids. He suggested cycling it, possibly synchronized with other catabolic cycling such as fasting.

2Vanessa Kosoy2h

Can you say more? What are "anabolic effects"? What does "cycling" mean in this context?

David Fendrich1h10

A simplistic model of your metabolism is that you have two states:

The anabolic state which builds muscle and creates new cells.
The catabolic state which tears down dysfunctional structures and recycles your cells.

A common theme in scientific anti-aging is that you need to balance both states and that the modern life leads us to spend too long in the anabolic state (in a state of abundance, well fed, moderate temperature and not physically stressed). Anabolic interventions can lead to good outcomes in the short-term and quick results, but can potentially be... (read more)

European Soylent alternatives

ChristianKl

A person at our local LW meetup (not active at LW.com) tested various Soylent alternatives that are available in Europe and wrote a post about them:

______________________

Over the course of the last three months, I've sampled parts of the
european Soylent alternatives to determine which ones would work for me
longterm.

- The prices are always for the standard option and might differ for
e.g. High Protein versions.
- The prices are always for the amount where you get the cheapest
marginal price (usually around a one month supply, i.e. 90 meals)
- Changing your diet to Soylent alternatives quickly leads to increased
flatulence for some time - I'd recommend a slow adoption.
- You can pay for all of them with Bitcoin.
- The list is...

(Continue Reading – 1120 more words)

1Mir4h

Have you updated on this since you made this comment (I ask to check whether I should invest in doing a search)? If not, do you now recall any specific examples?

Viliam2h20

I haven't paid attention to this recently (I have small kids, so we need to cook anyway), but I think it is magnesium and calcium -- they somehow interfere with each other's absorption.

Just a random thing I found in google, but didn't read it: https://pubmed.ncbi.nlm.nih.gov/1211491/

(Plus there is a more general concern about what other similar relations may exist that no one has studied yet, because most people do not eat like "I only eat X at the same time as Y, mixed together".)

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion

Summary

Setup

Introduction

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA