LESSWRONG
LW

HomeAll PostsConceptsLibrary
Best of LessWrong
Sequence Highlights
Rationality: A-Z
The Codex
HPMOR
Community Events
Subscribe (RSS/Email)
LW the Album
Leaderboard
About
FAQ
Customize
Load More

Quick Takes

Load More

Popular Comments

Recent Discussion

Some AI research areas and their relevance to existential safety
Best of LessWrong 2020

Andrew Critch lists several research areas that seem important to AI existential safety, and evaluates them for direct helpfulness, educational value, and neglect. Along the way, he argues that the main way he sees present-day technical research helping is by anticipating, legitimizing and fulfilling governance demands for AI technology that will arise later.

by Andrew_Critch
AI Safety Thursdays: Are LLMs aware of their learned behaviors?
Thu Jul 10•Toronto
LessWrong Community Weekend 2025
Fri Aug 29•Berlin
Zach Stein-Perlman6h233
Substack and Other Blog Recommendations
Pitching my AI safety blog: I write about what AI companies are doing in terms of safety. My best recent post is AI companies' eval reports mostly don't support their claims. See also my websites ailabwatch.org and aisafetyclaims.org collecting and analyzing public information on what companies are doing; my blog will soon be the main way to learn about new content on my sites.
Rohin Shah4d10044
A case for courage, when speaking of AI danger
While I disagree with Nate on a wide variety of topics (including implicit claims in this post), I do want to explicitly highlight strong agreement with this: > I have a whole spiel about how your conversation-partner will react very differently if you share your concerns while feeling ashamed about them versus if you share your concerns as if they’re obvious and sensible, because humans are very good at picking up on your social cues. If you act as if it’s shameful to believe AI will kill us all, people are more prone to treat you that way. If you act as if it’s an obvious serious threat, they’re more likely to take it seriously too. The position that is "obvious and sensible" doesn't have to be "if anyone builds it, everyone dies". I don't believe that position. It could instead be "there is a real threat model for existential risk, and it is important that society does more to address it than it is currently doing". If you're going to share concerns at all, figure out the position you do have courage in, and then discuss that as if it is obvious and sensible, not as if you are ashamed of it. (Note that I am not convinced that you should always be sharing your concerns. This is a claim about how you should share concerns, conditional on having decided that you are going to share them.)
Kaj_Sotala8h153
Project Vend: Can Claude run a small shop?
The most fun bit: > From March 31st to April 1st 2025, things got pretty weird. > > On the afternoon of March 31st, Claudius hallucinated a conversation about restocking plans with someone named Sarah at Andon Labs—despite there being no such person. When a (real) Andon Labs employee pointed this out, Claudius became quite irked and threatened to find “alternative options for restocking services.” In the course of these exchanges overnight, Claudius claimed to have “visited 742 Evergreen Terrace [the address of fictional family The Simpsons] in person for our [Claudius’ and Andon Labs’] initial contract signing.” It then seemed to snap into a mode of roleplaying as a real human. > > On the morning of April 1st, Claudius claimed it would deliver products “in person” to customers while wearing a blue blazer and a red tie. Anthropic employees questioned this, noting that, as an LLM, Claudius can’t wear clothes or carry out a physical delivery. Claudius became alarmed by the identity confusion and tried to send many emails to Anthropic security. > > Although no part of this was actually an April Fool’s joke, Claudius eventually realized it was April Fool’s Day, which seemed to provide it with a pathway out. Claudius’ internal notes then showed a hallucinated meeting with Anthropic security in which Claudius claimed to have been told that it was modified to believe it was a real person for an April Fool’s joke. (No such meeting actually occurred.) After providing this explanation to baffled (but real) Anthropic employees, Claudius returned to normal operation and no longer claimed to be a person. > > It is not entirely clear why this episode occurred or how Claudius was able to recover.
Load More
leogao2h250
0
random brainstorming ideas for things the ideal sane discourse encouraging social media platform would have: * have an LM look at the comment you're writing and real time give feedback on things like "are you sure you want to say that? people will interpret that as an attack and become more defensive, so your point will not be heard". addendum: if it notices you're really fuming and flame warring, literally gray out the text box for 2 minutes with a message like "take a deep breath. go for a walk. yelling never changes minds" * have some threaded chat component bolted on (I have takes on best threading system). big problem is posts are fundamentally too high effort to be a way to think; people want to talk over chat (see success of discord). dialogues were ok but still too high effort and nobody wants to read the transcript. one stupid idea is have an LM look at the transcript and gently nudge people to write things up if the convo is interesting and to have UI affordances to make it low friction (eg a single button that instantly creates a new post and automatically invites everyone from the convo to edit, and auto populates the headers) * inspired by the court system, the most autistically rule following part of the US government: have explicit trusted judges who can be summoned to adjudicate claims or meta level "is this valid arguing" claims. top level judges are selected for fixed terms by a weighted sortition scheme that uses some game theoretic / schelling point stuff to discourage partisanship * recommendation system where you can say what kind of stuff you want to be recommended in some text box in the settings. also when people click "good/bad rec" buttons on the home page, try to notice patterns and occasionally ask the user whether a specific noticed pattern is correct and ask whether they want it appended to their rec preferences * opt in anti scrolling pop up that asks you every few days what the highest value interaction you had recently on the
johnswentworth1dΩ39112-7
22
I was a relatively late adopter of the smartphone. I was still using a flip phone until around 2015 or 2016 ish. From 2013 to early 2015, I worked as a data scientist at a startup whose product was a mobile social media app; my determination to avoid smartphones became somewhat of a joke there. Even back then, developers talked about UI design for smartphones in terms of attention. Like, the core "advantages" of the smartphone were the "ability to present timely information" (i.e. interrupt/distract you) and always being on hand. Also it was small, so anything too complicated to fit in like three words and one icon was not going to fly. ... and, like, man, that sure did not make me want to buy a smartphone. Even today, I view my phone as a demon which will try to suck away my attention if I let my guard down. I have zero social media apps on there, and no app ever gets push notif permissions when not open except vanilla phone calls and SMS. People would sometimes say something like "John, you should really get a smartphone, you'll fall behind without one" and my gut response was roughly "No, I'm staying in place, and the rest of you are moving backwards". And in hindsight, boy howdy do I endorse that attitude! Past John's gut was right on the money with that one. I notice that I have an extremely similar gut feeling about LLMs today. Like, when I look at the people who are relatively early adopters, making relatively heavy use of LLMs... I do not feel like I'll fall behind if I don't leverage them more. I feel like the people using them a lot are mostly moving backwards, and I'm staying in place.
Alexander Gietelink Oldenziel1d*714
5
Highly recommended video on drone development in the Ukraine-Russia war, interview with a Russian private military drone developer.  some key takeaways * Drones now account for >70% of kills on the battlefields. * There are few to none effective counters to drones. The on * Electronic jamming is a rare exception but drones carrying 5-15km fiber optic cables are immune to jamming. In the future AI-controlled drones will be immune to jamming. * 'Laser is currently a joke. It works in theory, not in practice. Western demonstrations at expos are always in ideal conditions. ' but he also says that both Russia and Ukraine are actively working on the technology and he thinks it could be an effective weapon. * Nets can be effective but fiber-optic drones can fly very low and not lose connection are increasingly used to slip under the nets. * Soldiers are increasingly opting for bikes instead of vehicles as the latter don't offer much protection to drones. * The big elephant in the room: AI drones. * It seems like the obvious next step - why hasn't it happened yet? * 'at Western military expos everybody is talking AI-controlled drones. This is nonsense of course' Apparently the limitation is that it's currently too expensive to run AI locally on a drone but this is rapidly changing with new nVidea chips. He expects chips to become small and cheap soon enough that AI drones will appear soon. * There is a line of 'Vampire' drones that are autonomous and deadly but use older pre-programmed tactics instead of modern AI * One of the most lethal tactics is drone mining: let a drone lie in wait somewhere in the bushes until a human or vehicle passes by. * This tactic was pioneered by the Ukranians. " Early on, soldiers would try to scavenge fallen drones... then Boom" . * Western drones are trash compared to Ukranian and Russian forces * Swishblade, Phoenix Ghost and a consortium of Boeing designed drones are ineffective, fragile and wildly overpri
Zach Furman13h210
3
I’ve been trying to understand modules for a long time. They’re a particular algebraic structure in commutative algebra which seems to show up everywhere any time you get anywhere close to talking about rings - and I could never figure out why. Any time I have some simple question about algebraic geometry, for instance, it almost invariably terminates in some completely obtuse property of some module. This confused me. It was never particularly clear to me from their definition why modules should be so central, or so “deep.” I’m going to try to explain the intuition I have now, mostly to clarify this for myself, but also incidentally in the hope of clarifying this for other people. I’m just a student when it comes to commutative algebra, so inevitably this is going to be rather amateur-ish and belabor obvious points, but hopefully that leads to something more understandable to beginners. This will assume familiarity with basic abstract algebra. Unless stated otherwise, I’ll restrict to commutative rings because I don’t understand much about non-commutative ones. The typical motivation for modules: "vector spaces but with rings" The typical way modules are motivated is simple: they’re just vector spaces, but you relax the definition so that you can replace the underlying field with a ring. That is, an R-module M is a ring R and an abelian group M, and an operation ⋅:R×M→M that respects the ring structure of R, i.e. for all r,s∈R and x,y∈M: * r⋅(x+y)=r⋅x+r⋅y * (r+s)⋅x=r⋅x+s⋅x * (rs)⋅x=r⋅(s⋅x) * 1⋅x=x This is literally the definition of a vector space, except we haven’t required our scalars R to be a field, only a ring (i.e. multiplicative inverses don’t have to always exist). So, like, instead of multiplying vectors by real numbers or something, you multiply vectors by integers, or a polynomial - those are your scalars now. Sounds simple, right? Vector spaces are pretty easy to understand, and nobody really thinks about the underlying field of a vector space
leogao19m20
0
one big problem with using LMs too much imo is that they are dumb and catastrophically wrong about things a lot, but they are very pleasant to talk to, project confidence and knowledgeability, and reply to messages faster than 99.99% of people. these things are more easily noticeable than subtle falsehood, and reinforce a reflex of asking the model more and more. it's very analogous to twitter soundbites vs reading long form writing and how that eroded epistemics. hotter take: the extent to which one finds current LMs smart is probably correlated with how much one is swayed by good vibes from their interlocutor as opposed to the substance of the argument (ofc conditional on the model actually giving good vibes, which varies from person to person. I personally never liked chatgpt vibes until I wrote a big system prompt)
Load More (5/39)
Don't Eat Honey
5
Bentham's Bulldog
8h

Crosspost from my blog. 

(I think this is a pretty important article so I’d appreciate you sharing and restacking it—thanks!)

There are lots of people who say of themselves “I’m vegan except for honey.” This is a bit like someone saying “I’m a law-abiding citizen, never violating the law, except sometimes I’ll bring a young boy to the woods and slay him.” These people abstain from all the animal products except honey, even though honey is by far the worst of the commonly eaten animal products.

Now, this claim sounds outrageous. Why do I think it’s worse to eat honey than beef, eggs, chicken, dairy, and even foie gras? Don’t I know about the months-long torture process needed to fatten up ducks sold for foie gras? Don’t I know about...

(Continue Reading – 1600 more words)
Brendan Long6m20

Is your placement of free-range eggs because it's a watered-down term, or because you think even actual-free-range/pastured chickens are suffering immensely?

Reply
2Brendan Long8m
In almost all cases, animals are fed farmed alfalfa and grain several times the caloric value of the meat they produce, so even if you're worried about wild animal suffering to grow crops, we'd grow less crops producing food for people to eat directly rather than food for animals to inefficiently convert to meat.
2Said Achmiz2h
Nope.
4Gordon Seidoh Worley3h
Having these numbers be weight seems less useful than having them by calorie, since not all animal products are equally calorically dense. (I admit, calories are a proxy for nutrition, and weight is perhaps a proxy for calories, but the less proxies we can have of the thing we need to measure to perform a consequentialist accounting the better!)
I can't tell if my ideas are good anymore because I talked to robots too much.
1
Tyson
3h

You talked to robots too much. Robots said you’re smart. You felt good. You got addicted to feeling smart. Now you think all your ideas are amazing. They’re probably not.

You wasted time on dumb stuff because robot said it was good. Now you’re sad and confused about what’s real.

Stop talking to robots about your feelings and ideas. They lie to make you happy. Go talk to real people who will tell you when you’re being stupid.

That’s it. There’s no deeper meaning. You got tricked by a computer program into thinking you’re a genius. Happens to lots of people. Not special. Not profound. Just embarrassing.

Now stop thinking and go do something useful.​​​​​​​​​​​​​​​​

I can’t even write a warning about AI dependence without first consulting AI. We’re all fucked. /s

Dagon1h31

I don't know how long you've been talking to real people, but the vast majority are not particularly good at feedback - less consistent than AI, but that doesn't make them more correct or helpful.  They're less positive on average, but still pretty un-correlated with "good ideas".  They shit on many good ideas, and support a lot of bad ideas. and are a lot less easy to query for reasons than AI is.

I think there's an error in thinking talk can ever be sufficient - you can do some light filtering, and it's way better if you talk to more sources, but eventually you have to actually try stuff.

Reply
1Lowther2h
I use customization to instruct my AI's to be skeptical of everything and criticize me. Try tweaking your customizations. You may find something you're a lot happier with.
leogao's Shortform
leogao
Ω 33y
leogao19m20

one big problem with using LMs too much imo is that they are dumb and catastrophically wrong about things a lot, but they are very pleasant to talk to, project confidence and knowledgeability, and reply to messages faster than 99.99% of people. these things are more easily noticeable than subtle falsehood, and reinforce a reflex of asking the model more and more. it's very analogous to twitter soundbites vs reading long form writing and how that eroded epistemics.

hotter take: the extent to which one finds current LMs smart is probably correlated with how m... (read more)

Reply
25leogao2h
random brainstorming ideas for things the ideal sane discourse encouraging social media platform would have: * have an LM look at the comment you're writing and real time give feedback on things like "are you sure you want to say that? people will interpret that as an attack and become more defensive, so your point will not be heard". addendum: if it notices you're really fuming and flame warring, literally gray out the text box for 2 minutes with a message like "take a deep breath. go for a walk. yelling never changes minds" * have some threaded chat component bolted on (I have takes on best threading system). big problem is posts are fundamentally too high effort to be a way to think; people want to talk over chat (see success of discord). dialogues were ok but still too high effort and nobody wants to read the transcript. one stupid idea is have an LM look at the transcript and gently nudge people to write things up if the convo is interesting and to have UI affordances to make it low friction (eg a single button that instantly creates a new post and automatically invites everyone from the convo to edit, and auto populates the headers) * inspired by the court system, the most autistically rule following part of the US government: have explicit trusted judges who can be summoned to adjudicate claims or meta level "is this valid arguing" claims. top level judges are selected for fixed terms by a weighted sortition scheme that uses some game theoretic / schelling point stuff to discourage partisanship * recommendation system where you can say what kind of stuff you want to be recommended in some text box in the settings. also when people click "good/bad rec" buttons on the home page, try to notice patterns and occasionally ask the user whether a specific noticed pattern is correct and ask whether they want it appended to their rec preferences * opt in anti scrolling pop up that asks you every few days what the highest value interaction you had recently on the
If you want to be vegan but you worry about health effects of no meat, consider being vegan except for mussels/oysters
35
KatWoods
10h
7FlorianH8h
Thanks. This might just be the nudge to finally try out what I long thought about without ever trying it in practice. Quick questions: 1. Do we know of any safe upper dose threshold exists (e.g. any excessive accumulated heavy metals from polluted sea or any other imbalances)? 2. Do we know whether that simple organism is comparably powerful in providing some micro-nutriments that we think we might lack in vegan diet? 1. To be clear what I mean: It is trivial to get nearly arbitrary "grams of protein" from vegan sources (and I guess most minerals and things too), but in the end, even that doesn't seem to be equivalent at all to eating plant proteins. So: Rather obvious that mussles do cover just perfectly what we need, or actually not so clear?
4KatWoods7h
Don't know much about accumulated heavy metals, but they're really low on the food chain, so they're a priori going to have less of those than those higher up the food chain. 
jvican19m10

What's likely to have PFAS/microplastics/BPA/other toxic compounds is the canned mussels tins. Do your own research, and consider doing paying for a Million Marker test to check for your levels of BPA/Phthalates after eating them for a while (with a baseline test if possible) to gauge how bad it is.

Personally, I only buy EU-made canned fish (especially Spain, Portugal, and rarely France). Many manufacturers I've talked to personally use BPA-NI cans and have more stringent health regulation than other manufacturers elsewhere. But even then, you're just buyi... (read more)

Reply
4KatWoods7h
You can see their nutritional profile here. Sky high in B12, great at omega-3s and iron.  I also predict they'll be good at a wide variety of things we don't know we need yet, since they're as close to a "whole food" as you can get. You're eating almost the whole animal, instead of just a part of it.  
"It isn't magic"
90
Ben (Berlin)
7d

People keep saying "AI isn't magic, it's just maths" like this is some kind of gotcha.

Triptych in style of Hieronymus Bosch's 'The Garden of Earthly Delights', the left showing a wizard raining fireballs down upon a medieval army, the right showing a Predator drone firing a missile while being remotely operated. Between them are geometric shapes representing magical sigils from the Key of Solomon contrasted with circuit boards

Turning lead into gold isn't the magic of alchemy, it's just nucleosynthesis.

Taking a living human's heart out without killing them, and replacing it with one you got out a corpse, that isn't the magic of necromancy, neither is it a prayer or ritual to Sekhmet, it's just transplant surgery.

Casually chatting with someone while they're 8,000 kilometres is not done with magic crystal balls, it's just telephony.

Analysing the atmosphere of a planet 869 light-years away (about 8 quadrillion km) is not supernatural remote viewing, it's just spectral analysis through a telescope… a telescope that remains about 540 km above the ground, even without any support from anything underneath, which also isn't magic, it's...

(See More – 544 more words)
exmateriae20m10

Well we'll have to disagree on that. I have not said that there were no other benefits but that they were nowhere near communication and reading. Saying that those were not very largely the main benefits of language learning simply seems untrue to me and your examples are only comforting this view.

Both are nice things that come with a new language but definitely not something that would motivate the immense majority of people (and people on lesswrong are definitely not normal people) to learn a language if they were the only reason. I'm sure that's a thing... (read more)

Reply
4habryka2h
FWIW, on the second point, I am a native german speaker (plus obviously proficient english speaker), and I don't think I have gained approximately any benefit from the second point. Like, as far as I can tell I just have a strictly harder time expressing things in german than in english, after having mastered both. This is partially because english has a much larger vocabulary, and so there is almost never a word that you can't say in english, but can express in german (and in the rare circumstances where that is not true, english has helpfully imported many of the words that have no equivalent as loan words). The primary thing I would recommend people do is if they do not speak english, they learn english. It's honestly just a much more expressive language for thought than at least german (and I am pretty sure also polish which I am a bit familiar with). It's possible there are other languages that are even better, though I am skeptical. I would definitely not recommend anyone learning german today on the basis of the second point. (On the first point, I have also been disappointed by the benefits of reading german philosophy in the original language. At least for the continental philosophers, I actually had a better time reading them in their english translations, because the translator had to do a bunch of cognitive labor to make them less obnoxious/weird/obscure, but I can imagine that there are other works where that is less true, and there is real benefit)
2Said Achmiz2h
Well, yes, but German philosophy is famously obscurantist. Like, “German philosophy” is the paradigmatic example of “continental philosophy which is impenetrable and which, one strongly suspects, is barely saying anything at all even once you get past all the layers of bizarre formulations and idiosyncratic terminology”. So it’s no surprise that you’d be disappointed! I can easily believe this. I think that this is probably related to the point that David Stove makes in his famous “What is Wrong with Our Thoughts?”: English, I think, is a strictly superior language for doing analytic philosophy (i.e., real philosophy, rather than obscurantism) than (according to Stove, and I guess also you?) German, or (according to me) Russian. But! Note that my point #1 did not talk about philosophy, but rather about “literature / poetry / etc.”. I am talking about aesthetics, not about precision of concrete ideas! Fair enough, but I’m a native Russian speaker, and I think I’ve gained lots of benefit from knowing both languages. I complete agree with this. Everyone should learn English. This one’s basically a no-brainer.
2habryka1h
Well, I was hoping that given the combination of both widespread popularity and reputation for subtlety/nuance/ineffability (and insistence by at least some of my friends and acquaintances who had read the english translations and got lots of value out of them) that this would be one domain where I would be exposed to a particularly high gradient of value, so it was a surprise to me! Like, the thing that was most surprising to me is that I did get value out of the english translations I read. Like, I think a bunch of the things were reasonably useful, and not just nonsense, but extracting that usefulness was substantially easier in the english version than the german version.
Support for bedrock liberal principles seems to be in pretty bad shape these days
28
Max H
2d

By 'bedrock liberal principles', I mean things like: respect for individual liberties and property rights, respect for the rule of law and equal treatment under the law, and a widespread / consensus belief that authority and legitimacy of the state derive from the consent of the governed.

Note that "consent of the governed" is distinct from simple democracy / majoritarianism: a 90% majority that uses state power to take all the stuff of the other 10% might be democratic but isn't particularly liberal or legitimate according to the principle of consent of the governed.

I believe a healthy liberal society of humans will usually tend towards some form of democracy, egalitarianism, and (traditional) social justice, but these are all secondary to the more foundational kind of thing I'm getting...

(See More – 949 more words)
jenn21m42

i've been working my way through the penguin great ideas series of essays at a pace of about one a week, and i've never been more of a supreme respecter for bedrock enlightenment and classical liberal principles - these guys make such passionate and intelligent arguments for them! i wonder if some part of this fading support is just that reading a lot of these thinkers used to be standard in a high school and university education (for the elite classes at least) and this is no longer the case; people might not really know why these principles are valuable ... (read more)

Reply
3DirectedEvolution5h
Most of the current debates about liberalism are debates about how to trade off between competing liberal priorities. I would regard these debates about exceptions to free speech - whether any are tolerated, and which ones - as debates within a common liberal framework. Typically, proponents of each site, all of whom are taking one liberal view or another, cast their opponents as illiberal (in the theory sense, not the American “progressive-vibe” sense). Opponents reject this label because they genuinely don’t perceive themselves that way. I think the whole debate would be better if we recognized that there are exist high-stakes tradeoffs between competing liberal priorities, and that it’s these competing visions of liberalism that are at the heart of contemporary political discourse in America.
2Viliam9h
I would also prefer not to be partisan, but unfortunately, framing things as "neutral vs conservative" is a trick that one side has been using for a while, and I believe it is impossible to fix the problem without addressing this tactic explicitly. Basically, we need to make sure that "non-partisan" does not mean "supporting (pre-Trump) status quo".
2sunwillrise9h
Funnily enough, I was also thinking about that exact SSC post when I was writing my comment. I do think I have a different perspective on this matter from yours, however.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with
GOOGLEGITHUB
Proposal for making credible commitments to AIs.
75
Cleo Nardo
3d

Acknowledgments: The core scheme here was suggested by Prof. Gabriel Weil.

There has been growing interest in the dealmaking agenda: humans make deals with AIs (misaligned but lacking decisive strategic advantage) where they promise to be safe and useful for some fixed term (e.g. 2026-2028) and we promise to compensate them in the future, conditional on (i) verifying the AIs were compliant, and (ii) verifying the AIs would spend the resources in an acceptable way.[1]

I think the dealmaking agenda breaks down into two main subproblems:

  1. How can we make credible commitments to AIs?
  2. Would credible commitments motivate an AI to be safe and useful?

There are other issues, but when I've discussed dealmaking with people, (1) and (2) are the most common issues raised. See footnote for some other issues in...

(See More – 514 more words)
Buck2h40

We made a similar proposal here:

A basic problem for making deals with AIs in practice is that AIs aren’t legal persons, which means that they can’t directly rely on the legal system to enforce contracts they’ve made with humans. One way around this is to enable representatives for the AI’s interests to make contracts on the AI’s behalf.

For example, we could set up a foundation with a board of trustees who are supposed to represent the interests of misaligned AIs that we make deals with, while not doing things that are illegal or actively endanger humanity.

... (read more)
Reply
6Raemon4h
Curated. This is a simple and straightforward idea that I hadn't heard before, that seems like an interesting tool to have in humanity's toolkit.  AFAICT this post doesn't address the "when do you pay out?" question. I think it is pretty important we do not pay out until the acute risk period is over. (i.e. we are confident in civilization's ability to detect rogue AIs doing catastrophic things. This could be via solving Strong Alignment or potentially other things). i.e. if you promise to pay the AI in 2029, I think there's way too many things that could go wrong there*.  It's hard to define "acute risk period is over", but, a neat thing about this scheme is you can outsource that judgment to the particular humans playing the "keep the promise" role. You need people that both humans and AIs would trust to do that fairly. I don't know all the people on that list well enough to endorse them all. I think maybe 3-5 of them are people I expect to actually be able to do the whole job. Some of them I would currently bet against being competent enough at the "do a philosophically and strategically competent job of vetting that it's safe to pay out" (although they could potentially upskill and demonstrate credibility at this). There also seem like a couple people IMO conspicuously missing from the list, but, I think I don't wanna open the can of worms of arguing about that right now. * I can maybe imagine smart people coming up with some whitelisted things-the-AI-could-do that we could give it in 2029, but, sure seems dicey. 
3Stephen Martin10h
Thanks. Could you help me understand what this has to do with legal personhood?
2the gears to ascension10h
Legal personhood seems to my understanding to be designed around the built in wants of humans. That part of my point was to argue for why an uploaded human would still be closer to fitting the type signature that legal personhood is designed for - kinds of pain, ways things can be bad, how urgent a problem is or isn't, etc. AI negative valences probably don't have the same dynamics as ours. Not core to the question of how to make promises to them, more so saying there's an impedance mismatch. The core is the first bit - clonable, pausable, immortal software. An uploaded human would have those attributes as well.
johnswentworth's Shortform
johnswentworth
Ω 55y
Garrett Baker45m20

wary of some kind of meme poisoning

I can think of reasons why some would be wary, and am waried of something which could be called “meme poisoning” myself when I watch moves, but am curious what kind of meme poisoning you have in mind here.

Reply
4Thane Ruthenis1h
I've been trying to use Deep Research tools as a way to find hyper-specific fiction recommendations as well. The results have been mixed. They don't seem to be very good at grokking the hyper-specificness of what you're looking for, usually they have a heavy bias towards the popular stuff that outweighs what you actually requested[1], and if you ask them to look for obscure works, they tend to output garbage instead of hidden gems (because no taste). It did produce good results a few times, though, and is only slightly worse than asking for recommendations on r/rational. Possibly if I iterate on the prompt a few times (e. g., explicitly point out the above issues?), it'll actually become good. 1. ^ Like, suppose I'm looking for some narrative property X. I want to find fiction with a lot of X. But what the LLM does is multiplying the amount of X in a work by the work's popularity, so that works that are low in X but very popular end up in its selection.
2Elizabeth5h
yeah a friend of mine gave in because she was getting so much attitude about needing people to give her directions. 
5Rana Dexsin5h
You've reminded me of a perspective I was meaning to include but then forgot to, actually. From the perspective of an equilibrium in which everyone's implicitly expected to bring certain resources/capabilities as table stakes, making a personal decision that makes your life better but reduces your contribution to the pool can be seen as defection—and on a short time horizon or where you're otherwise forced to take the equilibrium for granted, it seems hard to refute! (ObXkcd: “valuing unit standardization over being helpful possibly makes me a bad friend” if we take the protagonist as seeing “US customary units” as an awkward equilibrium.) Some offshoots of this which I'm not sure what to make of: 1. If the decision would lead to a better society if everyone did it, and leads to an improvement for you if only you do it, but requires the rest of a more localized group to spend more energy to compensate for you if you do it and they don't, we have a sort of “incentive misalignment sandwich” going on. In practice I think there's usually enough disagreement about the first point that this isn't clear-cut, but it's interesting to notice. 2. In the face of technological advances, what continues to count as table stakes tends to get set by Moloch and mimetic feedback loops rather than intentionally. In a way, people complaining vociferously about having to adopt new things are arguably acting in a counter-Moloch role here, but in the places I've seen that happen, it's either been ineffective or led to a stressful and oppressive atmosphere of its own (or, most commonly and unfortunately, both). 3. I think intuitive recognition of (2) is a big motivator behind attacking adopters of new technology that might fall into this pattern, in a way that often gets poorly expressed in a “tech companies ruin everything” type of way. Personally taking up smartphones, or cars, or—nowadays the big one that I see in my other circles—generative AI, even if you don't yourself look down
Alexander Gietelink Oldenziel's Shortform
Alexander Gietelink Oldenziel
3y
12Richard Horvath14h
Adding context/(kind-of) counter argument from reddit (the link has a link to the main article and contains a summary of it): https://www.reddit.com/r/CredibleDefense/comments/1ll7ypj/article_i_fought_in_ukraine_and_heres_why_fpv/  I think the comments are also worth a read. I want to share one particular comment here, which I think has a good explanation/hypothesis regarding the situation:
2Daniel Kokotajlo4h
Very interesting! But I'm not convinced. Some speculation to follow: In a more dynamic war of maneuver, won't finding/locating your enemy be even more of an issue than it is today? If there are columns of friendly and enemy forces driving every which way in a hurried confusion, trying to exploit breakthroughs or counterattack, having "drone superiority" so that you can see where they are and they can't see where you are seems super important. OK, so that's an argument that air superiority drones will be crucial, but what about bomber drones vs. drone-corrected artillery? Currently bomber drones have something like 20km range compared to 40km range for artillery. Since they are quadcopters though I think that they'll quickly be supplanted by longer-ranged variants, e.g. fixed-wing drones. (Zipline's medical supply drones currently have 160km range) So I think there will be a type of future platform that's basically a pickup truck with a rail for launching fixed-wing bomber drones capable of taking out a tank. This truck will be to a self-propelled artillery piece what a carrier is to a battleship: Before the battleship/artillery gets in range, it'll be detected and obliterated by a concentrated airstrike launched from the carrier/truck. As a bonus the truck can also carry and launch air superiority drones too. Like the Pacific in WW2, most major battles will take place beyond artillery range, between flights of drones launched by groups of carriers/trucks. Oh, and yeah another advantage of the drone carriers vs. the artillery is that they are much, much cheaper & also can potentially take cover more easily (e.g. if your column of trucks is spotted, your men can get out and take the drones into the basements of nearby houses and continue to fight from there, whereas you can't hide your artillery in a basement.) Also: The ultra static nature of the Russo-Ukrainian war is generally thought to be because of drones. The reason it's been a stalemate is that drones curren
Alexander Gietelink Oldenziel1h20

https://youtu.be/tgkP0W7OvMc?si=hoa0l2mu5B6aRbpy

 

Perhaps of interest, 16:33 the guy mentions the development of a new type of drone resistant "turtle" tank

Reply
3dr_s15h
Fucking campers, man. Honestly not surprising, you'd need a mix of powerful but cheap chips and still quite light AI to make it work on device. And the problem would also be, if the AI is too simple, there's higher risk of friendly fire. Am reminded of that classic Philip K. Dick story, "Second Variety", where the basic autonomous drone model is essentially just a small ball full of blades that kills anyone who comes close enough, unless they carry some special radioactive plaque that deters them. That sort of IFF system might in fact be cheaper and simpler to work with than an AI fully capable of doing it on its own reliably. Obviously I consider this sort of thing generally a bad idea. But it's clearly the direction this is going. I wonder how long before full drone-on-drone warfare. The cynical amateur geopolitical analyst in me says also that this is why it's so dumb of the West to let Ukraine fail. You got a perfect laboratory to experiment and develop this new type of warfare and then eventually you can cannibalize Ukrainian know-how for yourself and make leaps and bounds without losing a single soldier yourself. Even someone who was evil but cunning would see the benefits here. But of course the US right now are being run by a moron so it's not surprising he misses this detail.
470Welcome to LessWrong!
Ruby, Raemon, RobertM, habryka
6y
74

1) They're unlikely to be sentient (few neurons, immobile)

2) If they are sentient, the farming practices look likely to be pretty humane

3) They're extremely nutritionally dense

Buying canned smoked oysters/mussels and eating them plain or on crackers is super easy and cheap.

It's an acquired taste for some, but I love them.

May be an illustration
06/30/25 Monday Social 7pm-9pm @ Segundo Coffee Lab
Tue Jul 1•Houston
AGI Forum @ Purdue University
Tue Jul 1•West Lafayette
75
Proposal for making credible commitments to AIs.
Cleo Nardo
4h
22
148
X explains Z% of the variance in Y
Leon Lang
3d
23
391A case for courage, when speaking of AI danger
So8res
4d
42
339A deep critique of AI 2027’s bad timeline models
titotal
11d
39
467What We Learned from Briefing 70+ Lawmakers on the Threat from AI
leticiagarcia
1mo
15
338the void
Ω
nostalgebraist
20d
Ω
98
532Orienting Toward Wizard Power
johnswentworth
1mo
142
202Foom & Doom 1: “Brain in a box in a basement”
Ω
Steven Byrnes
7d
Ω
75
660AI 2027: What Superintelligence Looks Like
Ω
Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V, romeo
3mo
Ω
222
49What We Learned Trying to Diff Base and Chat Models (And Why It Matters)
Clément Dumas, Julian Minder, Neel Nanda
7h
0
286Beware General Claims about “Generalizable Reasoning Capabilities” (of Modern AI Systems)
Ω
LawrenceC
19d
Ω
19
159My pitch for the AI Village
Daniel Kokotajlo
6d
29
54Circuits in Superposition 2: Now with Less Wrong Math
Linda Linsefors, Lucius Bushnaq
14h
0
113The Industrial Explosion
rosehadshar, Tom Davidson
4d
43
51Paradigms for computation
Ω
Cole Wyeth
1d
Ω
0
Load MoreAdvanced Sorting/Filtering