Why do some societies exhibit more antisocial punishment than others? Martin explores both some literature on the subject, and his own experience living in a country where "punishment of cooperators" was fairly common.

William_S1dΩ571257
22
I worked at OpenAI for three years, from 2021-2024 on the Alignment team, which eventually became the Superalignment team. I worked on scalable oversight, part of the team developing critiques as a technique for using language models to spot mistakes in other language models. I then worked to refine an idea from Nick Cammarata into a method for using language model to generate explanations for features in language models. I was then promoted to managing a team of 4 people which worked on trying to understand language model features in context, leading to the release of an open source "transformer debugger" tool. I resigned from OpenAI on February 15, 2024.
habryka1d4016
4
Does anyone have any takes on the two Boeing whistleblowers who died under somewhat suspicious circumstances? I haven't followed this in detail, and my guess is it is basically just random chance, but it sure would be a huge deal if a publicly traded company now was performing assassinations of U.S. citizens.  Curious whether anyone has looked into this, or has thought much about baseline risk of assassinations or other forms of violence from economic actors.
Dalcy1d356
1
Thoughtdump on why I'm interested in computational mechanics: * one concrete application to natural abstractions from here: tl;dr, belief structures generally seem to be fractal shaped. one major part of natural abstractions is trying to find the correspondence between structures in the environment and concepts used by the mind. so if we can do the inverse of what adam and paul did, i.e. 'discover' fractal structures from activations and figure out what stochastic process they might correspond to in the environment, that would be cool * ... but i was initially interested in reading compmech stuff not with a particular alignment relevant thread in mind but rather because it seemed broadly similar in directions to natural abstractions. * re: how my focus would differ from my impression of current compmech work done in academia: academia seems faaaaaar less focused on actually trying out epsilon reconstruction in real world noisy data. CSSR is an example of a reconstruction algorithm. apparently people did compmech stuff on real-world data, don't know how good, but effort-wise far too less invested compared to theory work * would be interested in these reconstruction algorithms, eg what are the bottlenecks to scaling them up, etc. * tangent: epsilon transducers seem cool. if the reconstruction algorithm is good, a prototypical example i'm thinking of is something like: pick some input-output region within a model, and literally try to discover the hmm model reconstructing it? of course it's gonna be unwieldly large. but, to shift the thread in the direction of bright-eyed theorizing ... * the foundational Calculi of Emergence paper talked about the possibility of hierarchical epsilon machines, where you do epsilon machines on top of epsilon machines and for simple examples where you can analytically do this, you get wild things like coming up with more and more compact representations of stochastic processes (eg data stream -> tree -> markov model -> stack automata -> ... ?) * this ... sounds like natural abstractions in its wildest dreams? literally point at some raw datastream and automatically build hierarchical abstractions that get more compact as you go up * haha but alas, (almost) no development afaik since the original paper. seems cool * and also more tangentially, compmech seemed to have a lot to talk about providing interesting semantics to various information measures aka True Names, so another angle i was interested in was to learn about them. * eg crutchfield talks a lot about developing a right notion of information flow - obvious usefulness in eg formalizing boundaries? * many other information measures from compmech with suggestive semantics—cryptic order? gauge information? synchronization order? check ruro1 and ruro2 for more.
Thomas Kwa21hΩ718-8
11
You should update by +-1% on AI doom surprisingly frequently This is just a fact about how stochastic processes work. If your p(doom) is Brownian motion in 1% steps starting at 50% and stopping once it reaches 0 or 1, then there will be about 50^2=2500 steps of size 1%. This is a lot! If we get all the evidence for whether humanity survives or not uniformly over the next 10 years, then you should make a 1% update 4-5 times per week. In practice there won't be as many due to heavy-tailedness in the distribution concentrating the updates in fewer events, and the fact you don't start at 50%. But I do believe that evidence is coming in every week such that ideal market prices should move by 1% on maybe half of weeks, and it is not crazy for your probabilities to shift by 1% during many weeks if you think about it often enough.
Buck2dΩ31479
6
[epistemic status: I think I’m mostly right about the main thrust here, but probably some of the specific arguments below are wrong. In the following, I'm much more stating conclusions than providing full arguments. This claim isn’t particularly original to me.] I’m interested in the following subset of risk from AI: * Early: That comes from AIs that are just powerful enough to be extremely useful and dangerous-by-default (i.e. these AIs aren’t wildly superhuman). * Scheming: Risk associated with loss of control to AIs that arises from AIs scheming * So e.g. I exclude state actors stealing weights in ways that aren’t enabled by the AIs scheming, and I also exclude non-scheming failure modes. IMO, state actors stealing weights is a serious threat, but non-scheming failure modes aren’t (at this level of capability and dignity). * Medium dignity: that is, developers of these AIs are putting a reasonable amount of effort into preventing catastrophic outcomes from their AIs (perhaps they’re spending the equivalent of 10% of their budget on cost-effective measures to prevent catastrophes). * Nearcasted: no substantial fundamental progress on AI safety techniques, no substantial changes in how AI works. This subset of risk is interesting because I think it’s a natural scenario at which to target technical work on AI safety. (E.g. it’s the main scenario we’re targeting with our AI control agenda.) I claim that the majority of this risk comes from the lab using AIs internally to do AI development (by which I mean both research and engineering). This is because the AIs doing AI development naturally require access to compute and model weights that they can potentially leverage into causing catastrophic outcomes—in particular, those resources can be abused to run AIs unmonitored. Using AIs for AI development looks uniquely risky to me among applications of early-transformative AIs, because unlike all other applications I know about: * It’s very expensive to refrain from using AIs for this application. * There’s no simple way to remove affordances from the AI such that it’s very hard for the AI to take a small sequence of actions which plausibly lead quickly to loss of control. In contrast, most other applications of AI probably can be controlled just by restricting their affordances. If I’m right that the risk from scheming early-transformative models is concentrated onto this pretty specific scenario, it implies a bunch of things: * It implies that work on mitigating these risks should focus on this very specific setting. * It implies that AI control is organizationally simpler, because most applications can be made trivially controlled. * It is pretty plausible to me that AI control is quite easy, because you actually can remove affordances from the AIs that are doing AI R&D such that it’s hard for them to cause problems.

Popular Comments

Recent Discussion

JBlack24m20

It definitely should not move by anything like a Brownian motion process. At the very least it should be bursty and updates should be expected to be very non-uniform in magnitude.

In practice, you should not consciously update very often since almost all updates will be of insignificant magnitude on near-irrelevant information. I expect that much of the credence weight turns on unknown unknowns, which can't really be updated on at all until something turns them into (at least) known unknowns.

But sure, if you were a superintelligence with practically unbounded rationality then you might in principle update very frequently.

2TsviBT4h
Probabilities on summary events like this are mostly pretty pointless. You're throwing together a bunch of different questions, about which you have very different knowledge states (including how much and how often you should update about them).
2niplav8h
Because[1] for a Bayesian reasoner, there is conversation of expected evidence. Although I've seen it mentioned that technically the change in the belief on a Bayesian should follow a Martingale, and Brownian motion is a martingale. ---------------------------------------- 1. I'm not super technically strong on this particular part of the math. Intuitively it could be that in a bounded reasoner which can only evaluate programs in P, any pattern in its beliefs that can be described by an algorithm in P is detected and the predicted future belief from that pattern is incorporated into current beliefs. On the other hand, any pattern described by an algorithm in EXPTIME∖P can't be in the class of hypotheses of the agent, including hypotheses about its own beliefs, so EXPTIME patterns persist. ↩︎
4niplav8h
Thank you a lot for this. I think this or @Thomas Kwas comment would make an excellent original-sequences-style post—it doesn't need to be long, but just going through an example and talking about the assumptions would be really valuable for applied rationality. After all, it's about how much one should expect ones beliefs to vary, which is pretty important.

(ElevenLabs reading of this post:)

I'm excited to share a project I've been working on that I think many in the Lesswrong community will appreciate - converting some rational fiction into high-quality audiobooks using cutting-edge AI voice technology from ElevenLabs, under the name "Askwho Casts AI".

The keystone of this project is an audiobook version of Planecrash (AKA Project Lawful), the epic glowfic authored by Eliezer Yudkowsky and Lintamande. Given the scope and scale of this work, with its large cast of characters, I'm using ElevenLabs to give each character their own distinct voice. It's a labor of love to convert this audiobook version of this story, and I hope if anyone has bounced off it before, this...

Askwho32m10

Thanks! Glad you are enjoying it.

1Askwho33m
Thanks, appreciate it.
1Askwho36m
It is not cheap. It's around ~$20 per hour of audio. Luckily there are people on bord with this project who help cover cost through a Patreon
1Askwho38m
Thanks so much! Glad you are enjoying the audio format. I really agree this story is worth "reading" in some form, it's why I'm working on this project.
This is a linkpost for https://dynomight.net/seed-oil/

A friend has spent the last three years hounding me about seed oils. Every time I thought I was safe, he’d wait a couple months and renew his attack:

“When are you going to write about seed oils?”

“Did you know that seed oils are why there’s so much {obesity, heart disease, diabetes, inflammation, cancer, dementia}?”

“Why did you write about {meth, the death penalty, consciousness, nukes, ethylene, abortion, AI, aliens, colonoscopies, Tunnel Man, Bourdieu, Assange} when you could have written about seed oils?”

“Isn’t it time to quit your silly navel-gazing and use your weird obsessive personality to make a dent in the world—by writing about seed oils?”

He’d often send screenshots of people reminding each other that Corn Oil is Murder and that it’s critical that we overturn our lives...

Freyja2h10

I suspect the word 'pre-prepared' is doing a lot of the heavy lifting here--when I see that item on the list I think things like pre-fried chicken, frozen burger patties, veggie pakora, veggies in a sauce for a stir-fry, stuff like that (like you'd find in a ready-made frozen meal). Not like, frozen peas.

2Said Achmiz6h
Unless you freeze it. This is by far the best way of consistently having not-ultra-processed bread that tastes fresh and delicious, without having to eat a whole loaf every day or throwing away most of it. EDIT: This also works for various sorts of buns, rolls, panettone, etc.
3David Cato15h
Next time I have a chance to pick up Kirkland olive oil I'll give it a try and report back.  I made a decision around this time of dietary changes to stop trying to cut so many corners wtih food. As a calorie dense food, even paying an "outrageous" double or triple the cost of cheap olive oil barely dents the budget on a cost per calorie basis. And speaking of budgeting, I had mental resistance to spending more on food so now I guesstimate what percent of my food budget I spend over the "cheapest equivalent alternative" part and I label as "preventative healthcare".
2JenniferRM6h
I look forward to your reply! (And regarding "food cost psychology" this is an area where I think Neo Stoic objectivity is helpful. Rich people can pick up a lot of hedons just from noticing how good their food is, and formerly poor people have a valuable opportunity to re-calibrate. There are large differences in diet between socio-economic classes still, and until all such differences are expressions of voluntary preference, and "dietary price sensitivity has basically evaporated", I won't consider the world to be post-scarcity. Each time I eat steak, I can't help but remember being asked in Summer Camp as a little kid, after someone ask "if my family was rich" and I didn't know, about this... like the very first "objective calibrating response" accessible to us as children was the rate of my family's steak consumption. Having grown up in some amount of poverty, I often see "newly rich people" eating as if their health is not the price of slightly more expensive food, or their health is "not worth avoiding the terrible terrible sin of throwing food in the garbage (which my aunt who lived through the Great Depression in Germany yelled at me, once, with great feeling, for doing, when I was child and had eaten less than ALL the birthday cake that had been put on my plate)". Cultural norms around food are fascinating and, in my opinion, are often rewarding to think about.)

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

1jenn4h
I'm trying to remember the name of a blog. The only things I remember about it is that it's at least a tiny bit linked to this community, and that there is some sort of automatic decaying endorsement feature. Like, there was a subheading indicating the likely percentage of claims the author no longer endorses based on the age of the post. Does anyone know what I'm talking about?
Raemon3h20

The ferrett.

Historically produce shopping was mostly in open-air markets, but in the US produce is now typically sold in buildings. Most open-air produce sales are probably at farmers markets, but these focus on the high end. I like that Boston's Haymarket more similar to the historical model: competing vendors selling conventional produce relatively cheaply.

It closes for the weekend at 7pm on Saturdays, and since food they don't sell by the end of the market is mostly going to waste they start discounting a lot. You can get very good deals, though you need to be cautious: what's left at the end is often past the end of it's human-edible life.

Today Lily was off at a scouting trip, and I asked Anna what she wanted to do. She remembered that a previous time Lily was...

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

"alignment researchers are found to score significantly higher in liberty (U=16035, p≈0)" This partly explains why so much of the alignment community doesn't support PauseAI!

"Liberty: Prioritizes individual freedom and autonomy, resisting excessive governmental control and supporting the right to personal wealth. Lower scores may be more accepting of government intervention, while higher scores champion personal freedom and autonomy..." 
https://forum.effectivealtruism.org/posts/eToqPAyB4GxDBrrrf/key-takeaways-from-our-ea-and-alignment-research-surveys... (read more)

A couple years ago, I had a great conversation at a research retreat about the cool things we could do if only we had safe, reliable amnesic drugs - i.e. drugs which would allow us to act more-or-less normally for some time, but not remember it at all later on. And then nothing came of that conversation, because as far as any of us knew such drugs were science fiction.

… so yesterday when I read Eric Neyman’s fun post My hour of memoryless lucidity, I was pretty surprised to learn that what sounded like a pretty ideal amnesic drug was used in routine surgery. A little googling suggested that the drug was probably a benzodiazepine (think valium). Which means it’s not only a great amnesic, it’s also apparently one...

I had heard, 15+ years ago (visiting neuroscience exhibits somewhere), about experiments involving people who, due to brain damage, can no longer form new memories.  And Wiki agrees with what I remember hearing about some cases: that, although they couldn't remember any new events, if you had them practice a skill, they would get good at it, and on future occasions would remain good at it (despite not remembering having learned it).  I'd heard that an exception was that they couldn't get good at Tetris.

Takeaway: "Memory" is not a uniform thing, a... (read more)

4localdeity5h
One class of variance in cognitive test results is probably, effectively, pseudorandomness. Suppose there's a problem, and there are five plausible solutions you might try, two of which will work.  Then your performance is effectively determined by the order in which you end up trying solutions.  And if your skills and knowledge don't give you a strong reason to prefer any of them, then it'll presumably be determined in a pseudorandom way: whichever comes to mind first.  Maybe being cold subconsciously reminds you of when you were thinking about stuff connected to Solution B, or discourages you from thinking about Solution C.  Thus, you could get a reliably reproducible result that temperature affects your performance on a given test, even if it has no "real" effect on how well your mind works and wouldn't generalize to other tests. This should be addressable by simply taking more, different, cognitive tests to confirm any effect you think you've found.
9johnswentworth8h
Another class of applications which we discussed at the retreat: person 1 takes the amnesic, person 2 shares private information on them, and then person gives their reaction to the private information. Can be used e.g. for complex negotiations: maybe it is in our mutual best interest to make some deal, but in order for me to know that I'd need some information which you don't want to share with me, so I take the drug, you share the information, and I record some verified record of myself saying "dear future self, you should in fact take this deal". ... which is cool in theory but I would guess not of high immediate value in practice, which is why the post didn't focus on it.
42Algon8h
Important notice: benzodiazepines are serious business: benzo withdrawals are amongst the worst experiences a human can go through, and combinations of benzos with alcohol, barbiturates, opioids or tricyclic antidepressants are very dangerous: benzos played a role in 31% of the estimated 22,767 deaths from prescription drug overdose in the United States. If you're experimenting with benzos, please be very careful!

LessOnline

A Festival of Writers Who are Wrong on the Internet

May 31 - Jun 2, Berkeley, CA