The forum has been very much focused on AI safety for some time now, thought I'd post something different for a change. Privilege.

Here I define Privilege as an advantage over others that is invisible to the beholder. This may not be the only definition, or the central definition, or not how you see it, but that's the definition I use for the purposes of this post. I also do not mean it in the culture-war sense as a way to undercut others as in "check your privilege". My point is that we all have some privileges [we are not aware of], and also that nearly each one has a flip side.

In some way this is the inverse of The Lens That Does Not See Its Flaws: The...

(See More – 319 more words)

8Razied5h

The word "privilege" has been so tainted by its association with guilt that it's almost an infohazard to think you've got privilege at this point, it makes you lower your head in shame at having more than others, and brings about a self-flagellation sort of attitude. It elicits an instinct to lower yourself rather than bring others up. The proper reactions to all these things you've listed is gratitude to your circumstances and compassion towards those who don't have them. And certainly everyone should be very careful towards any instinct they have at publicly "acknowledging their privilege"... it's probably your status-raising instincts having found a good opportunity to boast about your intelligence, appearance and good looks while appearing like you're being modest.

localdeity13m42

I grew up knowing "privilege" to mean a special right that was granted to you based on your job/role (like free food for those who work at some restaurants) or perhaps granted by authorities due to good behavior (and would be taken away for misusing it). Note also that the word itself, "privi"-"lege", means "private law": a law that applies to you in particular.

Rights and laws are social things, defined by how others treat you. To say that your physical health is a privilege therefore seems like either a category error, or a claim that other pe... (read more)

Instruction-following AGI is easier and more likely than value aligned AGI

Seth Herd

Ω 143d

Summary:

We think a lot about aligning AGI with human values. I think it’s more likely that we’ll try to make the first AGIs do something else. This might intuitively be described as trying to make instruction-following (IF) or do-what-I-mean-and-check (DWIMAC) be the central goal of the AGI we design. Adopting this goal target seems to improve the odds of success of any technical alignment approach. This goal target avoids the hard problem of specifying human values in an adequately precise and stable way, and substantially helps with goal misspecification and deception by allowing one to treat the AGI as a collaborator in keeping it aligned as it becomes smarter and takes on more complex tasks.

This is similar but distinct from the goal targets of prosaic alignment efforts....

(Continue Reading – 3512 more words)

2Seth Herd2h

I read your linked shortform thread. I agreed with pretty most of your arguments against some common AGI takeover arguments. I agree that they won't coordinate against us and won't have "collective grudges" against us. But I don't think the arguments for continued stability are very thorough, either. I think we just don't know how it will play out. And I think there's a reason to be concerned that takeover will be rational for AGIs, where it's not for humans. The central difference in logic is the capacity for self-improvement. In your post, you addressed self-improvement by linking a Christiano piece on slow takeoff. But he noted at the start that he wasn't arguing against self-improvement, only that the pace of self improvement would be more modest. But the potential implications for a balance of power in the world remain. Humans are all locked to a similar level of cognitive and physical capabilities. That has implications for game theory where all of the competitors are humans. Cooperation often makes more sense for humans. But the same isn't necessarily true of AGI. Their cognitive and physical capacities can potentially be expanded on. So it's (very loosely) like the difference between game theory in chess, and chess where one of the moves is to add new capabilities to your pieces. We can't learn much about the new game from theory of the old, particularly if we don't even know all of the capabilities that a player might add to their pieces. More concretely: it may be quite rational for a human controlling an AGI to tell it to try to self-improve and develop new capacities, strategies and technologies to potentially take over the world. With a first-mover advantage, such a takeover might be entirely possible. Its capacities might remain ahead of the rest of the world's AI/AGIs if they hadn't started to aggressively self-improve and develop the capacities to win conflicts. This would be particularly true if the aggressor AGI was willing to cause global cata

2Seth Herd3h

In the near term AI and search are blurred, but that's a separate topic. This post was about AGI as distinct from AI. There's no sharp line between but there are important distinctions, and I'm afraid we're confused as a group because of that blurring. More above, and it's worth its own post and some sort of new clarifying terminology. The term AGI has been watered down to include LLMs that are fairly general, rather than the original and important meaning of AI that can think about anything, implying the ability to learn, and therefore almost necessarily to have explicit goals and agency. This was about that type of "real" AGI, which is still hypothetical even though increasingly plausible in the near term.

wassname16m10

That's true, they are different. But search still provides the closest historical analogue (maybe employees/suppliers provide another). Historical analogues have the benefit of being empirical and grounded, so I prefer them over (or with) pure reasoning or judgement.

2Seth Herd3h

Yes, we do see such "values" now, but that's a separate issue IMO. There's an interesting thing happening in which we're mixing discussions of AI safety and AGI x-risk. There's no sharp line, but I think they are two importantly different things. This post was intended to be about AGI, as distinct from AI. Most of the economic and other concerns relative to the "alignment" of AI are not relevant to the alignment of AGI. This thesis could be right or wrong, but let's keep it distinct from theories about AI in the present and near future. My thesis here (and a common thesis) is that we should be most concerned about AGI that is an entity with agency and goals, like humans have. AI as a tool is a separate thing. It's very real and we should be concerned with it, but not let it blur into categorically distinct, goal-directed, self-aware AGI. Whether or not we actually get such AGI is an open question that should be debated, not assumed. I think the answer is very clearly that we will, and soon; as soon as tool AI is smart enough, someone will make it agentic, because agents can do useful work, and they're interesting. So I think we'll get AGI with real goals, distinct from the pseudo-goals implicit in current LLMs behavior. The post addresses such "real" AGI that is self-aware and agentic, but that has the sole goal of doing what people want is pretty much a third thing that's somewhat counterintuitive.

Is There Really a Child Penalty in the Long Run?

Maxwell Tabarrok

This is a linkpost for https://www.maximum-progress.com/p/is-there-really-a-child-penalty-in

A couple of weeks ago three European economists published this paper studying the female income penalty after childbirth. The surprising headline result: there is no penalty.

Setting and Methodology

The paper uses Danish data that tracks IVF treatments as well as a bunch of demographic factors and economic outcomes over 25 years. Lundborg et al identify the causal effect of childbirth on female income using the success or failure of the first attempt at IVF as an instrument for fertility.

What does that mean? We can’t just compare women with children to those without them because having children is a choice that’s correlated with all of the outcomes we care about. So sorting out two groups of women based on observed fertility will also sort them based on income and...

(Continue Reading – 1395 more words)

JBlack41m20

Yes, that was my first guess as well. Increased income from employment is most strongly associated with major changes, such as promotion to a new position with changed (and usually increased) responsibilities, or leaving one job and starting work somewhere else that pays more.

It seems plausible that these are not the sorts of changes that women are likely to seek out at the same rate when planning to devote a lot of time in the very near future to being a first-time parent. Some may, but all? Seems unlikely. Men seem more likely to continue to pursue such opportunities at a similar rate due to gender differences in child-rearing roles.

Stephen Fowler's Shortform

Stephen Fowler

AnnaSalamon1h20

I don't know the answer, but it would be fun to have a twitter comment with a zillion likes asking Sam Altman this question. Maybe someone should make one?

1Rebecca3h

OpenAI wasn’t a private company (ie for-profit) at the time of the OP grant though.

1Ebenezer Dukakis6h

So basically, I think it is a bad idea and you think we can't do it anyway. In that case let's stop calling for it, and call for something more compassionate and realistic like a public apology. I'll bet an apology would be a more effective way to pressure OpenAI to clean up its act anyways. Which is a better headline -- "OpenAI cofounder apologizes for their role in creating OpenAI", or some sort of internal EA movement drama? If we can generate a steady stream of negative headlines about OpenAI, there's a chance that Sam is declared too much of a PR and regulatory liability. I don't think it's a particularly good plan, but I haven't heard a better one.

1Ebenezer Dukakis6h

Sure, I think this helps tease out the moral valence point I was trying to make. "Don't allow them near" implies their advice is actively harmful, which in turn suggests that reversing it could be a good idea. But as you say, this is implausible. A more plausible statement is that their advice is basically noise -- you shouldn't pay too much attention to it. I expect OP would've said something like that if they were focused on descriptive accuracy rather than scapegoating. Another way to illuminate the moral dimension of this conversation: If we're talking about poor decision-making, perhaps MIRI and FHI should also be discussed? They did a lot to create interest in AGI, and MIRI failed to create good alignment researchers by its own lights. Now after doing advocacy off and on for years, and creating this situation, they're pivoting to 100% advocacy. Could MIRI be made up of good people who are "great at technical stuff", yet apt to shoot themselves in the foot when it comes to communicating with the public? It's hard for me to imagine an upvoted post on this forum saying "MIRI shouldn't be allowed anywhere near AI safety communications".

LLMs could be as conscious as human emulations, potentially

weightt an

19d

Firstly, I'm assuming that high resolution human brain emulation that you can run on a computer is conscious in normal sense that we use in conversations. Like, it talks, has memories, makes new memories, have friends and hobbies and likes and dislikes and stuff. Just like a human that you could talk with only through videoconference type thing on a computer, but without actual meaty human on the other end. It would be VERY weird if this emulation exhibited all these human qualities for other reason than meaty humans exhibit them. Like, very extremely what the fuck surprising. Do you agree?

So, we now have deterministic human file on our hands.

Then, you can trivially make transformer like next token predictor out of human emulation. You just have emulation,...

(See More – 699 more words)

1weightt an15h

Well, it's like saying if the {human in a car as a single system} is or is not conscious. Firstly it's a weird question, because of course it is. And even if you chain the human to a wheel in such a way they will never disjoin from the car. What I did is constrained possible actions of the human emulation. Not severely, the human still can talk whatever, just with constant compute budget, time or iterative commutation steps. Kind of like you can constrain actions of a meaty human by putting them in a jail or something. (... or in a time loop / repeated complete memory wipes) How would you expect to this possibly cash out? Suppose there are human emulations running around doing all things exactly like meaty humans. How exactly do you expect that announcement of a high scientific council go, "We discovered that EMs are not conscious* because .... and that's important because of ...". Is that completely out of model for you? Or like, can you give me (even goofy) scenario out of that possibility Or do you think high resolution simulations will fail to replicate capabilities of humans, outlook of them? I.e special sauce/quantum fuckery/literal magic?

JBlack1h20

I don't expect this to "cash out" at all, which is rather the point.

The only really surprising part would be that we had a way to determine for certain whether some other system is conscious or not at all. That is, very similar (high) levels of surprisal for either "ems are definitely conscious" or "ems are definitely not conscious", but the ratio between them not being anywhere near "what the fuck" level.

As it stands, I can determine that I am conscious but I do not know how or why I am conscious. I have only a sample size of 1, and no way to access a lar... (read more)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

"If we go extinct due to misaligned AI, at least nature will continue, right? ... right?"

plex

14h

This is a linkpost for https://aisafety.info/questions/NM1Y/If-we-go-extinct-due-to-misaligned-AI,-at-least-nature-will-continue,-right

[memetic status: stating directly despite it being a clear consequence of core AI risk knowledge because many people have "but nature will survive us" antibodies to other classes of doom and misapply them here.]

Unfortunately, no.^[1]

Technically, “Nature”, meaning the fundamental physical laws, will continue. However, people usually mean forests, oceans, fungi, bacteria, and generally biological life when they say “nature”, and those would not have much chance competing against a misaligned superintelligence for resources like sunlight and atoms, which are useful to both biological and artificial systems.

There’s a thought that comforts many people when they imagine humanity going extinct due to a nuclear catastrophe or runaway global warming: Once the mushroom clouds or CO2 levels have settled, nature will reclaim the cities. Maybe mankind in our hubris will have wounded Mother Earth and paid the price ourselves, but...

(See More – 359 more words)

19ryan_greenblatt4h

I think literal extinction is unlikely even conditional on misaligned AI takeover due to: * The potential for the AI to be at least a tiny bit "kind" (same as humans probably wouldn't kill all aliens). [1] * Decision theory/trade reasons This is discussed in more detail here and here. Insofar as humans and/or aliens care about nature, similar arguments apply there too, though this is mostly beside the point: if humans survive and have (even a tiny bit of) resources they can preserve some natural easily. I find it annoying how confident this article is without really bothering to engage with the relevant arguments here. (Same goes for many other posts asserting that AIs will disassemble humans for their atoms.) ---------------------------------------- 1. This includes the potential for the AI to have preferences that are morally valueable from a typical human perspective. ↩︎

O O1h10

Additionally, the AI might think it's in an alignment simulation and just leave the humans as is or even nominally address their needs. This might be mentioned in the linked post, but I want to highlight it. Since we already do very low fidelity alignment simulations by training deceptive models, there is reason to think this.

5GoteNoSente4h

It is not at all clear to me that most of the atoms in a planet could be harnessed for technological structures, or that doing so would be energy efficient. Most of the mass of an earthlike planet is iron, oxygen, silicon and magnesium, and while useful things can be made out of these elements, I would strongly worry that other elements that are needed also in those useful things will run out long before the planet has been disassembled. By historical precedent, I would think that an AI civilization on Earth will ultimately be able to use only a tiny fraction of the material in the planet, similarly to how only a very small fraction of a percent of the carbon in the planet is being used by the biosphere, in spite of biological evolution having optimized organisms for billions of years towards using all resources available for life. The scenario of a swarm of intelligent drones eating up a galaxy and blotting out its stars I think can empirically be dismissed as very unlikely, because it would be visible over intergalactic distances. Unless we are the only civilization in the observable universe in the present epoch, we would see galaxies with dark spots or very strangely altered spectra somewhere. So this isn't happening anywhere. There are probably some historical analogs for the scenario of a complete takeover, but they are very far in the past, and have had more complex outcomes than intelligent grey goo scenarios normally portray. One instance I can think of is the Great Oxygenation Event. I imagine an observer back then might have envisioned that the end result of the evolution of cyanobacteria doing oxygenic photosynthesis would be the oceans and lakes and rivers all being filled with green slime, with a toxic oxygen atmosphere killing off all other life. While indeed this prognosis would have been true to a first order approximation - green plants do dominate life on Earth today - the reality of what happened is infinitely more complex than this crude pictu

4Dagon6h

If it's possible for super-intelligent AI to be non-sentient, wouldn't it be possible for insects to evolve non-sentient intelligence as well? I guess I didn't assume "non-sentient" in the definition of "unaligned".

Do you believe in hundred dollar bills lying on the ground? Consider humming

128

Elizabeth

Introduction

[Reminder: I am an internet weirdo with no medical credentials]

A few months ago, I published some crude estimates of the power of nitric oxide nasal spray to hasten recovery from illness, and speculated about what it could do prophylactically. While working on that piece a nice man on Twitter alerted me to the fact that humming produces lots of nasal nitric oxide. This post is my very crude model of what kind of anti-viral gains we could expect from humming.

I’ve encoded my model at Guesstimate. The results are pretty favorable (average estimated impact of 66% reduction in severity of illness), but extremely sensitive to my made-up numbers. Efficacy estimates go from ~0 to ~95%, depending on how you feel about publication bias, what percent of Enovid’s impact...

(Continue Reading – 1539 more words)

4Elizabeth5h

Can you clarify this part? The liquid is a reactive solution (and contains other ingredients) so I don't understand how you calculated it. I agree the integral is a reasonable interpretation and appreciate you pointing it out. My guess is low frequent applications are better than infrequent high doses, but I don't know what the conversion rate is and this definitely undermines the hundred-dollar-bill case.

Thomas Kwa1h20

My prior is that solutions contain on the order of 1% active ingredients, and of things on the Enovid ingredients list, citric acid and NaNO2 are probably the reagents that create NO [1], which happens at a 5.5:1 mass ratio. 0.11ppm*hr as an integral over time already means the solution is only around 0.01% NO by mass [1], which is 0.055% reagents by mass, probably a bit more because yield is not 100%. This is a bit low but believable. If the concentration were really only 0.88ppm and dissipated quickly, it would be extremely dilute which seems unlikely. T... (read more)

Hot take: The AI safety movement is way too sectarian and this is greatly increasing p(doom)

O O

The movement to reduce AI x-risk is overly purist. This is leading to a lot of sects to maintain each individual sect's platonic level of purity and is actively (greatly) harming the cause.

How the Safety Sects Manifest

People suggest not publishing AI research
More recently, Jan and his team leaving OpenAI
Less recently, Paul Christiano leaving OpenAI to form METR^[1]
Even less recently, Anthropic forming off of OpenAI
A suggestion to blacklist anyone who decided to give $30 million (a paltry sum of money for a startup) to OpenAI.

I think these were all legitimate responses to a perceived increase in risk, but ultimately did or will do more harm than good. Disclaimer: I am the least sure that the formation Anthropic increases p(doom) but I speculate, post AGI, it will be seen...

(See More – 489 more words)

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion

Setting and Methodology

Introduction

How the Safety Sects Manifest

LessOnline Festival