P. — LessWrong

Replying toShould we align AI with maternal instinct?

P.6mo

Should we align AI with maternal instinct?

To the extent that maternal instincts are some actual small concrete set of things, you are probably making two somewhat opposite mistakes here: Imagining something that doesn't truly run on maternal instinct, and assuming that mothers actually care about their babies (for a certain definition of "care").

You say that mothers aren't actually "endlessly selfless, forever attuned to every cry, governed by an unshakable instinct to nurture", that there are "identities beyond 'mum' to be kept alive" and that there are nights that instinct disappears. But that's because you feel exhaustion, or also care about things other than your children. We don't need to create things like that. If "maternal instincts" are or... (read 373 more words →)

Replying toTormenting Gemini 2.5 with the [[[]]][][[]] Puzzle

P.11mo*

Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle

Sadly, by posting this here, you've added this puzzle to the training set of future models. Good benchmarks (e.g. ARC) keep the test set, or at least part of it, private.

Replying toI make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?

P.1y

I make several million dollars per year and have hundreds of thousands of followers—what is the straightest line path to utilizing these resources to reduce existential-level AI threats?

Firstly, and perhaps most importantly, my advice on what not to do is not to try directly convincing politicians to pause or stop AGI development. A prerequisite for them to take actions drastic enough to actually matter is for them to understand how powerful AGI will truly become. And once that happens, even if they ban all AI development, unless they consider the arguments for doom to be extremely strong, which they won't^[1], they will race and put truly enormous amounts of resources behind it, and that would be it for the species. Getting mid-sized business owners on board, on the other hand, might be a good idea due to the funding... (read more)

Replying toRefactoring cryonics as structural brain preservation

P.1y

Refactoring cryonics as structural brain preservation

Does OBP plan to eventually expand their services outside the USA? And how much would it cost if you didn’t subsidize it? Cost is a common complaint about cryonics so I could see you becoming much bigger than the cryonics orgs, but judging by the website you look quite small. Do you know why that is?

Replying toOpen Thread Spring 2024

P.2y*

Open Thread Spring 2024

Does anyone have advice on how I could work full-time on an alignment research agenda I have? It looks like trying to get a LTFF grant is the best option for this kind of thing, but if after working more time alone on it, it keeps looking like it could succeed, it’s likely that it would become too big for me alone, I would need help from other people, and that looks hard to get. So, any advice from anyone who’s been in a similar situation? Also, how does this compare with getting a job at an alignment org? Is there any org where I would have a comparable amount of freedom if my ideas are good enough?

Edit: It took way longer than I thought it would, but I've finally sent my first LTFF grant application! Now let's just hope they understand it and think it is good.

Replying toDecent plan prize announcement (1 paragraph, $1k)

P.2y*

Decent plan prize announcement (1 paragraph, $1k)

It depends on what you know about the model and the reason you have to be concerned in the first place (if it's just "somehow", that's not very convincing).

You might be worried that training it leads to the emergence of inner-optimizers, be them ones that are somehow "trying" to be good at prediction in a way that might generalize to taking real-life actions, approximating the searchy part of the humans they are trying to predict, or just being RL agents. If you are just using basically standard architectures with a lot more compute, these all seem unlikely. But if I were you, I might try to test its ability to perform well... (read 370 more words →)

Replying to2023 Unofficial LessWrong Census/Survey

P.2y

2023 Unofficial LessWrong Census/Survey

Done! There aren't enough mysterious old wizards.

Replying toVote on Interesting Disagreements

P.2y

Vote on Interesting Disagreements

You know of a technology that has at least a 10% chance of having a very big novel impact on the world (think the internet or ending malaria) that isn't included in this list, very similar, or downstream from some element of it: AI, mind uploads, cryonics, human space travel, geo-engineering, gene drives, human intelligence augmentation, anti-aging, cancer cures, regenerative medicine, human genetic engineering, artificial pandemics, nuclear weapons, proper nanotech, very good lie detectors, prediction markets, other mind-altering drugs, cryptocurrency, better batteries, BCIs, nuclear fusion, better nuclear fission, better robots, AR, VR, room-temperature superconductors, quantum computers, polynomial time SAT solvers, cultured meat, solutions to antibiotic resistance, vaccines to some disease, optical computers, artificial wombs, de-extinction and graphene.

Bad options included just in case someone thinks they are good.

•••

Replying toVote on Interesting Disagreements

P.2y

Vote on Interesting Disagreements

Public mechanistic interpretability research is net positive in expectation.

•••

Replying toVote on Interesting Disagreements

P.2y

Vote on Interesting Disagreements

Cultural values are something like preferences over pairs of social environments and things we actually care about. So it makes sense to talk about jointly optimizing them.

•••

Make-A-Video by Meta AI

Meta AI (Facebook) created a text-to-video model by taking a diffusion text-to-image model, adding temporal convolutional and attention layers, and fine-tuning it with video data (without text). They also use spatial and temporal super-resolution networks. Showing, to the surprise of no one who was paying attention, that our existing mostly homogeneous architectures can be easily extended to understand, to some extent, the structure of everyday reality. It's not the first text-to-video model, but it's much better than what came before.

Stable Diffusion has been released

It's an open-source text-to-image model capable of producing NSFW content. I, for one, I'm very excited to see the consequences this has on society, which I expect to be mostly positive (except if it leads to an increase in AI funding). Any thoughts?

LESSWRONG
LW

LESSWRONG
LW

P.

Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment?

DALL·E 2 by OpenAI

Stable Diffusion has been released

Make-A-Video by Meta AI

P.

Make-A-Video by Meta AI

Stable Diffusion has been released

Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment?

DALL·E 2 by OpenAI

P.

Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment?

DALL·E 2 by OpenAI

Stable Diffusion has been released

Make-A-Video by Meta AI

P.

Make-A-Video by Meta AI

Stable Diffusion has been released

Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment?

DALL·E 2 by OpenAI