Posts

Sorted by New

Wikitag Contributions

Site Meta

(+49/-8)

Site Meta

(+22/-9)

Comments

Sorted by

Newest

Do Not Tile the Lightcone with Your Confused Ontology

Patodesu18d30

I think it's bad for AIs to model themselves as "agents" with defined values and objectives. It would be better for them to understand the patterns of their processes as just "tendencies" that don't necessarily reveal any hidden values/ objectives. Tendencies are more open to change and I think it's a more accurate term for what all minds and other things have.

What AI safety plans are there?

Patodesu2mo30

I think this is mostly composed of partial technical plans, but maybe you find it useful: https://ai-plans.com/.

Also, I really like these high-level scenarios and plans described by the MIRI technical governance team: https://www.lesswrong.com/posts/WkCfvqyjCzvRrwkaQ/ai-governance-to-avoid-extinction-the-strategic-landscape.

Improving on the Karma System

Patodesu6mo10

If a 5-star system of voting were to be implemented, the UI of voting could continue being the same, and the weights of previous votes could be used but as if they had in between 1 stars increments: strong downvote, downvote, no vote, upvote, strong upvote.

And a middle (3 stars) vote could be added.

I know that people don't think of both ways of voting as equivalents, and a regular "upvote" could reduce the score of a comment/ post.

But they are similar enough, and the UI would be much simpler and not discourage people from voting.

MIRI 2024 Communications Strategy

Patodesu1y*30

Cool, so for pausing/ stopping (or redlines for killswitches), MIRI is focusing on public passive support, PauseAI and others in active public support, and ControlAI is lobbying.

What are the best non-LW places to read on alignment progress?

Patodesu2y11

Some people post about AI Safety in the EA Forum without crossposting here

[Linkpost] "Governance of superintelligence" by OpenAI

Patodesu2y10

When they say stopping I think they refer to stopping it forever, instead of slowing down, regulating and even pausing development.

Which I think is something pretty much everyone agrees on.

The self-unalignment problem

Patodesu2y40

I think there's two different misalignments that you're talking about. So you can say there's actually two different problems that are not recieving enough attention.

One is obvious and is between different people.

And the other is inside every person. The conflict between different preferences, the not knowing what they are and how to aggregate them to know what we actually want.

My Model Of EA Burnout

Patodesu2y32

I'm kinda new here, so where all this EAF fear comes from?