All of KvmanThinking's Comments + Replies

help, my self image as rational is affecting my ability to empathize with others

ahh that makes sense. should i just move it there now?

2Ben Pace1mo

That sounds good to me i.e. draft this post, and then make it a comment in one of those places instead (my weak guess is a quick take is better, but whatever you like).

You should use Consumer Reports

A "shortform/open thread"?

2Ben Pace1mo

Posted either as a comment on the seasonal open thread or using the quick takes / shortform feature, which posts it in your shortform (e.g. here is my shortform). I'm saying that this seems to me not on the level of substance of a post, so it'd be better as a comment of one of the above two types, and also that it's plausible to me you'd probably get more engagement as a comment in the open thread.

Why do we have the NATO logo?

how do the CEOs respond to our concerns?

Ah. Thanks! (by the way, when these questions get answered, should I take them down or leave them up for others?)

3Viliam1mo

Leave them up, other people may be curious too, but too shy to ask.

KvmanThinking1mo80

Good point. Really sorry. Just changed it.

how do the CEOs respond to our concerns?

how do the CEOs respond to our concerns?

Has Musk tried to convince the other AI companies to also worry about safety?

3Milan W1mo

Main concern right now is very much lab proliferation, ensuing coordination problems, and disagreements / adversarial communication / overall insane and polarized discourse. * Google Deepmind: They are older than OpenAI. They also have a safety team. They are very much aware of the arguments. I don't know about Musk's impact on them. * Anthropic: They split from OpenAI. To my best guess, they care about safety at least roughly as much as them. Many safety researchers have been quitting OpenAI to go work for Anthropic over the past few years. * xAI: Founded by Musk several years after he walked out from OpenAI. People working there have previously worked at other big labs. General consensus seems to be that their alignment plan (as least as explained by Elon) is quite confused. * SSI: Founded by Ilyia Sutskever after he walked out from OpenAI, which he did after participating in a failed effort to fire Sam Altman from OpenAI. Very much aware of the arguments. * Meta AI: To the best of my knowledge, aware of the arguments but very dismissive of them (at least at the upper management levels). * Mistral AI: I don't know much but probably more or less the same or worse than Meta AI. * Chinese labs: No idea. I'll have to look into this. I am confident that there are relatively influential people within Deepmind and Anthropic who post here and/or on the Aligment Forum. I am unsure about people from other labs, as I am nothing more than a relatively well-read outsider.

The difference is that if the Exxon Mobil CEO internalizes that (s)he is harming the environment, (s)he has to go and get a completely new job, probably building dams or something. But if Sam Altman internalizes that he is increasing our chance of extinction, all he has to do is tell all his capability researchers to work on alignment, and money is still coming in; only now, less of it comes from ChatGPT subscriptions and more of it comes from grants from the Long-Term Future Fund. It's a much easier and lighter shift. Additionally, he knows that he can go... (read more)

Superintelligence FAQ

KvmanThinking2mo*20

Could someone add "7. Why do very smart people in high places say this is not a problem?" (Or you could just reply to this comment.)

Living in Many Worlds

quantum immortality is not going to work out

How come?

When you downvote, explain why

KvmanThinking2mo21

That's a pretty big "somehow".

1Milan W2mo

Oh I know! That is why I added "somehow". But I am also very unsure over exactly how hard it is. Seems like a thing worth whiteboarding over for an hour and then maybe doing a weekend-project-sized test about.

KvmanThinking2mo2-1

I don't think such considerations apply to upvotes nearly as much if at all. Upvotes indicate agreement or approval, which doesn't need to be explained as thoroughly as disagreement (which usually involves having separate, alternative ideas in your head different from the ideas of the one you are disagreeing with)

5Vladimir_Nesov2mo

Whether upvotes need to be explained overall is not relevant to my comment, as I'm talking about the specific considerations named by Noah Birnbaum.

When you downvote, explain why

When you downvote, explain why

I believe that the reason your comment was strong downvoted was because you implied that "everyone repeating things already said" is an inevitable consequence of asking people why they disagree with you. This might be true on other websites (where people are hesitant to relinquish beliefs and opinions), but not on LessWrong.

4Richard_Kennaway2mo

Even on LW, there comes a point where everything has been said and further discussion will foreseeably be unuseful.

I had upvoted the ones I agreed with and thought were helpful. If I agree with something, I will upvote, because simply saying "I agree" is unnecessary when I can just click on a check mark. I appreciate and recognize the effort of those 5 other people who commented, but that is well enough communicated through agreement karma. Just because I have nothing to say about a response someone provided doesn't mean I don't value it.

hypnosis question

Your answer wasn't cryptic at all. Don't worry. This is a great answer. Let me know when you're done with that sequence. I'll have to read it.

(Also, it's horrifying that people can be hypnotized against their will. That makes me simultaneously thankful-that and curious-why it isn't more widely practiced...)

Something like TNIL or Real Character might be used for maximum intellectual utility. But I cannot see how simply minimizing the amount of words that need to exist for compact yet precise communication would help correct the corrupted machinery our minds run on.

-3ChristianKl2mo

I don't think the mental model of "corrupted machinery" is a very useful one. Humans reason by using heuristics. Many heuristics have advantages and disadvantages instead of being perfect. Sometimes that's because they are making tradeoffs, other times it's because they have random quirks. Real Character was a failed experiment. I don't know how capable Ithkuil IV happens to be.

KvmanThinking2mo40

By "make its users more aware of their biases" I mean, for example, a language where it's really obvious when you say something illogical, or have a flaw in your reasoning.

Some ideas I had for this:

Explicitly defined sematic spaces for every word, to dissolve questions and help people agree on the locations of phenomena in thingspace. Mechanisms for searching thingspace (while, for example, you can say "red chair" to narrow the space of all chairs down to the space of all chairs which reflect red light, it would be nice to be able to express things like "t

... (read more)

Russian Food for Petrov Day

why are people downvoting?

[This comment is no longer endorsed by its author]Reply

why is vegan censored?

2Said Achmiz2mo

It’s not censored; the asterisk is used here in the computer science sense, meaning “wildcard”; “veg*an” is short for “vegan or vegetarian”.

Toki pona FAQ

are there 2 types of alignment?

Also, it helps taboo your words. For example, "Toki Pona helps taboo your words" would be rendered as
tenpo toki pi toki pona li sama e tenpo toki pi ni: jan li ken kepeken ala e nimi pi ken ala sona pi pali lili.
"(the) speech-time related to Toki Pona is similar or the same as (the) speech-time with this quality: (the) person cannot use word(s) which cannot be known via small effort."

Before you complain that this is too long a phrase to be used practically, try to explain the concept of rationalist taboo in less syllables than I did in Toki Pona, whilst not relying on other rationalist jargon.

Finding gliders in the game of life

by "making an AI that builds utopia and stuff" I mean an AI that would act in such a way that rather than simply obeying the intent of its promptors, it goes and actively improves the world in the optimal way. An AI which has fully worked out Fun Theory and simply goes around filling the universe with pleasure and beauty and freedom and love and complexity in such a way that no other way would be more Fun.

1Robert Cousineau2mo

That would be described well by the CEV link above.

KvmanThinking2mo00

it will not consider it acceptable to kill me and instantly replace me with a perfect copy

Why not? I would find this acceptable, considering you are your information system.

Why am I still alive? (a time capsule)

Trying to be rational for the wrong reasons

I disagree with your disagreement of Eliezer and Connor's conclusions, but I still upvoted because you worded your argument and response quite well. Judging from your comments, you seem not to have a very high opinion of LessWrong, and yet you choose to interact anyways, because you would like to counterargue. You think we are just a big echo chamber of doom, and yet you learn our jargon. Good job. I disagree with what you say, but thank you for being the dissent we encourage. If you would like to know why we believe what we do, you would do well to read t... (read more)

KvmanThinking3mo10

I have an irrational preference

If your utility function weights you knowing things higher than most people's, that is not an irrationality.

are IQ tests a good measure of intelligence?

KvmanThinking3mo10

It's "101"? I searched the regular internet to find out, but I got some yes's and some no's, which I suspect were just due to different definitions of intelligence.

It's controversial?? Has that stopped us before? When was it done to death?

I'm just confused, because if people downvote my stuff, they're probably trying to tell me something, and I don't know what it is. So I'm just curious.

are IQ tests a good measure of intelligence?

KvmanThinking3mo10

Thanks. By the way, do you know why this question is getting downvoted?

4Maxwell Peterson3mo

Guesses: people see it as too 101 of a question; people think it’s too controversial / has been done to death many years ago; one guy with a lot of karma hates the whole concept and strong-downvoted it I think the 101 idea is most likely. But I don’t think it’s a bad question, so I’ve upvoted it.

fake alignment solutions????

KvmanThinking3mo40

I already figured that. The point of this question was to ask if there could possibly exist things that look indistinguishable from true alignment solutions (even to smart people), but that aren't actually alignment solutions. Do you think things like this could exist?

By the way, good luck with your plan. Seeing people actively go out and do actually meaningful work to save the world gives me hope for the future. Just try not to burn out. Smart people are more useful to humanity when their mental health is in good shape.

7johnswentworth3mo

I'm pretty uncertain on this one. Could a superintelligence find a plan which fools me? Yes. Will such a plans show up early on in a search order without actively trying to fool me? Ehh... harder to say. It's definitely a possibility I keep in mind. Most importantly, over time as our understanding improves on the theory side, it gets less and less likely that a plan which would fool me shows up early in a natural search order.

fake alignment solutions????

[Fiction] Lena (MMAcevedo)

Yes, human intelligence augmentation sounds like a good idea.
There are all sorts of "strategies" (turn it off, raise it like a kid, disincentivize changing the environment, use a weaker AI to align it) that people come up with when they're new to the field of AI safety, but that are ineffective. And their ineffectiveness is only obvious and explainable by people who specifically know how AI behaves. Supposes there are strategies which ineffectiveness is only obvious and explainable by people who know way more about decisions and agents and optimal strategi

... (read more)

1[anonymous]4mo

yep but the first three all fail for the shared reason of "programs will do what they say to do, including in response to your efforts". (the fourth one, 'use a weaker AI to align it', is at least obviously not itself a solution. the weakest form of it, using an LLM to assist an alignment researcher, is possible, and some less weak forms likely are too.) when i think of other 'newly heard of alignment' proposals, like boxing, most of them seem to fail because the proposer doesn't actually have a model of how this is supposed to work or help in the first place. (the strong version of 'use ai to align it' probably fits better here) (there are some issues which a programmatic model doesn't automatically make obvious to a human: they must follow from it, but one could fail to see them without making that basic mistake. probable environment hacking and decision theory issues come to mind. i agree that on general priors this is some evidence that there are deeper subjects that would not be noticed even conditional on those researchers approving a solution.) i guess my next response then would be that some subjects are bounded, and we might notice (if not 'be able to prove') such bounds telling us 'theres not more things beyond what you have already written down', which would be negative evidence (strength depending on how strongly we've identified a bound). (this is more of an intuition, i don't know how to elaborate this) (also on what johnswentworth wrote: a similar point i was considering making is that the question is set up in a way that forces you into playing a game of "show how you'd outperform magnus carlsen {those researchers} in chess alignment theory" - for any consideration you can think of, one can respond that those researchers will probably also think of it, which might preclude them from actually approving, which makes the conditional 'they approve but its wrong'[1] harder to be true and basically dependent on them instead of object-level properties

1[comment deleted]4mo

KvmanThinking4mo1-8

Uh, this is a human. Humans find it much harder to rationalize away the suffering of other humans, compared to rationalizing animal suffering.

6Kaj_Sotala4mo

Historically there were plenty of rationalizations for slavery, including ones holding that slaves weren't really people and were on par with animals. Such an argument would be much easier for a mind running on a computer and with no physical body - "oh it just copies the appearance of suffering but it doesn't really suffer".

[Fiction] Lena (MMAcevedo)

I attempted the AI Box Experiment (and lost)

And the regular, average people in this future timeline consider stuff like this ethically okay?

8Kaj_Sotala4mo

Compare to e.g. factory farming today, which also persists despite a lot of people thinking it not okay (while others don't care).

You are not too "irrational" to know your preferences.

hack reality via pure math

What - exactly - do you mean by that?

KvmanThinking4mo73

The above statement could be applied to a LOT of other posts too, not just this one.

Have we seen any "ReLU instead of sigmoid-type improvements" recently

How were these discovered? Slow, deliberate thinking, or someone trying some random thing to see what it does and suddenly the AI is a zillion times smarter?

2Marcus Williams4mo

"We offer no explanation as to why these architectures seem to work; we attribute their success, as all else, to divine benevolence." -SwiGLU paper. I think it varies, a few of these are trying "random" things, but mostly they are educated guesses which are then validated empirically. Often there is a spefic problem we want to solve i.e. exploding gradients or O(n^2) attention and then authors try things which may or may not solve/mitigate the problem.

Eutopia is Scary

The case for turning glowfic into Sequences

I certainly believe he could. After reading Tamsin Leake's "everything is okay" (click the link if you dare), I felt a little unstable, and felt like I had to expend deliberate effort to not think about the described world in sufficient detail in order to protect my sanity. I felt like I was reading something that had been maximized by a semi-powerful AI to be moving, almost infohazardously moving, but not quite; that this approached the upper bound of what humans could read while still accepting the imperfection of their current conditions.

The Treacherous Path to Rationality

utopia

It's a protopia. It is a word better than ours. It is not perfect. It would be advisable to keep this in mind. dath ilan likely has its own, separate problems.

A map of Bay Area memespace

And I’m not even mentioning the strange sexual dynamics

Is this a joke? I'm confused.

No Safe Defense, Not Even Science

yeah, the moment i looked at the big diagram my brain sort of pleasantly overheated

No Safe Defense, Not Even Science

I think the flaw is how he claims this:

No one begins to truly search for the Way until their parents have failed them, their gods are dead, and their tools have shattered in their hand.

I think that these three things are not things that cause a desire for rationality, but things that rationality makes you notice.

how to truly feel my beliefs?

why is this so downvoted? just curious

If I am not sufficiently terrified by the prospect of our extinction, I will not take as much steps to try and reduce its likelihood. If my subconscious does not internalize this sufficiently, I will not be as motivated. Said subconscious happiness affects my conscious reasoning without me consciously noticing.

Chapter 27: Empathy

KvmanThinking5mo20

Harry's brain tried to calculate the ramifications and implications of this and ran out of swap space.

this is very relatable

Building Weirdtopia

If I have some money, whom should I donate it to in order to reduce expected P(doom) the most?

That's a partial focus.

KvmanThinking5mo31

particularly girls

why!?

1ZY5mo

I don't understand either. If it is meant what it meant, this is a very biased perception and not very rational (truth seeking or causality seeking). There should be better education systems to fix that.

Sublimity vs. Youtube

i'd pick dust & youtube. I intrinsically value fairness

Sublimity vs. Youtube

(i no longer endorse this post) - cryonics is a pascal's mugging?

The YouTube is pure happiness. The sublimity is some happiness and some value. Therefore I choose the sublimity, but if it was "Wireheading vs. Youtube", or "Sublimity vs. seeing a motivational quote", I would choose the YouTube or the motivational quote, because I intrinsically value fairness.