All of Q Home's Comments + Replies

Q Home10

So at first I though this didn't include a step where the AI learns to care about things - it only learns to model things. But I think actually you're assuming that we can just directly use the model to pick actions that have predicted good outcomes - which are going to be selected as "good" according the the pre-specified P-properties. This is a flaw because it's leaving too much hard work for the specifiers to do - we want the environment to do way more work at selecting what's "good."

I assume we get an easily interpretable model where the difference bet... (read more)

Q Home50

The subproblem of environmental goals is just to make AI care about natural enough (from the human perspective) "causes" of sensory data, not to align AI to the entirety of human values. Fundamental variables have no (direct) relation to the latter problem.

However, fundamental variables would be helpful for defining impact measures if we had a principled way to differentiate "times when it's OK to sidestep fundamental variables" from "times when it's NOT OK to sidestep fundamental variables". That's where the things you're talking about definitely become a... (read more)

1Capybasilisk
Thanks. That makes sense.
Q Home50

Thank you for actually engaging with the idea (pointing out problems and whatnot) rather than just suggesting reading material.

Btw, would you count a data packet as an object you move through space?

A couple of points:

  • I only assume AI models the world as "objects" moving through space and time, without restricting what those objects could be. So yes, a data packet might count.
  • "Fundamental variables" don't have to capture all typical effects of humans on the world, they only need to capture typical human actions which humans themselves can easily perc
... (read more)
1Capybasilisk
Ok, that clears things up a lot. However, I still worry that if it's at the AI's discretion when and where to sidestep the fundamental variables, we're back at the regular alignment problem. You have to be reasonably certain what the AI is going to do in extremely out of distribution scenarios.
Q Home60

Epistemic status: Draft of a post. I want to propose a method of learning environmental goals (a super big, super important subproblem in Alignment). It's informal, so has a lot of gaps. I worry I missed something obvious, rendering my argument completely meaningless. I asked LessWrong feedback team, but they couldn't get someone knowledgeable enough to take a look.

Can you tell me the biggest conceptual problems of my method? Can you tell me if agent foundations researchers are aware of this method or not?

If you're not familiar with the problem, here's the... (read more)

Q Home20

Sorry if it's not appropriate for this site. But is anybody interested in chess research? I've seen that people here might be interested in chess. For example, here's a chess post barely related to AI.

Intro

In chess, what positions have the longest forced wins? "Mate in N" positions can be split into 3 types:

  1. Positions which use "tricks" to get a big number of moves before checkmate. Such as cycles of repeating moves. For example, this manmade mate in 415 (see the last position) uses obvious cycles. Not to mention mates in omega.
  2. Tablebase checkmates, di
... (read more)
Q Home10

Agree that neopronouns are dumb. Wikipedia says they're used by 4% LGBTQ people and criticized both within and outside the community.

But for people struggling with normal pronouns (he/she/they), I have the following thoughts:

  • Contorting language to avoid words associated with beliefs... is not easier than using the words. Don't project beliefs onto words too hard.
  • Contorting language to avoid words associated with beliefs... is still a violation of free speech (if we have such a strong notion of free speech). So what is the motivation to propose that? It'
... (read more)
Q Home00

I think there should be more spaces where controversial ideas can be debated. I'm not against spaces without pronoun rules, just don't think every place should be like this. Also, if we create a space for political debate, we need to really make sure that the norms don't punish everyone who opposes centrism & the right. (Over-sensitive norms like "if you said that some opinion is transphobic you're uncivil/shaming/manipulative and should get banned" might do this.) Otherwise it's not free speech either. Will just produce another Grey or Red Tribe inste... (read more)

4Viliam
Well, the primary goal of this place is to advance rationality and AI safety. Not the victory of any specific political tribe. And neither conformity nor contrarianism for its own sake. Employees get paid, which kinda automatically reduces their free speech, because saying the wrong words can make them stop getting paid. What is an (un)acceptable concession? For me, it is a question of effort and what value I receive in return. I value niceness, so by default people get their wishes granted, unless I forget. Some requests I consider arbitrary and annoying, so they don't get them. Yeah, those are subjective criteria. But I am not here to get paid; I am here to enjoy the talk. (What annoys me: asking to use pronouns other than he/she/they. I do not talk about people's past for no good reason, and definitely not just to annoy someone else. But if I have a good reason to point out that someone did something in the past, and the only way to do that is to reveal their previous name, then I don't care about the taboo.) Employment is really a different situation. You get laws, and recommendations of your legal department; there is not much anyone can do about that. And the rest is about the balance of power, where the individual employee is often in a much worse bargaining position.
Q Home30

I'll describe my general thoughts, like you did.

I think about transness in a similar way to how I think about homo/bisexuality.

  • If homo/bisexuality is outlawed, people are gonna suffer. Bad.
  • If I could erase homo/bisexuality from existence without creating suffering, I wouldn't anyway. Would be a big violation of people's freedom to choose their identity and actions (even if in practice most people don't actually "choose" to be homo/bisexual).
  • Different people have homo/bisexuality of different "strength" and form. One man might fall in love with another
... (read more)
5Viliam
I agree with most of that, but it seems to me that respecting homosexuality is mostly a passive action; if you ignore what other people do, you are already maybe 90% there. Homosexuals don't change their names or pronouns after coming out. You don't have to pretend that ten years ago they were something else than they appeared to you at that time. With transsexuality, you get the taboo of deadnaming, and occasionally the weird pronouns. Also, the reaction seems different when you try to opt out of the game. Like, if someone is uncomfortable with homosexuality, they can say "could we please just... not discuss our sexual relations here, and focus on the job (or some other reason why we are here)?" and that's usually accepted. If someone similarly says "could we please just... call everyone 'they' as a compromise solution, or simply refer to people using their names", that already got some people cancelled. Shortly, with homosexuals I never felt like my free speech was under attack. It is possible that most of the weirdness and pushing boundaries does not actually come from the transsexuals themselves, but rather from woke people who try to be their "allies". Either way, in effect, whenever a discussion about trans topics starts, I feel like "oh my, the woke hordes are coming, people are going to get cancelled". (And I am not really concerned about myself here, because I am not American, so my job is not on the line; and if some online community decides to ban me, well then fuck them. But I don't want to be in a community where people need to watch their tongues, and get filtered by political conformity.)
Q Home10

Draft of a future post, any feedback is welcome. Continuation of a thought from this shortform post.


(picture: https://en.wikipedia.org/wiki/Drawing_Hands)

The problem

There's an alignment-related problem: how do we make an AI care about causes of a particular sensory pattern? What are "causes" of a particular sensory pattern in the first place? You want the AI to differentiate between "putting a real strawberry on a plate" and "creating a perfect illusion of a strawberry on a plate", but what's the difference between doing real things and creating perfec... (read more)

Q Home10

Napoleon is merely an argument for "just because you strongly believe it, even if it is a statement about you, does not necessarily make it true".

When people make arguments, they often don't list all of the premises. That's not unique to trans discourse. Informal reasoning is hard to make fully explicit. "Your argument doesn't explicitly exclude every counterexample" is a pretty cheap counter-argument. What people experience is important evidence and an important factor, it's rational to bring up instead of stopping yourself with "wait, I'm not allowed ... (read more)

9Viliam
Two major points. 1) It annoys me if someone insists that I accept their theory about what being trans really is. Zack insists that Blanchard is right, and that I fail at rationality if I disagree with him. People on Twitter and Reddit insist that Blanchard is wrong, and that I fail at being a decent human if I disagree with them. My opinion is that I have no comparative advantage at figuring out who is right and who is wrong on this topic, or maybe everyone is wrong, anyway it is an empirical question and I don't have the data. I hope that people who have more data and better education will one day sort it out, but until that happens, my position firmly remains "I don't know (and most likely neither do you), stop bothering me". Also, from larger perspective, this is moving the goalposts. Long ago, tolerance was defined as basically not hurting other people, and letting them do whatever they want as long as it does not hurt others. Recently it also includes agreeing with the beliefs of their woke representatives. (Note that this is about the representatives, not the people being represented. Two trans people can have different opinions, but you are required to believe the woke one and oppose the non-woke one.) Otherwise, you are transphobic. I completely reject that. Furthermore, I claim that even trans people themselves are not necessarily experts on themselves. Science exists for a reason, otherwise we could just make opinion polls. Shortly: disagreement is not hate. But it often gets conflated, especially in environments that overwhelmingly contain people of one political tribe. 2) Every cause gets abused. It is bad if it becomes a taboo to point this out. A few months (or is it already years?) ago, there was an epidemic of teenagers on TikTok who appeared to have developed Tourette syndrome overnight. A few weeks or months later, apparently the epidemic was gone. I have no way to check those teenagers, but I think it is reasonable to assume that many of th
Q Home-3-3

Even if we assume that there should be a crisp physical cause of "transness" (which is already a value-laden choice), we need to make a couple of value-laden choices before concluding if "being trans" is similar to "believing you're Napoleon" or not. Without more context it's not clear why you bring up Napoleon. I assume the idea is "if gender = hormones (gender essentialism), and trans people have the right hormones, then they're not deluded". But you can arrive at the same conclusion ("trans people are not deluded") by means other than gender essentialis... (read more)

4Viliam
Napoleon is merely an argument for "just because you strongly believe it, even if it is a statement about you, does not necessarily make it true". We will probably disagree on this, but the only reason I care about trans issues is that some people report significant suffering (gender dysphoria) from their current situation, and I am in favor of people not suffering, so I generally try not to be an asshole. Unfortunately, for every person who suffers from something, there are probably dozen people out there who cosplay their condition... because it makes them popular on Twitter I guess, or just gives them another opportunity to annoy their neighbors. I have no empathy for those. Play your silly games, if you wish, but don't expect me to play along, and definitely don't threaten me to play along. Also, the cosplayers often make the situation more difficult for those who genuinely have the condition, by speaking in their name, and often saying things that the people who actually have the condition would disagree with... and in the most ironic cases, the cosplayers get them cancelled. So I don't mind being an asshole to the cosplayers, because from my perspective, they started it first. The word "deadnaming" is itself hysterical. (Who died? No one.) Gender essentialism? I don't make any metaphysical claim about essences. People simply are born with male or female bodies (yes, I know that some are intersex), and some people are strongly unhappy about their state. I find it plausible that there may be an underlying biological reason for that; and hormones seem like a likely candidate, because that's how body communicates many things. I don't have a strong opinion on that, because I have never felt a desire to be one sex or the other, just like I have never felt a strong desire to have a certain color of eyes, or hair, or skin, whether it would be the one I have or some that I have not. I expect that you will disagree with a lot of this, and that's okay; I am not tryi
Q Home0-1

There are people who feel strongly that they are Napoleon. If you want to convince me, you need to make a stronger case than that.

It's confusing to me that you go to "I identify as an attack helicopter" argument after treating biological sex as private information & respecting pronouns out of politeness. I thought you already realize that "choosing your gender identity" and "being deluded you're another person" are different categories.

If someone presented as male for 50 years, then changed to female, it makes sense to use "he" to refer to their f

... (read more)
2Viliam
Ah, I disagree, and I don't really wish to discuss the details, so just shortly: * I assume that for trans people being trans is something more than mere "choice" (even if I don't wish to make guesses what exactly, I suspect something with hormones; this is an empirical question for smart people to figure out). If this turns out not to be true, I will probably be annoyed. * If you introduce yourself as "Jane" today, I will refer to you as "Jane". But if 50 years ago you introduced yourself as "John", that is a fact about the past. I am not saying that "you were John" as some kind of metaphysical statement, but that "everyone, including you, referred to you as John" 50 years ago, which is a statement of fact.
Q Home31

Meta-level comment: I don't think it's good to dismiss original arguments immediately and completely.

Object-level comment:

Neither of those claims has anything to do with humans being the “winners” of evolution.

I think it might be more complicated than that:

  1. We need to define what "a model produced by a reward function" means, otherwise the claims are meaningless. Like, if you made just a single update to the model (based on the reward function), calling it "a model produced by the reward function" is meaningless ('cause no real optimization pressure w
... (read more)
Q Home10

My point is that chairs and humans can be considered in a similar way.

Please explain how your point connects to my original message: are you arguing with it or supporting it or want to learn how my idea applies to something?

Q Home10

I see. But I'm not talking about figuring out human preferences, I'm talking about finding world-models in which real objects (such as "strawberries" or "chairs") can be identified. Sorry if it wasn't clear in my original message because I mentioned "caring".

Models or real objects or things capture something that is not literally present in the world. The world contains shadows of these things, and the most straightforward way of finding models is by looking at the shadows and learning from them.

You might need to specify what you mean a little bit.

The ... (read more)

3Vladimir_Nesov
My point is that chairs and humans can be considered in a similar way. There's the world as a whole that generates observations, and particular objects on their own. A model that cares about individual objects needs to consider them separately from the world. The same object in a different world/situation should still make sense, so there are many possibilities for the way an object can be when placed in some context and allowed to develop. This can be useful for modularity, but also for formulating properties of particular objects, in a way that doesn't get distorted by the influence of the rest of the world. Human preferences is one such property.
Q Home10

Creating an inhumanly good model of a human is related to formulating their preferences.

How does this relate to my idea? I'm not talking about figuring out human preferences.

Thus it's a step towards eliminating path-dependence of particular life stories

What is "path-dependence of particular life stories"?

I think things (minds, physical objects, social phenomena) should be characterized by computations that they could simulate/incarnate.

Are there other ways to characterize objects? Feels like a very general (or even fully general) framework. I believe my idea can be framed like this, too.

2Vladimir_Nesov
Models or real objects or things capture something that is not literally present in the world. The world contains shadows of these things, and the most straightforward way of finding models is by looking at the shadows and learning from them. Hypotheses is another toy example. One of the features of models/things seems to be how they capture the many possibilities of a system simultaneously, rather than isolated particular possibilities. So what I gestured at was that when considering models of humans, the real objects or models behind a human capture the many possibilities of the way that human could be, rather than only the actuality of how they actually are. And this seems useful for figuring out their preferences. Path-dependence is the way outcomes depend on the path that was taken to reach them. A path-independent outcome is convergent, it's always the same destination regardless of the path that was taken. Human preferences seem to be path dependent on human timescales, growing up in Egypt may lead to a persistently different mindset from the same human growing up in Canada.
Q Home*52

There's an alignment-related problem, the problem of defining real objects. Relevant topics: environmental goals; task identification problem; "look where I'm pointing, not at my finger"; The Pointers Problem; Eliciting Latent Knowledge.

I think I realized how people go from caring about sensory data to caring about real objects. But I need help with figuring out how to capitalize on the idea.

So... how do humans do it?

  1. Humans create very small models for predicting very small/basic aspects of sensory input (mini-models).
  2. Humans use mini-models as puzzle
... (read more)
2Vladimir_Nesov
Creating an inhumanly good model of a human is related to formulating their preferences. A model captures many possibilities and the way many hypothetical things are simulated in the training data. Thus it's a step towards eliminating path-dependence of particular life stories (and preferences they motivate), by considering these possibilities altogether. Even if some on the possible life stories interact with distortionary influences, others remain untouched, and so must continue deciding their own path, for there are no external influences there and they are the final authority for what counts as aiding them anyway.
6[anonymous]
Another highly relevant post: The Pointers Problem.
Q Home10

I don't understand Model-Utility Learning (MUL) section, what pathological behavior does AI do?

Since humans (or something) must be labeling the original training examples, the hypothesis that building bridges means “what humans label as building bridges” will always be at least as accurate as the intended classifier. I don’t mean “whatever humans would label”. I mean they hypothesis that “build a bridge” means specifically the physical situations which were recorded as training examples for this system in particular, and labeled by humans as such.

... (read more)
Q Home20

I'm noticing two things:

  1. It's suspicious to me that values of humans-who-like-paperclips are inherently tied to acquiring an unlimited amount of resources (no matter in which way). Maybe I don't treat such values as 100% innocent, so I'm OK keeping them in check. Though we can come up with thought experiments where the urge to get more resources is justified by something. Like, maybe instead of producing paperclips those people want to calculate Busy Beaver numbers, so they want more and more computronium for that.
  2. How consensual were the trades if their outcome is predictable and other groups of people don't agree with the outcome? Looks like coercion.
Q Home20

Often I see people dismiss the things the Epicureans got right with an appeal to their lack of the scientific method, which has always seemed a bit backwards to me.

The most important thing, I think, is not even hitting the nail on the head, but knowing (i.e. really acknowledging) that a nail can be hit in multiple places. If you know that, the rest is just a matter of testing.

6Self
~Don't aim for the correct solution, (first) aim for understanding the space of possible solutions
Q Home10

But avoidance of value drift or of unendorsed long term instability of one's personality is less obvious.

What if endorsed long term instability leads to negation of personal identity too? (That's something I thought about.)

Q Home10

I think corrigibility is the ability to change a value/goal system. That the literal meaning of the term... "Correctable". If an AI were fully aligned, there would be no need to correct it.

Perhaps I should make a better argument:

It's possible that AGI is correctable, but (a) we don't know what needs to be corrected or (b) we cause new, less noticeable problems, while correcting AGI.

So, I think there's not two assumptions "alignment/interpretability is not solved + AGI is incorrigible", but only one — "alignment/interpretability is not solved". (A strong... (read more)

Q Home10

It's not aligned at every possible point in time.

I think corrigibility is "AGI doesn't try to kill everyone and doesn't try to prevent/manipulate its modification". Therefore, in some global sense such AGI is aligned at every point in time. Even if it causes a local disaster.

Over 90% , as I said

Then I agree, thank you for re-explaining your opinion. But I think other probabilities count as high too.

To me, the ingredients of danger (but not "> 90%") are those:

  • 1st. AGI can be built without Alignment/Interpretability being solved. If that's true,
... (read more)
2TAG
I think corrigibility is the ability to change a value/goal system. That the literal meaning of the term... "Correctable". If an AI were fully aligned, there would be no need to correct it. Yes, there are dangers other than a high probability of killing almost every one. I didn't say there arent. But it's motte and baileying to fall back to "what about these lesser risks". Yes, and that's the specific argument I am addressing,not AI risk in general. Except that if it's many many times smarter, it's ASI, not AGI.
Q Home10

why is “superintelligence + misalignment” highly conjunctive?

In the sense that matters, it needs to be fast, surreptitious, incorrigible, etc.

What opinion are you currently arguing? That the risk is below 90% or something else? What counts as "high probability" for you?

Incorrigible misalignment is at least one extra assumption.

I think "corrigible misalignment" doesn't exist, corrigble AGI is already aligned (unless AGI can kill everyone very fast by pure accident). But we can have differently defined terms. To avoid confusion, please give example... (read more)

5TAG
Over 90% , as I said It's not aligned at every possible point in time. I'm, talking about the Foom scenario that has been discussed endlessly here. The complete argument for Foom Doom that:- 1. the AI will have goals/values in the first place (it wont be a tool like GPT*), 2. the values will be misaligned, however subtly, to be unfavorable to humanity 3. that the misalignment cannot be detected or corrected 4. that the AI can achieve value stability under self modification 5. That the AI will self modify in way too fast to stop 6. and that most misaligned values in the resulting ASI are highly dangerous.
2[comment deleted]
Q Home10

I've confused you with people who deny that a misaligned AGI is even capable of killing most humans. Glad to be wrong about you.

But I am not saying that the doom is unlikely given superintelligence and misalignment, I am saying the argument that gets there -- superintelligence + misalignment -- is highly conjunctive. The final step., the execution as it were, is no highly conjunctive.

But I don't agree that it's highly conjunctive.

  • If AGI is possible, then its superintelligence is a given. Superintelligence isn't given only if AGI stops at human level
... (read more)
3TAG
It needs to happen quickly or surreptitiously to be a problem. Incorrigible misalignment is at least one extra assumption. In the sense that matters, it needs to be fast, surreptitious, incorrigible, etc. Huh?
Q Home10

Yes, I probably mean something other than ">90%".

[lists of various catastrophes. many of which have nothing to do with AI]

Why are you doing this? I did not say there is zero risk of anything. (...) Are you using "risk" to mean the probability of the outcome , or the impact of the outcome?

My argument is based on comparing the phenomenon of AGI to other dangerous phenomena. The argument is intended to show that bad outcome is likely (if AGI wants to do a bad thing, it can achieve it) and that impact of the outcome can kill most humans.

I think its

... (read more)
2TAG
But I am not saying that the doom is unlikely given superintelligence and misalignment, I am saying the argument that gets there -- superintelligence + misalignment -- is highly conjunctive. The final step., the execution as it were, is no highly conjunctive. Why not?
Q Home10

Informal logic is more holistic than not, I think, because it relies on implicit assumptions.

It's not black and white. I don't think they are zero risk, and I don't think it is Certain Doom, so it's not what I am talking about. Why are you bringing it up? Do you think there is a simpler argument for Certain Doom?

Could you proactively describe your opinion? Or re-describe it, by adding relevant details. You seemed to say "if hard takeoff, then likely doom; but hard takeoff is unlikely, because hard takeoff requires a conjunction of things to be true". I... (read more)

2TAG
Straining after implicit meanings can cause you to miss explicit ones. I think its needed for the "likely". Slow takeoff gives humans more time to notice and fix problems, so the likelihood of bad outcomes goes down. Wasn't that obvious? I do mean >90%. If you mean something else, you are probably talking past me. Why are you doing this? I did not say there is zero risk of anything. Are you using "risk" to mean the probability of the outcome , or the impact of the outcome?
Q Home10

Why ? I'm saying p(doom) is not high. I didn't mention P(otherstuff).

To be able to argue something (/decide how to go about arguing something), I need to have an idea about your overall beliefs.

That doesn't imply a high probability of mass extinction.

Could you clarify what your own opinion even is? You seem to agree that rapid self-improvement would mean likely doom. But you aren't worried about gradual self-improvement or AGI being dangerously smart without much (self-)improvement?

2TAG
No, for me to argue something I only need to state the premises relevant to the conclusion, which in this case are: 1. high probability of existential doom is a complex conjunctive argument 2. laws of probability. Logic isn't holistic. It's not black and white. I don't think they are zero risk, and I don't think it is Certain Doom, so it's not what I am talking about. Why are you bringing it up? Do you think there is a simpler argument for Certain Doom? Doom meaning what? It's obvious that there is some level of risk, but some level of risk isn't Certain Doom. Certain Doom is an extraordinary claim,and the burden of proof therefore is on (certain) doomers. But you seem to be switching between different definitions. Saying “the most dangerous technology with the worst safety and the worst potential to control it” doesn't actually imply a high level of doom (p>9) or a high level of risk (> 90% dead)-- it's only a relative statement. My stated argument has nothing to do with the AGI being impotent given all the premises in the doom argument.
Q Home10

I think I have already answered that: I don't think anyone is going to deliberately build something they can't control at all. So the probability of mass extinction depends on creating an uncontrollable superintelligence accidentally-- for instance, by rapid recursive self improvement. And RRSI , AKA Foom Doom, is a conjunction of claims, all of which are p<1, so it is not high probability.

I agree that probability mostly depends on accidental AGI. I don't agree that probability mostly depends on (very) hard takeoff. I believe probability mostly depen... (read more)

0TAG
That doesn't imply a high probability of mass extinction. Why ? I'm saying p(doom) is not high. I didn't mention P(otherstuff). You seem to be motte-and-baileying.
Q Home10

I want to discuss this topic with you iff you're ready to proactively describe the cruxes of your own beliefs. I believe in likely doom and I don't think the burden of proof is on "doomers".

Maybe there just isn't a good argument for Certain Doom (or at least high probability near-extinction). I haven't seen one

What do you expect to happen when you're building uninterpretable technology without safety guarantees, smarter than all of humanity? Looks like the most dangerous technology with the worst safety and the worst potential to control it.

To me, thos... (read more)

2TAG
I think I have already answered that: I don't think anyone is going to deliberately build something they can't control at all. So the probability of mass extinction depends on creating an uncontrollable superintelligence accidentally-- for instance, by rapid recursive self improvement. And RRSI , AKA Foom Doom, is a conjunction of claims, all of which are p<1, so it is not high probability. That of course leaves other probabilities open, eg. weaponised (S)AI...but I wasn't talking about them.
Q Home10

You are correct that critical thinkers may want to censor uncritical thinkers. However, independent-minded thinkers do not want to censor conventional-minded thinkers.

I still don't see it. Don't see a causal mechanism that would cause it. Even if we replace "independent-minded" with "independent-minded and valuing independent-mindedness for everyone". I have the same problems with it as Ninety-Three and Raphael Harth.

To give my own example. Algorithms in social media could be a little too good at radicalizing and connecting people with crazy opinions, s... (read more)

Q Home31

We only censor other people more-independent-minded than ourselves. (...) Independent-minded people do not censor conventional-minded people.

I'm not sure that's true. Not sure I can interpret the "independent/dependent" distinction.

  • In "weirdos/normies" case, a weirdo can want to censor ideas of normies. For example, some weirdos in my country want to censor LGBTQ+ stuff. They already do.
  • In "critical thinkers/uncritical thinkers" case, people with more critical thinking may want to censor uncritical thinkers. (I believe so.) For example, LW in particu
... (read more)
6lsusr
I appreciate your earnest attempt to understand what I'm writing. I don't think "weirdos/normies" nor "Critical thinkers/uncritical thinkers" quite point at what I'm trying to point at with "independent/conventional". "Independent/dependent" is about whether what other people think influences you to reach the same conclusions as other people. "Weirdos/normies" is about whether you reach the same conclusions as other people. In other words, "weirdos/normies" is correlation. "Independent/dependent" is causation in a specific direction. Independent tends to correlate with weirdo, and dependent tends to correlate with normie, but it's possible to have either one without the other. You are correct that critical thinkers may want to censor uncritical thinkers. However, independent-minded thinkers do not want to censor conventional-minded thinkers. I appreciate your compliment too.
Q Home10

I tried to describe necessary conditions which are needed for society and culture to exist. Do you agree that what I've described are necessary conditions?

I realize I'm pretty unusual in the regard, which may be biasing my views. However, I think I am possibly evidence against the notion that a desire to leave a mark on the culture is fundamental to human identity

Relevant part of my argument was "if your personality gets limitlessly copied and modified, your personality doesn't exist (in the cultural sense)". You're talking about something different, y... (read more)

Q Home10

I think we can just judge by the consequences (here "consequences" don't have to refer to utility calculus). If some way of "injecting" art into culture is too disruptive, we can decide to not allow it. Doesn't matter who or how makes the injection.

Q Home10

To exist — not only for itself, but for others — a consciousness needs a way to leave an imprint on the world. An imprint which could be recognized as conscious. Similar thing with personality. For any kind of personality to exist, that personality should be able to leave an imprint on the world. An imprint which could be recognized as belonging to an individual.

Uncontrollable content generation can, in principle, undermine the possibility of consciousness to be "visible" and undermine the possibility of any kind of personality/individuality. And without t... (read more)

1artifex0
I wouldn't take the principle to an absolute- there are exceptions, like the need to be heard by friends and family and by those with power over you. Outside of a few specific contexts, however, I think people ought to have the freedom to listen to or ignore anyone they like. A right to be heard by all of society for the sake of leaving a personal imprint on culture infringes on that freedom. Speaking only for myself, I'm not actually that invested in leaving an individual mark on society- when I put effort into something I value, whether people recognize that I've done so is not often something I worry about, and the way people perceive me doesn't usually have much to do with how I define myself. Most of the art I've created in my life I've never actually shared with anyone- not out of shame, but just because I've never gotten around to it. I realize I'm pretty unusual in the regard, which may be biasing my views. However, I think I am possibly evidence against the notion that a desire to leave a mark on the culture is fundamental to human identity
Q Home32

Thank you for the answer, clarifies your opinion a lot!

Artistic expression, of course, is something very different. I'm definitely going to keep making art in my spare time for the rest of my life, for the sake of fun and because there are ideas I really want to get out. That's not threatened at all by AI.

I think there are some threats, at least hypothetical. For example, the "spam attack". People see that a painter starts to explore some very niche topic — and thousands of people start to generate thousands of paintings about the same very niche topic... (read more)

1artifex0
Certainly communication needs to be restricted when it's being used to cause certain kinds of harm, like with fraud, harassment, proliferation of dangerous technology and so on. However, no: I don't see ownership of information or ways of expressing information as a natural right that should exist in the absence of economic necessity. Copying an actors likeness without their consent can cause a lot of harm when it's used to sexually objectify them or to mislead the public. The legal rights actors have to their likeness also make sense in a world where IP is needed to promote the creation of art. Even in a post-scarcity future, it could be argued that realistically copying an actors likeness risks confusing the public when those copies are shared without context, and is therefore harmful- though I'm less sure about that one. There are cases where imitating an actor without their consent, even very realistically, can be clearly harmless, however. For example, obvious parody and accurate reconstructions of damaged media. I don't think those violate any fundamental moral right of actors to prevent imitations. In the absence of real harm, I think the right of the public to communicate what they want to communicate should outweigh the desire of an actor control how they're portrayed. In your example of a "spam attack", it seems to me one of two things would have to be true: It could be that people lose interest in the original artist's work because the imitations have already explored limits of the idea in a way they find valuable- in which case, I think this is basically equivalent to when an idea goes viral in the culture; the original artist deserves respect for having invented the idea, but shouldn't have a right to prevent the culture from exploring it, even if that exploration is very fast. Alternatively, it could be the case that the artist has more to say that isn't or can't be expressed by the imitations- other ideas, interesting self expression, and so on-
Q Home10

Maybe I've misunderstood your reply, but I wanted to say that hypothetically even humans can produce art in non-cooperative and disruptive ways, without breaking existing laws.

Imagine a silly hypothetical: one of the best human artists gets a time machine and starts offering their art for free. That artist functions like an image generator. Is such an artist doing something morally questionable? I would say yes.

2dr_s
If they significantly undercut the competition by using some trick I would agree they are, though it's a grey area mostly (what if instead of a time machine they just have a bunch of inherited money that allows them to work without worrying about making a living? Can't people release their work for free?).
Q Home31

Could you explain your attitudes towards art and art culture more in depth and explain how exactly your opinions on AI art follow from those attitudes? For example, how much do you enjoy making art and how conditional is that enjoyment? How much do you care about self-expression, in what way? I'm asking because this analogy jumped out at me as a little suspicious:

And as terrible as this could be for my career, spending my life working in a job that could be automated but isn't would be as soul-crushing as being paid to dig holes and fill them in again. I

... (read more)
2artifex0
In that paragraph, I'm only talking about the art I produce commercially- graphic design, web design, occasionally animations or illustrations.  That kind of art isn't about self-expression- it's about communicating the client's vision. Which is, admittedly, often a euphemism for "helping businesses win status signaling competitions", but not always or entirely. Creating beautiful things and improving users' experience is positive-sum, and something I take pride in. Pretty soon, however, clients will be able to have the same sort of interactions with an AI that they have with me, and get better results. That means more of the positive-sum aspects of the work, with much less expenditure of resources- a very clear positive for society. If that's prevented to preserve jobs like mine, then the jobs become a drain on society- no longer genuinely productive, and not something I could in good faith take pride in. Artistic expression, of course, is something very different. I'm definitely going to keep making art in my spare time for the rest of my life, for the sake of fun and because there are ideas I really want to get out. That's not threatened at all by AI. In fact, I've really enjoyed mixing AI with traditional digital illustration recently. While I may go back to purely hand-drawn art for the challenge, AI in that context isn't harming self-expression; it's supporting it. While it's true that AI may threaten certain jobs that involve artistic self-expression (and probably my Patreon), I don't think that's actually going to result in less self-expression. As AI tools break down the technical barriers between imagination and final art piece, I think we're going to see a lot more people expressing themselves through visual mediums. Also, once AGI reaches and passes a human level, I'd be surprised if it wasn't capable of some pretty profound and moving artistic self-expression in its own right. If it turns out that people are often more interested what minds like tha
2dr_s
Ominous supervillain voice: "For now."
Q Home10

I like the angle you've explored. Humans are allowed to care about humans — and propagate that caring beyond its most direct implications. We're allowed to care not only about humans' survival, but also about human art and human communication and so on.

But I think another angle is also relevant: there are just cooperative and non-cooperative ways to create art (or any other output). If AI creates art in non-cooperative ways, it doesn't matter how the algorithm works or if it's sentient or not.

2dr_s
It's a fair angle in principle; if for example two artists agreed to create N works and train AI on the whole set in order to produce "hybrid" art that mixes their styles, that would be entirely legitimate algorithmic art and I doubt anyone will take issue with it! The problem now is also specifically that N needs to be inordinately large. A model that can create art with few shot learning would make questions of copyright much easier to solve. It's the fact that in practice the only realistic way right now is to have millions of dollars in compute and use a tagged training set bigger than just public domain material which puts AI and artists inevitably on a collision course.
Q Home21

Thus, it doesn't matter in the least if it stifles human output, because the overwhelming majority of us who don't rely on our artistic talent to make a living will benefit from a post-scarcity situation for good art, as customized and niche as we care to demand.

How do you know that? Art is one of the biggest outlets of human potential; one of the biggest forces behind human culture and human communities; one of the biggest communication channels between people.

One doesn't need to be a professional artist to care about all that.

2dr_s
Well, "to make a living" implies that you're an artist as a profession and earn money from it. But I agree with you that that's far from the only problem. Art is a two-way street and its economic value isn't all there is to it. A world in which creating art feels pointless is one in which IMO we're all significantly more miserable.
Q Home1-1

I think you're going for the most trivial interpretation instead of trying to explore interesting/unique aspects of the setup. (Not implying any blame. And those "interesting" aspects may not actually exist.) I'm not good at math, but not that bad to not know the most basic 101 idea of multiplying utilities by probabilities.

I'm trying to construct a situation (X) where the normal logic of probability breaks down, because each possibility is embodied by a real person and all those persons are in a conflict with each other.

Maybe it's impossible to construct ... (read more)

Q Home10

For all intents and purposes it's equivalent to say "you have only one shot" and after memory erasure it's not you anymore, but a person equivalent to other version of you next room.

Let's assume "it's not you anymore" is false. At least for a moment (even if it goes against LDT or something else).

Yes, you have a 0.1 chance of being punished. But who cares if they will erase your memory anyway.

Let's assume that the persons do care.

3lenivchick
Okay, let's imagine that you doing that experiment for 9999999 times, and then you get back all your memories. You still better drink. Probablities don't change. Yes, if you are consistent with your choice (which you should be) - you have a 0.1 probability of being punished again and again and again. Also you have a 0.9 probability of being rewarded again and again and again. Of course that seems counterintuitive, because in real life a perspective of "infinite punishment" (or nearly infinite punishment) is usually something to be avoided at all costs, even if you don't get reward. That's because in real life your utility scales highly non-linearly, and even if single punishment and single reward have equal utility measure - 9999999 punishments in a row is a larger utility loss than a utility gain from 9999999 rewards. Also in real life you don't lose your memory every 5 seconds and have a chance to learn on your mistakes. But if we talking about spherical decision theory in a vacuum - you should drink.
Q Home33

To me, the initial poll options make no sense without each other. For example, "avoid danger" and "communicate beliefs" don't make sense without each other [in context of society].

If people can't communicate (report epistemic state), "avoid danger" may not help or be based on 100% biased opinions on what's dangerous.

  • If some people solve Alignment, but don't communicate, humanity may perish due to not building a safe AGI.
  • If nobody solves Alignment, but nobody communicates about Alignment, humanity may perish because careless actors build an unsafe AGI witho
... (read more)
Q Home10

Maybe you should edit the post to add something like this:

My proposal is not about the hardest parts of the Alignment problem. My proposal is not trying to solve theoretical problems with Inner Alignment or Outer Alignment (Goodhart, loopholes). I'm just assuming those problems won't be relevant enough. Or humanity simply won't create anything AGI-like (see CAIS).

Instead of discussing the usual problems in Alignment theory, I merely argue X. X is not a universally accepted claim, here's evidence that it's not universally accepted: [write the evidence

... (read more)
Q Home10

Maybe there's a misunderstanding. Premise (1) makes sure that your proposal is different from any other proposal. It's impossible to reject premise (1) without losing the proposal's meaning.

Premise (1) is possible to reject only if you're not solving Alignment but solving some other problem.

I'm arguing for open, external, effective legal systems as the key to AI alignment and safety. I see the implementation/instilling details as secondary. My usage refers to specifying rules/laws/ethics externally so they are available and usable by all intelligent syst

... (read more)
-1JWJohnston
I'm talking about the need for all AIs (and humans) to be bound by legal systems that include key consensus laws/ethics/values. It may seem obvious, but I think this position is under-appreciated and not universally accepted. By focusing on the external legal system, many key problems associated with alignment (as recited in the Summary of Argument) are addressed. One worth highlighting is 4.4, which suggests AISVL can assure alignment in perpetuity despite changes in values, environmental conditions, and technologies, i.e., a practical implementation of Yudkowsky's CEV.
Q Home10

Perhaps the most important and (hopefully) actionable recommendation of the proposal is in the conclusion:

"For the future safety and wellbeing of all sentient systems, work should occur in earnest to improve legal processes and laws so they are more robust, fair, nimble, efficient, consistent, understandable, accepted, and complied with." (comment)

Sorry for sounding harsh. But to say something meaningful, I believe you have to argue two things:

  • Laws are distinct enough from human values (1), but following laws / caring about laws / reporting about pred
... (read more)
1JWJohnston
My argument goes in a different direction. I reject premise (1) and claim there is an "essential equivalence and intimate link between consensus ethics and democratic law [that] provide a philosophical and practical basis for legal systems that marry values and norms (“virtue cores”) with rules that address real world situations (“consequentialist shells”)." In the body of the paper I characterize democratic law and consensus  ethics as follows: That is, democratic law corresponds to the common definition of Law. Consensus ethics is essentially equivalent to human values when understood in the standard philosophical sense as "shared values culturally determined through rational consideration and negotiation." In short, I'm of the opinion "Law = Ethics." Regarding your premise (2): See my reply to Abram's comment. I'm mostly ducking the "instilling" aspects. I'm arguing for open, external, effective legal systems as the key to AI alignment and safety. I see the implementation/instilling details as secondary. My reference to Bostrom's direct specification was not intended to match his use, i.e., hard coding (instilling) human values in AIs. My usage refers to specifying rules/laws/ethics externally so they are available and usable by all intelligent systems. Of the various alignment approaches Bostrom mentioned (and deprecated), I thought direct specification came closest to AISVL.
Q Home10

I like how you explain your opinion, very clear and short, basically contained in a single bit of information: "you're not a random sample" or "this equivalence between 2 classes of problems can be wrong".

But I think you should focus on describing the opinion of others (in simple/new ways) too. Otherwise you're just repeating yourself over and over.

If you're interested, I could try helping to write a simplified guide to ideas about anthropics.

Q Home10

Additionally, this view ignores art consumers, who out-number artists by several orders of magnitude. It seems unfair to orient so much of the discussion of AI art's effects on the smaller group of people who currently create art.

What is the greater framework behind this argument? "Creating art" is one of the most general potentials a human being can realize. With your argument we could justify chopping off every human potential because "there's a greater amount of people who don't care about realizing it".

I think deleting a key human potential (and a shared cultural context) affects the entire society.

Q Home20

A stupid question about anthropics and [logical] decision theories. Could we "disprove" some types of anthropic reasoning based on [logical] consistency? I struggle with math, so please keep the replies relatively simple.

  • Imagine 100 versions of me, I'm one of them. We're all egoists, each one of us doesn't care about the others.
  • We're in isolated rooms, each room has a drink. 90 drinks are rewards, 10 drink are punishments. Everyone is given the choice to drink or not to drink.
  • The setup is iterated (with memory erasure), everyone gets the same type of d
... (read more)
1lenivchick
I guess you've made it more confusing than it needs to be by introducing memory erasure to this setup. For all intents and purposes it's equivalent to say "you have only one shot" and after memory erasure it's not you anymore, but a person equivalent to other version of you next room. So what we got is many different people in different spacetime boxes, with only one shot, and yes, you should drink. Yes, you have a 0.1 chance of being punished. But who cares if they will erase your memory anyway. Actually we are kinda living in that experiment - we all gonna die eventually, so why bother doing stuff if you wont care after you die. But I guess we just got used to suppress that thought, otherwise nothing gonna be done. So drink.
Q Home10

Let's look at actual outcomes here. If every human says yes, 95% of them get to the afterlife. If every human says no, 5% of them get to the afterlife. So it seems better to say yes in this case, unless you have access to more information about the world than is specified in this problem. But if you accept that it's better to say yes here, then you've basically accepted the doomsday argument.

There's a chance you're changing the nature of the situation by introducing Omega. Often "beliefs" and "betting strategy" go together, but here it may not be the case.... (read more)

Load More