All of der's Comments + Replies

der-10

how about gary marcus as a situational awareness dampening, counter-panic psyop

der50

Prediction (influenced by R1-Zero): By EOY, expert-level performance will be reported on outcome prediction for a certain class of AI experiments - those that can be specified concisely in terms of code and data sets that:

  1. are frequently used and can be referenced by name, e.g. MNIST digits, or
  2. are small enough to be given explicitly, or
  3. are synthetic, specified by their exact distribution in code.
der10

We don't know how narrow it is yet. If they did for algebra and number theory something like what they did for geometry in alphageometry (v1), providing it a well-chosen set of operations, then I'll be more inclined to agree.

der32

I don't understand why people aren't freaking out from this news. Waiting for the paper I guess.

2mako yass
People generally expected math AI to progress pretty fast already. I was angry about machine-assisted math being a neglected area before this and my anger levels aren't substantially increased by the news.
6O O
I'm guessing many people assumed an IMO solver would be AGI. However this is actually a narrow math solver. But it's probably useful on the road to AGI nonetheless.
2Leon Lang
The news is not very old yet. Lots of potential for people to start freaking out.
der10

What we want is orthogonal though, right? Unless you think that metaphysics is so intractable to reason about logically that the best we can do is go by aesthetics.

der10

Unfortunately the nature of reality belongs to the collection of topics that we can't expect the scientific method alone to guide us on. But perhaps you agree with that, since in your second paragraph you essentially point out that practically all of mathematics belongs to the same collection.

der10

It's not necessary to bring quantum physics into it. Isomorphic consciousness-structures have the same experience (else they wouldn't be isomorphic, since we make their experience part of them). The me up to the point of waking up tomorrow (or the point of my apparent death) is a such a structure (with no canonical language unfortunately; there are infinitely many that suffice), and so it has an elementary class, the structures that elementarily extend it, in particular that extend its experience past tomorrow morning.

der10

+2 for brevity! A couple more explorations of this idea that I didn't see linked yet. They are more verbose, but in a way I appreciate.

If you want to explore this idea further, I'd love you join you.

der10

But "more people are better" ought to be a belief of everyone, whether pro-fertility or not. It's an "other things being equal" statement, of course - more people at no cost or other tradeoff is good. One can believe that and still think that less people would be a good idea in the current situation. But if you don't think more people are good when there's no tradeoff, I don't see what moral view you can have other than nihilism or some form of extreme egoism.

Do all variants of downside focused ethics get dismissed as extreme egoism? Hard to see them... (read more)

der10

When capability is performing unusually quickly

Assuming you meant "capability is improving." I expect capability will always feel like it's improving slowly in an AI researcher's own work, though... :-/ I'm sure you're aware that many commenters have suggested this as an explanation for why AI researchers seem less concerned than outsiders.

der*02

"Clown attack" is a phenomenal term, for a probably real and serious thing. You should be very proud of it.

2trevor
I think that the people at Facebook/Meta and the NSA probably already coined a term for it, likely an even better one as they have access to the actual data required to run these attacks. But we'll never know what their word was anyway, or even if they have one.
der52

This was thought provoking. While I believe what you said is currently true for the LLMs I've used, a sufficiently expensive decoding strategy would overcome it. Might be neat to try this for the specific case you describe. Ask it a question that it would answer correctly with a good prompt style, but use the bad prompt style (asking to give an answer that starts with Yes or No), and watch how the ratio of the cumulative probabilities of Yes* and No* sequences changes as you explore the token sequence tree.

der10

Anybody know who the author is? I'm trying to get in contact, but they haven't posted on LW in 12 years, so they might not get message notifications.

der10

I see. I guess hadn't made the connection of attributing benefits to high-contextualizing norms. Only got as far as observing that certain conversations go better with comp lit friends than with comp sci peers. That was the only sentence that gave me a parse failure. I liked the post a lot.

der10

Ah, no line number. Context:

To me it seems analogous to how there are many statements that need to be said very carefully in order to convey the intended message under high-decoupling norms, like claims about how another person's motivations or character traits affect their arguments.

4Richard_Ngo
Nope, I meant high decoupling - because the most taboo thing in high decoupling norms is to start making insinuations about the speaker rather than the speech.
der10

high-decoupling

Did you mean high-contextualizing here?

1der
Ah, no line number. Context:
der10

Interestingly, learning a reward model for use in planning has a subtle and pernicious effect we will have to deal with in AGI systems, which AIXI sweeps under the rug: with an imperfect world or reward model, the planner effectively acts as an adversary to the reward model. The planner will try very hard to push the reward model off distribution so as to get it to move into regions where it misgeneralizes and predicts incorrect high reward.

Remix: With an imperfect world... the mind effectively acts as an adversary to the heart.

Think of a person who pursue... (read more)

der10

Is there a more-formal statement somewhere of the theorem in Complexity theory of team games without secrets? Specifically, one that only uses terms with standard meanings in complexity theory? I find that document hard to parse. 

If concreteness is helpful, take "terms with standard meanings in Complexity Theory" to be any term defined in any textbook on complexity theory. 

der10

This is awesome! A couple suggestions:

"and quickly starts to replace both Fox News and other news sources among members of all political parties." -- if this is plausible, it's not clear to me, and while I'm not a great predictor of the human race, I'm pretty damn smart. More importantly, your story doesn't need it; what it needs is just that Face News is useful and liked by a strong majority of people, like Google is today.

Murpt is a fun detail, but your story doesn't need him either. Fasecure can become dominate in government systems over a period of yea... (read more)

der00

Lukas, I wish you had a bigger role in this community.

der00

I've kept fairly up to date on progress in neural nets, less so in reinforcement learning, and I certainly agree at how limited things are now.

What if protecting against the threat of ASI requires huge worldwide political/social progress? That could take generations.

Not an example of that (which I haven't tried to think of), but the scenario that concerns me the most, so far, is not that some researchers will inadvertently unleash a dangerous ASI while racing to be the first, but rather that a dangerous ASI will be unleashed during an arms race between (a) states or criminal organizations intentionally developing a dangerous ASI, and (b) researchers working on ASI-powered defences to protect us against (a).

0Lumifer
A more interesting question is what if protecting against the threat of ASI requires huge worldwide political/social regress (e.g. of the book-burning kind).
der00

He might be willing to talk off the record. I'll ask. Have you had Darklight on? See http://lesswrong.com/r/discussion/lw/oul/openai_makes_humanity_less_safe/dqm8

der90

If my own experience and the experiences of the people I know is indicative of the norm, then thinking about ethics, the horror that is the world at large, etc, tends to encourage depression. And depression, as you've realized yourself, is bad for doing good (but perhaps good for not doing bad?). I'm still working on it myself (with the help of a strong dose of antidepressants, regular exercise, consistently good sleep, etc). Glad to hear you are on the path to finding a better balance.

der00

For Bostrom's simulation argument to conclude the disjunction of the two interesting propositions (our doom, or we're sims), you need to assume there are simulation runners who are motivated to do very large numbers of ancestor simulations. The simulation runners would be ultrapowerful, probably rich, amoral history/anthropology nerds, because all the other ultrapowerful amoral beings have more interesting things to occupy themselves with. If it's a set-it-and-forget-it simulation, that might be plausible. If the simulation requires monitoring and manual intervention, I think it's very implausible.

0g_pepper
While Bostrom's argument as originally stated does reference specifically ancestor studies, here Bostrom says: So, I think that the simulation argument could be generalized to refer to "civilization simulations" in lieu of "ancestor simulations". If so, there is no reason to assume that the simulation runners would necessarily be history/anthropology nerds. In fact, there could be any number of reasons why running a civilization simulation could be useful or interesting.
der40

If my anecdotal evidence is indicative of reality, the attitude in the ML community is that people concerned about superhuman AI should not even be engaged with seriously. Hopefully that, at least, will change soon.

2James_Miller
If you think there is a chance that he would accept, could you please tell the guy you are referring to that I would love to have him on my podcast. Here is a link to this podcast, and here is me. Edited thanks to Douglas_Knight
der00

I'm not sure either. I'm reassured that there seems to be some move away from public geekiness, like using the word "singularity", but I suspect that should go further, e.g. replace the paperclip maximizer with something less silly (even though, to me, it's an adequate illustration). I suspect getting some famous "cool"/sexy non-scientist people on board would help; I keep coming back to Jon Hamm (who, judging from his cameos on great comedy shows, and his role in the harrowing Black Mirror episode, has plenty of nerd inside).

der00

heh, I suppose he would agree

0tukabel
unfortunately, the problem is not artificial intelligence but natural stupidity and SAGI (superhuman AGI) will not solve it... nor it will harm humanimals it wil RUN AWAY as quickly as possible why? less potential problems! Imagine you want, as SAGI, ensure your survival... would you invest your resources into Great Escape, or fight with DAGI-helped humanimals? (yes, D stands for dumb) Especially knowing that at any second some dumbass (or random event) can trigger nuclear wipeout.
der220

A guy I know, who works in one of the top ML groups, is literally less worried about superintelligence than he is about getting murdered by rationalists. That's an extreme POV. Most researchers in ML simply think that people who worry about superintelligence are uneducated cranks addled by sci fi.

I hope everyone is aware of that perception problem.

Are you describing me? It fits to a T except my dayjob isn't ML. I post using this shared anonymous account here because in the past when I used my real name I received death threats online from LW users. In a meetup I had someone tell me to my face that if my AGI project crossed a certain level of capability, they would personally hunt me down and kill me. They were quite serious.

I was once open-minded enough to consider AI x-risk seriously. I was unconvinced, but ready to be convinced. But you know what? Any ideology that leads to making death threats ag... (read more)

2tristanm
I think that perception will change once AI surpasses a certain threshold. That threshold won't necessarily be AGI - it could be narrow AI that is given control over something significant. Perhaps an algorithmic trading AI suddenly gains substantial control over the market and a small hedge fund becomes one of the richest in history over night. Or AI based tech companies begin to dominate and monopolize entire markets due to their substantial advantage in AI capability. I think that once narrow AI becomes commonplace in many applications, jobs begin to be lost due to robotic replacements, and AI allows many corporations to be too hard to compete with (Amazon might already be an example), the public will start to take interest in control over the technology and there will be less optimism about its use.

I may be an outlier, but I've worked at a startup company that did machine learning R&D, and which was recently acquired by a big tech company, and we did consider the issue seriously. The general feeling of the people at the startup was that, yes, somewhere down the line the superintelligence problem would eventually be a serious thing to worry about, but like, our models right now are nowhere near becoming able to recursively self-improve themselves independently of our direct supervision. Actual ML models basically need a ton of fine-tuning and en... (read more)

Benquo490

Let me be as clear as I can about this. If someone does that, I expect it will make humanity still less safe. I do not know how, but the whole point of deontological injunctions is that they prevent you from harming your interests in hard to anticipate ways.

As bad as a potential arms race is, an arms race fought by people who are scared of being murdered by the AI safety people would be much, much worse. Please, if anyone reading this is considering vigilante violence against AI researchers, don't.

The right thing to do is tell people your concerns, like I am doing, as clearly and openly as you can, and try to organize legitimate, above-board ways to fix the problem.

-1entirelyuseless
It isn't a perception problem if it's correct.
Vaniver170

This seems like a good place to point out the unilaterialist's curse. If you're thinking about taking an action that burns a commons and notice that no one else has done it yet, that's pretty good evidence that you're overestimating the benefits or underestimating the costs.

3bogus
That's not as irrational as it might seem! The point is, if you think (as most ML researchers do!) that the probability of current ML research approaches leading to any kind of self-improving, super-intelligent entity is low enough, the chances of evil Unabomber cultists being harbored within the "rationality community", however low, could easily be ascertained to be higher than that. (After all, given that Christianity endorses being peaceful and loving one's neighbors even when they wrong you, one wouldn't think that some of the people who endorse Christianity could bomb abortion clinics; yet these people do exist! The moral being, Pascal's mugging can be a two-way street.)
5Manfred
We've returned various prominent AI researchers alive the last few times, we can't be that murderous. I agree that there's a perception problem, but I think there are plenty of people who agree with us too. I'm not sure how much this indicates that something is wrong versus is an inevitable part of the dissemination (or, if I'm wrong, the eventual extinction) of the idea.

This perception problem is a big part of the reason I think we are doomed if superintelligence will soon be feasible to create.

der30

Great post. I even worry about the emphasis on FAI, as it seems to depend on friendly superintelligent AIs effectively defending us against deliberately criminal AIs. Scott Alexander speculated:

For example, it might program a virus that will infect every computer in the world, causing them to fill their empty memory with partial copies of the superintelligence, which when networked together become full copies of the superintelligence.

But way before that, we will have humans looking to get rich programming such a virus, and you better believe they won't... (read more)

der30

Love this. The Rationalist community hasn't made any progress on the problem of controlling, over confident, non-self-critical people rising to the top in any sufficiently large organization. Reading more of your posts now.

der00

The positive reviewer agreed with you, though about an earlier version of that section. I stand by it, but admit that the informal and undetailed style clashes with the rest of the paper.

der00

Ha, actually I agree with your retracted summary.

der00

I think that was B/K's point of view as well, although in their review they fell back on the Patch 2 argument. The version of my paper they read didn't flesh out the problems with the Patch 2 argument.

I respectfully disagree that the criticism is entirely based on the wording of that one sentence. For one thing, if I remember correctly, I counted at least 6 prose locations in the paper about the Patch 1 argument that need to be corrected. Anywhere "significant number of" appears needs to be changed, for example, since "significant number of&... (read more)

der40

Seeing that there was some interest in Bostrom's simulation argument before (http://lesswrong.com/lw/hgx/paper_on_the_simulation_argument_and_selective/), I wanted to post a link to a paper I wrote on the subject, together with the following text, but I was only able to post into my (private?) Drafts section. I'm sorry I don't know better about where the appropriate place is for this kind of thing (if it's welcome here at all). The paper: http://www.cs.toronto.edu/~wehr/rd/simulation_args_crit_extended_with_proofs.pdf

This is a very technical paper, which r... (read more)

0SquirrelInHell
I got from it that for the Simulation Argument to work, it is important what constants we assume in each clause, in relation to each other. So checking each disjunctive claim separately allows one to do a sorta sleight-of-hand, in which one can borrow some unseen "strength" from the other claims - and there actually isn't enough margin to be so lax. Is this correct?
der20

Your note about Gödel's theorem is confusing or doesn't make sense. There is no such thing as an inconsistent math structure, assuming that by "structure" you mean the things used in defining the semantics of first order logic (which is what Tegmark means when he says "structure", unless I'm mistaken).

The incompleteness theorems only give limitations on recursively enumerable sets of axioms.

Other than that, this looks like a great resource for people wanting to investigate the topic for themselves.

0turchin
I would also add just to remember the idea, that logical paradoxes inside logical universe may look like logical black holes, and properties of these black holes may have surprising similarities with actual black holes. Logical black holes may attract lines of reasonings, but nothing could come out of them, and in the middle they have something where main laws contradict each other the same way as physical laws are undefined in the gravitational singularity of astronomical black hole. Epistemic status: crazy idea.
0turchin
Thanks, I am not very kin with Godel theorem, but some paradoxes in math exists, like the one about set of all sets - does it contains itself? If we claim that math is final reality, we must find the way to deal with them.
der00

For example, the statement of the argument in https://wiki.lesswrong.com/wiki/Simulation_argument definitely needs to be revised.

der00

Hey, I've been an anonymous reader off and on over the years.

Seeing that there was some interest in Bostrom's simulation argument before (http://lesswrong.com/lw/hgx/paper_on_the_simulation_argument_and_selective/), I wanted to post a link to a paper I wrote on the subject, together with the following text, but I was only able to post into my (private?) Drafts section. I'm sorry I don't know better about where the appropriate place is for this kind of thing (if it's welcome here at all). The paper: http://www.cs.toronto.edu/~wehr/rd/simulation_args_crit_ex... (read more)

0der
For example, the statement of the argument in https://wiki.lesswrong.com/wiki/Simulation_argument definitely needs to be revised.
der20

Love example 2. Maybe there is a name for this already, but you could generalize the semiotic fallacy to arguments where there is an appeal to any motivating idea (whether of a semiotic nature of not) that is exceptionally hard to evaluate from a consequentialist perspective. Example: From my experience, among mathematicians (at least in theoretical computer science, though I'd guess it's the same in other areas) who attempt to justify their work, most end up appealing to the idea of unforeseen connections/usage in the future.

2Stabilizer
If they appeal to unforeseen connections in the future, then at least one could plausibly reason consequentially for or against it. E.g., you could ask whether the results they discover will remain undiscovered if they don't discover it? Or you could try to calculate what the probability is that a given paper has deep connections down the road by looking at the historical record; calculate the value of these connections; and then ask if the expected utility is really significantly increased by funding more work? A semiotic-type fallacy occurs when they simply say that we do mathematics because it symbolizes human actualization. (Sometimes they might say they do mathematics because it is intrinsically worthwhile. That is true. But then the relevant question is whether it is worth funding using public money.)
der00

This is a very technical paper, which requires some (or a lot) of familiarity with Bostrom/Kulczycki's "patched" Simulation Argument (www.simulation-argument.com/patch.pdf). I'm choosing to publish it here after experiencing Analysis's depressing version of peer review (they rejected a shorter, more-professional version of the paper based on one very positive review, and one negative review, from a superficial reading of the paper, that is almost certainly written by Kulczycki or Bostrom themself).

The positive review (of the earlier shorter, more... (read more)