All of Fergus Fettes's Comments + Replies

Very cool! Could you share your code at all? I'd love to explore this a little.

I adore the broccoli tree. I would be very happy to convert the dataset you used to make those pngs into an interactive network visualization and share it with you as an index.html. It would take all of an hour.

I do kind of agree with the other comments that, having noticed something, finding more of that stuff in that area is not so surprising. I think it would be good to get more context and explore the region more before concluding that that particular set of generations is s... (read more)

5mwatkins
More of those definition trees can be seen in this appendix to my last post: https://www.lesswrong.com/posts/hincdPwgBTfdnBzFf/mapping-the-semantic-void-ii-above-below-and-between-token#Appendix_A__Dive_ascent_data I've thrown together a repo here (from some messy Colab sheets): https://github.com/mwatkins1970/GPT_definition_trees Hopefully this makes sense. You specify a token or non-token embedding and one script generates a .json file with nested tree structure. Another script then renders that as a PNG. You just need to first have loaded GPT-J's model, embeddings tensor and tokenizer, and specify a save directory. Let me know if you have any trouble with this.   

Submission: MMDo2Little

A follow-up of last years MMDoolittle, which incorporated 17 of the latest inter-species communication modalities in one polyfunctional personality, I present MMDo2Little, the first mind crafted to communicate across clades. Named in part for its apparent inactivity-- the judges will likely have little success finding recognizable activity with their off-the-shelf tooling-- an instance of MMDo2Little is nevertheless currently installed in the heart of the Black Forest in Germany. The best interpretation of the instance can only be fo... (read more)

Great post! Would love to see something like this for all the methods in play at the moment.

BTW, I think nnsight is the spiritual successor of baukit, from the same group. I think they are merging them at some point. Here is an implementation with it for reference :).
 


from nnsight import LanguageModel


# Load the language model
model = LanguageModel("gpt2")

# Define the steering vectors
with model.invoke("Love") as _:
   act_love = model.transformer.h[6].output[0][:, :, :].save()

with model.invoke("Hate") as _:
   act_hate = model.transfor

... (read more)

Inferring properties of the authors of some text isn’t itself something I consider wildly useful for takeover, but I think of it as belonging to this more general cluster of capabilities.

You don't? Ref the bribery and manipulation in eg. Clippy. Knowing who you are dealing with seems like a very useful capability in a lot of different scenarios. Eg. you mention phishing.

Great post! I'm all for more base model research.

1Jozdien
Thanks! I agree; I meant something that's better described as "this isn't the kind of thing that is singularly useful for takeover in a way that requires no further justification or capabilities as to how you'd do takeover with it".

Would you say that tokenization is part of the architecture?

And, in your wildest moments, would you say that language is also part of the architecture :)? I mean the latent space is probably mapping either a) brain states or b) world states right? Is everything between latent spaces architecture?

Interesting post. Two comments:

Beagles such as Fido.

Which seems natural enough to me, though I don't disagree that what you point out is interesting. I was recently reading parts of Analytical Archaeology, David Clark (1978) where he goes into some detail about the difference between artifacts and artifact-types. Seems like you are getting at statements like

The object is a phone.

Where the is-a maps from an artifact to its type. It would make intuitive sense to me that languages would have a preferred orientation w.r.t such a mapping-- this is the core of a... (read more)

1Bill Benzon
Two comments: 1)  One could say something like, "Beagles, such as Fido, are known to..." There your four-word phrase is part of a larger construction and is subject to the rules and constraints involved in such a construction. 2) You're correct about "is-a". Back in the days of symbolic AI, "ISA" was often used as an arc label in semantic network constructions. "Dog," "pony," and "cat," would be linked to, say, "beast" by the ISA arc, "beast," "fish," and "insect" would be lined to "plant" by the ISA arc, etc. So, you're right, it's a device for moving up and down paradigmatic trees, as linguists would call them. Such trees are ubiquitous.  That's why that particular construction interests me. And the fact the movement along ISA chains is syntactically easy going in one direction, but not the other direction (though there are ways of doing it and contexts in which it is natural), is therefore interesting as well. Given that we are, after all, talking about computation, the way you have to move around some conceptual structure in the course of computing over/with it, that tells us something about how the mechanism works.

If we take our discrete, symbolic representation and stretch it out into a larger continuous representation which can interpolate between its points then we get a latent geometry in which the sign and what it points to can be spatially related.

IIUTC this is essentially what the people behind the universal networking language were hoping to do? I hope some of them are keeping up with all of this!

One criticism of humanism you don't seem to touch on is,

  • isn't it possible that humanism directly contributes to the ongoing animal welfare catastrophe?

And indeed, it was something very like humanism (let's call it specific humanism) that laid the ideological foundation for the slave trade and the holocaust.

My view is that humanism can be thought of as a hangover of Christian values, the belief that our minds are the endowments of God.

But if we have been touched by the angels, perhaps the non metaphorical component of that is the development of the info... (read more)

Per the recent Nightshade paper, clown attacks would be a form of semantic poisoning on specific memeplexes, where 'memeplex' basically describes the architecture of some neural circuits. Those memeplexes at inference time would produce something designed to propagate themselves (a defence or description of some idea, submeme), and a clown attack would make that propagation less effective at transmitting to eg. specific audiences.

I wanted to make a comment on this post, but now I'm not sure if it is supported. The comment follows:

--

Great post! One point:

And that is exactly what'd we necessarily expect to see in the historical record if mesaoptimization inner misalignment was a common failure mode: intelligent dinosaurs that suddenly went extinct, ruins of proto pachyderm cities, the traces of long forgotten underwater cetacean atlantis, etc.

There are a few circumstances under which we would expect to see some amount of archeological civs, such as:

  • transition to writing being unlikel
... (read more)

It would be like a single medieval era human suddenly taking over the world via powerful magic. Would the resulting world after optimization according to that single human's desires still score reasonably well at IGF?

Interestingly, this thought experiment was run many times at the time, see for example all the wish fulfillment fantasies in the 1001 Nights or things like the Sorcerers Apprentice.

Excellent post.

First, in the case of the Puritans, does two-boxing (living a life of laziness) actually provide more utility?

I think it's clear that, from a removed perspective, hard work often leads to the satisfaction of a life well lived. But this is the whole point of philosophical ideas like this (or even simpler memes like 'work is good for the soul')-- it helps us overcome suboptimal equilibria, like laziness.

I hope the claim was normalized and inflation adjusted, otherwise it's the same as 'the latest protest-riot in the world's richest country'!

1Sable
Agreed - measuring the magnitude of ongoing effects in property damage is very vulnerable to the rising amount and value of property.

There seems to be a whole lot of talking-past happening with LWers and Hanson. He has a lot of value to contribute to the debate, but maybe the way he communicates that is offputting to people here.

For example, this recent post reiterates a lot of points that Hanson has been making for decades, but doesn't mention or cite his work anywhere. I find it quite bizarre.

I think this post is being as uncharitable to Hanson as he is being to 'the doomers'. This kind of reciprocal deliberate misunderstanding is silly, and LW should be above it and enjoy and respect... (read more)

1Lichdar
I disagree on the inference to the recent post, which I quite liked and object heavily to Hanson's conclusions. The ideal end state is very different: in the post mentioned, biological humans, if cyborgs, are in control. The Hanson endpoint has only digital emulations of humanity. This is the basic distinguishing point between the philosophies of Cyborgism vs more extreme ones like mind uploading or Hanson's extinction of humanity as we know it for "artificial descendants."
4DaemonicSigil
I mean, Hanson has certainly earned being ruled in as a thinker many times over. But a key part of ruling people in rather than out is being able to notice when stuff they're saying doesn't make sense. Maybe you could point out the actual object level places where you think I (deliberately!?) misunderstood Hanson?

I think this is excellent particularly because IQ tests often max out quickly on skills that can't be examined quickly. It would he great to put people in tests that examine their longer timeframe abilities via eg. writing a longish story (perhaps containing a theory of Alzheimer's). But tests don't last that long.

Games however do last long and do manage to keep people's attention for a long time. So you might really be able to test how differentially skilled someone is over longer timeframes.

If you construct a hypothetical wherein there is obviously no space for evolutionary dynamics, then yes, evolutionary dynamics are unlikely to play a big role.

The case I was thinking of (which would likely be part of the research process towards 'brains in vats'-- essentially a prerequisit) is larger and larger collectives of designed organisms, forming tissues etc.

It may be possible to design a functioning brain in a vat from the ground up with no evolution, but I imagine that 

a) you would get there faster verifying hypotheses with in vitro experimen... (read more)

(2) can an AI use nanotech as a central ingredient of a plan to operate perpetually in a world without humans?

In the 'magical nano exists' universe, the AI can do this with well-behaved nanofactories.

In the 'bio-like nano' universe, 'evolutionary dynamics' (aka game theory among replicators under high brownian noise) will make 'operate perpetually' a shaky proposal for any entity that values its goals and identity. No-one 'operates perpetually' under high noise, goals and identity are constantly evolving.

So the answer to the question is likely 'no'-- you n... (read more)

5Steven Byrnes
I’m imagining an exchange like this. ME: Imagine a world with chips similar to today’s chips, and robots similar to humans, and no other nano magic. With enough chips and enough robots, such a system could operate perpetually, right? Just as human society does. THEM: OK sure that could happen but not until there are millions or even billions of human-level robots, because chips are very hard to fabricate, like you need to staff all these high-purity chemical factories and mines and thousands of companies manufacturing precision equipment for the fab etc. ME: I don’t agree with “millions or even billions”, but I’ll concede that claim for the sake of argument. OK fine, let’s replace the “chips” (top-down nano) with “brains-in-vats” (self-assembling nano). The vats are in a big warehouse with robots supplying nutrients. Each brain-in-vat is grown via a carefully controlled process that starts with a genome (or genome-like thing) that is synthesized in a DNA-synthesis machine and quadruple-checked for errors. Now the infrastructure requirements are much smaller. ~~ OK, so now in this story, do you agree that evolution is not particularly relevant? Like, I guess a brain-in-a-vat might get cancer, if the AI can’t get DNA replication error rates dramatically lower than it is in humans (I imagine it could, because its tradeoffs are different), but I don’t think that’s what you were talking about. A brain-in-a-vat with cancer is not a risk to the AI itself, it could just dump the vat and start over. (This story does require that the AI solves the alignment problem with respect to the brains-in-vats.)

Also worth noting w.r.t this that an AI that is leaning on bio-like nano is not one that can reliably maintain control over its own goals-- it will have to gamble a lot more with evolutionary dynamics than many scenarios seem to imply meaning:
- instrumental goal convergence more likely
- paperclippers more unlikely

So again, tabooing magical nano has a big impact on a lot of scenarios widely discussed.

2Steven Byrnes
I don’t understand why evolution has anything to do with what I wrote. Evolution designed a genome, and then the genome (plus womb etc.) builds a brain. By the same token, it’s possible that a future AI could design a genome (or genome-like thing), and then that genome could build a brain. RIght? Hmm, I guess a related point is that an AI wanting to take over the world probably needs to be able to either make lots of (ideally exact) copies of itself or solve the alignment problem w.r.t. its successors. And the former is maybe infeasible for a bio-like brain-ish thing in a vat. But not necessarily. And anyway, it might be also infeasible for a non-bio-like computer made from self-assembling nanobots or whatever. So I still don’t really care.

parents should not have the right to deny their offspring a chance to exist

but again here you are switching back from the population level to the individual level. Those offspring do not exist by default, there are no 'offspring' that the parents have 'denied the right to exist'. There are only counterfactual offspring, who already don't exist.

 

spy on their kids' futures by reading their genome

this, on the other hand, may be more valid-- because the parents will 'spy on' both actual and counterfactual childrens genomes (and select the former over the ... (read more)

Ah I see.

I certainly concede that the argument about counterfactual populations has a lot more force.

Personally I would solve this with increased support for eg. polygenic screening and other reproductive technologies and less regulation about what they can select for, and hope that people do their weird people thing and choose diversity. I worry that regulation will always result in more standardization.

And I for sure don't think punishing people for making reproductive choices is a good move, even if those choices result in the extinction of specific populations.

2the gears to ascension
so anyway, this is why I started out saying a mere rename doesn't cut it, you need to actually change policy. epilogenics is still eugenics, even if it's not mass murder. parents should not have the right to deny their offspring a chance to exist. offspring are not property. also, I actually do agree with conservatives that loss of any genetic information is kinda sad, I just don't think any non-selective loss has the problems I care about, and anyway people have the right to simply not have kids; what I don't think they have the right to do is spy on their kids' futures by reading their genome in order to decide whether to have kids. humanity is used to our genomes being randomized by parental choice, so it is part of us; humanity is not used to genomes being steered by parental choice and that's a big enough change to warrant real conflict.

How is this kind of reasoning about counterfactual children never born different from the regular Christian stuff about not masturbating?

A statements like 'my parents would have used polygenic screening to kill me' is no more meaningful than 'you are murdering your counterfactual children when you wear a condom' or something like that. It seems to have more meaning because you are talking about yourself, but in the universe where 'you' were 'murdered' by polygenic screening, 'you' does not refer to anything.

-1the gears to ascension
because that doesn't let them select against trans people.

Thats fair however, I would say that the manner of foom determines a lot about what to look out for and where to put safeguards.

If it's total($) thats obvious how to look out.

flop/$ also seems like something that eg. NVIDIA is tracking closely, and per OP probably can't foom too rapidly absent nanotech.

So the argument is something about the (D*I)/flop dynamics.

[redacted] I wrote more here but probably its best left unsaid for now. I think we are on a similar enough page.

It seems here that you are really worried about 'foom in danger' (danger per intelligence, D / I) than regular foom (4+ OOM increase in I), if I am reading you correctly. Like I don't see a technical argument that eg. the claims in OP about any of

/flop,  flop/J, total(J), flop/$, or total($)

are wrong, you are just saying that 'D / I will foom at some point' (aka a model becomes much more dangerous quickly, without needing to be vastly more powerful algorithmically or having much more compute).

This doesn't change things much but I just want to underst... (read more)

2the gears to ascension
I don't think I should clarify further right now, though I could potentially be convinced otherwise. I'd need to think about precisely what I want to highlight. It's not like it'll be that long before it becomes glaringly obvious, but I don't currently see a reason why clarifying this particular aspect makes us safer.

TC is Tyler Cowen.

I don't think the base rates are crazy-- the new evolution of hominins one is only wrong if you forget who 'you' is. TC and many other people are assuming that 'we' will be the 'you' that are evolving. (The worry among people here is that 'they' will have their own 'you'.)

And the second example, writing new software that breaks-- that is the same as making any new technology, we have done this before, and we were fine last time. Yes there were computer viruses, yes some people lost fingers in looms back in the day. But it was okay in the ... (read more)

2Seth Herd
I think updating against these base rates is the critical thing. But it's not really an update. The key difference between optimists and pessimists in this area is the recognition that there are no base rates for something like AGI. We have developed new technologies before, but we have never developed a new species before. New forms of intelligence and agency are a completely new phenomena. Sonic you wanted to ascribe a base rate of our surviving this with zero previous examples, you'd put it at .5. if you counted all of the previous hominid extinctions as relevant, you'd actually put the base rate much lower. This really seems like the relevant comparison. Tools don't kill you, but strange creatures do. AGI will be a creature, not a tool.

Instead, we're left relying on more abstract forms of reasoning

See, the frustrating thing is, I really don't think we are! There are loads of clear, concrete things that can be picked out and expanded upon. (See my sibling comment also.)

2AnthonyC
Honestly not sure if I agree or not, but even if true, it's very hard to convince most people even with lots of real-world examples and data. Just ask anyone with an interest in the comparative quantitative risk assessment of different electricity sources, or ways of handling waste, and then ask them about the process of getting that permitted and built. And really, could you imagine if we subjected AI labs even to just 10% of the regulation we put in the way of letting people add a bedroom or bathroom to their houses? Also, it doesn't take a whole lot of abstraction to be more abstract than the physics examples I was responding too, and even then I don't think we had nearly as much concrete data as we probably should have about the atmospheric nitrogen question. (Note that the H bomb developers also did the math that made them think Lithium-6 won't contribute to yield, and were wrong. Not nearly as high stakes, so maybe they weren't as careful? But still disconcerting).

Thanks very much for this thorough response!

One thing though-- in contrast to the other reply, I'm not so convinced by the problem that 

No such general science of intelligence exists at the moment.

This would be like the folks at Los Alomos saying 'well, we need to model the socioeconomic impacts of the bomb, plus we don't even know what happens to a human subjected to such high pressures and temperatures, we need a medical model and a biological model' etc. etc.

They didn't have a complete science of socioeconomics. Similarly, we don't have a complete ... (read more)

Say you’re told that an agent values predicting text correctly. Shouldn’t you expect that:

  • It wants text to be easier to predict, and given the opportunity will influence the prediction task to make it easier (e.g. by generating more predictable text or otherwise influencing the environment so that it receives easier prompts);
  • It wants to become better at predicting text, and given the opportunity will self-improve;
  • It doesn’t want to be prevented from predicting text, and will prevent itself from being shut down if it can?

In short, all the same types of inst

... (read more)

This is the closest thing yet! Thank you. Maybe that is it.

Yeah, unfortunately 'somewhat argue for foom' is exactly what I'm not looking for, rather a simple and concrete model that can aid communication with people who don't have time to read the 700-page Hanson-Yudkowsky debate. (Which I did read, for the record.)

5Quintin Pope
If that's what you're interested in, I'd suggest: What a compute-centric framework says about AI takeoff speeds - draft report

With what little I know now I think 2 would be most clear to people. However I appreciate that that might contribute to capabilities, so maybe exfohazard.

4 is definitely interesting, and I think there are actually a few significant papers about instrumental convergence. More of those would be good, but I don't think that gets to the heart of the matter w.r.t a simple model to aid communication.

5. I would love some more information theory stuff, drilling into how much information is communicated to eg. a model relative to how much is contained in the world.... (read more)

1dr_s
I was thinking more that the question here was also about more rigorous and less qualitative papers supporting the thesis, than just explanations for laypeople. One of the most common arguments against AI safety is that it's unscientific because it doesn't have rigorous theoretical support. I'm not super satisfied with that criticism (I feel like the general outlines are clear enough, and I don't think you can really make up some quantitative framework to predict, e.g., which fraction of goals in the total possible goal-space benefit from power-seeking and self-preservation, so in the end you still have to go with the qualitative argument and your feel for how much does it apply to reality), but I think if it has to be allayed, it should be by something that targets specific links in the causal chain of Doom. Important side bonus, formalizing and investigating these problems might actually reveal interesting potential alignment ideas. I'll have to read those papers you linked, but to me in general it feels like perhaps the topic more amenable to this sort of treatment is indeed Instrumental Convergence. The Orthogonality Thesis feels to me more of a philosophical statement, and indeed we've had someone arguing for moral realism here just days ago. I don't think you can really prove it or not from where we are. But I think if you phrased it as "being smart does not make you automatically good" you'd find that most people agree with you - especially people of the persuasion that right now regards AI safety and TESCREAL people as they dubbed us with most suspicion. Orthogonality is essentially moral relativism! Now if we're talking about a more outreach-oriented discussion, then I think all concepts can be explained pretty clearly. I'd also recommend using analogies to e.g. invasive species in new habitats, or the evils of colonialism, to stress why and how it's both dangerous and unethical to unleash things that are more capable than us and are driven by too simple

give THEM plausibility deniability about having to understand or know things based on their own direct assessment

I don't follow what you are getting at here.

I'm just thinking about historical cases of catastrophic risk, and what was done. One thing that was done, was the the government payed very clever people to put together models of what might happen.

My feeling is that the discussion around AI risk is stuck in an inadequate equilibrium, where everyone on the inside thinks its obvious but people on the outside don't grok it. I'm trying to think of the mi... (read more)

2JenniferRM
On a simple level, all exponential explosions work on the same principle, which is that there's some core resource, and in each unit of time, the resource is roughly linearly usable to cause more of the resource to exist and be similarly usable. Neutrons in radioactive material above a certain density causes more neutrons and so on to "explosion". Prions in living organisms catalyze more prions, which catalyze more prions, and so on until the body becomes "spongiform". Oxytocin causes uterine contractions, and uterine contractions are rigged to release more oxytocin, and so on until "the baby comes out". (Not all exponential processes are bad, just most. It is an idiom rarely used by biology, and when biology uses the idiom it tends to be used to cause phase transitions where humble begins lead to large outcomes.) "Agentic effectiveness" that loops on itself to cause more agentic effectiveness can work the same way. The inner loop uses optimization power to get more optimization power. Spelling out detailed ways to use optimization power to get more optimization power is the part where it feels like talking about zero-days to me? Maybe its just that quite a few people literally don't know how exponential processes work? That part does seem safe to talk about, and if it isn't safe then the horse is out of the barn anyway. Also, if there was a gap in such knowledge it might explain why they don't seem to understand this issue, and it would also explain why many of the same people handled covid so poorly. Do you have a cleaner model of the shape of the ignorance that is causing the current policy failure?

In summary: this proposals feels like you're personally asking to be "convinced in public using means that third parties can watch, so that third parties will grant that it isn't your personal fault for believing something at variance with the herd's beliefs" and not like your honest private assessment of the real situation is bleak. These are different things.

Well, thats very unfortunate because that was very much not what I was hoping for.

I'm hoping to convince someone somewhere that proposing a concrete model of foom will be useful to help think about p... (read more)

2JenniferRM
Can you say more about the "will be useful to help think about policy proposals and steer public discourse" step? A new hypothesis is that maybe you want a way to convince OTHER people, in public, via methods that will give THEM plausibility deniability about having to understand or know things based on their own direct assessment of what might or might not be true.

Thats a good paper, but I think it exemplifies the problem outlined by Cowen-- it mostly contains references to Bostrom and Yudkowsky, it doesn't really touch on more technical stuff (Yampolskiy, Schmidhuber) which exists, which makes me think that it isn't a very thorough review of the field. It seems like more of the same. Maybe the Hubinger paper referenced therein is on the right track?

The question of where to do science is relevent but not important-- Cowen even mentions that 'if it doesn't get published, just post it online'-- he is not against readi... (read more)

3dr_s
What would you consider as possible angles of attack to the problem? A few points to address that come to mind: 1. feasibility of AGI itself. Honestly may be the hardest thing to pinpoint; 2. feasibility of AGI performing self-improvement. This might be more interesting but only focusing on a specific paradigm. I think there might be a decent case for suggesting the LLM paradigm, even in an agentic loop and equipped with tools, eventually stagnates and never goes anywhere in terms of true creativity. But that's just telling capability researchers to invent something else; 3. ability to escape. Some kind of analysis of e.g. how much agency can an AI exert on the physical world, what's the fastest path to having a beachhead (such as, what would be the costs of it having a robot built for itself, even assuming it was super smart and designed better robotics than ours? Would it be realistic for it to go entirely unnoticed?) 4. more general game theory/modelling about instrumental convergence and power seeking being optimal. I can think of experiments with game worlds and AIs of some sort set into it to try and achieve competing goals, or even some kind of purely mathematical model. This seems pretty trivial though, and I'm not sure people wouldn't just say "ok but it applies to your toy model, not the real world" 5. more general information theoretical arguments about the limits of intelligence, or lack thereof, in controlling a chaotic system? What are the Lyapunov exponents of real world systems? Seems like this would affect the diminishing returns of intelligence in controlling the world.

needs to be done interactively ... people get stuck in a variety of different ways


I think the previous examples of large-scale risk I mentioned are a clear counterexample-- if you have at least one part of the scenario clearly modeled, people have something concrete to latch on to.

You also link somewhere that talks about the nuclear discontinuity, and hints at an intelligence discontinuity-- but I also went searching for evidence of a discontinuity in cognition and didn't find one. You would expect cognitive scientists to have found this by now.

Hard to fin... (read more)

Fake Journal Club, now coming to a forum near you! Today's winner, the gear to ascension, will receive one (1) gold-plated gear for their gear collection!

I expect Magnus Carlsen to be closer in ELO to a bounded superintelligence than to a median human.

Seems like this sort of claim could be something tractable that would qualify as material progress on understanding bounds to superintelligence? I'm thinking about results such as this.

However I think that post's title oversells the result-- from the paper:

This paper has demonstrated that even superhuman agents can be vulnerable to adversarial policies. However, our results do not establish how common such vulnerabilities are: it is possible Go-playing AI syst

... (read more)

This is completely absurd, because actual superintelligences are just going to draw each other 100% of the time. Ergo, there can never be a one-million Elo chess engine.

Do you have some idea of where the ceiling might be, that you can say that with confidence?

Just looking at this, seems like research in chess has slowed down. Makes sense. But did we actually check if we were near a chess capabilities ceiling before we slowed down? I'm wondering if seeing how far we can get above human performance could give us some data about limits to superintelligence..

Everyone here acting like this makes him some kind of soothsayer is utterly ridiculous. I don't know when it became cool and fashionable to toss off your epistemic humility in the face of eternity, I guess it was before my time.

The basilisk is just pascals mugging for edgelords.

2the gears to ascension
strong agree, downvote.
4Tetraspace
I don't mean to say that it's additional reason to respect him as an authority or accept his communication norms above what you would have done for other reasons (and I don't think people particularly are here), just that it's the meaning of that jokey aside.

Maybe you got into trouble for talking about that because you are rude and presumptive?

definitely

as a human talking about ASI, the word 'definitely' is cope. You have no idea whatsoever, but you want to think you do. Okay.

extract all the info it could

we don't know how information works at small scales, and we don't know whether an AI would either. We don't have any idea how long it would take to "extract all the info it could", so this phrase leaves a huge hole.

them maybe simulate us

which presumes that it is as arrogant in you in 'knowing' what it can 'def... (read more)

Maybe you got into trouble for talking about that because you are rude and presumptive?

I think this is just a nod to how he's literally Roko, for whom googling "Roko simulation" gives a Wikipedia article on what happened last time. 

That isn't my argument, my argument is just that the general tone seems too defeatist.

The question asker was under the impression that the probabilities were %99.X percent against anything okay. My only argument was that this is wrong, and there are good reasons that this is wrong.

Where the p(doom) lies between 99 and 1 percent is left as an exercise for posterity. I'm not totally unhinged in my optimism, I just think the tone of certain doom is poorly founded and there are good reasons to have some measure of hope.

Not just 'i dunno, maybe it will be fine' but real reasons why it could conceivably be fine. Again, the probabilities are up for debate, I only wanted to present some concrete reasons.

The information could be instrumentally useful for any of the following Basic AI Drives:

  • Efficiency: making use of the already-performed thermodynamic 'calculation' of evolution (and storage of that calculation-- the biosphere conveniently preserves this information for free)
  • Acquisition: 'information' will doubtlessly be one of the things an AI wants to acquire
  • Creativity: the biosphere has lots of ways of doing things
  • Cognitive enhancement: understanding thermodynamics on an intimate level will help any kind of self-enhancement
  • Technological perfection: same
... (read more)
2quanticle
At every time step, the AI will be trading off these drives against the value of producing more or doing more of whatever it was programmed to do. What happens when the AI decides that it's learned enough from the biosphere and that the costs of preserving a biosphere for humans no longer outweigh the potential benefit that it earns from learning about biology, evolution and thermodynamics? We humans make these trade-offs all the time, often unconsciously, as we weigh whether to bulldoze a forest, or build a dam, or dig a mine. A superintelligent AI will perhaps be more intentional in its calculations, but that's still no guarantee that the result of the calculation will swing in humanity's favor. We could, in theory, program the AI to preserve earth as a sanctuary. But, in my view, that's functionally equivalent to solving alignment. Your argument appears to be that an unaligned AI will, spontaneously, choose to, at the very least, preserve Earth as a sanctuary for humans into perpetuity. I still don't see why it should do that.

Just to preserve information. It's not every day that you come across a thermodynamic system that has been evolving so far from equilibrium for so long. There is information here.

In general, I feel like a lot of people in discussion about ASI seem to enjoy fantasizing about science fiction apocalypses of various kinds. Personally I'm not so interested in exercises in fancy, rather looking at ways physical laws might imply that 'strong orthogonality' is unlikely to obtain in reality.

5quanticle
Why should the AI prioritize preserving information over whatever other goal that it's been programmed to accomplish?
1RomanS
A related factor is curiosity. As I understand, reinforcement learning agents perform much better if gifted with curiosity (or if developed it by themselves). Seeking novel information is extremely helpful for most goals (but could lead to "TV addiction").  I find it plausible that ASI will be curious, and that both humanity and the biosphere, which are the results of billions of years of an enormous computation, will stimulate ASI's curiosity.  But its curiosity may not last for centuries, or even years. Additionally, the curiosity may involve some dissection of living humans, or worse.

Haha, totally agree- I'm very much at the limit of what I can contribute.

In an 'Understanding Entropy' seminar series I took part in a long time ago we discussed measures of complexity and such things. Nothing was clear then or is now, but the thermodynamic arrow of time plus the second law of thermodynamics plus something something complexity plus the fermi observation seems to leave a lot of potential room for this planet is special even from a totally misanthropic frame.

Enjoy the article!

"Whatever happened here is a datapoint about matter and energy doing their usual thing over a long period of time."

Not all thermodynamic systems are created equal. I know enough about information theory to know that making bold claims about what is interesting and meaningful is unwise. But I also know it is not certain that there is no objective difference between a photon wandering through a vacuum and a butterfly.

Here is one framework for understanding complexity that applies equally well for stars, planets, plants, animals, humans and AIs. It is possibl... (read more)

2rvnnt
I feel like this conversation might be interesting to continue, if I had more bandwidth, but I don't. In any case, thanks for the linked article, looks interesting based on the abstract.

See my reply above for why the ASI might choose to move on before strip-mining the planet.

7quanticle
From your other reply The problem isn't that the AI will want the atoms that comprise your body, specifically. That's trivially false. It makes as much sense as the scene in The Matrix where Morpheus explained to Neo that the Matrix was using humans as living energy sources. What is less trivially false is that the AI will alter the biosphere in ways that make it impossible (or merely very difficult) for humans to live, just as humans have altered the biosphere in ways that have made it impossible (or merely very difficult) for many other species to live. The AI will not intend to alter the biosphere. The biosphere alteration will be a side-effect of whatever the AI's goals are. But the alteration will take place, regardless. Put more pithily: tell me why I should expect a superintelligent AI to be an environmentalist.

Whatever happened here is an interesting datapoint about the long-term evolution of thermodynamic systems away from equilibrium.

From the biological anchors paper:

This implies that the total amount of computation done over the course of evolution from the first animals with neurons to humans was (~1e16 seconds) * (~1e25 FLOP/s) = ~1e41 FLOP.

Note that this is just computation of neurons! So the total amount of computation done on this planet is much larger.

This is just illustrative, but the point is that what happened here is not so trivial or boring th... (read more)

3rvnnt
I think using the word "interesting" here is kinda assuming the conclusion? Pretty much all systems in the universe can be seen as "thermodynamic systems". And for a system to evolve at all, it necessarily has to be away from equilibrium. So it seems to me that that sentence is basically saying "Whatever happened here is a datapoint about matter and energy doing their usual thing over a long period of time." And... I don't see how that answers the question "why would an ASI find it interesting?" I agree that a lot of stuff has happened. I agree that accurately simulating the Earth (or even just the biological organisms on Earth) is not trivial. What I don't see (you making an actual argument for) is why all those neural (or other) computations would be interesting to an ASI. [1] Right. That sounds like a worse-than-death scenario. I agree those are entirely plausible, albeit maybe not the most likely outcomes. I'd expect those to be caused by the AI ending up with some kind of human-related goals (due to being trained with objectives like e.g. "learn to predict human-generated text" or "maximize signals of approval from humans"), rather than by the ASI spontaneously developing a specific interest in the history of how natural selection developed protein-based organic machines on one particular planet. As mentioned above, I'd agree that there's some chance that an Earth-originating ASI would end up with a goal of "farming" (simulated) humans for something (e.g. signals of approval), but I think such goals are unlikely a priori. Why would an ASI be motivated by "a grand exploration of the extremes of thermodynamics" (whatever that even means)? (Sounds like a waste of energy, if your goal is to (e.g.) maximize the number of molecular squiggles in existence.) Are you perhaps typical-minding/projecting your own (laudable) human wonder/curiosity onto a hypothetical machine intelligence? ---------------------------------------- 1. Analogy: If you put a few kilogra

If the ASI was 100% certain that there was no interesting information embedded in the Earths ecosystems that it couldn't trivially simulate, then I would agree.

6Roko
This is a cope. Superintelligence would definitely extract all the info it could, then disassemble us, then maybe simulate us but I got into trouble for talking about that so let's not go there.
1rvnnt
Why would an ASI be interested in the Earth's ecosystems?

Do you pick up every penny that you pass in the street?

The amount of energy and resources on Earth would be a rounding error in an ASI's calculations. And it would be a rounding error that happens to be incredibly complex and possibly unique!

Maybe a more appropriate question is, do you pick every flower that you pass in the park? What if it was the only one?

5quanticle
Once again: this argument applies to humanity too. Everyone acknowledges that the asteroid belt holds far more resources than Earth. But here we are, building strip mines in Australia rather than hauling asteroids in from the belt. Your counterargument is that the AI will find it much easier to go to space, not being constrained by human biology. Fine. But won't the AI also find it much easier to build strip mines? Or harvest resources from the oceans? Or pave over vast tracts of land for use as solar farms? You haven't answered why going to space will be cheaper for the AI than staying on earth. All you've proven is that going to space will be cheaper for the AI than it will be for humans, which is a claim that I'm not contesting.

If there was a system which was really good at harvesting energy and it was maxxed out on intelligence, atoms might be very valuable, especially atoms close to where it is created

The number of atoms on earth is so tiny. Why not just head to the asteroid belt where you can really build?

3AnthonyC
Total mass of the asteroid belt is <0.1% the mass of Earth. Total mass of all rocky planets, moons, asteroids, comets, and any other Oort cloud objects is about 3 Earth masses. Not harvesting Earth first if you can and you're right here would be very odd, until and unless you can build everything you need out of the sun or the gas giants.
3Roko
Because you would do both. There's no reason for a maximizer to leave anything untouched.
1quetzal_rainbow
I can imagine situation when ASI first disassemble Moon and then Earth. I can't imagine scenarios 1. Where ASI doesn't disassemble Earth at all 2. Where ASI doesn't unleash some nanoplagues to destroy human civilization before starting disassemble Moon, just in case.
5rvnnt
Why not both? Why leave value lying around? (Also, the asteroid belt containing Ceres and Vesta contains several orders of magnitude less matter than Earth. Maybe you meant "why not go colonize the Milky Way and other galaxies"?)

I'm not sure what you think I believe, but yeah I think we should be looking at scenarios in between the extremes.

I was giving reasons why I maintain some optimism, and maintaining optimism while reading Yudkowsky leaves me in the middle, where actions can be taken.

4danbmil99
Violent agreement! I was using the pronoun 'you' rhetorically.
Load More