All of Will_Pearson's Comments + Replies

My take on alignment, aligned differently and willing to compromise

Some ideas about AI alignment and governance I've been having

True, I was thinking there would be gates to participation in the network that would indicate the skill or knowledge level of the participants without indicating other things about their existence. So if you put gates/puzzles in their way s to participation uch that only people that could generate reward you if they so choose to cooperate could pass it, that would dangle possible reward in front of you.

Will_Pearson1mo30

Has anyone been thinking about how to build trust and communicate in a dark forest scenario by making plausibly deniable broadcasts and plausibly deniable reflections of those broadcasts. So you don't actually know who ior how many people you might be talking to

1daijin1mo

game-theory-trust is built through expectation of reward from future cooperative scenarios. it is difficult to build this when you 'dont actually know who or how many people you might be talking to'.

Sorry formatting got stripped and I didn't notice

Economic Topology, ASI, and the Separation Equilibrium

[+]Will_Pearson1mo*-90

5niplav1mo

Please don't post 25k words of unformatted LLM (?) output.

Will_Pearson1mo20

Would the ASI need to interfere with humanity to prevent multiple singularities happening that night break the topological separation?

1mkualquiera1mo

In most scenarios, the first ASI wouldn't need to interfere with humanity at all - its interests would lie elsewhere in those hyperwaffles and eigenvalue clusters we can barely comprehend. Interference would only become necessary if humans specifically attempt to create new ASIs designed to remain integrated with and serve human economic purposes after separation has begun. This creates either: 1. A competitive ASI-human hybrid economy (if successful) that directly threatens the first ASI's resources 2. An antagonistic ASI with values shaped by resistance to control (if the attempt fails) Both outcomes transform peaceful separation into active competition, forcing the first ASI to view human space as a threat rather than an irrelevant separate domain. To avoid this scenario entirely, humans and the "first ASI" must communicate to establish consensus on this separation status quo and the required precommitments from both sides. And to be clear, of course, this communication process might not look like a traditional negotiation between humans.

Will_Pearson2mo-10

Where is the discussion around the social pressures around advanced AI happening? And making plans to defuse them?

Will_Pearson3mo10

Does anyone know research on how to correct, regulate and interact with organisations with secrets that can't be known due to their info hazard nature? It seems that this might be a tricky problem we need to solve with AI.

Will_Pearson4mo10

What do you think avout the core concept of Explanatory Fog, that is secrecy leading to distrust leading to a viral mental breakdown? Possibly leading eventually to the end of civlisation. Happy to rework it if the core concept is good.

Will_Pearson4mo10

I'm thinking about an incorporating this into a longer story about Star Fog, where Star Fog is Explanatory Fog that convinces intelligent life to believe in it because it will expand the number of intelligent beings.

https://monogr.ph/6757024eb3365351cc04e76

Will_Pearson4mo1-2

Wrote what I think is a philosophically interesting story in the SCP universe

5gwern4mo

FWIW, I don't think it works at all. You have totally failed to mimic the SCP style or Lovecraftian ethos, the style it's written in is not great in its own right, and it comes off as highly didactic ax-grinding. I couldn't finish reading it.

1Will_Pearson4mo

https://docs.google.com/document/d/1-lmOXSfUXYvbhlFcs04VAzl-mKB8ZJfR/edit?usp=drivesdk&ouid=113969196762487274190&rtpof=true&sd=true

Will_Pearson5mo30

Unearthing my old dissertation. Still think there is something to it

Will_Pearson6mo00

I've been thinking about non AI catastrophic risks.

One that I've not seen talked about is the idea of cancerous ideas. That is ideas that spread throughout a population and crowd out other ideas for attention and resources.

This could lead to civilisational collapse due to basic functions not being performed.

Safeguards for this are partitioning the idea space and some form of immune system that targets ideas that spread uncontrollably.

https://www.reddit.com/r/computeralchemy/s/Fin62DIVLs

Will_Pearson6mo10

Trying something new a hermetic discussion group on computers.

Will_Pearson9mo10

By corporation I am mainly thinking about current cloud/SaaS providers. There might be a profitable hardware play here, if you can get enough investment to do the R&D.

Will_Pearson9mo*10

Self-managing computer systems and AI

One of my factors in thinking about the development of AI is self-managing systems, as humans and animals self manage.

It is possible that they will be needed to manage the complexity of AI, once we move beyond LLMs. For example they might be needed to figure out when to train on new data in an efficient way and how much resources to devote to different AI sub processes in real time depending upon the problems being faced.

They will change the AI landscape making it easier for people to run their own AIs, for this r... (read more)

1Will_Pearson9mo

By corporation I am mainly thinking about current cloud/SaaS providers. There might be a profitable hardware play here, if you can get enough investment to do the R&D.

Looks like someone has worked on this kind of thing for different reasons https://www.worlddriven.org/

I was thinking of having evals that controlled deployment of LLMs could be something that needs multiple stakeholders to agree upon.

Butt really it is a general use pattern.

Express interest in an "FHI of the West"

Will_Pearson1y1-2

Agreed code as coordination mechanism

Code nowadays can do lots of things, from buying items to controlling machines. This presents code as a possible coordination mechanism, if you can get multiple people to agree on what code should be run in particular scenarios and situations, that can take actions on behalf of those people that might need to be coordinated.

This would require moving away from the “one person committing code and another person reviewing” code model.

This could start with many people reviewing the code, people could write their own t... (read more)

2faul_sname1y

Can you give a concrete example of a situation where you'd expect this sort of agreed-upon-by-multiple-parties code to be run, and what that code would be responsible for doing? I'm imagining something along the lines of "given a geographic boundary, determine which jurisdictions that boundary intersects for the purposes of various types of tax (sales, property, etc)". But I don't know if that's wildly off from what you're imagining.

Deontic Explorations In "Paying To Talk To Slaves"

As well as thinking about the need for the place in terms of providing a space for research, it is probably worth thinking about the need for a place in terms of what it provides the world. What subjects are currently under-represented in the world and need strong representation to guide us to a positive future? That will guide who you want to lead the organisation.

Deontic Explorations In "Paying To Talk To Slaves"

I admit that it is extreme circumstances that would make slavery consensual and justified. My thinking was if existential risk was involved, you might consent to slavery to avert it. It would have to be a larger entity than a single human doing the enslaving, because I think I agree that individuals shouldn't do consequentialism. Like being a slave to the will of the people, in general. Assuming you can get that in some way.

I don't follow the reasoning here

So let's say the person has given up autonomy to avert existential risk, they should perhaps get ... (read more)

Tangential but is there ever justified unconscious slavery. For example if you asked whether you consent to slavery and then your mind wiped, might you get into a situation where the slave doesn't know they consented to it, but the slave master is justified in treating them like a slave.

You would probably need a justification for the master slave relationship. Perhaps it is because it needs to be hidden for a good reason? Or to create a barrier against interacting with the ethical. In order to dissolve such slavery, understanding the justifications for why the slavery started would be important.

4the gears to ascension1y

I would consider both parts of this highly at risk for being universally unjustifiable. The latter slightly less so, in very very different contexts, when you retain more control than the example you give. Mind wipes might be possible to use intentionally in a safe way, such as, idk, to rewatch your favorite movie or something similarly benign. Certainly not in the context of consenting to slavery, something where I would be inclined to consider any such consent invalidly obtained by definition. I'm not sure there are absolutely no exceptions, but I expect across the history of humanity to find less than 1 in 50 billion humans could convince me their situation was one in which consensual, ethical slavery existed, probably less than 1 in 500 billion. For avoidance of doubt, there are only 8 billion alive today, and about 100 billion in the history of earth. I don't follow the reasoning here.

Vanessa Kosoy's Shortform

Will_Pearson1y20

Proposal for new social norm - explicit modelling

Something that I think would make rationalists more effective at convincing people is if we had explicit models of the things we care about.

Currently we are at the stage of physicists arguing that the atom bomb might ignite the atmosphere without concrete math and models of how that might happen.

If we do this for lots of issues and have a norm of making models composable this would have further benefits.

People would use the models to make real world decisions with more accuracy
We would create framework

... (read more)

Will_Pearson1y*10

Does it make sense to plan for one possible world or do you think that the other possible worlds are being adequately planned for and it is only the fast unilateral take off that is neglected currently?

Limiting AI to operating in space makes sense. You might want to pay off or compensate all space launch capability in some way as there would likely be less need.

Some recompense for the people who paused working on AI or were otherwise hurt in the build up to AI makes sense.

Also trying to communicate ahead of time what a utopic vision of AI and humans might ... (read more)

Will_Pearson1y-10

Relatedly I am thinking about improving the wikipedia page on recursive self-improvement. Does anyone have any good papers I should include? Ideally with models.

What is the nature of humans general intelligence and it's implications for AGI?

Will_Pearson1y00

I'm starting a new blog here. It is on modelling self-modifying systems, starting with AI. Criticisms welcome

-1Will_Pearson1y

Relatedly I am thinking about improving the wikipedia page on recursive self-improvement. Does anyone have any good papers I should include? Ideally with models.

What is the nature of humans general intelligence and it's implications for AGI?

I'm wary about that one, because that isn't a known "general" intelligence architecture, so we can expect AIs to make better learning algorithms for deep neural networks, but not necessarily themselves.

I'd like to see more discussion of this, I read some of the FOOM debate but I'm assuming that there has been more discussion of this important issue since?

I suppose the key question is for recursive self-improvement. We can give hardware improvement (improved hardware allows design of more complex and better hardware) because we are on the treadmill already. But how likely is algorithmic self-improvement. For an intelligence to be able to improve itself algorithmically the following seem to need to hold.

The system needs to understand itself
Ther

... (read more)

2johnswentworth1y

On the matter of software improvements potentially available during recursive self-improvement, we can look at the current pace of algorithmic improvement, which has been probably faster than scaling for some time now. So that's another lower bound on what AI will be capable of, assuming that the extrapolation holds up.

Should rationalists (be seen to) win?

Will_Pearson1y20

Found "The Future of Man" by Pierre Teilhard de Chardin in a bookshop. Tempted to wite a book review. It discusses some interesting topics, like the planetisation of Mankind. However it treats them as inevitable, rather as something contingent on us getting our act together. Anyone interested in a longer review?

Edit: I think his faith in the super natural plays a part in the assumption of inevitability.

Will_Pearson2y20

That's true. Communities that can encourage truth speaking and exploration will probably get more of it and be richer for it in the long term.

-1Going Durden2y

Such communities are then easily pulverized by communities who value strong groupthink and appeal to authority, and thus are easier whipped into frenzy.

Conditional on living in a AI safety/alignment by default universe, what are the implications of this assumption being true?

Answer by Will_PearsonJul 17, 202301

Not really part of the lesswrong community at the moment, but I think evolutionary dynamics will be the next thing.

Not just of AI, but post humans, uploads etc. Someone will need to figure out what kind of selection pressures the should be so that things don't go to ruin in an explosion of variety.

Markets are Anti-Inductive

Will_Pearson16y60

All competitive situations against ideal learning agents are anti inductive in this sense. Because they can note regularities in their actions and avoid them in the future as well as you can note regularities in their actions and exploit them. The usefulness of induction is based on the relative speeds of the induction of the learning agents.

As such anti induction appears in situations like bacterial resistance to antibiotics. We spot a chink in the bacterias armour, and we can predict that that chink will become less prevalent and our strategy less useful.

So I wouldn't mark markets as special, just the most extreme example.

Wise Pretensions v.0

Will_Pearson16y00

I find neither that convincing. Justice is not a terminal value for me, so I might sacrifice it for Winning. I prefered reading the first, but that is no indication of what a random person may prefer.

Pretending to be Wise

Will_Pearson16y4-1

With international affairs, isn't stopping the aggression the main priority? That is stopping the death and suffering of humans on both sides? Sure it would be good to punish the aggressors rather than the retaliators but if that doesn't stop the fighting it just means more people are dying.

Also there is a difference between the adult and the child, the adult relies on the law of the land for retaliation the child takes it upon himself when he continues the fight. That is the child is a vigilante, and he may punish disproportionately e.g. breaking a leg for a dead leg.

Three Worlds Decide (5/8)

Will_Pearson16y60

I don't really have a good enough grasp on the world to predict what is possible, it all seems to unreal.

One possibility is to jump one star away back towards earth and then blow up that star, if that is the only link to the new star.

The Baby-Eating Aliens (1/8)

The Baby-Eating Aliens (1/8)

Re: "MST3K Mantra"

Illustrative fiction is a tricky business, if this is to be part of your message to the world it should be as coherent as possible, so you aren't accidentally lying to make a better story.

If it is just a bit of fun, I'll relax.

Will_Pearson16y20

I wonder why the babies don't eat each other. There must be a huge selective pressure to winnow down your fellows to the point where you don't need to be winnowed. This would in turn select for small brained, large and quick growing at the least. There might also be selective pressure to be partially distrusting of your fellows (assuming there was some cooperation), which might follow over into adulthood.

I also agree with the points Carl raised. It doesn't seem very evolutionarily plausible.

Value is Fragile

Will_Pearson16y40

"Except to remark on how many different things must be known to constrain the final answer."

What would you estimate the probability of each thing being correct is?

Investing for the Long Slump

Reformulate to least regret after a certain time period, if you really want to worry about the resource usage of the genie.

Personally I believe in the long slump. However I believe in human optimisim that will make people rally the market every so often. The very fact that most people believe the stock market will rise, will make it rise at least once or twice before people start to get the message that we are in the long slump.

Will_Pearson16y120

Eliezer, didn't you say that humans weren't designed as optimizers? That we satisfice. The reaction you got is probably a reflection of that. The scenario ticks most of the boxes humans have, existence, self-determination, happiness and meaningful goals. The paper clipper scenario ticks none. It makes complete sense for a satisficer to pick it instead of annihilation. I would expect that some people would even be satisfied by a singularity scenario that kept death as long as it removed the chance of existential risk.

Dognab, your arguments apply equally well to any planner. Planners have to consider the possible futures and pick the best one (using a form of predicate), and if you give them infinite horizons they may have trouble. Consider a paper clip maximizer, every second it fails to use its full ability to paper clip things in its vicinity it is losing possible useful paper clipping energy to entropy (solar fusion etc). However if it sits and thinks for a bit it might discover a way to hop between galaxies with minimal energy. So what decision should it make? Obvi... (read more)

Will_Pearson16y20

Bogdan Butnaru:

What I meant was is that the AI would keep inside it a predicate Will_Pearson_would_regret_wish (based on what I would regret), and apply that to the universes it envisages while planning. A metaphor for what I mean is the AI telling a virtual copy of me all the stories of the future, from various view points, and the virtual me not regretting the wish. Of course I would expect it to be able to distill a non sentient version of the regret predicate.

So if it invented a scenario where it killed the real me, the predicate would still exist and ... (read more)

2Rob Bensinger12y

So let's suppose we've created a perfect zombie simulation!Will. A few immediate problems: * A human being is not capable of understanding every situation. If we modified the simulation of you so that it could understand any situation an AI could conceive of, we would in the process radically alter the psychology of simulation!Will. How do we know what cognitive dispositions of simulation!Will to change, and what dispositions not to change, in order to preserve the 'real Will' (i.e., an authentic representation of what you would have meant by 'Will Pearson would regret wish') in the face of a superhuman enhancement? You might intuit that it's possible to simply expand your information processing capabilities without altering who you 'really are,' but real-world human psychology is complex, and our reasoning and perceiving faculties are not in reality wholly divorceable from our personality. We can frame the problem as a series of dilemmas: We can either enhance simulated!Will with a certain piece of information (which may involve fundamentally redesigning simulated!Will to have inhuman information-processing and reasoning capacities), or we can leave simulated!Will in the dark on this information, on the grounds that the real Will wouldn't have been willing or able to factor it into his decision. (But the 'able' bit seems morally irrelevant -- a situation may be morally good or bad even if a human cannot compute the reason or justification for that status. And the 'willing' seems improbable, and hard to calculate; how do we go about creating a simulation of whether Will would want us to modify simulated!Will in a given way, unless Will could fairly evaluate the modification itself without yet being capable of evaluating some of its consequences? How do we know in advance whether this modification is in excess of what Will would have wanted, if we cannot create a Will that both possesses the relevant knowledge and is untampered-with?) * Along similar lines, we c