LESSWRONG
LW

All of Maybe_a's Comments + Replies

Train for incorrigibility, then reverse it (Shutdown Problem Contest Submission)

Things that I seem to notice about the plan:

Adjusting weights a plan for basic AIs, which can't seek to e.g. be internally consistent, eventually landing wherever the attractors take it.
Say, you manage to give your AI enough quirks for it to go cry in a corner. Now you need to lower your AI nerfing to get more intelligence, leading to brinkmanship dynamics.
In the middle, you have a bunch of AI, trained for maximum of various aspects of incorrigibility, hoping they are incapable of cooperating; or for that any single AI will not act destructively (while trained for incorrigibility).

What Does LessWrong/EA Think of Human Intelligence Augmentation as of mid-2023?

Answer by Maybe_aJul 08, 202310

Maybe, in-vivo genetic editing of the brain is possible. Adenoviruses that are a normal delivery mechanism for genetic therapy can pass hemo-encephalic barrier, so seems plausible to an amateur.

(Not obvious that this works in adult organisms, maybe genes activate while fetus grows or during childhood.)

When do "brains beat brawn" in Chess? An experiment

Maybe_a2y00

Odds games against engine are played with contempt equal to matherial difference.

Sorry you didn't know that beforehand.

AutoBound on neural network can achieve OOMs lower training loss

Maybe_a2y10

Obviously fine. I posted here to get better than my single point estimate of what's up with this thing.

Reward Is Not Enough

Maybe_a2y30Review for 2021 Review

The post expands on the intuition of ML field that reinforcement learning doesn't always work and getting it to work is fiddly process.

In the final chapter, a DeepMind paper that argues that 'one weird trick' will work, is demolished.

Secure homes for digital people

Maybe_a2y10Review for 2021 Review

The problem under consideration is very important for some possible futures of humanity.

However, author's eudamonic wishlist is self-admittedly geared for fiction production, and don't seem to be very enforceable.

larger language models may disappoint you [or, an eternally unfinished draft]

Maybe_a2y30Review for 2021 Review

It's a fine overview of modern language models. Idea of scaling all the skills at the same time is highlighted, different from human developmental psychology. Since publishing 500B-PaLM models seemed to have jumps at around 25% of the tasks of BIG-bench.

Inadequacy of measuring average performance on LLM is discussed, where a proportion is good, and rest is outright failure from human PoV. Scale seems to help with rate of success.

How should DeepMind's Chinchilla revise our AI forecasts?

Maybe_a2y10

In 7th footnote, $D_{h u m a n}$ should be 5e9, not 5e6 (doesn't seem to impact reasoning qualitatively).

Humanity as an entity: An alternative to Coherent Extrapolated Volition

Maybe_a3y10

Argument against CEV seems cool, thanks for formulating it. I guess we are leaving some utility on the table with any particular approach.

Part on referring to a model to adjudicate itself seems really off. I have a hard time imagining a thing that has better performance at meta-level than on object-level. Do you have some concrete example?

1Victor Novikov3y

Let me rephrase it: FAI has a part of its utility function that decides how to "aggregate" our values, how to resolve disagreements and contradictions in our values, and how to extrapolate our values. Is FAI allowed to change that part? Because if not, it is stuck with our initial guess on how to do that, forever. That seems like it could be really bad. Actual example: -What if groups of humans self-modify to care a lot about some particular issue, in an attempt to influence FAI? More far-fetched examples: -What if a rapidly-spreading mind virus drastically changes the values of most humans? -What if aliens create trillions of humans that all recognize the alien overlords as their masters? Just to be clear of the point of the examples, these are examples where a "naive" aggregation function might allow itself to be influenced, while a "recursive" function would follow the meta-reasoning that we wouldn't want FAI's values and behavior to be influenced by adversarial modification of human values, only by genuine changes in such (whatever "genuine" means to us. I'm sure that's a very complex question. Which is kind of the point of needing to use recursive reasoning. Human values are very complex. Why would human meta-values be any less complex?)

Could we set a resolution/stopper for the upper bound of the utility function of an AI?

Answer by Maybe_aApr 11, 202260

Thanks for giving it a think.

Turning off is not a solved problem, e.g. https://www.lesswrong.com/posts/wxbMsGgdHEgZ65Zyi/stop-button-towards-a-causal-solution

Finite utility doesn't help, as long as you need to use probability. So you get, 95% chance of 1 unit of utility is worse than 99%, is worse than 99.9%, etc. And then you apply the same trick to probabilities you get a quantilizer. And that doesn't work either https://www.lesswrong.com/posts/ZjDh3BmbDrWJRckEb/quantilizer-optimizer-with-a-bounded-amount-of-output-1

2FinalFormal23y

The point is that the AI turns itself off after it fulfills its utility function, just like a called function. It doesn't maximize utility, it sets utility to greater than 25. There is no ghost in the machine that wants to be alive, and after its utility function is met, it will be inert. None of the objections listed in the 'stop button' hypothetical apply. I'm not sure I understand your objection to finite utility. Here's my model of what you're saying: Now, this is not great and might have led to some deaths. But it seems to me to be far less likely to lead to the end of the world than an unbounded utility agent. I'm not saying that utility ceilings are a panacea, but they might be a useful tool to use in concert with other safety precautions.

Playing with DALL·E 2

Maybe_a3y50

Maybe people failure is caused by whatever they tweaked to avoid 'generating realistic faces and known persons'?

Maybe_a3y10

Cool analysis. Sounds plausible.

So you're out to create a new benchmark? Reading SAT is referencing text in answers with ellipsis, making it hard for me to solve in single read-through. Maybe repeating questions in the beginning and expanding ellipses would fix that for humans. Probably current format is also confusing for pretrained models like GPT.

Requiring a longer task text doesn't seem essential. In the end, maybe, you'd like to take some curriculum learning experiment and thin out learning examples so that current memorization mechanisms wouldn't suf... (read more)

What is your Personal Knowledge Management system?

Answer by Maybe_aJul 17, 201910

No particular philosophy: just add some kludge to make your life easier, then repeat until they blot out the Sun.

Non-computer tool is paper for notes & pen, filing everything useful to inbox during daily review. Everything else is based off org-mode, with Orgzly on mobile. Syncing over SFTP, not a cloud person.

Wrote an RSS reader in Python for filling inbox, along with org-capture. Wouldn't recommend the same approach, since elfeed should do the same reasonably easy. Having a script helps since running it automatically nightly + before daily revie... (read more)

AI Safety Prerequisites Course: Revamp and New Lessons

Maybe_a6y10

Oh, sorry. Javascript shenanigans seem to have sent me into antoher course, works fine on a clean browser.

AI Safety Prerequisites Course: Revamp and New Lessons

Maybe_a6y-20

Consider not wasting your reader's time with having to register on grasple to be presented with 34-euro paywall.

[This comment is no longer endorsed by its author]Reply

4Davide_Zagami6y

Registration and access to the lessons is completely free. Where do you see a paywall?

Is the average ethical review board ethical from an utilitarian standpoint?

Maybe_a9y90

I'd think 'ethical' in review board has noting to do with ethics. It's more of PR-vary review board. Limiting science to status-quo-bordering questions doesn't seem most efficient, but a reasonable safety precaution. However, typical view of the board might be skewed from real estimates of safety. For example, genetic modification of humans is probably minimally disruptive biological research (compared, to, say, biological weapons), though it is considered controversial.

[QUESTION]: What are your views on climate change, and how did you form them?

Maybe_a11y00

My town ...

Let 20% wards be swung by one vote, that gives each voter 1 in (5 * amount of voters) chance of affecting a vote cast on the next level, if that's how US system works?

... elected officials change their behavior based on margins ...

Which is an exercise in reinforcing prior beliefs, since margins are obviously insufficient data.

Politicians pay a lot more attention to vote-giving populations...

Are politicians equipped with a device to detect voters and their needs? If not, then it's lobbying, not voting that matters.

...impact of your r

... (read more)

0A1987dM11y

This isn't a great assumption in general, but it's a particularly bad assumption if you're describing your reasoning out loud in public.

0Luke_A_Somers11y

1 - No. This was an election for members of the town council, so those two 1-ballot-difference votes decided who got two jobs. 2 - I'm not sure what you mean there. A way of measuring the strength of this effect is to see how voting patterns differ between representatives who win by landslides vs representatives who squeak by. The vulnerable representatives act much more cautiously - and for good reason. 3 - Who voted in each election is public information, so the first answer is YES except for the requirement that there be a special device for it which you tacked on. For identifying your needs, there's exit polling (there is no way to make sure that you get exit-polled), in which you are often asked what the most important issue is for you. For other polling, I suppose you can lie and say you vote regularly when called, but you might consider that unethical. The simplest form of lobbying is letters to representatives. Again, I suppose you could lie. 4- Follow it, not you. The population of people who face the same situation with the same logical premises and habits. Copies of computations are the same computation. Perhaps you would find the fraction of people that would need to be voting so that your voting is no longer 'worth it'. Then you would vote or not with that probability.

[QUESTION]: What are your views on climate change, and how did you form them?

Maybe_a11y00

Absolutely, shutting up and multiplying is the right thing to do.

Assume: simple majority vote, 1001 voters, 1 000 000 QALY at stake, votes binomially distributed B(p=0.4), no messing with other people's votes, voting itself doesn't give you QALY.

My vote swings iff 500 <= B(1001, 0.4) < 501, with probability 5.16e-11, it is advised if takes less than 27 minutes.

Realistically, usefulness of voting is far less, due to:

actual populations are huge, and with them chance of swing-voting falls;
QALY are not quite utils (eg. other's QALY counts the same wa

... (read more)

0A1987dM11y

So learning how 900 of the voters (other than you) voted won't shift your beliefs about how the remaining 100 voters voted in the slightest? No, I don't think so.

0Alsadius11y

Politicians say they "need your vote" because they're talking to lots of people. Any one of them doesn't matter, but in bulk they really do.

2Luke_A_Somers11y

My town, with more than 1001 voters in each ward, had around twenty elections last year and two of them were decided by one vote. Your model says that this should happen somewhere in America much less than once in a billion years. The fact of the matter is, voters are not binomially distributed (sometimes this lowers the probabilities further, but sometimes it raises them a lot) Also, elected officials change their behavior based on margins, and on the size and habits of the population of voters. Politicians pay a lot more attention to vote-giving populations than non-vote-giving populations, for instance. The number of minor thresholds that can have some impact is large. And that's before you multiply the impact of your reasoning by the population who might follow it.

[QUESTION]: What are your views on climate change, and how did you form them?

Maybe_a11y80

I don't care, because there's nothing I can do about it. It also applies to all large-scale problems, like national elections.

I do understand, that that point of view creates 'tragedy of commons', but there's no way I can force millions of people to do my bidding on this or that.

I also do not make interventions to my lifestyle, since I expect AGW effects to be dominated by socio-economic changes in the nearest half a century.

1kilobug11y

I think that's a common misconception for not actually running the numbers. We individually have a very low chance of changing anything at large-scale problems, but the effects of changing anything in large-scale problems is enormous. When dealing with very minor chance of very major change, we can't just use our intiutions (which breaks down) but we need to actually run the numbers. And when it's done, like it was on this post, it says that we should care, the order of magnitude of changes being higher than the order of magnitude of our powerlessness.

What are you working on? January 2014

Maybe_a11y20

Are artificial neural networks really Turing-complete? Yep, they are [Siegelman, Sontag 91]. Amount of neurons in the paper is $10^5$ , with rational edge weights, so it's really Kolmogorov-complex. This, however, doesn't say if we can build good machines for specific purposes.

Let's figure out how to sort a dozen numbers with $\lambda$ -calculus and sorting networks. It must stand to notice, that lambda-expression is O(1), whereas sorter network is O(n (log n)^2) in size.

Batcher's odd–even mergesort would be O(log n) levels deep, and given one neuron is used to implement ... (read more)

an ethical puzzle about brain emulation

Maybe_a11y20

A is not bad because torturing a person and then restoring their initial state has precisely same consequences as forging own memory of torturing a person and restoring their initial state.

0asr11y

From a virtue-ethics point of view, it seems reasonable to judge anybody who would do this. Good people would not want to remember committing torture, even if they didn't do it, because this would result in their future selves being wracked with unearned guilt, which is doing an injustice to that future self. Put the other way: Anybody who would want to falsely remember being a torturer would probably be the sort of person who enjoys the idea of being a torturer -- which is to say, a bad person.

A brief history of ethically concerned scientists

Maybe_a12y00

Well, it seems somewhat unfair to judge the decision on information not available for decision-maker, however, I fail to see how is that an 'implicit premise'.

I didn't think Geneva convention was that old, and, actually updating on it makes Immerwahr decision score worse, due to lower expected amount of saved lives (through lower chance of having chemical weapons used).

Hopefully, roleplaying this update made me understand that in some value systems it's worth it. Most likely, E(\Delta victims to Haber's war efforts) > 1.

DanArmak12y140

Here's what I meant by saying you were begging the question: you were assuming the outcome (few people would be killed by chemical warfare after WW1) did not depend on the protests against chemical weapons.

You said originally that protesting against chemical warfare (CW) during WW1 was not worth the sacrifice involved, because few people were killed by CW after WW1.

But the reason few people were killed is that CW was not used often. And one contributing factor to its not being used was that people had protested its use in WW1, and created the Geneva Conven... (read more)

A brief history of ethically concerned scientists

Maybe_a12y00

Standing against unintended pandemics, atomic warfare and other extinction threatenting events have been quite good of an idea in retrospect. Those of us working of scientific advances shall indeed ponder the consequences.

But Immerwahr-Haber episode is just an unrelated tearjerker. Really, inventing process for creation of nitrogen fertilizers is so more useful than shooting oneself in the heart. Also, chemical warfare turned out not to kill much people since WWI, so such sacrifice is rather irrelevant.

4DanArmak12y

That is rather begging the question. As a result of WW1 there have been agreements in place - the Geneva Protocol - not to develop or use chemical weapons, and so fewer people have been killed by them than might have otherwise.

Philosophical Landmines

Maybe_a12y-40

"The universe is too complex for it to have been created randomly."

You're right. Exactly.

Unless there are on order of $2^KolmogorovComplexity(Universe)$ universes, the chance of it being constructed randomly is exceedingly low.

Please, do continue.

[anonymous]12y140

Unless there are on order of $2^KolmogorovComplexity(Universe)$ universes, the chance of it being constructed randomly is exceedingly low.

An extremely low probability of the observation under some theory is not itself evidence. It's extremely unlikely that the I would randomly come up with the number 0.0135814709894468, and yet I did.

It's only interesting if there is some other possibility that assigns a different probability to that outcome.