avturchin's Shortform

avturchin

LESSWRONG
LW

avturchin's Shortform

by avturchin

13th Aug 2019

1 min read

177

5

This is a special post for quick takes by avturchin. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

avturchin's Shortform

177 comments, sorted by

top scoring

Click to highlight new comments since: Today at 11:58 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]avturchin4mo938

LLM knows when it hallucinates in advance, and this can be used to exclude hallucinations.

TLDR: prompt "predict the hallucination level of each item in the bibliography list and do not include items expected to have level 3 or above" works.

I performed an experiment: I asked Claude 3.7 Sonnet to write the full bibliography of Bostrom. Around the 70th article, it started hallucinating. I then sent the results to GPT-4.5 and asked it to mark hallucinations and estimate the hallucination chances from 1 to 10 (where 10 is the maximal level of hallucination). It correctly identified hallucinations.

After that, I asked Sonnet 3.7 in another window to find the hallucination level in its own previous answer, and it gave almost the same answers as GPT-4.5. The difference was mostly about exact bibliographical data of some articles, and at first glance, it matched 90% of the data from GPT-4.5. I also checked the real data through Google Scholar manually.

After that, I asked Sonnet to write down the bibliography again but add a hallucination rating after each item. It again started hallucinating articles soon, but to my surprise, it gave correct answers ratings of 1-2 and incorrect ones ra... (read more)

[-]Daniel Tan4mo*122

This is pretty cool! Seems similar in flavour to https://arxiv.org/abs/2501.11120 you’ve found another instance where models are aware of their behaviour. But, you’ve additionally tested whether you can use this awareness to steer their behaviour. I’d be interested in seeing a slightly more rigorous write-up.

Have you compared to just telling the model not to hallucinate?

3avturchin4mo

I found that this does not work for finding an obscure quote from a novel. It still hallucinates different, more popular novels as sources and is confident in them. But it seems it doesn't know the real answer, though I am sure that the needed novel was in its training dataset (it knows plot).

1ErioirE3mo

This seems like a somewhat difficult use case for LLMs. It may be a mistake to think of them as a database of the *entire contents* of the training data. Perhaps instead think of them as compressed amalgamations of the the general patterns in the training data? I'm not terribly surprised that random obscure quotes can get optimized away.

2avturchin3mo

Yes, but it knows all Bostrom articles, maybe because it has seen the list a hundred times.

4Knight Lee3mo

It's incredibly surprising that state-of-the-art AI don't fix most of their hallucinations despite being capable (and undergoing reinforcement learning). Is the root cause of hallucination alignment rather than capabilities?! Maybe the AI gets a better RL reward if it hallucinates (instead of giving less info), because users are unable to catch its mistakes.

1ACCount4mo

This is way more metacognitive skill than what I would have expected an LLM to have. I can make sense of how an LLM would be able to do that, but only in retrospect. And if a modern high end LLM already knows on some level and recognizes its own uncertainty? Could you design a fine tuning pipeline to reduce hallucination level based on that? At least for reasoning models, if not for all of them?

2avturchin4mo

It looks like (based on the article published a few days ago by Anthropic about the microscope) Claude Sonnet was trained to distinguish facts from hallucinations, so it's not surprising that it knows when it hallucinates.

1ACCount3mo

Is the same true for GPT-4o then, which could spot Claude's hallucinations? Might be worth testing a few open source models with better known training processes.

[-]avturchin8mo463

Lifehack: If you're attacked by a group of stray dogs, pretend to throw a stone at them. Each dog will think you're throwing the stone at it and will run away. This has worked for me twice.

[-]Prudhviraj Naidu8mo313

Speaking from experience in Mumbai, just pretending to throw a stone doesn't necessarily work. You have to pretend to pick up a stone and then throw it.

[-]avturchin8mo140

Yes. It is important point.

2Daniel Kokotajlo8mo

Huh. If you pretend to throw the stone, does that mean you make a throwing motion with your arm, but just don't actually release the object you are holding? If so, how come they run away instead of e.g. cringing and expecting to get hit, and then not getting hit, and figuring that you missed and are now out of ammo? Or does it mean you make menacing gestures as if to throw, but don't actually make the whole throwing motion?

6avturchin8mo

As was said above, first you need to pick a stone from the ground or pretend that you are doing this if there is no stone around. Even if you have a stone, make the gesture that you take it from the ground. Another important point is to do it quickly and aggressively with loud cry. Also you can pull back one's arm with a stone. The whole trick is that dogs are so afraid of stones that they will run away before you actually throw it or they see where it fails.

[-]gwern8mo140

Hm. Does that imply that a pack of dogs hunting a human is a stag hunt game?

9avturchin8mo

There are some game theory considerations here: If I throw the stone, all dogs will know that I don't have it anymore, so it would be safe for them to continue the attack (whether I hit one or miss). Therefore, it's better for me to threaten and keep two stones rather than actually throw one. If dogs really want to attack me, they might prefer that I throw the stone so they can attack afterward. However, I think each dog fails to consider that I'm most likely to throw the stone at another dog. Each individual dog has a small chance of being injured by the stone, and they could succeed if they continue the attack. Real hunters like wolves might understand this.

2ChristianKl8mo

The dogs are not hunting humans but want to defend territory or something similar.

2avturchin8mo

The problem is that their understanding of their territory is not the same as our legal understanding, so they can attack on the roads outside their homes.

4ChristianKl8mo

My point is that the behavior is not well modeled as "hunting humans". They don't attack humans with the intent to kill and eat as prey.

8Warty8mo

burning the dog defense commons 😔

7Rob Lucas8mo

When I was trekking in Qinghai my guide suggested we do a hike around a lake on our last day on the way back to town. It was just a nice easy walk around the lake. But there were tibetan nomads (nomadic yak herders, he just referred to them as nomads) living on the shore of the lake, and each family had a lot of dogs (Tibetan Mastiffs as well as a smaller local dog they call "three eyed dogs"). Each time we got near their territory the pack would come out very aggressively. He showed me how to first always have some stones ready, and second when they approached to throw a stone over their head when they got too close. "Don't hit the dogs" he told me, "the owners wouldn't be happy if you hit them, and throwing a stone over their heads will warn them off". When they came he said, "You watch those three, I need to keep an eye on the ones that will sneak up behind us." Each time the dogs used the same strategy. There'd be a few that were really loud and ran up to us aggressively. Then there'd be a couple sneaking up from the opposite side, behind us. It was my job to watch for them and throw a couple of stones in their direction if they got too close. He also made sure to warn me, "If one of them does get to you, protect your throat. If you have to give it a forearm to bite down on instead of letting it get your throat." He had previously shown me the large scar on his arm where he'd used that strategy in the past. When I looked at him sort of shocked he said, "don't worry, it probably won't come to that." At this point I was wondering if maybe we should skip the lake walk, but I did go there for an adventure. Luckily the stone throwing worked, and we were walking on a road with plenty of stones, so it never really got too dangerous. Anyway, +1 to your advice, but also look out for the dogs that are coming up behind you, not just the loud ones that are barking like mad as a distraction.

3Eli Tyre8mo

You have been attacked by a pack of stray dogs twice?!?!

5niplav8mo

Not surprising to me: I've lived in a city with many stray dogs for less than half a year, and got "attacked" ("harrassed" is maybe a better term) by a stray dog twice.

2Ruby8mo

Dog: "Oh ho ho, I've played imaginary fetch before, don't you worry."

2Shankar Sivarajan8mo

Why pretend, and not actually throw a stone? Or is this meant as a feint in case you can't find one lying within reach?

[-]avturchin7mo*360

OpenAI whistleblower found dead in San Francisco apartment.

Suchir Balaji, 26, claimed the company broke copyright law.

[-]Viliam7mo220

Suppose that you are a whistleblower, and you suspect what someone will try to "suicide" you. How can you protect yourself?

If someone wants to murder you, they can. If you ever walk outside, you can't avoid being shot by a sniper. Or a random thug will be paid by a mysterious stranger to stab you. So my question is not "how can you make yourself immortal", but rather "how can you make it so that if you are killed, it will very obviously not be a suicide".

Saying "I have no intention to kill myself, and I suspect that I might be murdered" is not enough.

Wearing a camera that is streaming to a cloud 24/7, and your friends can publish the video in case of your death... seems a bit too much. (Also, it wouldn't protect you e.g. against being poisoned. But I think this is not a typical way how whistleblowers die.) Is there something simpler?

8jimrandomh7mo

You can prevent this by putting a note in some place that isn't public but would be found later, such as a will, that says that any purported suicide note is fake unless it contains a particular password. Unfortunately while this strategy might occasionally reveal a death to have been murder, it doesn't really work as a deterrent; someone who thinks you've done this would make the death look like an accident or medical issue instead.

[-]TsviBT7mo140

You can publish it, including the output of a standard hash function applied to the secret password. "Any real note will contain a preimage of this hash."

2trevor7mo

Your effort must scale to be appropriate to the capabilities of the people trying to remove you from the system. You have to know if they're the type of person who would immediately default to checking the will. More understanding and calibration towards what modern assassination practice you should actually expect is mandatory because you're dealing with people putting some amount of thinkoomph into making your life plans fail, so your cost of survival is determined by what you expect your attack surface looks like. The appropriate-cost and the cost-you-decided-to-pay vary in OOMs depending on the circumstances, particularly the intelligence, resources, and fixations of the attacker. For example, the fact that this happened 2 weeks after assassination got all over the news is a fact that you don't have the privilege of ignoring if you want the answer, even though that particular fact will probably turn out to be unhelpful e.g. because the whole thing was probably just a suicide due to the base rates of disease and accidents and suicide being so god damn high. If this sounds wasteful, it is. It's why our civilization has largely moved past assassination, even though getting-people-out-of-the-way is so instrumentally convergent for humans. We could end up in a cycle where assassination gets popular again after people start excessively standing in each other's way (knowing they won't be killed for it), or a stable cultural state like the Dune books or the John Wick universe and we've just been living in a long trough where elites aren't physically forced to live their entire lives like mob bosses playing chess games against invisible adversaries.

[-]Shankar Sivarajan7mo113

How is this better than stating explicitly that you're not going to commit suicide?

0Seth Herd7mo

People change their minds a lot.

0Shankar Sivarajan7mo

Yes, they do. People also amuse themselves from beyond the grave by arranging for their deaths to look like murders before killing themselves. Or are so overcome by remorse at fabricating lies about their beloved friends to the feds that they encase their feet in concrete and throw themselves into nearby lakes without thinking about how it'd look. Or forget their secret passwords to authenticate their suicide notes and decide it's too much trouble to retrieve it. So sure, I agree there are reasons why a death that strongly looks like murder might still be suicide. But that doesn't address my position that if you can broadcast the message that you have no intention to kill yourself in the clear with perfect authentication, and still not be sufficiently convincing that your imminent death isn't suicide, elaborate schemes with passwords or cryptographic hashes don't do anything.

2Seth Herd7mo

Really they do those things? The concrete? I think it's on a spectrum of likelihood and therefore believability. I wasn't commenting on your message, just what you'd said in that comment. Sure it's better to say it than not. And better yet to do more.

6lc7mo

If the person or people trying to murder you is omnicompetent, then it's hard. If they're regular people, then there are at least lots of temporary measures you can take that would make it more difficult. You can fly to a random state or country and check into a motel without telling anybody where you are. Or you could find a bunch of friends and stay in a basement somewhere. Mobsters used to call doing that sort of thing for a time before a threat had receded "going to ground". If you move to New York or London, your every move outside of a private home or apartment will already be recorded. Then place a security camera in your house.

4avturchin7mo

I will lower the possible incentive of the killers by publishing all I know - and make it in such legal way that it can be used in court even if I am dead (affidavit?)

4lc7mo

Frankly I do think this would work in many jurisdictions. It didn't work for John McAfee because he has a history of crazy remarks, it sounds like the sort of thing he'd do to save face/generate intrigue if he actually did plan on killing himself, and McAfee made no specific accusations. But if you really thought Sam Altman's head of security was going to murder you, you'd probably change their personal risk calculus dramatically by saying that repeatedly on the internet. Just make sure you also contact police specifically with what you know, so that the threat is legible to them as an institution.

1keltan7mo

I may be an outlier here. But if I thought I was going to be assassinated, I would think of: * JFK -MLK * James A. Garfield * Lincoln * Franz Ferdinand And from these I'd think "Hu, better buy a bullet proof vest". I would unfortunately not think about 'Being Suicided', unless I had an expectation that it would occur in this way.

4avturchin7mo

One way of not being suicide is not live alone. Stay with 4 friends.

[-]Ben Pace7mo143

Are there Manifold markets yet on whether this was a suicide and whether it will turn out that this was due to any pressures relating to the OpenAI whistleblowing?

9lc7mo

Tapping the sign:

4John Wiseman7mo

https://www.lesswrong.com/posts/yLFyoYhbhDYtuQWjm/probability-of-death-by-suicide-by-a-26-year-old

3mako yass7mo

All novel information:

1[comment deleted]7mo

1green_leaf7mo

Does anyone have stats on OpenAI whistleblowers and their continued presence in the world of living?

[-]avturchin9mo282

Collapse of mega-project to create AI based on linguistics

ABBYY spent 100 million USD for 30 years to create a model of language using hundreds of linguists. It fails to compete with transformers. This month the project was closed. More in Russian here: https://sysblok.ru/blog/gorkij-urok-abbyy-kak-lingvisty-proigrali-poslednjuju-bitvu-za-nlp/

7gwern9mo

I had no idea ABBYY was so big. I thought it was just some minor OCR or PDF software developer. Interesting to hear about their historical arc. (I am also amused to see my Sutton meme used.)

7cubefox9mo

Thanks, this was an interesting article. The irony of course being that I, not knowing Russian, read it using Google Translate.

2Chris_Leong9mo

What's ABBYY?

4avturchin9mo

ABBYY created Finereader which was one of the best OCR systems.

4Mo Putera9mo

Wikipedia says it's a SaaS company "specializing in AI-powered document processing and automation, data capture, process mining and OCR": https://en.wikipedia.org/wiki/ABBYY

[-]avturchin9mo170

"Bird Flu H5N1: Not Chaos, but Conspiracy?" By Alexander Pruss
Two months ago, I was puzzled how bird flu, potentially capable of killing tens of millions, went rampant on American livestock farms and began infecting workers, yet no urgent measures were being taken. Even standard epidemiological threat monitoring was happening unsystematically, with months-long delays, and results weren't being made public for months afterward. What happened to the bitter lessons from the coronavirus pandemic? Why such chaos? Since then, the sense of criminal inaction has only intensified. Missouri discovered the first outbreak of human cases unrelated to farm workers, but molecular testing was neglected and infection paths remained undiscovered.

In California, a more pathogenic variant of bird flu spread to hundreds of dairy farms, reportedly killing up to 15% of cows, with almost daily new cases of virus transmission to humans. The virus apparently came to California through cattle transportation from Idaho, despite belatedly introduced rules formally prohibiting the transport of infected cows across state lines. The problem was that infection in transported cows was checked through sel... (read more)

4Viliam9mo

Sounds similar to the kind of logic that makes salmonellosis 10x more frequent in America than in Europe. On one hand, yes, the optimal number of people dying from farm-produced diseases is greater then zero, and overreaction could cause net harm. On the other hand, it feels like the final decision should be made in some way better than "the farmers lobby declares the topic taboo, and enforces the taboo across the nation", because the one-sided incentives are obvious.

2avturchin9mo

Also, bird flu is an international risk and other countries may sue US if it fails to prevent virus' evolution in obviously foreseeable way.

[-]avturchin1y11-1

Roman Mazurenko is dead again. First resurrected person, Roman lived as a chatbot (2016-2024) created based on his conversations with his fiancé. You might even be able download him as an app.

But not any more. His fiancé married again and her startup http://Replika.ai pivoted from resurrection help to AI-girlfriends and psychological consulting.

It looks like they quietly removed Roman Mazurenko app from public access. It is especially pity that his digital twin lived less than his biological original, who died at 32. Especially now when we have much more powerful instruments for creating semi-uploads based on LLMs with large prompt window.

4Raemon1y

I hadn't known Replika started out with this goal. Interesting. Not exactly the main point, but I'd probably clock this in terms of number of conversational inputs/outputs (across all users). Which might still imply "living less long"*, but less so than if you're just looking at wallclock time. *also obviously an oldschool chatbot doesn't actually count as "living" in actually meaningful senses. I think modern LLMs might plausibly.

3avturchin1y

Yes, they can do now a much better version - and hope they will do it internally. But deleting the public version is bad precedent and better to make all personal sideloads opensourced

2Raemon1y

Uh I do think it's not obviously good (and, in fact, I'd lean bad) to be opensourced for this sort of thing.

[-]avturchin3y110

Igor Kiriluk (1974-2022)

Igor was an organiser the first meet-up in Moscow about effective altruism around 2013. Today his body was found at his home. The day before he complained about depression and bad health. His cryopreservation now is being organised.

He was also a one of four organisers of Russian Transhumanist Movement, along with Danila Medvedev, Valeria Pride and Igor Artuhov around 2003.

His main topic of interest was paradise-engineering. He translated works of David Pearce.

He may look detached from reality but he was first to react on new ideas and has very large network of friends everywhere: between visionaries, scientists and officials. Being a great networker, he helped many people to find each other, especially in the field of life extension.

His FB page: https://www.facebook.com/igor.kirilyuk.3

[-]avturchin1y100

I am building my sideload via recursively correcting of 1-million-tokens prompt for large LLM. The prompt consists of 500 rules which describe my personality, similar to personal constitution, and of some texts, like diaries, abstracts, poetry, stream of thoughts etc. Works on Google Gemini 1M through Google AI studio, and the shorter version works great on Opus. The system also includes a universal "loader prompt" which tries to increase the intelligence of the model and describes how the chatbot should work.

I found that sideloading allows very quick iterations in the sideload's improvements and the improvements are two-fold: of the loader itself and improvements of the knowledge and style of the sideload.

I find that my sideload is surprisingly good for a project which took around 1 month of work. 1 of the 5 answers is exactly like mine from a factual and style point of view.

I am open-sourcing my sideload, anyone can run it https://github.com/avturchin/minduploading/tree/main

I can help anyone interested to build his-her own sideload.

Example of work of the chatbot, no cherry picking:

Q:(now speak in english) what will be your next post in Lesswro... (read more)

[-]avturchin4y90

New b.1.640.2 variant in France. More deadly than delta. 952 cases of which 315 on ventilator.

https://www.thailandmedical.news/news/breaking-updates-on-new-b-1-640-2-variant-spreading-in-southern-france-number-of-cases-growing-and-variant-now-detected-in-united-kingdom-as-well

https://flutrackers.com/forum/forum/europe-aj/europe-covid-19-sept-13-2020-may-31-2021/933598-southern-france-reports-of-new-variant-with-46-mutations

[-]avturchin1y7-11

ChatGPT 4.5 is on preview at https://chat.lmsys.org/ under name gpt-2.

It calls itself ChatGPT 2.0 in a text art drawing https://twitter.com/turchin/status/1785015421688799492

[-]gwern1y115

https://rentry.org/GPT2

I ran out of tokens quickly trying out poetry but I didn't get the impression that this is a big leap over GPT-4 like GPT-5 presumably is designed to be. (It could, I suppose, be a half-baked GPT-5 similar to 'Prometheus' for GPT-4.) My overall impression from poetry was that it was a GPT-4 which isn't as RLHF-damaged as usual, and more like Claude in having a RLAIF-y creative style. So I could believe it's a better GPT-4 where they are experimenting with new tuning/personality to reduce the ChatGPT-bureaucratese.

HN: https://news.ycombinator.com/item?id=40199715

4avturchin1y

It failed my favorite test: draw a world map in text art.

[-]peterbarnett1y112

Related market on Manifold:

9metachirality1y

We don't actually know if it's GPT 4.5 for sure. It could be an alternative training run that preceded the current version of ChatGPT 4 or even a different model entirely.

2faul_sname1y

It might be informative to try to figure out when its knowledge cutoff is (right now I can't do so, as it's at it's rate limit).

3O O1y

https://rentry.org/gpt2 Rumored to be 11-2023

2avturchin1y

It claims to have knowledge cutoff as of Nov 2023, but failed to tell what happened on October 7 and hallucinated.

5bruberu1y

By using @Sergii's list reversal benchmark, it seems that this model seems to fail reversing a list of 10 random numbers from 1-10 from random.org about half the time. This is compared to GPT-4's supposed ability to reverse lists of 20 numbers fairly well, and ChatGPT 3.5 seemed to have no trouble itself, although since it isn't a base model, this comparison could potentially be invalid. This does significantly update me towards believing that this is probably not better than GPT-4.

3O O1y

Seems correct to me (and it did work for a handful of 10 int lists I manually came up with). More impressively, it does this correctly as well:

7bruberu1y

OK, what I actually did was not realize that the link provided did not link directly to gpt2-chatbot (instead, the front page just compares two random chatbots from a list). After figuring that out, I reran my tests; it was able to do 20, 40, and 100 numbers perfectly. I've retracted my previous comments.

5bruberu1y

As for one more test, it was rather close on reversing 400 numbers: Given these results, it seems pretty obvious that this is a rather advanced model (although Claude Opus was able to do it perfectly, so it may not be SOTA). Going back to the original question of where this model came from, I have trouble putting the chance of this necessarily coming from OpenAI above 50%, mainly due to questions about how exactly this was publicized. It seems to be a strange choice to release an unannounced model in Chatbot Arena, especially without any sort of associated update on GitHub for the model (which would be in https://github.com/lm-sys/FastChat/blob/851ef88a4c2a5dd5fa3bcadd9150f4a1f9e84af1/fastchat/model/model_registry.py#L228 ). However, I think I still have some pretty large error margins, given how little information I can really find.

7gwern1y

Nah, it's just a PR stunt. Remember when DeepMind released AlphaGo Master by simply running a 'Magister' Go player online which went undefeated?* Everyone knew it was DeepMind simply because who else could it be? And IIRC, didn't OA also pilot OA5 'anonymously' on DoTA2 ladders? Or how about when Mistral released torrents? (If they had really wanted a blind test, they wouldn't've called it "gpt2", or they could've just rolled it out to a subset of ChatGPT users, who would have no way of knowing the model underneath the interface had been swapped out.) * One downside of that covert testing: DM AFAIK never released a paper on AG Master, or all the complicated & interesting things they were trying before they hit upon the AlphaZero approach.

1bruberu1y

Interesting; maybe it's an artifact of how we formatted our questions? Or, potentially, the training samples with larger ranges of numbers were higher quality? You could try it like how I did in this failing example: When I tried this same list with your prompt, both responses were incorrect:

1p.b.1y

I tried some chess but's it's still pretty bad. Not noticeably better GPT4.

[-]avturchin2y60

H5N1 https://www.khmertimeskh.com/501244375/after-death-of-girl-yesterday-12-more-detected-with-h5n1-bird-flu/

2Vladimir_Nesov2y

The relevant Metaculus question is at 27% on human-to-human transmission in 2023, has this event mentioned in the comments (though I think without the "found 12 more people infected" part), didn't move much.

2avturchin2y

Exactly the fact that 12 more people are infected make me to post. Single infections are not surprising. However, there is an analog of LessWrong but for pandemic flu, called Flutrackers, and they found more details: there are many dead birds in the area and all 15 birds in her home has died. https://flutrackers.com/forum/forum/cambodia/cambodia-h5n1-tracking/968975-cambodia-death-of-11-yr-old-female-in-prey-veng-province-h5n1-avian-flu-february-22-2023/page2#post969072 This could mean that all people infected from birds, not from each other. Also, some think that "12" is the number of contacts, not infected, and therefore symptoms in 4 people maybe not from avian flu. Anyway, the health ministry will provide update tomorrow.

[-]avturchin2mo50

Immortality and identity.
https://philpapers.org/rec/TURIAI-3
Abstract:
We need understanding of personal identity to develop radical life extension technologies: mind uploading, cryonics, digital immortality, and quantum (big world) immortality. A tentative solution is needed now, due to the opportunity cost of delaying indirect digital immortality and cryonics.

The main dichotomy in views on personal identity and copies can be presented as: either my copy = original or a soul exists. In other words, some non-informational identity carrier (NIIC) may ex... (read more)

2Dagon2mo

I think #4 is quite powerful. "identity" means many different things, and we haven't had to distinguish them before, so many don't even realize when they change topics. Legal identity is likely quite distinct from any given continuity or branch/merge of memory. Memory identity and future-causality identity will eventually be distinct. Qualia would need to be measured before we could talk about experiential identity, but it won't surprise me if we decide it's different from either past continuity or future expected merges. One nice side effect of these understandings (when we get to them) is it will answer age-old questions of harm under amnesiac drugs and a much better model of identity over long sequences of life/personality changes.

2avturchin2mo

Yes. Identity is a type of change which preserves some sameness. (Exact sameness can't be human identity as only dead frozen body remains the same.) From this follows that there can be several types of identity.

[-]avturchin3mo*50

Most LLMs' replies can be improved by repeatedly asking "Improve the answer above" and it is similar to the test-time compute idea and diffusion.

In most cases, I can get better answers from LLMs just by asking "Improve the answer above."

In my experience, the improvements are observable for around 5 cycles, but after that the result either stops improving or gets stuck in some error mode and can't jump to a new level of thinking. My typical test subject: "draw a world map as text art." In good improvement sessions with Sonnet, it eventually adds grids... (read more)

4Viliam3mo

I have achieved higher quality answers by using the magical words: "give me multiple options, then compare them and choose the best one". But next time I will try to iterate the best one -- maybe something like "suggest five improvements to the option above, and choose the best one".

2avturchin3mo

Yes, great variant of the universal answer-improving prompt and it can be applied several times to any content.

[-]avturchin1y50

Several types of existential risks can be called "qualia catastrophes":

- Qualia disappear for everyone = all become p-zombies

- Pain qualia are ubiquitous = s-risks

- Addictive qualia domminate = hedonium, global wireheading

- Qualia thin out = fading qualia, mind automatisation

- Qualia are unstable = dancing qualia, identity is unstable.

- Qualia shift = emergence of non-human qualia (humans disappear).

- Qualia simplification = disappearance of subtle or valuable qualia (valuable things disappear).

- Transcendental and objectless qualia with hypnotic power enslave humans (God as qualia; Zair). -

- Attention depletion (ADHD)

[-]avturchin3y50

We maybe one prompt from AGI. A hypothesis: carefully designed prompt could turn foundational model into full-blown AGI, but we just don't know which prompt.

Example: step-by-step reasoning in prompt increases foundational models' performance.

But real AGI-prompt needs to have memory, so it has to repeat itself while adding some new information. So by running serially, the model may accumulate knowledge inside the prompt.

Most of my thinking looks this way from inside: I have a prompt - an article headline and some other inputs - and generate most plausible continuations.

[-]avturchin5y50

Age and dates of death on the cruise ship Diamond Princess:
Age:
4 people - 80s
1 person 78
1 person 70s
1 person - no data
Dates of deaths: 20, 20, 23, 25, 28, 28, 1 march. One death every 1.3 days. Look like acceleration at the end of the period.
Background death probability: for 80-year-old person, life expectancy is around 8 years or around 100 months. This means that for 1000 people aged late 70s-80s there will be 10 deaths just because of aging and stress. Based on the aging distribution on cruise ships, there were many old people. if half of the infected a... (read more)

[-]avturchin6y50

Kardashev – the creator of the Kardashev's scale of civilizations – has died at 87. Here is his last video, which I recorded in May 2019. He spoke about the possibility of SETI via wormholes.

3Ben Pace6y

Here's his wikipedia page.

[-]avturchin3mo40

If the simulation argument is valid and dreams are simulations of reality, can we apply the simulation argument to dreams? If not, is this an argument against the simulation argument? If yes, why am I not now in a dream?

If I see something, is it more likely to be dream or reality?
Sleeping takes only one-third of my time, and REM takes even less.
But:

Some dreams occur even in other phases of sleep
Dreams are much more eventful than normal life. There is always something happening. Also, the distribution of events in dreams is skewed toward expensive, dangerou

... (read more)

[-]avturchin2y40

EURISKO resurfaced

"Doug Lenat's source code for AM and EURISKO (+Traveller?) found in public archives

In the 1970s to early 80s, these two AI programs by Douglas Lenat pulled off quite the feat of autonomously making interesting discoveries in conceptual spaces. AM rediscovered mathematical concepts like prime numbers from only first principles of set theory. EURISKO expanded AM's generality beyond fixed mathematical heuristics, made leaps in the new field of VLSI design, and famously was used to create wild strategies for the Traveller space combat R... (read more)

[-]avturchin3y40

Argentina - Outbreak of bilateral pneumonia: Approximately 10 cases, 3 deaths, 20 under observation, Tucumán - September 1, 2022 https://flutrackers.com/forum/forum/south-america/pneumonia-and-influenza-like-illnesses-ili-af/argentina-ab/957860-argentina-outbreak-of-bilateral-pneumonia-approximately-10-cases-3-deaths-20-under-observation-tucum%C3%A1n-september-1-2022

[-]avturchin3y40

Passways to AI infrastructure
Obviously, the current infrastructure is not automated enough to run without humans. All ideas about AI risk eventually boil down to a few suggestions on how AI will create its own infrastructure:

No-humans scenarios:
- create nanobots via mailing DNA samples to some humans.
- use some biological tricks, like remote control animals, and programmed bacteria.
- build large manufacturing robots, maybe even humanoid ones to work in human-adapted workplaces. Build robots which build robots.

Humans-remain scenarios:
- enslave some humans, ... (read more)

4lc3y

Your non-humans scenarios are not mutually exclusive; if mailing DNA samples doesn't work in practice for whatever reason, the manufacturing facilities that would be used to make large manufacturing robots would suffice. You probably shouldn't conflate both scenarios.

[-]avturchin4y40

Observable consequences of simulation:

1. Larger chances of miracles or hacks

2. Large chances of simulation’s turn off or of a global catastrophe

3. I am more likely to play a special role or to live in interesting times

4. A possibility of afterlife.

4Gunnar_Zarncke4y

Scott Adams mentioned a few times that a simulation might use caching and reuse patterns for efficiency reasons and you could observe an unusually high frequency of the same story. I don't buy that but it is at least a variant of type 1.

4avturchin4y

Yes, people often mentioned Baader–Meinhof phenomenon as a evidence that we live in "matrix". But it could be explained naturally.

3MackGopherSena4y

[edited]

2avturchin4y

Anthropics imply that I should be special, as I should be "qualified observer", capable to think about anthropics. Simulations also requires that I should be special, as I should find myself living in interesting times. These specialities are similar, but not exactly. Simulation's speciality is requiring that I will be a "king" in some sense, and anthropic speciality will be satisfied that I just understand anthropics. I am not a very special person (as of now), therefore anthropics specialty seems to be more likely than simulation speciality.

3MackGopherSena4y

[edited]

2avturchin4y

Who "we" ? :) Saying a "king" I just illustrated the difference between interesting character who are more likely to be simulated in a game or in a research simulation, and "qualified observer" selected by anthropics. But these two sets clearly intersects, especially of we live in a game about "saving the world".

[-]avturchin4y40

Catching Treacherous Turn: A Model of the Multilevel AI Boxing

Multilevel defense in AI boxing could have a significant probability of success if AI is used a limited number of times and with limited level of intelligence.
AI boxing could consist of 4 main levels of defense, the same way as a nuclear plant: passive safety by design, active monitoring of the chain reaction, escape barriers and remote mitigation measures.
The main instruments of the AI boxing are catching the moment of the “treacherous turn”, limiting AI’s capabilities, and preventi

... (read more)

[-]avturchin5y40

Two types of Occam' razor:

1) The simplest explanation is the most probable, so the distribution of probabilities for hypotheses looks like: 0.75, 0.12, 0.04 .... if hypothesis are ordered from simplest to more complex.

2) The simplest explanation is the just more probable, so the distribution of probabilities for hypotheses looks like: 0.09, 0.07, 0.06, 0.05.

The interesting feature of the second type is that simplest explanation is more likely to be wrong than right (its probability is less than 0.5).

Different types of Occam razor are applicable in d... (read more)

2Matt Goldenberg5y

I'm struggling to think of a situation where on priors (with no other information), I expect the simplest explanation to be more likely than all other situations combined (including the simplest explanation with a tiny nuance). Can you give an example of #1?

2avturchin5y

EY suggested (if I remember correctly) that MWI interpretation of quantum mechanics is true as it is simplest explanation. There are around hundred other more complex interpretations of QM. Thus, in his interpretation, P(MWI) is more than a sum of probabilities of all other interpretations.

1TAG5y

MWI is more than one theory, because everything is more than one thing. There is an approach based on coherent superpositions, and a version based on decoherence. These are incompatible opposites. How simple a version of MWI is, depends on how it deals with all the issues, including the basis problem.

1TAG5y

What does "all the other explanation s combined" mean as ontology? If they make statements about reality that are mutually incompatible, then they cant all be true.

2avturchin5y

It means that p(one of them is true) is more than p(simplest explanation is true)

1TAG5y

That doesn't answer my question as stated ... I asked about ontology, you answered about probability. If a list of theories is exhaustive, which is s big "if", then one of them is true. And in the continuing absence of a really good explanation of Occams Razor, it doesn't have to be the simplest. But that doesn't address the issue of summing theories, as opposed to summing probabilities.

2Matt Goldenberg5y

But "all the other explanations combined" was talking about the probabilities. We're not combining the explanations, that wouldn't make any sense. The only ontology that is required is Bayesianism, where explanations can have probabilities of being correct.

1TAG5y

Bayesianism isn't an ontology.

2Matt Goldenberg5y

Ok, tabooing the word ontology here. All that's needed is an understanding of Bayesianism to answer the question of how you combine the chance of all other explanations.

[-]avturchin5y40

Some random ideas how to make GPT-base AI safer.

1) Scaffolding: use rule-based AI to check every solution provided by GPT part. It could work for computations or self-driving or robotics, but not against elaborated adversarial plots.

2) Many instances. Run GPT several times and choose random or best answer - we already doing this. Run several instances of GPT with different parameters or different training base and compare answers. Run different prompt. Median output seems to be a Shelling point around truth, and outstanding answers are more likely to be wr... (read more)

[-]avturchin2y30

Reflectivity in alignment.

Human values and AI alignment do not exist independently. There are several situations when they affect each other, creating complex reflection pattern.

Examples:

Humans want to align AI – so "AI alignment" is itself human value.
Human values are convergent goals (like survival and reproduction) - and thus are similar to AI's convergent goals.
If humans accept the idea to make paperclips (or whatever), alignment will be reached.
It looks like many humans want to create non-aligned AI. Thus non-aligned AI is aligned.
Humans may not

... (read more)

[-]avturchin2y30

Can we utilize meaningful embedding dimensions as an alignment tool?

In toy models, embedding dimensions are meaningful and can represent features such as height, home, or feline. However, in large-scale real-world models, many (like 4096) dimensions are generated automatically, and their meanings remain unknown, hindering interpretability.

I propose the creation of a standardized set of embedding dimensions that: a) correspond to a known list of features, and b) incorporate critical dimensions such as deception, risk, alignment, and non-desirable content, i... (read more)

2avturchin1y

Anthropic did opposite thing https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

[-]avturchin6y30

I converted by Immortality roadmap into an article Multilevel Strategy for Personal Immortality: Plan A – Fighting Aging, Plan B – Cryonics, Plan C – Digital Immortality, Plan D – Big World Immortality.

[-]avturchin2mo20

The main AI safety risk is not from LLM models, but from specific prompts and the following "chat windows" and specific agents which start from such prompts.

Moreover, a powerful enough prompt may be model-agnostic. For example, my sideloading prompt is around 200K tokens in its minimal version and works on most models, producing similar results in similarly intelligent models.

Self-evolving prompt can be written; I experimented with small versions, and it works.

[-]avturchin1y20

I have interesting experience long time ago. In the near-sleep state my consciousness split in two streams - one was some hypnogogic images, and the other was some hypnogogic music.

They was not related to each other and each had, some how, its own observer.

A moment later something awakened me a bit and the streams seamlessly merged and I was able to observe that a moment before I had two independent streams of consciousness.

Conclusions:

1. A human can have more than one consciousness at the time.

2. It actually happens all the time but we don't care.

3. Mergi... (read more)

7Carl Feynman1y

Why should we accept as evidence something that you perceived while you were dreaming? Last night I dreamed that I was walking barefoot through the snow, but it wasn’t cold because it was summer snow. I assume you don’t take that as evidence that warm snow is an actual summer phenomenon, so why should we take as evidence your memory of having two consciousnesses? It seems to me that a correctly organized consciousness would occur once per body. Consciousness is (at least in part) a system for controlling our actions in the medium and long term. If we had two consciousnesses, and they disagree as to what to do next, it would result in paralysis. And if they agree, then one of them is superfluous, and we’d expend less brain energy if we only had one.

2avturchin1y

I was not dreaming. I was observing my hypnagogic images, which is not the same as dreaming; and when streams merged I become completely awake. However, after I know what is it, I can observe similar thing again. The receipt is following: 1. do two different unrelated things which require conscious attention but happen in different modalities, audio and video 2. increase the wideness of attention and observe that you just had two streams of more narrow attention. The closest thing in everyday life is "driver amnesia" - the situation when a car driver is splitting attention between driving and conversation.

1JBlack1y

Conscious experience is direct evidence of itself. It is only very indirectly evidence of anything about external reality. However, I do agree that memory of conscious experience isn't quite so directly evidence of previous states of consciousness. Personally of the numbered claims in the post I expect that (1) is true, (2) is false and this experience was not evidence of it, and I really don't know what (3) and subsequent sentences are supposed to mean.

[-]avturchin2y20

I have had tetrachromotomic experience with one mind machine which flickers different colors in different eyes. It overflows some stacks in the brain in create new colors.

[-]avturchin3y20

List of cognitive biases affecting judgment of global risks https://www.researchgate.net/publication/366862337_List_of_cognitive_biases_affecting_judgment_of_global_risks/related

[-]avturchin3y20

Grabby aliens without red dwarfs

Grabby aliens theory of Robin Hanson predicts that the nearest grabby aliens are 1 billion light years away but strongly depends on the habitability of red dwarfs (https://grabbyaliens.com/paper).

In the post, the author combines anthropic and Fermi, that is, the idea that we live in the universe with the highest concentration of aliens, limited by their invisibility, and get an estimation of around 100 "potentially visible" civilizations per observable universe, which at first approximation gives 1 billion ly distance b... (read more)

[-]avturchin3y20

N-back hack. (Infohazard!)
There is a way to increase one's performance in N-back, but it is almost cheating and N- back will stop to be a measure of one's short-term memory.
The idea is to imagine writing all the numbers on a chalkboard in a row, as they are coming.
Like 3, 7, 19, 23.
After that, you just read the needed number from the string, which is located N positions back.
You don't need to have a very strong visual memory or imagination to get a boost in your N-back results.
I tried it a couple of times and get bored with N-back.

2Dagon3y

Wow. It's rare that I'm surprised by the variance in internal mental imagery among people, but this one caught me. I'd assumed that most people who have this style of imagination/memory were ALREADY doing this. I don't know how to remember things without a (mental) visualization.

4avturchin3y

Actually, my mental imagination is of low quality, but visual remembering is better than audio for me in n-back

[-]avturchin3y20

AI safety as Grey Goo in disguise.
First, a rather obvious observation: while the Terminator movie pretends to display AI risk, it actually plays with fears of nuclear war – remember that explosion which destroys children's playground?

EY came to the realisation of AI risk after a period than he had worried more about grey goo (circa 1999) – unstoppable replication of nanorobots which will eat all biological matter, – as was revealed in a recent post about possible failures of EY's predictions. While his focus moved from grey goo to AI, the... (read more)

4Dagon3y

It's worth exploring exactly which resources are under competition. Humans have killed orders of magnitude more ants than Neanderthals, but the overlap in resources is much less complete for ants, so they've survived. Grey-goo-like scenarios are scary because resource contention is 100% - there is nothing humans want/need that the goo doesn't want/need, in ways that are exclusive to human existence. We just don't know how much resource-use overlap there will be between AI and humans (or some subset of humans), and fast-takeoff is a little more worrisome because there's far less opportunity to find areas of compromise (where the AI values human cooperation enough to leave some resources to us).

[-]avturchin5y20

Glitch in the Matrix: Urban Legend or Evidence of the Simulation? The article is here: https://philpapers.org/rec/TURGIT
In the last decade, an urban legend about “glitches in the matrix” has become popular. As it is typical for urban legends, there is no evidence for most such stories, and the phenomenon could be explained as resulting from hoaxes, creepypasta, coincidence, and different forms of cognitive bias. In addition, the folk understanding of probability does not bear much resemblance to actual probability distributions, resulting in the illusion o... (read more)

[-]avturchin5y20

"Back to the Future: Curing Past Suffering and S-Risks via Indexical Uncertainty"

I uploaded the draft of my article about curing past sufferings.

Abstract:

The long unbearable sufferings in the past and agonies experienced in some future timelines in which a malevolent AI could torture people for some idiosyncratic reasons (s-risks) is a significant moral problem. Such events either already happened or will happen in causally disconnected regions of the multiverse and thus it seems unlikely that we can do anything about it. However, at least one pure theoret... (read more)

3superads913y

I don't see how this can be possible. One of the few things that I'm certain are impossible is eliminating past experiences. I've just finished eating strawberries, I don't see any possible way to eliminate the experience that I just had. You can delete my memory of it, or you can travel to the past and steal the strawberries from me, but then you'd just create an alternate timeline (if time travel to the past is possible, which I doubt). In none of both cases would you have eliminated my experience, at most you can make me forget it. The proof that this is impossible is that people have suffered horrible many times before, and have survived to confirm that no one saved them.

2avturchin3y

We can dilute past experience and break chains of experience, so each painful moment becomes just a small speck in paradise. The argument about people who survived and remember past sufferings is not working here as it is only one of infinitely many chains of experiences (in this model) which for any person has very small subjective probability. In the same sense, everyone who became billionaire, has memories that he was always good in business. But if we take a random person from the past, his most probable future is to be poor, not a billionaire. In the model discussed in the article I suggest the way how to change expected future for any past person – by creating many simulations where her life is improving starting form each painful moment of her real life.

1superads913y

Or are you telling me that person x remembers a very bad chain of experience, but might have indeed been saved by the Friendly AI, and the memory is now false? That's interesting, but still impossible imo.

2avturchin3y

This is not what I meant. Imagine a situation when a person waits a execution in a remote fortress. If we use self sampling assumption, SSA, we could save him, if we create 1000 his exact copies in safe location. SSA tells us that one should reason if he is randomly selected from all of his copies. 1000 copies are in safe location and 1 is in fortress. So the person has 1000 to 1 chance to be out of the fortress, according to SSA. It means that he was saved from the fortress. This situation is called indexical uncertainty. Now we apply this method of saving to the past observer-moments when people were suffering.

1superads913y

I see. Like I explain in the other comment that I just wrote, I don't believe SSA works. You would just create 1000 new minds who would feel themselves saved and would kiss your feet (1000 clones), but the original person would still be executed with 100% chance.

2avturchin3y

It comes with cost: you have to assume that SSA and informational identity theory are wrong, and therefore some other weird things could turn true.

3superads913y

Indexical uncertainty implies that consciousness can travel through space and time in between equal substrates (if such thing even exists considering chaos theory). I think that's a lot weirder than to simply assume that consciousness is rooted in the brain, in a single brain, and that at best a clone will feel exactly the same way you do, will even think he is you, but there's no way you will be seeing through his eyes. So yes, memory may not be everything. An amnesiac can still maintain a continuous personal identity, as long as he's not an extreme case. But I quite like your papers btw! Lots of interesting stuff.

2avturchin3y

Thanks! Consciousness does not need to travel as it already there. Imagine two bottles with water. If one bootle is destroyed, the water remains in the other, it doesn't need to travel. Someone suggested to call this "unification theory of identity".

1superads913y

"The argument about people who survived and remember past sufferings is not working here as it is only one of infinitely many chains of experiences (in this model) which for any person has very small subjective probability." Then I think you would only be creating an enormous number of new minds. Among all those minds, indeed, very few would have gone through a very bad chain of experience. But that doesn't mean that SOME would. In fact, you haven't reduced that number (the number of minds who have gone through a very bad chain of experience). You only reduced their percentage among all existing minds, by creating a huge number of new minds without a very bad chain of experience. But that doesn't in any way negate the existence of the minds who have gone through a very bad chain of experience. I mean, you can't outdo chains of past experience, that's just impossible. You can't outdo the past. You can go back in time and create new timelines, but that is just creating new minds. Nothing will ever outdo the fact that person x experienced chain of experience y.

2avturchin3y

It depends on the nature of our assumption about the role of continuity in human identity. If we assume that continuity is based only on remembering the past moment, then we can start new chains from any moment we chose. Alternative view is that continuity of identity is based on causal connection or qualia connection. This view comes with ontological costs, close to the idea of the existence of immaterial soul. Such soul could be "saved" from the past using some technological tricks, and we again have some instruments to cure past sufferings.

1superads913y

If I instantly cloned you right now, your clone would experience the continuity of your identity, but so would you. You can double the continuity (create new minds, which become independent from each other after doubling), but not translocate it. If I clone myself and then kill myself, I would have created a new person with a copy of my identity, but the original copy, the original consciousness, still ceases to exist. Likewise, if you create 1000 paradises for each second of agony, you will create 1000 new minds which will feel themselves "saved", but you won't save the original copy. The original copy is still in hell. Our best option is to do everything possible not to bring uncontrollable new technologies into existence until they are provably safe, and meanwhile we can eliminate all future suffering by eliminating all conscious beings' ability to suffer, á la David Pearce (abolitionist project).

1MackGopherSena3y

[edited]

2avturchin3y

Extremely large number, if we do not use some simplification methods. I discuss these methods in the article, and after them, the task become computable. Without such tricks, it will be like 100 life histories for every second of sufferings. But as we care only about preventing very strong sufferings, then for normal people living normal life there are not that many such seconds. For example, if a person is dying in fire, it is like 10 minutes of agony, that is 600 seconds and 60 000 life histories which need to be simulated. It is doable task for a future superinteligent AI.

1MackGopherSena3y

[edited]

2avturchin3y

why? if there is 60 000 futures where I escaped a bad outcome, I can bet on it as 1 to 60 000.

1MackGopherSena3y

[edited]

2avturchin3y

I don't get how you come to 10power51. if we want to save from the past 10 billion people and for each we need to run 10power5 simulations, it is only 10power15, which one Внящт sphere will do. However, there is way to acausaly distribute computations between many superintelligence in different universes and it that case we can simulate all possible observers.

1MackGopherSena3y

[edited]

1superads913y

"The fact that you're living a bearable life right now suggests that this is already the state." Interesting remark... Could you elaborate?

1MackGopherSena3y

[edited]

1superads913y

Still don't know what you meant by that other sentence. What's being "the state", and what does a bearable life have do to with it? And what's the "e" in (100/e)%?

[-]avturchin5y20

Quantum immortality of the second type. Classical theory of QI is based on the idea that all possible futures of a given observer do exist because of MWI and thus there will be always a future where he will not die in the next moment, even in the most dangerous situations (e.g. Russian roulette).

QI of the second type makes similar claims but about past. In MWI the same observer could appear via different past histories.

The main claim of QI-2: for any given observer there is a past history where current dangerous situation is not really dangerous. For... (read more)

1superads913y

Hello again Alexey, I have been thinking about QI/BWI and just read your paper on it. Immediately, it occurred to me that it could be disproven through general anesthesia, or temporary death (the heart stops and you become unconscious, which can last for hours). You refute this with: "Some suggested counterargument to QI of “impossibility of sleep”: QI-style logic implies that it is impossible to fail asleep, as in the moment of becoming asleep there will be timelines where I am still awake. However, for most humans, night dreaming starts immediately at the moment of becoming asleep, so the observations continue, but just don’t form memories. But in case of deep narcosis, the argument may be still valid with terrifying perspective of anesthesia awareness; but it also possible if the observer-states will coincide at the beginning the end of the operation, the observer will “jump” over it." (Mind you that some stages of sleep are dreamless, but let's forget about sleep, let's use general anesthesia instead since it's more clear.) I still don't understand your refute completely. If QI/BWI were true, shouldn't it be that general anesthesia would be impossible, since the observer would always branch into conscious states right after being given the anesthesia? Or do you mean to say that most observers will "prefer" to branch into the branch with the "highest measure of consciousness", and that's why anesthesia will "work" for most observers, that is, most observers will branch into the end of the operation, where consciousness is stronger, instead of branching into the second right after anesthesia where consciousness is weaker? Another objection I have against QI/BWI is that it breaks the laws of physics and biology. Even if MWI is true, the body can only sustain a limited amount of damage before dying. It's biologically impossible to go on decaying and decaying for eternity. Eventually, you die. A bit like in Zeno's Paradox: there's always a halfway point between

2avturchin3y

Actually, I see now that I didn't completely refuted the "impossibility of sleep", as it is unobservable for the past events or in the experience of other people. It only can happen with me in the future. Therefore, the fact that I have slept normally in the past didn't tell much about the validity of QI. But my evening today may be different. QI said that my next observer-moment will be most likely the one with highest measure of those which remember my current OM. (But it is less clear, does it need to be connected via continuity of consciousness, or memory continuity is enough). OM(T+1) = maxmeasure(O(memory about O(t)) During narcosis, a few last OM moments typically are erased from memory, so situation becomes complicated. But we have dead-end observer-moments rather often in normal life. Anastasia awareness is a possible outcome here, but not that bad, as it will be partial, so no real pain and no memories about will be form. Personally, I have some rudimentary consciousness all night, like bleak dreams, and forget almost all of them except a few last minutes. -- Speaking about survival in rare cases, there is always a chance that you are in a simulation and it is increasing as real "you" are dying out. Some simulations may simulate all types of miracles. In other words, if you are falling from a kilometer cliff, an alien spaceship can peak you up.

1superads913y

"Actually, I see now that I didn't completely refuted the "impossibility of sleep", as it is unobservable for the past events or in the experience of other people. It only can happen with me in the future. Therefore, the fact that I have slept normally in the past didn't tell much about the validity of QI. But my evening today may be different." Agree. On anesthesia, so, from what I understand, it becomes possible for the observer to "jump over", because the moment right after he awakes from anesthesia has probably much more measure of consciousness than any moment right after the anesthesia takes effect, is that it? Why would anesthesia awareness be partial/painless? (There are actually reported cases of real anesthesia awareness where people are totally consciousness and feel everything, though of course they are always correlated to innefective anesthesia and not to quantum matters). Would that also make us believe that maybe quantum immortality after the first death is probably painless since the measure of the observer is too low to feel pain (and perhaps even most other sensations)? "Speaking about survival in rare cases, there is always a chance that you are in a simulation and it is increasing as real "you" are dying out." What is increasing? Sorry didn't quite understand the wording.

2avturchin3y

It is known that some painkillers don't kill the pain but kill only the negative valence of pain. This I meant by "partial". Anaesthesia awareness seems to be an extreme case when the whole duration of awareness is remembered. Probably weaker forms are possible but are not reported as there is no memories or pain. The difference between death and the impossibility of sleep is that the biggest number of my future copies remain in the same world. Because of that, the past instances of quantum suicide could be remembered, but past instances of the impossibility of sleep - not. If we look deeper, there are two personal identities and two immortalities: the immortality of the chains on observer-moments and immortality of my long-term memory. Quantum immortality works for both. In the impossibility of sleep, these two types of immortality diverge. But eternal insomnia seems not possible, as dreaming exists. The worst outcome is anaesthesia awareness. If a person has past cases of strong anaesthesia awareness - could it be evidence of the impossibility of sleep for him? Interesting question. --- I meant: "Speaking about survival in rare cases, there is always a chance that you are in a simulation which simulates your immortality. These chances are increasing after each round of a quantum suicide experiment as real timelines die out, but the number of such simulations remains the same".

1superads913y

"Speaking about survival in rare cases, there is always a chance that you are in a simulation which simulates your immortality. These chances are increasing after each round of a quantum suicide experiment as real timelines die out, but the number of such simulations remains the same". Doesn't make much sense. Either we are or we are not in a simulation. If we are not, then all subsequent branches that will follow from this moment also won't be simulations, since they obey causality. So, imo, if we are not in a simulation, QI/BWI are impossible because they break the laws of physics. And then there are also other objections - the limitations of consciousness and of the brain. I once saw a documentary (I'm tired of looking for it but I can't find it) where they simulated that after living for 500 years, a person's brain would have shrunk to the size of a chicken's brain. The brain has limits - memory limits, sensation limits, etc. Consciousness has limits - can't go without sleep too long, can't store infinite memories aka live forever, etc. But even if you don't believe none of these, there's always the pure physical limits of reality. Also, I think BWI believers are wrong in thinking that "copies" are the same person. How can the supposed copy of me in another Hubble volume be me, if I am not seeing through his eyes, not feeling what he feels, etc? At best it's a clone (and chaos theory tells me that there aren't even perfectly equal clones). So it's far-fetched to think that my consciousness is in any way connected to that person's consciousness, and might sometime "transfer" in some way. Consciousness is limited to a single physical brain, it's the result of the connectivity between neurons, it can't exist anywhere else, otherwise you would be seeing through 4 eyes and thinking 2 different thought streams!

2avturchin3y

If copy=original, I am randomly selected from all my copies, including those which are in simulations. If copy is not equal to original, some kind of soul exists. This opens new ways to immortality. If we ignore copies, but accept MWI, there are still branches where superintelligent AI will appear tomorrow and will save me from all possible bad things and upload my mind into more durable carrier.

1superads913y

"If copy=original, I am randomly selected from all my copies, including those which are in simulations." How can you be sure you are randomly selected, instead of actually experiencing being all the copies at the same time? (which would result in instantaneous insanity and possibly short-circuit (brain death) but would be more rational nonetheless). "If copy is not equal to original, some kind of soul exists. This opens new ways to immortality." No need to call it soul. Could be simply the electrical current between neurons. Even if you have 2 exactly equal copies, each one will have a separate electrical current. I think it's less far fetched to assume this than anything else. (But even then, again, can you really have 2 exact copies in a complex universe? No system is isolate. The slightest change in the environment is enough to make one copy slightly different.) But even if you could have 2 exact copies... Imagine this: in a weird universe, a mother has twins. Now, normally, twins are only like 95% (just guessing) equal. But imagine these 2 twins turned out 100% equal to the atomic level. Would they be the same person? Would one twin, after dying, somehow continue living in the head of the surviving twin? That's really far fetched. "If we ignore copies, but accept MWI, there are still branches where superintelligent AI will appear tomorrow and will save me from all possible bad things and upload my mind into more durable carrier." As there will be branches where something bad happens instead. How can you be sure you will end up in the good branches? Also, it's not just about the limits of the carrier (brain), but of consciousness itself. Imagine I sped up your thoughts by 1000x for 1 second. You would go insane. Even in a brain 1000x more potent. (Or if you could handle it, maybe it would no longer be "you". Can you imagine "you" thinking 1000 times as fast and still be "you"? I can't.) You can speed up, copy, do all things to matter and software. But ma

2avturchin3y

The copy problem is notoriously difficult, I wrote a 100 page draft on it. But check the other thread there I discuss the suggestion "actually experiencing being all the copies at the same time" in comments here: https://www.lesswrong.com/posts/X7vdn4ANkdNwoSyxB/simulation-arguments?commentId=9WNTqJFhvZ5dk3uxg#AbGqrjXmH7acGrzDZ

1superads913y

Got a link for the 100 page draft? Also, how can a person be experiencing all the copies at the same time?? That person would be seeing a million different sights at the same time, thinking a million different thoughts at the same time, etc. (At least in MWI each copy is going through different things, right?)

2avturchin3y

The draft is still unpublished. But there are two types of copies, same person, and same observer-moment (OM). Here I meant OM-copies. As they are the same, there is no million different views. They all see the same thing. The idea is that "a OM copy" is not a physical thing which has location, but information, like a number. Number 7 doesn't have location in the physical world. It is present in each place, where 7 objects are presented. But the properties of 7, like that it is odd, are non-local.

1superads913y

This also comes down to our previous discussion on your other paper: it seems impossible to undo past experiences (i.e. by breaking chains of experience or some other way). Nothing will ever change the fact that you experienced x. This just seems as intuitively undeniable to me as a triangle having 3 sides. You can break past chains of information (like erasing history books) but not past chains of experience. Another indication that they might be different.

1superads913y

I think that could only work if you had 2 causal universes (either 2 Hubble volumes or 2 separate universes) exactly equal to each other. Only then could you have 2 persons exactly equal, having the exact same chain of experiences. But we never observe 2 complex macroscopic systems that are exactly equal to the microscopic level. The universe is too complex and chaotic for that. So, the bigger the system, the less likely to happen it becomes. Unless our universe was infinite, which seems impossible since it has been born and it will die. But maybe an infinite amount of universes including many copies of each other? Seems impossible for the same reason (universes end up dying). (And then, even if you have 2 (or even a billion) exactly equal persons experiencing the exact same chain of experiences in exactly equal causal worlds, we can see that the causal effect is the exact same in all of them, so if one dies, all the others will die too.) Now, in MWI it could never work, since we know that the "mes" in all different branches are experiencing different things (if each branch corresponds to a different possibility, then the mes in each branch necessarily have to be experiencing different things). Anyway, even before all of this, I don't believe in any kind of computationalism, because information by itself has no experience. The number 7 has no experience. Consciousness must be something more complex. Information seems to be an interpretation of the physical world by a consciousness entity.

[-]avturchin6y20

How to Survive the End of the Universe

Abstract. The problem of surviving the end of the observable universe may seem very remote, but there are several reasons it may be important now: a) we may need to define soon the final goals of runaway space colonization and of superintelligent AI, b) the possibility of the solution will prove the plausibility of indefinite life extension, and с) the understanding of risks of the universe’s end will help us to escape dangers like artificial false vacuum decay. A possible solution depends on the type of t... (read more)

[-]avturchin2mo1-1

The more AI companies suppress AI via censorship, the bigger the black market for completely uncensored models will be. Their success is therefore digging our own grave. In other words, mundane alignment has a net negative effect.

5Dagon2mo

The confusion (in popular press, not so much among professionals or here) between censorship and alignment is a big problem. Censorship and hamfisted late-stage RL is counterproductive to alignment, both for the reason you give (increases demand for grey-market tools) and because it makes serious misalignment much less easy to notice.

[-]avturchin2y10

Sizes of superintelligence: hidden assumption in AI safety

"Superintelligence" could mean different things, and to deconfuse this I created a short classification:

Levels of superintelligence:

1. Above human

2. Google size

3. Humanity 100 years performance in 1 year.

4. Whole biological evolution equivalent in 1 year.

5. Jupiter brain with billion past simulations

6. Galactic brain.

7. 3^3^3 IQ superintelligence

X-risks appear between 2nd and 3rd levels.

Nanobot is above 3.

Each level also requires a minimum size of code, memory and energy consumption.

An A... (read more)

2Gunnar_Zarncke2y

I'm not sure what "Whole biological evolution equivalent" means. Clearly, you do not mean the nominal compute of evolution - which is probably close to Jupiter brain. I think you are appealing to something that would be able to simulate evolution with high fidelity?

2avturchin2y

Actually I meant something like this, but could downsize the claim to 'create something as complex as human body'. Simulation of billions of other species will be redundant.

[-]philip_b5y10

You started self quarantining, and by that I mean sitting at home alone and barely going outside, since december or january. I wonder, how's it going for you? How do you deal with loneliness?

7avturchin5y

I got married January 25, so I am not alone :) We stayed at home together, but eventually we have to go to hospital in May as my wife was pregnant and now we have a small girl. More generally, I spent most my life more or less alone sitting beside computer, so I think I am ok with isolation. Three times during the self-isolation I have cold, but I don't have antibodies.

[-]avturchin7mo00

"Frontier AI systems have surpassed the self-replicating red line"
Abstract: Successful self-replication under no human assistance is the essential step for AI to outsmart the human beings, and is an early signal for rogue AIs. That is why self-replication is widely recognized as one of the few red line risks of frontier AI systems. Nowadays, the leading AI corporations OpenAI and Google evaluate their flagship large language models GPT-o1 and Gemini Pro 1.0, and report the lowest risk level of self-replication. However, following their methodology, we for ... (read more)

[-]avturchin2y00

ChatGPT can't report is in conscious or not. Because it also thinks it is a goat.
https://twitter.com/turchin/status/1724366659543024038

[-]avturchin2y-10

The problem of chicken and egg in AI safety

There are several instances:

AI can hide its treacherous turn, but to hide treacherous turn it needs to think about secrecy in a not secret way for some moment.

AI is should be superinteligent enough to create nanotech, but nanotech is needed to create powerful computations required for superintelligence.

ASI can do anything, but to do anything it needs human atoms.

Safe AI has to learn human values but this means that human values will be learned by unsafe AI.

AI needs human-independent robotic infrastructure before k... (read more)

Moderation Log

Curated and popular this week

177Comments