LESSWRONG
LW

All of alexlyzhov's Comments + Replies

Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?

For every token, model activations are computed once when the token is encountered and then never explicitly revised -> "only [seems like it] goes in one direction"

Douglas Hofstadter changes his mind on Deep Learning & AI risk (June 2023)?

alexlyzhov2y92

with the only recursive element of its thought being that it can pass 16 bits to its next running

I would name activations for all previous tokens as the relevant "element of thought" here that gets passed, and this can be gigabytes.

From how the quote looks, I think his gripe is with the possibility of in-context learning, where human-like learning happens without anything about how the network works (neither its weights nor previous token states) being ostensibly updated.

3Holly_Elmore2y

I don't understand this. Something is being updated when humans or LLMs learn, no?

Aura as a proprioceptive glitch

alexlyzhov2y21

Among them, one I found especially peculiar is that I distinctly started feeling some sort of sensations outside of my body.

I had this, and it lasted for a year after the retreat. I also found that there's a strong tendency for the sensations to happen in the area you described.

I could feel sensations substantially outside of the area accessible to my hands too, but they were a bit more difficult to feel. They could correspond to priors for tactile-like affordances for objects at a distance (e.g. graspability of a cup, or speed of a fast-moving vehicle) that are readily constructed by ordinary perception.

Bandgaps, Brains, and Bioweapons: The limitations of computational science and what it means for AGI

alexlyzhov2y30

Thriving in the Weird Times: Preparing for the 100X Economy

alexlyzhov2y30

I thought a bit about datasets before and to me it seems like what needs collecting most is detailed personal preference datasets. E.g. input-output examples of how you generally prefer information to be filtered, processed, communicated to you, refined with your inputs; what are your success criteria for tasks, where are the places in your day flow / thought flow where the thing needs to actively intervene and correct you. Especially in those places where you feel you can benefit from cognitive extensions most, based on your bottlenecks. It could initially be too hard to infer from screen logs alone.

alexlyzhov's Shortform

alexlyzhov2y10

Random idea about preventing model stealing. After finetuning a mixture of experts model with your magic sauce, place the trained experts on geographically distinct servers with heterogeneous tech stacks and security systems to avoid common vulnerabilities. Horcrux vibes

AI interpretability could be harmful?

alexlyzhov2y20

Vaguely related paper: Self-Destructing Models: Increasing the Costs of Harmful Dual Uses in Foundation Models is an early attempt to prevent models from being re-purposed via fine-tuning.

It doesn't seem like a meaningfully positive result. For example, all their plots only track finetuning on up to 200 examples. I imagine they might have even had clear negative results in conditions with >200 examples available for finetuning. After 50-100 examples, the gap between normal finetuning and finetuning from random init, even though still small, grows fast. ... (read more)

1Roman Leventov2y

I wonder whether GFlowNets are somehow better suited for self-destruction/non-finetunability than LLMs.

Clarifying and predicting AGI

alexlyzhov2y10

I think "on most cognitive tasks" means for an AGI its t is defined as the first t for which it meets the expert level at most tasks. However, what exactly counts as a cognitive task does seem to introduce ambiguity and would be cool to clarify, e.g. by pointing to a clear protocol for sampling all such task descriptions from an LLM.

Clarifying and predicting AGI

alexlyzhov2y30

Several-months-AGI is required to be coherent in the sense of coherence defined with human experts today. I think this is pretty distinct from coherence that humans were being optimized to have before behavioral modernity (50K years ago).

I agree that evolution optimized hard for some kind of coherence, like persistent self-schema, attitudes, emotional and behavioral patterns, attachments, long-term memory access. But what humans have going for them is the combination of this prior coherence and just 50K years of evolution after humans unlocked access to th... (read more)

2the gears to ascension2y

Ah, perhaps.

Clarifying and predicting AGI

alexlyzhov2y10

on most cognitive tasks, it beats most human experts

I think this specifies both thresholds to be 50%.

Sam Altman: "Planning for AGI and beyond"

alexlyzhov2y10

It doesn't seem like "shorter timelines" in the safest quadrant has much to do with their current strategy, as they have a gpt-4 paper section on how they postponed the release to reduce acceleration.

The Preference Fulfillment Hypothesis

alexlyzhov2y10

https://www.lesswrong.com/posts/WKGZBCYAbZ6WGsKHc/love-in-a-simbox-is-all-you-need was vaguely similar

The Preference Fulfillment Hypothesis

alexlyzhov2y20

Relevant recent paper: https://www.lesswrong.com/posts/8F4dXYriqbsom46x5/pretraining-language-models-with-human-preferences

Bing Chat is blatantly, aggressively misaligned

alexlyzhov2y74

why it is so good in general (GPT-4)

What are the examples indicating it's at the level of performance at complex tasks you would expect from GPT-4? Especially performance which is clearly attributable to improvements that we expect to be made in GPT-4? I looked through a bunch of screenshots but haven't seen any so far.

SolidGoldMagikarp (plus, prompt generation)

alexlyzhov2y10

Can confirm I consistently had non-deterministic temp-0 completions on older davinci models accessed through the API last year.

Microsoft Plans to Invest $10B in OpenAI; $3B Invested to Date | Fortune

alexlyzhov2y42

Bloomberg reported on plans to invest $10B today

2DragonGod2y

Thanks, added to the OP.

alexlyzhov3y10

Have you seen this implemented in any blogging platform other people can use? I'd love to see this feature implemented in some Obsidian publishing solution like quartz, but for now they mostly don't care about access management.

Humans do acausal coordination all the time

alexlyzhov3y10

Wow, Zvi example is basically what I've been doing recently with hyperbolic discounting too after I've spent a fair amount of time thinking about Joe Carlsmith—Can you control the past. It seems to work. "It gives me a lot of the kind of evidence about my future behavior that I like" is now the dominant reason behind certain decisions.

Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

alexlyzhov3y10

How much time do you expect the form, the coding test, and the interview to take for an applicant?

1maxnadeau3y

30 min, 45 min, 20-30 min (respectively)

My tentative interpretability research agenda - topology matching.

alexlyzhov3y91

This idea tries to discover translations between the representations of two neural networks, but without necessarily discovering a translation into our representations.

I think this has been under investigation for a few years in the context of model fusion in federated learning, model stitching, and translation between latent representations in general.

Relative representations enable zero-shot latent space communication - an analytical approach to matching representations (though this is a new work, it may be not that good, I haven't checked)

Git Re-B... (read more)

1Maxwell Clarke3y

Thanks for these links, especially the top one is pretty interesting work

A Few Terrifying Facts About The Russo-Ukrainian War

alexlyzhov3y1917

I don't expect Putin to use your interpretation of "d" instead of his own interpretation of it which he is publicly advertising whenever he has a big public speech on the topic.

From the latest speech:

> In the 80s they had another crisis they solved by "plundering our country". Now they want to solve their problems by "breaking Russia".

This directly references an existential threat.

From the speech a week ago:

> The goal of that part of the West is to weaken, divide and ultimately destroy our country. They are saying openly now that in 1991 they managed... (read more)

What's up with the bad Meta projects?

alexlyzhov3y36

From my experience of playing VR games on mobile devices (Quest 1 and Quest 2), the majority of in-game characters look much better than this and it doesn't impact the framerate at all. This seems like a 100% stylistic choice.

Answer by alexlyzhovAug 22, 202240

"... the existing literature on the influence of dopamine enhancing agents on working memory provides reasonable support for the hypothesis that augmenting dopamine function can improve working memory."
—Pharmacological manipulation of human working memory, 2003

alexlyzhov3y10

Is it supposed to be helping working memory?

What is estimational programming? Squiggle in context

alexlyzhov3y30

I'd be really interested in a head-to-head comparison with R on a bunch of real-world examples of writing down beliefs that were not selected to favor either R or Squiggle. R because at least in part specifying and manipulating distributions seems to require less boilerplate than in Python.

What DALL-E 2 can and cannot do

alexlyzhov3y10

I wonder what happens when you ask it to generate
> "in the style of a popular modern artist <unknown name>"
or
> "in the style of <random word stem>ism".
You could generate both types of prompts with GPT-3 if you wanted so it would be a complete pipeline.

"Generate conditioned on the new style description" may be ready to be used even if "generate conditioned on an instruction to generate something new" is not. This is why a decomposition into new style description + image conditioned on it seems useful.

If this is successful, then more of the... (read more)

It’s Probably Not Lithium

alexlyzhov3y30

I wonder if the macronutrient rates shifted. This would influence the total calories you end up with because absorption rates are different for different macronutrients. How the food is processed also influences absorption (as well as the total amount of calories that may not be reflected on the package).

If these factors changed, calories today don't mean exactly the same thing as calories in 1970.

Since FDA allows a substantial margin of error for calories, maybe producers also developed a bias that allows them to stay within this margin of error but show fewer calories on the package?

Maybe this is all controlled for in studies, dunno, I just did a couple of google searches and had these questions.

2Ege Erdil3y

I have no clue about this, unfortunately.

The inordinately slow spread of good AGI conversations in ML

alexlyzhov3y188

I could imagine that OpenAI getting top talent to ensure their level of research achievements while also filtering people they hire by their seriousness about reducing civilization-level risks is too hard. Or at least it could easily have been infeasible 4 years ago.

I know a couple of people at DeepMind and none of them have reducing civilization-level risks as one of their primary motivations for working there, as I believe is the case with most of DeepMind.

Confused why a "capabilities research is good for alignment progress" position isn't discussed more

alexlyzhov3y20

I have an argument for capabilities research being good but with different assumptions. The assumption that's different is that we would progress rapidly towards AGI capabilities (say, in 10 years).

If we agree 95% of progress towards alignment happens very close to the AGI, then the duration of the interval between almost-AGI and AGI is the most important duration.

Suppose the ratio of capabilities research to alignment research is low (probably what most people here want). Then AI researchers and deployers will have an option say "Look, so many resources w... (read more)

why assume AGIs will optimize for fixed goals?

alexlyzhov3y90

When you say that coherent optimizers are doing some bad thing, do you imply that it would always be a bad decision for the AI to make the goal stable? But wouldn't it heavily depend on what other options it thinks it has, and in some cases maybe worth the shot? If such a decision problem is presented to the AI even once, it doesn't seem good.
The stability of the value function seems like something multidimensional, so perhaps it doesn't immediately turn into a 100% hardcore explicit optimizer forever, but there is at least some stabilization. In particula

... (read more)

alexlyzhov's Shortform

alexlyzhov3y50

Every other day I have a bunch of random questions related to AI safety research pop up but I'm not sure where to ask them. Can you recommend any place where I can send these questions and consistently get at least half of them answered or discussed by people who are also thinking about it a lot? Sort of like an AI safety StackExchange (except there's no such thing), or a high-volume chat/discord. I initially thought about LW shortform submissions, but it doesn't really look like people are using the shortform for asking questions at all.

2Jan Czechowski3y

There's an AI safety camp slack with #no-stupid-questions channel. I think people stay there even after the camp ends (I'm still there although this year edition ended last week). So you can either apply for next years edition (which I very much recommend!) or maybe contact organizers if they can add you without you being AISC participant/alumni? Just a disclaimer, I'm not sure how active this slack is between camps, and it might be that lot of people leave after the camp ends.

2niplav3y

The closest thing to an AI safety StackExchange is the stampy wiki, with loads of asked & answered questions. It also has a discord.

2harfe3y

The eleuther.ai discord has two alignment channels with reasonable volume (#alignment-general and #alignment-beginners). These might be suitable for your needs.

Gato as the Dawn of Early AGI

alexlyzhov3y20

But the mere fact that one network may be useful for many tasks at once has been extensively investigated since 1990s.

Is there a convenient way to make "sealed" predictions?

alexlyzhov3y90

To receive epistemic credit, make sure that people would know you haven't made all possible predictions on a topic this way and then revealed the right one after the fact. You can probably publish plaintext metadata for this.

Second Citizenships, Residencies, and/or Temporary Relocation

alexlyzhov3y*20

An update on Israel:

> Citizenship is typically granted 3 months after arrival; you can fill out a simple form to waive this waiting period, however.
I think it's not the case, because you receive an internal ID of a citizen immediately after a document check, but they only give you a passport you can use for visas after 3 months (which you can also spend outside the country).
Waiving the waiting period is possible in 2022, but you have to be smart about it and go to exactly the right place to do it (because many local governments are against it).

> Isra... (read more)

[RETRACTED] It's time for EA leadership to pull the short-timelines fire alarm.

alexlyzhov3y50

Actually, the Metaculus community prediction has a recency bias:
> approximately sqrt(n) new predictions need to happen in order to substantially change the Community Prediction on a question that already has n players predicting.

In this case, n=298, the prediction should change substantially after sqrt(n)=18 new predictions (usually it takes up to a few days). Over the past week, there were almost this many predictions and the AGI community median has shifted 2043 -> 2039, and the 30th percentile is 8 years.

[RETRACTED] It's time for EA leadership to pull the short-timelines fire alarm.

alexlyzhov3y210

No disagreements here; I just want to note that if "the EA community" waits too long for such a pivot, at some point AI labs will probably be faced with people from the general population protesting because even now a substantial share of the US population views the AI progress in a very negative light. Even if these protests don't accomplish anything directly, they might indirectly affect any future efforts. For example, an EA-run fire alarm might be compromised a bit because the memetic ground would already be captured. In this case, the concept of "AI r... (read more)

2GeneSmith3y

I’m not sure I would agree. The post you linked to is titled “A majority of the public supports AI development.” Only 10% of the population is strongly opposed to. You’re making an implicit assumption that the public is going to turn against the technology in the next couple of years but I see no reason to believe that. In the past, public opinion really only turns against technology dolloping a big disaster. But we may not see a big AI induced disaster before a change in public opinion will be irrelevant to AGI

DALL·E 2 by OpenAI

alexlyzhov3y30

ICML 2022 reviews dropped this week.

DALL·E 2 by OpenAI

alexlyzhov3y20

"What if outer space were udon" (CLIP guided diffusion did really well, this is cherry-picked though: https://twitter.com/nshepperd1/status/1479118002310180882)

"colourless green ideas sleep furiously"

4Rabrg3y

This is a great example of how even a single iteration on the prompt can vastly improve the results. Here are the results when using your quotes exactly: Pretty dreadful! But here they are, with the exact same prompt, except with ", digital art" appended to it:

Testing PaLM prompts on GPT3

alexlyzhov3y140

Are PaLM outputs cherry-picked?

I reread the description of the experiment and I'm still unsure.

The protocol is on page 37 goes like this:
- the 2-shot exemplars used for few-shot learning were not selected or modified based on model output. I infer this from the line "the full exemplar prompts were written before any examples were evaluated, and were never modified based on the examination of the model output".
- greedy decoding is used, so they couldn't filter outputs given a prompt.

What about the queries (full prompt without the QAQA few-shot data part)? A... (read more)

7Algon3y

2[comment deleted]3y

Working Out in VR Really Works

alexlyzhov3y60

These games are really engaging for me and haven't been named:

Eleven Table Tennis. Ping-pong in VR (+ multiplayer and tournaments):

Racket NX. This one is much easier but you still move around a fair bit. The game is "Use the racket to hit the ball" as well.

Synth Riders. An easier and more chill Beat Saber-like game:

Holopoint. Archery + squats, gets very challenging on later levels:

Some gameplay videos for excellent games that have been named:

Beat Saber. "The VR game". You can load songs from the community library using mods.

Thrill of the Fight (boxin... (read more)

You Can Get Fluvoxamine

alexlyzhov3y10

You can buy fladrafinil or flmodafinil without any process (see reddit for reports, seems to work much better than adrafinil)

How Bad Is QWERTY, Really? A Review of the Literature, such as It Is

alexlyzhov3y10

One thing you probably won't find in an evidence review is that it feels more pleasant for me to type in Colemak rather than in QWERTY years after I made the switch. That's a pretty huge factor as well considering that we put so many hours into typing.

Jimrandomh's Shortform

alexlyzhov4y10

I would also highlight this as seemingly by far the most wrong point. Consider how many Omicron cases we now have and we still don't know for sure it's significantly less severe. Now consider how many secret cases in humans infected with various novel strains you're working with you would need to enact in a controlled environment to be confident enough that a given strain is less severe and thus it makes sense to release it.

Mental health benefits and downsides of psychedelic use in ACX readers: survey results

alexlyzhov4y80

Does anyone have a good model of how do they reconcile

1) a pretty large psychosis rate in this survey, a bunch of people in https://www.lesswrong.com/posts/MnFqyPLqbiKL8nSR7/my-experience-at-and-around-miri-and-cfar-inspired-by-zoe saying that their friends got mental health issues after using psychedelics, anecdotal experiences and stories about psychedelic-induced psychosis in the general cultural field

and

2) Studies https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3747247/ https://journals.sagepub.com/doi/10.1177/0269881114568039 finding no correlation, or, ... (read more)

2zehsilva4y

My mental model for the difference between the two results is based on the following: 1) the studies by Krebs and Johansen are analysis based on the "National Survey on Drug Use and Health (...), randomly selected to be representative of the adult population in the United States". 2) ACX readers population is not representative of the US population, in fact, it might be skewed in some dimensions that are very relevant here. 3) there are significant differences in the fraction of each sample that report psychedelic use 3.1) in the case of Krebs and Johansen (2013, 2015), it is ~13% reporting lifetime psychedelic use, while in the subsample of ACX readers survey considered in this report it is ~100%. One important aspect here tying this together is that I would assume ACX readers do not have the same distribution of genes associated with intelligence as the general population, and there has been evidence that there is an overlap of those genes and the genes associated with bipolar disorder (https://www.med.uio.no/norment/english/research/news-and-events/news/2019/genetic-overlap-bipolar-disorder-intelligence.html). This genetic overlap can explain a higher susceptibility of psychotic-like experiences with higher intelligence, even if there is no particular diagnose. Furthermore, by considering the multiple types of psychotic disorder, https://pubmed.ncbi.nlm.nih.gov/17199051/ has found the prevalence in the general population to be ~3%, which does not fall too far from the 4.5% that responded a firm "yes" to the survey.

Kaj_Sotala4y130

This study at least didn't ask about the length of the psychotic episode, so it seems compatible with the users having had short-term psychotic episodes that didn't cause long-term damage.

Speculatively, a short-term psychosis could even be part of what causes long-term mental health benefits, if e.g. psychedelics do it via a relaxing of priors and the psychotic episode is the moment when they are the most relaxed before stabilizing again, in line with the neural annealing analogy:

The hypothesized flattening of the brain’s (variational free) energy landsc

alexlyzhov4y20

"Training takes between 24 and 48 hours for most models"; I assumed both are trained within 48 hours (even though this is not precise and may be incorrect).

"AI and Compute" trend isn't predictive of what is happening

alexlyzhov4y20

Ohh OK I think since I wrote "512 TPU cores" it's 512x512, because in Appendix C here https://arxiv.org/pdf/1809.11096.pdf they say it corresponds to 512x512.

1Jsevillamol4y

Deep or shallow version?

"AI and Compute" trend isn't predictive of what is happening

alexlyzhov4y20

It should be referenced here in Figure 1: https://arxiv.org/pdf/2006.16668.pdf

A Breakdown of AI Chip Companies

alexlyzhov4y40

"I have heard that they get the details wrong though, and the fact that they [Groq] are still adversing their ResNet-50 performance (a 2015 era network) speaks to that."

I'm not sure I fully get this criticism: ResNet-50 is the most standard image recognition benchmark and unsurprisingly it's the only (?) architecture that NVIDIA lists in their benchmarking stats for image recognition as well: https://developer.nvidia.com/deep-learning-performance-training-inference.

Experiments with a random clock

alexlyzhov4y10

This is a very neat idea, is there any easy way to enable this for Android and Google Calendar notifications? I guess not

Alcohol, health, and the ruthless logic of the Asian flush

alexlyzhov4y40

Yep, the first google result http://xn--80akpciegnlg.xn--p1ai/preparaty-dlya-kodirovaniya/disulfiram-implant/ (in Russian) says that you use an implant with 1-2g of the substance for up to 5-24 months and that "the minimum blood level of disulfiram is 20 ng/ml; ". This paper https://www.ncbi.nlm.nih.gov/books/NBK64036/ says "Mild effects may occur at blood alcohol concentrations of 5 to 10 mg/100 mL."