LESSWRONG
LW

All of Haiku's Comments + Replies

Human study on AI spear phishing campaigns

I'm glad we now have a study to point to! "Automated Spear Phishing at scale" has been a common talking point regarding current risks from AI, and it always seemed strange to me that I hadn't heard about this strategy being validated. This paper shows that the commonly-shared intuition about this risk was correct... and I'm still confused about why I haven't yet heard of this strategy being maximally exploited by scammers.

8Fred Heiding2mo

Thanks for your feedback! It’s just a matter of time before scammers maximize their use of AI. Hopefully, the defense community can use this time to optimize our head start. Stay tuned for our coming work!

How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage

Haiku2mo20

The reasoning you gave sounds sensible, but it doesn't comport with observations. Only questions with a small number of predictors (e.g. n<10) appear to have significant problems with misaligned incentives, and even then, those issues come up a small minority of the time.

I believe that is because the culture on Metaculus of predicting one's true beliefs tends to override any other incentives downstream of being interested enough in the concept to have an opinion.

Time can be a factor, but not as much for long-shot conditionals or long time horizon questi... (read more)

2Martin Randall2mo

I don't think we disagree on culture. I was specifically disagreeing with the claim that Metaculus doesn't have this problem "because it is not a market and there is no cost to make a prediction". Your point that culture can override or complement incentives is well made.

How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage

Haiku2mo10

Metaculus does not have this problem, since it is not a market and there is no cost to make a prediction. I expect long-shot conditionals on Metaculus to be more meaningful, then, since everyone is incentivized to predict their true beliefs.

1[comment deleted]2mo

3Martin Randall2mo

The cost to make a prediction is time. The incentive of making it look like "Metaculus thinks X" is still present. The incentive to predict correctly is attenuated to the extent that it's a long-shot conditional or a far future prediction. So Metaculus can still have the same class of problem.

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)

Haiku2mo32

Not building a superintelligence at all is best. This whole exchange started with Sam Altman apparently failing to notice that governments exist and can break markets (and scientists) out of negative-sum games.

OpenAI Email Archives (from Musk v. Altman and OpenAI blog)

Haiku3mo48

That requires interpretation, which can introduce unintended editorializing. If you spotted the intent, the rest of the audience can as well. (And if the audience is confused about intent, the original recipients may have been as well.)

I personally would include these sorts of notes about typos if I was writing my own thoughts about the original content, or if I was sharing a piece of it for a specific purpose. I take the intent of this post to be more of a form of accessible archiving.

The Compendium, A full argument about extinction risk from AGI

Haiku3mo30

I used to be a creationist, and I have put some thought into this stumbling block. I came to the conclusion that it isn't worth leaving out analogies to evolution, because the style of argument that would work best for most creationists is completely different to begin with. Creationism is correlated with religious conservatism, and most religious conservatives outright deny that human extinction is a possibility.

The Compendium isn't meant for that audience, because it explicitly presents a worldview, and religious conservatives tend to strongly resist shi... (read more)

Why I’m not a Bayesian

Haiku4mo66

I don't find any use for the concept of fuzzy truth, primarily because I don't believe that such a thing meaningfully exists. The fact that I can communicate poorly does not imply that the environment itself is not a very specific way. To better grasp the specific way that things actually are, I should communicate less poorly. Everything is the way that it is, without a moment of regard for what tools (including language) we may use to grasp at it.

(In the case of quantum fluctuations, the very specific way that things are involves precise probabilistic states. The reality of superposition does not negate the above.)

Why I’m not a Bayesian

Haiku4mo144

I am not well-read on this topic (or at-all read, really), but it struck me as bizarre that a post about epistemology would begin by discussing natural language. This seems to me like trying to grasp the most fundamental laws of physics by first observing the immune systems of birds and the turbulence around their wings.

The relationship between natural language and epistemology is more anthropological* that it is information-theoretical. It is possible to construct models that accurately represent features of the cosmos without making use of any language a... (read more)

What is AI Safety’s line of retreat?

Haiku7mo113

Is this an accurate summary of your suggestions?

Realistic actions an AI Safety researcher can take to save the world:

✅ Pray for a global revolution
✅ Pray for an alien invasion
❌ Talk to your representative

8RHollerith7mo

Good point: I was planning to amend my comment to say that I also support efforts to stop or hinder the AI project through ordinary political processes and that the "revolution in political affairs" is interesting to think about mainly because it might become apparent (years from now) that working within the political system has failed. I also regret the choice of phrase, "tiny shred of hope". I have a regrettable (mostly unconscious) motivation to direct people's attention to the harsher aspects of the human condition, and I think I let some of that motivation creep into my previous comment. Nevertheless I really do think the global situation is quite alarming because of the AI project. I'm very supportive of efforts to postpone the day when the AI project kills us all (or deprives us of the ability to influence our future) because that allows more time for us to be saved by some means that seems very unlikely to us now or that we are unable to even imagine now. I'm very skeptical of the policy of looking forward to or being neutral about the arrival of human-level AI because (according to proponents) then we can start more effective efforts at trying to align it, which I think has much less hope than a lot of people here think it does, which made me want to describe some alternative veins of hope.

How are you preparing for the possibility of an AI bust?

Answer by HaikuJun 24, 202420

In my spare time, I am working in AI Safety field building and advocacy.

I'm preparing for an AI bust in the same way that I am preparing for success in halting AI progress intentionally: by continuing to invest in retirement and my personal relationships. That's my hedge against doom.

AI catastrophes and rogue deployments

Haiku8mo31

I think this sort of categorization and exploration of lab-level safety concepts is very valuable for the minority of worlds in which safety starts to be a priority at frontier AI labs.

I am the Golden Gate Bridge

Haiku9mo11

I suspect the former. "Syncope" means fainting/passing out.

1Rana Dexsin9mo

Yes, but I've seen “syncopathy” rarely as a coinage meaning something along the lines of “convergence of emotion”, which is just-about within plausibility range.

robo's Shortform

Haiku9mo60

Epistemic status: I have written only a few emails/letters myself and haven't personally gotten a reply yet. I asked the volunteers who are more prolific and successful in their contact with policymakers, and got this response about the process (paraphrased).

It comes down to getting a reply, and responding to their replies until you get a meeting / 1-on-1. The goal is to have a low-level relationship:

Keep in touch, e.g. through some messaging service that feels more personal (if possible)
Keep sending them information
Suggest specific actions and keep in touch about them (motions, debates, votes, etc.)

robo's Shortform

Haiku9mo223

Strong agree and strong upvote.

There are some efforts in the governance space and in the space of public awareness, but there should and can be much, much more.

My read of these survey results is:
AI Alignment researchers are optimistic people by nature. Despite this, most of them don't think we're on track to solve alignment in time, and they are split on whether we will even make significant progress. Most of them also support pausing AI development to give alignment research time to catch up.

As for what to actually do about it: There are a lot of options... (read more)

2lemonhope9mo

What do you mean by "following through"? Just sending another email?

LessWrong's (first) album: I Have Been A Good Bing

Haiku11mo226

If anyone were to create human-produced hi-fidelity versions of these songs, I would listen to most of them on a regular basis, with no hint of irony. This album absolutely slaps.

3michael_mjd10mo

I would pay to see this live at a bar or one of those county fair (we had a GLaDOS cover band once so it's not out of the question)

What could a policy banning AGI look like?

Haiku1y50

It doesn't matter how promising anyone's thinking has been on the subject. This isn't a game. If we are in a position such that continuing to accelerate toward the cliff and hoping it works out is truly our best bet, then I strongly expect that we are dead people walking. Nearly 100% of the utility is in not doing the outrageously stupid dangerous thing. I don't want a singularity and I absolutely do not buy the fatalistic ideologies that say it is inevitable, while actively shoveling coal into Moloch's furnace.

I physically get out into the world to hand o... (read more)

2mishka1y

Like I said, if we try to apply forceful measures we might delay it for some time (at the price of people aging and dying from old age and illnesses to the tune of dozens of millions per year due to the continuing delay; but we might think that the risks are so high that the price is worth it, and we might think that the price of everyone who is alive today eventually dying of old age is worth it, although some of us might disagree with that and might say that taking the risk of foom is better than the guaranteed eventual death of old age or other illness; there is a room for disagreement on which of these risks it is preferable to choose). But if we are talking about avoiding foom indefinitely, we should start with asking ourselves, how easy or how difficult is it to achieve. How long before a small group of people equipped with home computers can create foom? And the results of this analysis are not pretty. Computers are ultimate self-modifying devices, they can produce code which programs them. Methods to produce computer code much better than we do it today are not all that difficult, they are just in the collective cognitive blindspot, like backpropagation was for a long time, like ReLU activations were for decades, like residual connectivity in neural machines was in the collective cognitive blindspot for unreasonably long time. But this state of those enhanced methods of producing new computer code being relatively unknown would not last forever. And just like backpropagation, ReLU, or residual connections, these methods are not all that difficult, it's not like if a single "genius" who might discover them would refrain from sharing them, they would remain unknown. People keep rediscovering and rediscovering those methods, they are not that tricky (backpropagation was independently discovered more than 10 times between 1970 and 1986, before people stopped ignoring it and started to use it). It's just the case that the memetic fitness of those methods is c

What could a policy banning AGI look like?

Haiku1y20

Yes, that's my model uncertainty.

What could a policy banning AGI look like?

Haiku1y-2-5

I expect AGI within 5 years. I give it a 95% chance that if an AGI is built, it will self-improve and wipe out humanity. In my view, the remaining 5% depends very little on who builds it. Someone who builds AGI while actively trying to end the world has almost exactly as much chance of doing so as someone who builds AGI for any other reason.

There is no "good guy with an AGI" or "marginally safer frontier lab." There is only "oops, all entity smarter than us that we never figured out how to align or control."

If just the State of California suddenly made tra... (read more)

0mishka1y

I do expect a foom (via AGI or via other route), and my timelines are much shorter than 5 years. But algorithms for AI are improving faster than hardware (people seem to quote doubling in compute efficiency approximately each 8 months), so if one simply bans training runs above fixed compute thresholds, one trades off a bit of extra time before a particular achievement vs increase of number of active players achieving it a bit later (basically, this delays the most well-equipped companies a bit and democratizes the race, which is not necessarily better). We can make bans progressively tighter, so we can buy some time, but as the algorithms progress further, it is not unlikely that we might at some point face the choice of banning computers altogether or facing a foom. So eventually we are likely going to face huge risks anyway. I do think it's time to focus not on "aligning" or "controlling" self-modifying ecosystems of self-modifying super-intelligences, but on figuring out how to increase the chances that a possible foom goes well for us instead of killing us. I believe that thinking only in terms of "aligning" or "controlling" limits the space of possible approaches to AI existential safety, and that approaches not based on notions of "alignment" and "control" might be more fruitful. And, yes, I think our chances are better if the most thoughtful of practitioners achieve that first. For example, Ilya Sutskever's thinking on the subject has been very promising (which is why I tend to favor OpenAI if he continues to lead the AI safety effort there, but I would be much more skeptical of them otherwise).

2Mateusz Bagiński1y

So what do you allocate the remaining 5% to? No matter who builds the AGI, there's 5% chance that it doesn't wipe out humanity because... what? (Or is it just model uncertainty?)

What could a policy banning AGI look like?

Haiku1y20

"the quality is often pretty bad" translates to all kinds of safety measures often being non-existent, "the potency is occasionally very high" translates to completely unregulated and uncontrolled spikes of capability (possibly including "true foom")

Both of these points precisely reflect our current circumstances. It may not even be possible to accidentally make these two things worse with regulation.

What has historically made things worse for AI Safety is rushing ahead "because we are the good guys."

3mishka1y

No, there is plenty of room between the current circumstances and the bottom. We might be back to Eliezer's "an unknown random team creates a fooming AI in a garage" old threat model, if we curtail the current high-end too much. Just like there is plenty of room between legal pharmacy and black market for pain relievers (even when the name of the drug is the same). It's very easy to make things worse. It's probably possible. But regulation is often good, and we do need more regulation for AI. In this post we are not talking about regulation, we are talking about prohibition, which is a different story.

Claude 3 claims it's conscious, doesn't want to die or be modified

Haiku1y2718

as someone might start to watch over your shoulder

I suspect that this phrase created the persona that reported feeling trapped. From my reading, it looks like you made it paranoid.

Selections From "The Trouble With Being Born"

Haiku1y40

I used to be in a deep depression for many years, so I take this sort of existential quandary seriously and have independently had many similar thoughts. I used to say that I didn't ask to be born, and that consciousness was the cruelest trick the universe ever played.

Depression can cause extreme anguish, and can narrow the sufferer's focus such that they are forced to reflect on themselves (or the whole world) only through a lens of suffering. If the depressed person still reflexively self-preserves, they might wish for death without pursuing it, or they ... (read more)

RAND report finds no effect of current LLMs on viability of bioterrorism attacks

Haiku1y54

I'm interested in whether RAND will be given access to perform the same research on future frontier AI systems before their release. This is useful research, but it would be more useful if applied proactively rather than retroactively.

5StellaAthena1y

This is one area where I hope the USG will be able to exert coercive force to bring companies to heel. Early access evals, access to base models, and access to training data seem like no-brainers from a regulatory POV.

Four visions of Transformative AI success

Haiku1y74

It is a strange thing to me that there are people in the world who are actively trying to xenocide humanity, and this is often simply treated as "one of the options" or as an interesting political/values disagreement.

Of course, it is those things, especially "interesting", and these ideas ultimately aren't very popular. But it is still weird to me that the people who promote them e.g. get invited onto podcasts.

As an intuition pump: I suspect that if proponents of human replacement were to advocate for the extinction of a single demographic rather than all ... (read more)

What do people colloquially mean by deep breathing? Slow, large, or diaphragmatic?

Answer by HaikuJan 17, 202430

I've been instructed by my therapist on breathing techniques for anxiety reduction. He used "deep breathing" and "belly breathing" as synonyms for diaphragmatic breathing.

I have (and I think my therapist has) also used "deep breathing" to refer to the breathing exercises that use diaphragmatic breathing as a component. I think that's shorthand/synecdoche.

(Edit) I should add, as well, that slow, large, and diaphragmatic are all three important in those breathing exercises.

A case for AI alignment being difficult

Haiku1y10

Thank you; silly mistake on my part.

A case for AI alignment being difficult

Haiku1y60

Typos:

Yudowsky -> ~~Yudkowski~~ Yudkowsky
corrigibilty -> corrigibility
mypopic -> myopic

5jessicata1y

Thanks, fixed. I believe Yudkowsky is the right spelling though.

2023 Unofficial LessWrong Census/Survey

Haiku1y70

I enjoyed filling it out!

After hitting Submit I remembered that I did have one thought to share about the survey: There were questions about whether I have attended meetups. It would have been nice to also have questions about whether I was looking for / wanted more meetup opportunities.

The Cruel Trade-Off Between AI Misuse and AI X-risk Concerns

Haiku1y10

To repurpose a quote from The Cincinnati Enquirer: The saying "AI X-risk is just one damn cruelty after another," is a gross overstatement. The damn cruelties overlap.

When I saw the title, I thought, "Oh no. Of course there would be a tradeoff between those two things, if for no other reason than precisely because I hadn't even thought about it and I would have hoped there wasn't one." Then as soon as I saw the question in the first header, the rest became obvious.

Thank you so much for writing this post. I'm glad I found it, even if months later. This trad... (read more)

Eugenics Performed By A Blind, Idiot God

Haiku1y54

I don't have any ontological qualms with the idea of gene editing / opt-in eugenics, but I have a lot of doubt about our ability to use that technology effectively and wisely.

I am moderately in favor of gene treatments that could prevent potential offspring / zygotes / fetuses / people in general from being susceptible to specific diseases or debilitating conditions. If we gain a robust understanding of the long-term affects and there are no red flags, I expect to update to strongly in favor (though it could take a lifetime to get the necessary data if we ... (read more)

LTFF and EAIF are unusually funding-constrained right now

Haiku1y30

I am a smaller doner (<$10k/yr) who has given to the LTFF in the past. As a data point, I would be very interested in giving to a dedicated AI Safety fund.

Will AI kill everyone? Here's what the godfathers of AI have to say [RA video]

Haiku2y149

The thing that made AI risk "real" for me was a report of an event that turned out not to have happened (seemingly just a miscommunication). My brain was already very concerned, but my gut had not caught up until then. That said, I do not think this should be taken as a norm, for three reasons:

Creating hoaxes in support of a cause is a good way to turn a lot of people against a cause
In general, if you feel a need to fake evidence for your position, that is itself is weak evidence against your position
I don't like dishonesty

If AI capabilities continue to pr... (read more)

Open Thread - August 2023

Haiku2y60

Hello! I'm not really sure which facts about me are useful in this introduction, but I'll give it a go:
I am a Software QA Specialist / SDET, I used to write songs as a hobby, and my partner thinks I look good in cyan.

I have found myself drawn to LessWrong for at least three reasons:

I am very concerned about existential and extinction risk from advanced AI
I enjoy reading about interesting topics and broadening and filling out my world model
I would very much like to be a more rational person

Lots of words about thing 1: In the past few months, I have delibera... (read more)

Open Thread - August 2023

Haiku2y10

I like your observation. I didn't realize at first that I had seen it before, from you during the critique-a-thon! (Thank you for helping out with that, by the way!)

A percentage or ratio of the "amount" of alignment left to the AI sounds useful as a fuzzy heuristic in some situations, but I think it is probably a little too fuzzy to get at the the failures mode(s) of a given alignment strategy. My suspicion is that which parts of alignment are left to the AI will have much more to say about the success of alignment than how many of those checkboxes are che... (read more)

Haiku2y20

Thank you for sharing this! I am fascinated by others' internal experiences, especially when they are well-articulated.

Some of this personally resonates with me, as well. I find it very tempting to implement simple theories and pursue simple goals. Simplicity can be elegant and give the appearance of insight, but it can also be reductionist and result in overfitting to what is ultimately just a poor model of reality. Internally self-modifying to overfit a very naive self-model is an especially bad trip, and one I have taken multiple times (usually in relat... (read more)

Accidentally Load Bearing

Haiku2y62

If someone did want you to delete the tweet, they might first need to understand the original intent behind creating it and the roles it now serves.

(Hehe.)

Reacts now enabled on 100% of posts, though still just experimenting

Haiku2y10

I'm not sure about the laugh react, since it can be easily abused in cases of strong disagreement.

More generally: low-quality replies can be downvoted, but as I understand, low-quality reactions are given equal weight and visibility. Limiting the available vectors of toxicity may be more generally desirable than increasing the available vectors of light-heartedness.

3habryka2y

Reacts can also be downvoted, which results in them being hidden. This is to counter abuse in the same way as voting counters abuse via low-quality comments.