Veedrac20d

I generally don't care much about people's confidence levels. I don't Aumann agree that hard. But I do care how much effort someone has put in, how settled an idea is, whether is has been helpful or predictive. "Epistemic status: personal experience" is directly useful to me. I'll judge probability on merits however confident someone is (maybe not if I know their calibration curves, but I don't), but if I know what effort they did and didn't put in, I'll happily directly update on that. I don't think it's factually true that epistemic status 'almost never' conveys something other than a confidence level.

Epistemic status: did a few minutes informal searching to sanity check my claims, which were otherwise off the cuff.

Veedrac23d

This isn't even necessarily a risk thing, like would be analogous to the claim. If the reward is small, it also raises the question of friction costs. Taking the prize now has no ongoing cost. Taking it at a later date has a sizable upfront cost and a small ongoing cost.

Veedrac23d

How confident are we that hyperbolic time discounting is even real? I think you can explain these results with zero time discounting.

Normal Person: hey I have some money I don't need right now

Completely Legit Businessperson #1: I advise you to invest that. You can invest it in A for 5% annual returns, or if you are willing to have just slightly less liquidity, in B for 10% annual returns.

Normal Person: I guess B.

Completely Legit Businessperson #2: Hey, do I have some investment opportunities for you?

Normal Person: Yes?

Completely Legit Businessperson #2: And so you know you can trust me, the first $100 in the account is free!

Normal Person: Cool.

Completely Legit Businessperson #2: These accounts have an amazingly high return. In just one week our AI trading strategy will double—

Normal Person: Yeah no thanks I'll take the $100.

Replying toThe pace of progress, 4 years later

Veedrac2mo

The pace of progress, 4 years later

I edited out the word 'significantly', which in retrospect was misleading.

I'd prefer not to repeat what I've heard. In case I'm making this sound more mysterious than it is, I will note that you're not missing out on any juicy gossip. Nothing I heard in passing would be material to much.

The pace of progress, 4 years later

Veedrac

2mo

It has been a long four years since I wrote Moore's Law, AI, and the pace of progress, a post about the room we have for AI compute to scale. Late 2021 had given us a year to absorb GPT-3, the Scaling Hypothesis, and Meena, but was a year before ChatGPT hit the stage, and, well, you probably know the rest.

While four years isn't quite enough time to test the everything I claimed, there's plenty to cover. Let's jump in.

Disclaimer: I'm going to use AI assistance liberally for scanning data and documents, and don't want to commit to noting this at every point.

0. Preface: Popular conception

In late 2021 I quoted from Jensen... (read 1652 more words →)

Replying toMoving Goalposts: Modern Transformer Based Agents Have Been Weak ASI For A Bit Now

Veedrac2mo

Moving Goalposts: Modern Transformer Based Agents Have Been Weak ASI For A Bit Now

To be clear about my position, and to disagree with Lemoine, not passing a Turing test doesn't mean you aren't intelligent (or aren't sentient, or a moral patient). It only holds in the forward direction: passing a Turing Test is strong evidence that you are intelligent (and contain sentient pieces, and moral patients).

I think it's completely reasonable to take moral patienthood in LLMs seriously, though I suggest not assuming that entails a symmetric set of rights—LLMs are certainly not animals.

potentially implying that actual humans were getting a score of 27% "human" against GPT-4.5?!?!

Yes, but note that ELIZA had a reasonable score in the same data. Unless you're to believe that a human... (read more)

Replying toMoving Goalposts: Modern Transformer Based Agents Have Been Weak ASI For A Bit Now

Veedrac2mo

Moving Goalposts: Modern Transformer Based Agents Have Been Weak ASI For A Bit Now

Turing Tests were passed.

Basically all so-called Turing Tests that have been beaten are simply not Turing Tests. I have seen one plausible exception, showing that AI does well in a 5-minute limited versions of the test, seemingly due in large part to 5 minutes being much too short for a non-expert to tease at the remaining differences. The paper claims "Turing suggests a length of 5 minutes," but this is never actually said in that way, and also doesn't really make sense. This is, after all, Turing of Turing machines and of relative reducibility.

Replying toOn Fleshling Safety: A Debate by Klurl and Trapaucius.

Veedrac3mo

On Fleshling Safety: A Debate by Klurl and Trapaucius.

To the first part: yes, of course, my claim isn't that anything here is axiomatically unfair. It absolutely depends on the credences you give for different things, and the context you interpret them in. But I don't think the story in practice is justified.

If, instead, your concern is that the correspondence between Klurl's hypothetical examples and what they found when reaching the planet was improbably high, then I agree that is very coincidental, but I do not think that coincidence is being used as support for the story's intended lessons.

This is indeed approximately the source of my concern.

I think in a story like this if you show someone rapidly making narrow predictions... (read more)

Replying toOn Fleshling Safety: A Debate by Klurl and Trapaucius.

Veedrac3mo

On Fleshling Safety: A Debate by Klurl and Trapaucius.

Let me try addressing your comment more bluntly to see if that helps.

Your complaint about Klurl's examples are that they are "coincidentally" drawn from the special class of examples that we already know are actually real, which makes them not fictional.

No, Klurl is not real. There are no robot aliens seeding our planet. The fictional evidence I was talking about was not that Earth right now exists in reality right now, it was that Earth right now exists in this story specifically at the point it was used.

If you write a story where a person prays and then wins the lottery as part of a demonstration of the efficacy of prayer, that... (read more)

Replying toOn Fleshling Safety: A Debate by Klurl and Trapaucius.

Veedrac3mo

On Fleshling Safety: A Debate by Klurl and Trapaucius.

I stand by what I said, but I don't want to argue about semantics. I would not have allowed myself to write a story this way.

The Star Trek claim is a false dichotomy. One could choose to directly show that the underspecified parts are underspecified, one could choose to show many examples of the ways this would near-miss, one could simply not write oneself into this corner in the first place. And in the rather hard to believe counterfatual that Yudkowsky didn't feel capable to make his story without such a contrivance, he could have just used a different frame, or a different format, or signposted the issue, or done some other thing instead.

Replying toOn Fleshling Safety: A Debate by Klurl and Trapaucius.

Veedrac3mo

On Fleshling Safety: A Debate by Klurl and Trapaucius.

"One does not live through a turn of the galaxy by taking occasional small risks."

I'll admit to this that the author being Yudkowsky heavily colored how I read this line. He has repeatedly, strongly taken the stance that AI risk is not about small probabilities, he would not be thinking so much about AI risk if his probability were order-1%, people who do care about order-1% risks are being silly, etc. There are lots of quotes but I'll take the first one I found on a search, not because it's the closest match but that it's the first one I found.

But the king of the worst award has to go to the

... (read more)

I saw a recentish post challenging people to state a clear AI xrisk argument and was surprised at how poorly formed the arguments in the comments were despite the issues getting called out. So, if you're like apparently most of LessWrong, here's what I consider the primary reduced argument, copied with slight edits from an HN post I made a couple years ago:

It is plausible that future systems achieve superhuman capability; capable systems necessarily have instrumental goals; instrumental goals tend to converge; human preferences are unlikely to be preserved when other goals are heavily selected for unless intentionally preserved; we don't know how to make AI systems encode any complex preference robustly.

I should note that having a direct argument doesn't mean other arguments like statistical precedent, analogy to evolution, or even intuition aren't useful. It is however good mental hygiene to track when you have short reasoning chains that don't rely on getting analogies right, since analogies are hard^[1].

^{^}
Complete sidenote but I find this link fascinating. I wrote ‘analogies are hard’ thinking there there ought to be a Sequences post for that, not that there is. The post I found is somehow all the more convincing for the point I was making with how Yudkowsky messes up the discussion of neural networks. Were I the kind of person to write LessWrong posts rather than just imagine what they might be if I did, a better Analogies are hard would be one of the first.

Reply to https://twitter.com/krishnanrohit/status/1794804152444580213, too long for twitter without a subscription so I threw it here, but do please treat it like a twitter comment.

rohit: Which part of [the traditional AI risk view] doesn't seem accounted for here? I admit AI safety is a 'big tent' but there's a reason they're congregated together.

You wrote in your list,

the LLMs might start even setting bad objectives, by errors of omission or commission. this is a consequence of their innards not being the same as people (either hallucinations or just not having world model or misunderstanding the world)

In the context of traditional AI risk views, this misses the argument. Roughly the concern is instead like so:

ASI... (read 449 more words →)

Veedrac's Shortform

Veedrac

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Post-history is written by the martyrs

Veedrac

This is a story I wrote in late 2020, a few months after GPT-3 was released. Speculative fiction seems to be well received here, and although this is not as serious a story as my other recent post, it covers serious ideas and attempts to be hard science fiction. I'm also moving it here to more easily reference it in the future.

People who have been closely following tech news might be aware of GPT-3, a Machine Learning based language model by OpenAI that has incredible abilities. Watch GPT-3 write a small app based on a description, play 19 degrees of Kevin Bacon, give coherent reasoning on meaningful questions, diagnose a patient from

... (read 5422 more words →)

Optimality is the tiger, and agents are its teeth

Veedrac

You've done it. You've built the machine.

You've read the AI safety arguments and you aren't stupid, so you've made sure you've mitigated all the reasons people are worried your system could be dangerous, but it wasn't so hard to do. AI safety seems a tractable concern. You've built a useful and intelligent system that operates along limited lines, with specifically placed deficiencies in its mental faculties that cleanly prevent it from being able to do unboundedly harmful things. You think.

After all, your system is just a GPT, a pre-trained predictive text model. The model is intuitively smart—it probably has a good standard deviation or two better intuition than any human that has... (read 4636 more words →)

353

Moore's Law, AI, and the pace of progress

Veedrac

It seems to be a minority view nowadays to believe in Moore's Law, the routine doubling of transistor density roughly every couple of years, or even the much gentler claim, that There's Plenty [more] Room at the Bottom. There's even a quip for it: the number of people predicting the death of Moore's law doubles every two years. This is not merely a populist view by the uninformed. Jensen Huang, CEO of NVIDIA, a GPU company, has talked about Moore's Law failing.

"Moore's Law used to grow at 10x every five years [and] 100x every 10 years," Huang said during a Q&A panel with a small group of reporters and analysts at CES

... (read 7085 more words →)

130

LESSWRONG
LW

LESSWRONG
LW

Veedrac

Optimality is the tiger, and agents are its teeth

Moore's Law, AI, and the pace of progress

Post-history is written by the martyrs

The pace of progress, 4 years later

Veedrac

Veedrac

The pace of progress, 4 years later

Veedrac's Shortform

Post-history is written by the martyrs

Optimality is the tiger, and agents are its teeth

Moore's Law, AI, and the pace of progress

Veedrac

Optimality is the tiger, and agents are its teeth

Moore's Law, AI, and the pace of progress

Post-history is written by the martyrs

The pace of progress, 4 years later

Veedrac

Veedrac

The pace of progress, 4 years later

Veedrac's Shortform

Post-history is written by the martyrs

Optimality is the tiger, and agents are its teeth

Moore's Law, AI, and the pace of progress

0. Preface: Popular conception