Thane Ruthenis - LessWrong

Counterargument: Doing it manually teaches you the skills and the strategies for autonomously attaining high levels of understanding quickly and data-efficiently. Those skills would then generalize to cases in which you can't consult anyone, such as cases where the authors are incommunicado, dead, or don't exist/the author is the raw reality. That last case is particularly important for doing frontier research: if you've generated a bunch of experimental results and derivations, the skills to make sense of what it all means have a fair amount of overlap with the skills for independently integrating a new paper into your world-models.

Of course, this is primarily applicable if you expect research to be a core part of your career, and it's important to keep in mind that "ask an expert for help" is an option. Still, I think independent self-studies can serve as good "training wheels".

AI #109: Google Fails Marketing Forever

Thane Ruthenis2d22

Which is weird, if you are overwhelmed shouldn’t you also be excited or impressed? I guess not, which seems like a mistake, exciting things are happening.

"Impressed" or "excited" implies a positive/approving emotion towards the overwhelming news coming from the AI sphere. As an on-the-nose comparison, you would not be "impressed" or "excited" by a constant stream of reports covering how quickly an invading army is managing to occupy your cities, even if the new military hardware they deploy is "impressive" in a strictly technical sense.

A Bear Case: My Predictions Regarding AI Progress

Thane Ruthenis4d30

When reading LLM outputs, I tend to skim them. They're light on relevant, non-obvious content. You can usually just kind of glance diagonally through their text and get the gist, because they tend to spend a lot of words saying nothing/repeating themselves/saying obvious inanities or extensions of what they've already said.

When I first saw Deep Research outputs, it didn't read to me like this. Every sentence seemed to be insightful, dense with pertinent information.

Now I've adjusted to the way Deep Research phrases itself, and it reads same as any other LLM output. Too many words conveying too few ideas.

Not to say plenty of human writing isn't similar kind of slop, and not to say some LLM outputs aren't actually information-dense. But well-written human stuff is usually information-dense, and could have surprising twists of thought or rhetoric that demand you to actually properly read it. And LLM outputs – including, as it turns out, Deep Research's – are usually very water-y.

On (Not) Feeling the AGI

Thane Ruthenis4d*2715

Altman’s model of the how AGI will impact the world is super weird if you take it seriously as a physical model of a future reality

My instinctive guess is that these sorts of statements from OpenAI are Blatant Lies intended to lower the AGI labs' profile and ensure there's no widespread social/political panic. There's a narrow balance to maintain, between generating enough hype targeting certain demographics to get billions of dollars in investments from them ("we are going to build and enslave digital gods and take over the world, do you want to invest in us and get a slice of the pie, or miss out and end up part of the pie getting sliced up?") and not generate so much hype of the wrong type that the governments notice and nationalize you ("it's all totally going to be business-as-usual, basically just a souped-up ChatGPT, no paradigm shifts, no redistribution of power, Everything will be Okay").

Sending contradictory messages such that each demographic hears only what they want to hear is a basic tactic for this. The tech investors buy the hype/get the FOMO and invest, the politicians and the laymen dismiss it and do nothing.

They seem to be succeeding at striking the right balance, I think. Hundreds of billions of dollars going into it from the private sector while the governments herp-derp.

certainly possible that the first AGI-level product will come out – maybe it’s a new form of Deep Research, let’s say – and initially most people don’t notice or care all that much

My current baseline expectation is that it won't look like this (unless the AGI labs/the AGI will want to artificially make it look like this). Attaining actual AGI, instead of the current shallow facsimiles, will feel qualitatively different.

For me, with LLMs, there's a palatable sense that they need to be babied and managed and carefully slotted into well-designed templates or everything will fall apart. It won't be like that with an actual AGI, an actual AGI would be exerting optimization pressure from its own end to make things function.

Relevant meme

There'll be a palatable feeling of "lucidity" that's currently missing with LLMs. You wouldn't confuse the two if you had their chat windows open side by side, and the transformative effects will be ~instant.

johnswentworth's Shortform

Thane Ruthenis4d7-2

Track record: My own cynical take seems to be doing better with regards to not triggering people (though it's admittedly less visible).

Any suggestions for how I can better ask the question to get useful answers without apparently triggering so many people so much?

First off, I'm kind of confused about how you didn't see this coming. There seems to be a major "missing mood" going on in your posts on the topic – and I speak as someone who is sorta-aromantic, considers the upsides of any potential romantic relationship to have a fairly low upper bound for himself^[1], and is very much willing to entertain the idea that a typical romantic relationship is a net-negative dumpster fire.

So, obvious-to-me advice: Keep a mental model of what topics are likely very sensitive and liable to trigger people, and put in tons of caveats and "yes, I know, this is very cynical, but it's my current understanding" and "I could totally be fundamentally mistaken here".

In particular, a generalization of an advice from here has been living in my head rent-free for years (edited/adapted):

Tips For Talking About Your Beliefs On Sensitive Topics
You want to make it clear that they're just your current beliefs about the objective reality, and you don't necessarily like that reality so they're not statements about how the world ought to be, and also they're not necessarily objectively correct and certainly aren't all-encompassing so you're not condemning people who have different beliefs or experiences. If you just say, "I don't understand why people do X," everyone will hear you as saying that everyone who does X is an untermensch who should be gutted and speared because in high-simulacrum-level environments disagreeing with people is viewed as a hostile act attempting to lower competing coalitions' status, and failing to furiously oppose such acts will get you depowered and killed. So be sure to be extra careful by saying something like, "It is my current belief, and I mean with respect to my own beliefs about the objective reality, that a typical romantic relationship seems flawed in lots of ways, but I stress, and this is very important, that if you feel or believe differently, then that too is a valid and potentially more accurate set of beliefs, and we don't have to OH GOD NOT THE SPEARS ARRRGHHHH!"

More concretely, here's how I would have phrased your initial post:

Rewrite

Here's a place where my model of the typical traditional romantic relationships seems to be missing something. I'd be interested to hear people's takes on what it might be.

Disclaimer: I'm trying to understand the general/stereotypical case here, i. e., what often ends up happening in practice. I'm not claiming that this is how relationships ought to be like, nor that all existing relationships are like this. But on my model, most people are deeply flawed, they tend to form deeply flawed relationships, and I'd like to understand why these relationships still work out. Bottom line is, this is going to be a fairly cynical/pessimistic take (with the validity of its cynicism being something I'm willing to question).

Background claims:

My model of the stereotypical/traditional long-term monogamous hetero relationship has a lot of downsides for men. For example:
- Financial costs: Up to 50% higher living costs (since in the "traditional" template, men are the breadwinners.)
- Frequent, likely highly stressful, arguments. See Aella's relationship survey data: a bit less than a third of respondents in 10-year relationships reported fighting multiple times a month or more.
- General need to manage/account for the partner's emotional issues. (My current model of the "traditional" relationship assumes the anxious attachment style for the woman, which would be unpleasant to manage.)
For hetero men, consistent sexual satisfaction is a major upside offered by a relationship, providing a large fraction of the relationship-value.
A majority of traditional relationships are sexually unsatisfying for the man after a decade or so. Evidence: Aella's data here and here are the most legible sources I have on hand; they tell a pretty clear story where sexual satisfaction is basically binary, and a bit more than half of men are unsatisfied in relationships of 10 years (and it keeps getting worse from there). This also fits with my general models of dating: women usually find the large majority of men sexually unattractive, most women eventually settle on a guy they don't find all that sexually attractive, so it should not be surprising if that relationship ends up with very little sex after a few years.

Taking on purely utilitarian lens, for a relationship to persist, the benefits offered by it should outweigh its costs. However, on my current model, that shouldn't be the case for the average man. I expect the stated downsides to be quite costly, and if we remove consistent sex from the equation, the remaining value (again, for a stereotypical man) seems comparatively small.

So: Why do these relationships persist? Obviously the men might not have better relationship prospects, but they could just not have any relationship. The central question which my models don't have a compelling answer to is: what is making these relationships net positive value for the men, relative to not having a romantic relationship at all?

Some obvious candidate answers:

The cultural stereotypes diverge from reality in some key ways, so my model is fundamentally mistaken. E. g.:
- I'm overestimating the downsides: the arguments aren't that frequent/aren't very stressful, female partners aren't actually "high-maintanance", etc.
- I'm overestimating the value of sex for a typical man.
- I'm underestimating how much other value relationships offers men. If so: what is that "other value", concretely? (Note that it'd need to add up to quite a lot to outweigh the emotional and financial costs, under my current model.)
Kids. This one makes sense for those raising kids, but what about everyone else? Especially as fertility goes down.
The wide tail. There's plenty of cases which make sense which are individually unusual - e.g. my own parents are business partners. Maybe in aggregate all these unusual cases account for the bulk.
Loneliness. Maybe most of these guys have no one else close in their life. In this case, they'd plausibly be better off if they took the effort they invested in their romantic life and redirected to friendships (probably mostly with other guys), but there's a lot of activation energy blocking that change.
Wanting a dependent. Lots of men are pretty insecure, and having a dependent to provide for makes them feel better about themselves. This also flips the previous objection: high maintenance can be a plus if it makes a guy feel wanted/useful/valuable.
Social pressure/commitment/etc making the man stick around even though the relationship is not net positive for him.
The couple are de-facto close mostly-platonic friends, and the man wants to keep that friendship.

I'm interested in both actual data and anecdata. What am I missing here? What available evidence points strongly to some of these over others?

Obvious way to A/B test this would be to find some group of rationalist-y people who aren't reading LW/your shortform, post my version there, and see the reactions. Not sure what that place would be. (EA forum? r/rational's Friday Open Threads? r/slatestarcodex? Some Discord/Substack group?)

Adapting it for non-rationalist-y audiences (e. g., r/AskMen) would require more rewriting. Mainly, coating the utilitarian language in more, ahem, normie terms.

^{^}
Given the choice between the best possible romantic relationship and $1m, I'd pick $1m. ~~Absent munchkinry like "my ideal girlfriend is a genius alignment researcher on the level of von Neumann and Einstein".~~

METR: Measuring AI Ability to Complete Long Tasks

Thane Ruthenis5d30

I buy this for the post-GPT-3.5 era. What's confusing me is that the rate of advancement in the pre-GPT-3.5 era was apparently the same as in the post-GPT-3.5 era, i. e., doubling every 7 months.

Why would we expect there to be no distribution shift once the AI race kicked into high gear? GPT-2 to GPT-3 to GPT-3.5 proceeded at a snail's pace by modern standards. How did the world happen to invest in them just enough for them to fit into the same trend?

Fabien's Shortform

Thane Ruthenis6dΩ350

I still think that even for the things you described, it will be relatively easy for the base model to understand what is going on, and it's likely that GPT-4o will too

Maaaybe. Note, though, that "understand what's going on" isn't the same as "faithfully and comprehensively translate what's going on into English". Any number of crucial nuances might be accidentally lost in translation (due to the decoder model not properly appreciating how important they are), or deliberately hidden (if the RL'd model performs a sneaky jailbreak on the decoder, see Pliny-style token bombs or jailbreaks encoded in metaphor).

METR: Measuring AI Ability to Complete Long Tasks

Thane Ruthenis6d40

Hm, that's a very good point.

I think the amount of money-and-talent invested into the semiconductor industry has been much more stable than in AI though, no? Not constant, but growing steadily with the population/economy/etc. In addition, Moore's law being so well-known potentially makes it a self-fulfilling prophecy, with the industry making it a target to aim for.

METR: Measuring AI Ability to Complete Long Tasks

Thane Ruthenis6d135

Indeed. That seems incredibly weird. It would be one thing if it were a function of parameter size, or FLOPs, or data, or at least the money invested. But the release date?

The reasons why GPT-3, GPT-3.5, GPT-4o, Sonnet 3.6, and o1 improved on the SOTA are all different from each other, ranging from "bigger scale" to "first RLHF'd model" to "first multimodal model" to "algorithmic improvements/better data" to "???" (Sonnet 3.6) to "first reasoning model". And it'd be one thing if we could at least say that "for mysterious reasons, billion-dollar corporations trying incredibly hard to advance the frontier can't do better than doubling the agency horizon every 7 months using any method", but GPTs from -2 to -3.5 were developed in a completely different socioeconomic situation! There wasn't an AI race dynamics, AGI companies were much poorer, etc. Yet they're still part of the pattern.

This basically only leaves teleological explanations, implies a divine plan for the rate of human technological advancement.

Which makes me suspect there's some error in the data, or the methodology was (accidentally) rigged to produce this result^[1]. Or perhaps there's a selection bias where tons of people were trying various different ways to forecast AI progress, all methodologies which failed to produce a neat trend weren't published, and we're looking at a methodology that chanced upon a spurious correlation.

Or I'm missing something obvious and it actually makes way more sense. Am I missing something obvious?

^{^}
For example: Was the benchmarking of how long a given type of task takes a human done prior to evaluating AI models, or was it done simultaneously with figuring out which models can do which tasks? I'd assume the methodology was locked in first, but if not...

Richard Ngo's Shortform

Thane Ruthenis6d20

So maybe we need to think about systematization happening separately in system 1 and system 2?

I think that's right. Taking on the natural-abstraction lens, there is a "ground truth" to the "hierarchy of values". That ground truth can be uncovered either by "manual"/symbolic/System-2 reasoning, or by "automatic"/gradient-descent-like/System-1 updates, and both processes would converge to the same hierarchy. But in the System-2 case, the hierarchy would be clearly visible to the conscious mind, whereas the System-1 route would make it visible only indirectly, by the impulses you feel.

I don't know about the conflict thing, though. Why do you think System 2 would necessarily oppose System 1's deepest motivations?

LESSWRONG
LW

Posts

Wikitag Contributions

Comments