Thane Ruthenis

Wikitag Contributions

Comments

Sorted by

Competitive agents will not choose to in order to beat the competition

Competitive agents will chose to commit suicide, knowing it's suicide, to beat the competition? That suggests that we should observe CEOs mass-poisoning their employees, Jonestown-style, in a galaxy-brained attempt to maximize shareholder value. How come that doesn't happen?

Are you quite sure the underlying issue here is not that the competitive agents don't believe the suicide raise to be a suicide race?

alignment will be optimised away, because any system that isn’t optimising as hard as possible won’t survive the race

Off the top of my head, this post. More generally, this is an obvious feature of AI arms races in the presence of alignment tax. Here's a 2011 writeup that lays it out:

Given abundant time and centralized careful efforts to ensure safety, it seems very probable that these risks could be avoided: development paths that seemed to pose a high risk of catastrophe could be relinquished in favor of safer ones. However, the context of an arms race might not permit such caution. A risk of accidental AI disaster would threaten all of humanity, while the benefits of being first to develop AI would be concentrated, creating a collective action problem insofar as tradeoffs between speed and safety existed.

I assure you the AI Safety/Alignment field has been widely aware of it since at least that long ago.

Also,

alignment will be optimised away, because any system that isn’t optimising as hard as possible won’t survive the race

Any (human) system that is optimizing as hard as possible also won't survive the race. Which hints at what the actual problem is: it's not even that we're in an AI arms race, it's that we're in an AI suicide race which the people racing incorrectly believe to be an AI arms race. Convincing people of the true nature of what's happening is therefore a way to dissolve the race dynamic. Arms races are correct strategies to pursue under certain conditions; suicide races aren't.

I've skimmed™ what I assume is your "main essay". Thoughtless Kneejerk Reaction™ follows:

  • You are preaching to the choir. Most of it are 101-level arguments in favor of AGI risk. Basically everyone on LW has already heard them, and either agrees vehemently, or disagrees with some subtler point/assumption which your entry-level arguments don't cover. The target audience for this isn't LWers, this is not content that's novel and useful for LWers. That may or may not be grounds for downvoting it (depending on one's downvote philosophy), but is certainly grounds for not upvoting it and for not engaging with it.
    • The entry-level arguments have been reiterated here over and over and over and over again, and it's almost never useful, and everyone's sick of them, and you essay didn't signal that engaging with you on them would be somehow unusually productive.
    • If I am wrong, prove me wrong: quote whatever argument of yours you think ranks the highest on novelty and importance, and I'll evaluate it.
  • The focus on capitalism likely contributed to the "this is a shallow low-insight take" impression. The problem isn't "capitalism", it's myopic competitive dynamics/Moloch in general. Capitalism exhibits lots of them, yes. But a bunch of socialist/communist states would fall into the same failure mode; a communist world government would fall into the same failure mode (inasmuch as it would still involve e. g. competition between researchers/leaders for government-assigned resources and prestige). Pure focus on capitalism creates the impression that you're primarily an anti-capitalism ideologue who's aiming co-opt the AGI risk for that purpose.
    • A useful take along those lines might be to argue that we can tap into the general public's discontent with capitalism to more persuasively argue the case for the AGI risk, followed by an analysis regarding specific argument structures which would be both highly convincing and truthful.
  • Appending an LLM output at the end, as if it's of inherent value, likely did you no favors.

I'm getting the impression that you did not familiarize yourself with LW's culture and stances prior to posting. If yes, this is at the root of the problems you ran into.

Edit:

Imagine for a moment that an amateur astronomer spots an asteroid on a trajectory to wipe out humanity. He doesn’t have a PhD. He’s not affiliated with NASA. But the evidence is there. And when he contacts the people whose job it is to monitor the skies, they say: “Who are you to discover this?” And then refuse to even look in the direction he’s pointing.

A more accurate analogy would involve the amateur astronomer joining a conference for people discussing how to divert that asteroid, giving a presentation where he argues for the asteroid's existence using low-resolution photos and hand-made calculations (to a room full of people who've observed the asteroid through the largest international telescopes or programmed supercomputer simulations of its trajectory), and is then confused why it's not very well-received.

It's been more than three months since o3 and still no o4, despite OpenAI researchers' promises.

Deep Learning has officially hit a wall. Schedule the funeral.

[/taunting_god]

I don't think that's an issue here at all. Look at the CoTs: it has no trouble whatsoever splitting higher-level expressions into concatenations of blocks of nested expressions and figuring out levels of nesting.

Counterargument: Doing it manually teaches you the skills and the strategies for autonomously attaining high levels of understanding quickly and data-efficiently. Those skills would then generalize to cases in which you can't consult anyone, such as cases where the authors are incommunicado, dead, or don't exist/the author is the raw reality. That last case is particularly important for doing frontier research: if you've generated a bunch of experimental results and derivations, the skills to make sense of what it all means have a fair amount of overlap with the skills for independently integrating a new paper into your world-models.

Of course, this is primarily applicable if you expect research to be a core part of your career, and it's important to keep in mind that "ask an expert for help" is an option. Still, I think independent self-studies can serve as good "training wheels".

Which is weird, if you are overwhelmed shouldn’t you also be excited or impressed? I guess not, which seems like a mistake, exciting things are happening.

"Impressed" or "excited" implies a positive/approving emotion towards the overwhelming news coming from the AI sphere. As an on-the-nose comparison, you would not be "impressed" or "excited" by a constant stream of reports covering how quickly an invading army is managing to occupy your cities, even if the new military hardware they deploy is "impressive" in a strictly technical sense.

When reading LLM outputs, I tend to skim them. They're light on relevant, non-obvious content. You can usually just kind of glance diagonally through their text and get the gist, because they tend to spend a lot of words saying nothing/repeating themselves/saying obvious inanities or extensions of what they've already said.

When I first saw Deep Research outputs, it didn't read to me like this. Every sentence seemed to be insightful, dense with pertinent information.

Now I've adjusted to the way Deep Research phrases itself, and it reads same as any other LLM output. Too many words conveying too few ideas.

Not to say plenty of human writing isn't similar kind of slop, and not to say some LLM outputs aren't actually information-dense. But well-written human stuff is usually information-dense, and could have surprising twists of thought or rhetoric that demand you to actually properly read it. And LLM outputs – including, as it turns out, Deep Research's – are usually very water-y. 

Altman’s model of the how AGI will impact the world is super weird if you take it seriously as a physical model of a future reality

My instinctive guess is that these sorts of statements from OpenAI are Blatant Lies intended to lower the AGI labs' profile and ensure there's no widespread social/political panic. There's a narrow balance to maintain, between generating enough hype targeting certain demographics to get billions of dollars in investments from them ("we are going to build and enslave digital gods and take over the world, do you want to invest in us and get a slice of the pie, or miss out and end up part of the pie getting sliced up?") and not generate so much hype of the wrong type that the governments notice and nationalize you ("it's all totally going to be business-as-usual, basically just a souped-up ChatGPT, no paradigm shifts, no redistribution of power, Everything will be Okay").

Sending contradictory messages such that each demographic hears only what they want to hear is a basic tactic for this. The tech investors buy the hype/get the FOMO and invest, the politicians and the laymen dismiss it and do nothing.

They seem to be succeeding at striking the right balance, I think. Hundreds of billions of dollars going into it from the private sector while the governments herp-derp.

certainly possible that the first AGI-level product will come out – maybe it’s a new form of Deep Research, let’s say – and initially most people don’t notice or care all that much

My current baseline expectation is that it won't look like this (unless the AGI labs/the AGI will want to artificially make it look like this). Attaining actual AGI, instead of the current shallow facsimiles, will feel qualitatively different.

For me, with LLMs, there's a palatable sense that they need to be babied and managed and carefully slotted into well-designed templates or everything will fall apart. It won't be like that with an actual AGI, an actual AGI would be exerting optimization pressure from its own end to make things function.

Relevant meme

There'll be a palatable feeling of "lucidity" that's currently missing with LLMs. You wouldn't confuse the two if you had their chat windows open side by side, and the transformative effects will be ~instant.

Load More