testingthewaters - LessWrong

I mean, this applies to humans too. The words and explanations we use for our actions are often just post hoc rationalisations. An efficient text predictor must learn not what the literal words in front of them mean, but the implied scenario and thought process they mask, and that is a strictly nonlinear and "unfaithful" process.

testingthewaters's Shortform

testingthewaters3d20

I think I've just figured out why decision theories strike me as utterly pointless: they get around the actual hard part of making a decision. In general, decisions are not hard because you are weighing payoffs, but because you are dealing with uncertainty.

To operationalise this: a decision theory usually assumes that you have some number of options, each with some defined payout. Assuming payouts are fixed, all decision theories simply advise you to pick the outcome with the highest utility. "Difficult problems" in decision theory are problems where the payout is determined by some function that contains a contradiction, which is then resolved by causal/evidential/functional decision theories each with their own method of cutting the Gordian knot. The classic contradiction, of course, is that "payout(x1) == 100 iff predictor(your_choice) == x1; else payout(x1) == 1000".

Except this is not at all what makes real life decisions hard. If I am planning a business and ever get to the point where I know a function for exactly how much money two different business plans will give me, I've already gotten past the hard part of making a business plan. Similarly, if I'm choosing between two doors on a game show the difficulty is not that the host is a genius superpredictor who will retrocausally change the posterior goat/car distribution, but the simple fact that I do not know what is behind the doors. Almost all decision theories just skip past the part where you resolve uncertainty and gather information, which makes them effectively worthless in real life. Or, worse, they try to make the uncertainty go away: If I have 100 dollars and can donate to a local homeless shelter I know well or try and give it to a malaria net charity I don't know a lot about, I can be quite certain the homeless shelter will not misappropriate the funds or mismanage their operation, and less so about the faceless malaria charity. This is entirely missing from the standard EA arguments for allocation of funds. Uncertainty matters.

CapResearcher's Shortform

testingthewaters5d90

This has shifted my perceptions of what is in the wild significantly. Thanks for the heads up.

testingthewaters's Shortform

testingthewaters11d10

https://research.google/blog/deciphering-language-processing-in-the-human-brain-through-llm-representations/

Activations in LLMs are linearly mappable to activations in the human brain. Imo this is strong evidence for the idea that LLMs/NNs in general acquire extremely human like cognitive patterns, and that the common "shoggoth with a smiley face" meme might just not be accurate

METR: Measuring AI Ability to Complete Long Tasks

testingthewaters13d20

That surprisingly straight line reminds me of what happens when you use noise to regularise an otherwise decidedly non linear function: https://www.imaginary.org/snapshot/randomness-is-natural-an-introduction-to-regularisation-by-noise

Towards a scale-free theory of intelligent agency

testingthewaters15d20

I think this is a really cool research agenda. I can also try to give my "skydiver's perspective from 3000 miles in the air" overview of what I think expected free energy minimisation means, though I am by no means an expert. Epistemic status: this is a broad extrapolation of some intuitions I gained from reading a lot of papers, it may be very wrong.

In general, I think of free energy minimisation as a class of solutions for the problem of predicting complex systems behaviour, in line with other variational principles in physics. Thus, it is an attempt to use simple physical rules like "the ball rolls down the slope" to explain very complicated outcomes like "I decide to build a theme park with roller coasters in it". In this case, the rule is "free energy is minimised", but unlike a simple physical system whose dimensionality is very literally visible, VFE is minimised in high dimensional probability spaces.

Consider the concrete case below: there are five restaurants in a row and you have to pick one to go to. The intuitive physical interpretation is that you can be represented by a point particle moving to one of five coordinates, all relatively close by in the three dimensional XYZ coordinate space. However, if we assume that this is just some standard physical process you'll end up with highly unintuitive behaviour (why does the particle keep drifting right and left in the middle of these coordinates, and then eventually go somewhere that isn't the middle?). Instead we might say that in an RL sense there is a 5 dimensional action space and you must pick a dimension to maximise expected reward. Free energy minimisation is a rule that says that your action is the one that minimises variation between the predicted outcome your brain produces and the final outcome that your brain observes---which can happen either if your brain is very good at predicting the future or if you act to make your prediction come true. A preference in this case is a bias in the prediction (you can see yourself going to McDonald's more, in some sense, and you feel some psychological aversion/repulsive force moving you away from Burger King) that is then satisfied by you going to the restaurant you are most attracted to. Of course this is just a single agent interpretation and with multiple subagents you can imagine valleys and peaks in the high dimensional probability space, which is resolved when you reach some minima that can be satisfied by action.

The Takeoff Speeds Model Predicts We May Be Entering Crunch Time

testingthewaters17d20

It's hard to empathise with dry numbers, whereas a lively scenario creates an emotional response so more people engage. But I agree that this seems to be very well done statistical work.

Elite Coordination via the Consensus of Power

testingthewaters17d20

Hey, thank you for taking the time to reply honestly and in detail as well. With regards to what you want, I think that this is in many senses also what I am looking for, especially the last item about tying in collective behaviour to reasoning about intelligence. I think one of the frames you might find the most useful is one you've already covered---power as a coordination game. As you alluded to in your original post, people aren't in a massive hive mind/conspiracy---they mostly want to do what other successful people seem to be doing, which translates well to a coordination game and also explains the rapid "board flips" once a critical mass of support/rejection against some proposition is reached. For example, witness the rapid switch to majority support of gay marriage in the 2010s amongst the population in general.

Would also love to discuss this with you in more detail (I trained as an English student and also studied Digital Humanities). I will leave off with a few book suggestions that, while maybe not directly answering your needs, you might find interesting.

Capitalist Realism by Mark Fisher (as close to a self-portrait by the modern humanities as it gets)
Hyperobjects by Timothy Morton (high level perspective on how cultural, material, and social currents impact our views on reality)
How minds change by David McRaney (not humanities, but pop sci about the science of belief and persuasion)

P.S. Re: the point about Yarvin being right, betting on the dominant group in society embracing a dangerous delusion is a remarkably safe bet. (E.g. McCarthyism, the aforementioned Bavarian Witch Hunts, fascism, lysenkoism etc.)

Elite Coordination via the Consensus of Power

testingthewaters18d455

Hey, really enjoyed your triple review on power lies trembling, but imo this topic has been... done to death in the humanities, and reinventing terminology ad hoc is somewhat missing the point. The idea that the dominant class in a society comes from a set of social institutions that share core ideas and modus operandi (in other words "behaving as a single organisation") is not a shocking new phenomenon of twentieth century mass culture, and is certainly not a "mystery". This is basically how every country has developed a ruling class/ideology since the term started to have a meaning, through academic institutions that produce similar people. Yale and Harvard are as Oxford and Cambridge, or Peking University and Renmin University. (European universities, in particular, started out as literal divinity schools, and hence are outgrowths of the literal Catholic church, receiving literal Papal bulls to establish themselves as one of the studia generalia.) [Retracted, while the point about teaching religious law and receiving literal papal bulls is true the origins of the universities are much more diverse. But my point about the history of cultural hegemony in such institutions still stands.]

What Yarvin seems to be annoyed by is that the "Cathedral consensus" featured ideas that he dislikes, instead of the quasi-feudal ideology of might makes right that he finds more appealing. That is also not surprising. People largely don't notice when they are part of a dominant class and their ideas are treated as default: that's just them being normal, not weird. However, when they find themselves at the edge of the overton window, suddenly what was right and normal becomes crushing and oppressive. The natural dominance of sensible ideas and sensible people becomes a twisted hegemony of obvious lies propped up by delusional power-brokers. This perspective shift is also extremely well documented in human culture and literature.

In general, the concept that a homogenous ruling class culture can then be pushed into delusional consensuses which ultimately harms everyone is an idea as old as the Trojan War. The tension between maintaining a grip on power and maintaining a grip on reality is well explored in Yuval Noah Harari's book Nexus (which also has an imo pretty decent second half on AI). In particular I direct you to his account of the Bavarian witch hunts. Indeed, the unprecedented feature of modern society is the rapid divergence in ideas that is possible thanks to information technology and the cultivation of local echo chambers. Unfortuantely, I have few simple answers to offer to this age old question, but I hope that recognising the lineage of the question helps with disambiguation somewhat. I look forward to your ideas about new liberalisms.

The Fork in the Road

testingthewaters20d10

Yeah, I'm not gonna do anything silly (I'm not in a position to do anything silly with regards to the multitrillion param frontier models anyways). Just sort of "laying the groundwork" for when AIs will cross that line, which I don't think is too far off now. The movie "Her" is giving a good vibe-alignment for when the line will be crossed.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments