Why the focus on wise AI advisors?[1]
I'll be writing up a proper post to explain why I've pivoted towards this, but it will still take some time to produce a high quality post, so I decided it was worthwhile releasing a short-form description in the mean time.
By Wise AI Advisors, I mean training an AI to provide wise advice.
a) AI will have a massive impact on society given the infinite ways to deploy such a general technology
b) There are lots of ways this could go well and lots of ways that this could go extremely poorly (election interference, cyber attac...
Despite my contention on the associated paper post that focusing on wisdom in this sense is ducking the hard part of the alignment problem, I'll stress here that it Iseems thoroughly useful if it's a supplement not a substitute for work on the hard parts of the problem - technical, theoretical and societal.
I also think it's going to be easier to create wise advisors than you think, at least in the weak sense that they make their human users effectively wiser.
In short, think simple prompting schemes and eventually agentic scaffolds can do a lot of the extra...
I use ChatGPT voice-to-text all the time. About 1% of the time, the message I record in English gets seemingly-roughly-correctly translated into Welsh, and ChatGPT replies in Welsh. Sometimes my messages go back to English on the next message, and sometimes they stay in Welsh for a while. Has anyone else experienced this?
Example: https://chatgpt.com/share/67e1f11e-4624-800a-b9cd-70dee98c6d4e
Shower thought I had a while ago:
Everybody loves a meritocracy until people realize that they're the ones without merit. I mean you never hear someone say things like:
> I think America should be a meritocracy. Ruled by skill rather than personal characteristics or family connections. I mean, I love my son, and he has a great personality. But let's be real: If we live in a meritocracy he'd be stuck in entry-level.
(I framed the hypothetical this way because I want to exclude senior people very secure in their position who are performatively pushing for me...
There is a contingent of people who want excellence in education (e.g. Tracing Woodgrains) and are upset about e.g. the deprioritization of math and gifted education and SAT scores in the US. Does that not count?
Given that ~ no one really does this, I conclude that very few people are serious about moving towards a meritocracy.
This sounds like an unreasonably high bar for us humans. You could apply it to all endeavours, and conclude that "very few people are serious about <anything>". Which is true from a certain perspective, but also stretches the word "serious" far past how it's commonly understood.
In my post on value systematization I used utilitarianism as a central example of value systematization.
Value systematization is important because it's a process by which a small number of goals end up shaping a huge amount of behavior. But there's another different way in which this happens: core emotional motivations formed during childhood (e.g. fear of death) often drive a huge amount of our behavior, in ways that are hard for us to notice.
Fear of death and utilitarianism are very different. The former is very visceral and deep-rooted; it typically inf...
Some interesting thoughts on (in)efficient markets from Byrne Hobart, worth considering in the context of Inadequate Equilibria.
...When a market anomaly shows up, the worst possible question to ask is "what's the fastest way for me to exploit this?" Instead, the first thing to do is to steelman it as aggressively as possible, and try to find any way you can to rationalize that such an anomaly would exist. Do stocks rise on Mondays? Well, maybe that means savvy investors have learned through long experience that it's a good idea to take off risk before the wee
It seems like my workshops would generally work better if they were spaced out over 3 Saturdays, instead of crammed into 2.5 days in one weekend.
This would give people more time to try applying the skills in their day to day, and see what strategic problems they actually run into each week. Then on each Saturday, they could spend some time reviewing last week, thinking about what they want to get out of this workshop day, and then making a plan for next week.
My main hesitation is I kind of expect people to flake more when it's spread out over 3 weeks...
I tried reading HPMOR and didn't like it. As a vehicle for rationality, it's definitely conveying its message wholeheartedly, but I really don't understand the decision to disguise it as a Harry Potter fanfiction, especially when the fanfiction element feels strangely written and empty. That being said, I am absolutely in the minority here, so it's likely I'm simply missing something. Since I'm only a few chapters in I'll try and continue it - but my hopes are not up at all.
A snowclone summarizing a handful of baseline important questions-to-self: "What is the state of your X, and why is that what your X's state is?" Obviously also versions that are less generally and more naturally phrased, that's just the most obviously parametrized form of the snowclone.
Classic(?) examples:
"What do you (think you) know, and why do you (think you) know it?" (X = knowledge/belief)
"What are you doing, and why are you doing it?" (X = action(-direction?)/motivation?)
Less classic examples that I recognized or just made up:
"How do you feel, and w...
This called a Hurwicz decision rule / criterion (your t is usually alpha).
I think the content of this argument is not that maxmin is fundamental, but rather that simplicity priors "look like" or justify Hurwicz-like decision rules. Simple versions of this are easy to prove but (as far as I know) do not appear in the literature.
I'm looking for recommendations for frameworks/tools/setups that could facilitate machine checked math manipulations.
More details:
Here's a place where I feel like my models of romantic relationships are missing something, and I'd be interested to hear peoples' takes on what it might be.
Background claim: a majority of long-term monogamous, hetero relationships are sexually unsatisfying for the man after a decade or so. Evidence: Aella's data here and here are the most legible sources I have on hand; they tell a pretty clear story where sexual satisfaction is basically binary, and a bit more than half of men are unsatisfied in relationships of 10 years (and it keeps getting worse from ...
An effect I noticed: Going through Aella's correlation matrix (with poorly labeled columns sadly), a feature which strongly correlates with the length of a relationship is codependency. Plotting question 20. "The long-term routines and structure of my life are intertwined with my partner's" (li0toxk)
assuming that's what "codependency" refers to
The shaded region is a 95% posterior estimate for the mean of the distribution conditioned on the time-range (every 2 years) and cis-male respondents, with prior .
Note also that codependency and sex sat...
From Brian Potter's Construction Physics newsletter I learned about Taara, framed as "Google's answer to Starlink" re: remote internet access, using ground-based optical communication instead of satellites ("fiber optics without the fibers"; Taara calls them "light bridges"). I found this surprising. Even more surprisingly, Taara isn't just a pilot but a moneymaking endeavor if this Wired passage is true:
...Taara is now a commercial operation, working in more than a dozen countries. One of its successes came in crossing the Congo River. On one side was Brazza
Here's a game-theory game I don't think I've ever seen explicitly described before: Vicious Stag Hunt, a two-player non-zero-sum game elaborating on both Stag Hunt and Prisoner's Dilemma. (Or maybe Chicken? It depends on the obvious dials to turn. This is frankly probably a whole family of possible games.)
The two players can pick from among 3 moves: Stag, Hare, and Attack.
Hunting stag is great, if you can coordinate on it. Playing Stag costs you 5 coins, but if the other player also played Stag, you make your 5 coins back plus another 10.
Hunting hare is fi...
links 3/24/25: https://roamresearch.com/#/app/srcpublic/page/03-24-2025
I ran quick experiments that make me think that it's somewhat hard for LLMs to learn radically new encodings in an unsupervised way, and thus that LLMs probably won't learn to speak new incomprehensible languages as a consequence of big r1-like RL in the next few years.
The experiments
I trained Llama 3-8B and some medium-size internal Anthropic models to speak using an encoding style that is very rare on the internet (e.g. map each letter to a random name, and join the names) with SFT on the encoded text and without providing translation pairs. I find ...
I find both the views below compellingly argued in the abstract, despite being diametrically opposed, and I wonder which one will turn out to be the case and how I could tell, or alternatively if I were betting on one view over another, how should I crystallise the bet(s).
One is exemplified by what Jason Crawford wrote here:
...The acceleration of material progress has always concerned critics who fear that we will fail to keep up with the pace of change. Alvin Toffler, in a 1965 essay that coined the term “future shock,” wrote:
I believe that most human beings
I'm currently trying to write a human-AI trade idea similar to the idea by Rolf Nelson (and David Matolcsi), but one which avoids Nate Soares and Wei Dai's many refutations.
I'm planning to leverage logical risk aversion, which Wei Dai seems to agree with, and a complicated argument for why humans and ASI will have bounded utility functions over logical uncertainty. (There is no mysterious force that tries to fix the Pascal's Mugging problem for unbounded utility functions, hence bounded utility functions are more likely to succeed)
I'm also working on argum...
Metastrategy = Cultivating good "luck surface area"?
Metastrategy: being good at looking at an arbitrary situation/problem, and figure out what your goals are, and what strategies/plans/tactics to employ in pursuit of those goals.
Luck Surface area: exposing yourself to a lot of situations where you are more likely to get valuable things in a not-very-predictable way. Being "good at cultivating luck surface area" means going to events/talking-to-people/consuming information that are more likely to give you random opportunities / new ways of thinking / new pa...
I guess a point here might also be that luck involves non-linear effects that are hard to predict and so when you're optimising for luck you need to be very conscious about not only looking at results but rather holding a frame of playing poker or similar.
So it is not something that your brain does normally and so it is a core skill of successful strategy and intellectual humility or something like that?