Dynamic pricing is reducing consumer surplus, and in the limit of perfect AI pricing algorithms and no competition it would go to zero. If most nonfungible goods were subject to perfect price discrimination, what would the world look like? I genuinely have little idea.
There must be increased firm profits but consumer surplus will only come from commodities. So even the firm owners have nothing to spend their wealth on that actually gives them >ε utility more than sitting at home, making sandwiches from commodity bread and cheese and watching commodity T...
Short timelines, slow takeoff vs. Long timelines, fast takeoff
Due to chain-of-thought in the current paradigm seeming like great news for AI safety, some people seem to have the following expectations:
Short timelines: CoT reduces risks, but shorter preparation time increases the odds of catastrophe.
Long timelines: the current paradigm is not enough; therefore, CoT may stop being relevant, which may increase the odds of catastrophe. We have more time to prepare (which is good), but we may get a faster takeoff than the current paradigm makes it seem like. An...
I agree with this analysis. I think we probably have somewhat better odds on the current path, particularly if we can hold onto faithful CoT.
I expect some one major changes of continuous learning, but CoT and most of the LLM paradigm may stay pretty similar up to takeoff.
Steve Byrnes' Foom & Doom is a pretty good guess at what we get if the current paradigm doesn't pan out.
In a post on Solomonoff Induction (and also in this wiki entry), Yudkowsky describes Shannon’s minimax algorithm for searching the entire chess game-tree as an example of conceptual progress. Previously, Edgar Allen Poe had argued that it was impossible in principle for a machine to play chess well. With Shannon’s algorithm, it became possible in principle, just computationally infeasible.
However, even “principled” algorithms like minimax search don’t take into account the possibility that your opponent knows things you don’t know (as I’ve previously discu...
Thirdly, although I’ve been talking about the “value” of a position as if it’s a well-defined concept, it mostly isn’t. Stockfish’s value calculations are grounded in the likelihood of it winning from that position when playing itself. But there’s no clear way to translate from that to the likelihood of winning against one’s actual opponent, which is what we’re interested in. I won’t discuss this further here, but trying to pin down how to estimate a position’s value in that sense seems potentially fruitful.
I'll take a shot. If is...
My colleagues and I are finding it difficult to replicate results from several well-received AI safety papers. Last week, I was working with a paper that has over 100 karma on LessWrong and discovered it is mostly false but gives nice-looking statistics only because of a very specific evaluation setup. Some other papers have even worse issues.
I know that this is a well-known problem that exists in other fields as well, but I can’t help but be extremely annoyed. The most frustrating part is that this problem should be solvable. If a junior-level p...
You're correct. It's over 100 karma which is very different than 100 upvotes. I'll edit the original comment. Thanks!
Crossposted from Twitter:
This year I’ve been thinking a lot about how the western world got so dysfunctional. Here’s my rough, best-guess story:
1. WW2 gave rise to a strong taboo against ethnonationalism. While perhaps at first this taboo was valuable, over time it also contaminated discussions of race differences, nationalism, and even IQ itself, to the point where even truths that seemed totally obvious to WW2-era people also became taboo. There’s no mechanism for subsequent generations to create common knowledge that certain facts are true but usefully ...
That is, while it was bad for the people who didn't get rule of law, they were a separate enough category that this mostly didn't "leak into" undermining the legal mechanisms that helped their societies become productive and functional in the first place.
I'm speaking speculatively here, but I don't know that it didn't leak out and undermine the mechanism that supported productive and functional societies. The sophisticated SJW in me suggests that this is part of what caused the eventual (though not yet complete) erosion of those mechanisms.
It seems like if...
Quick experiment: how good is weight steering against training gamers? How does it change when the model knows what the training distribution looks like vs when it has to learn it during training?
Hopes:
Error-correcting codes work by running some algorithm to decode potentially-corrupted data. But what if the algorithm might also have been corrupted? One approach to dealing with this is triple modular redundancy, in which three copies of the algorithm each do the computation and take the majority vote on what the output should be. But this still creates a single point of failure—the part where the majority voting is implemented. Maybe this is fine if the corruption is random, because the voting algorithm can constitute a very small proportion of the total...
No I agree with that. I thought the tree design already involved weighted sums overpowering each other, but I think that was premature.
Today's Inkhaven post is an edit to yesterday's, adding more examples of legitimacy-making characteristics, so I'm posting it in shortform so that I can link it separately:
Here are some potential legitimacy-relevant characteristics:
At the beginning of November, I learned about a startup called Red Queen Bio, that automates the development of viruses and related lab equipment. They work together with OpenAI, and OpenAI is their lead investor.
On November 13, they publicly announced their launch:
...Today, we are launching Red Queen Bio (http://redqueen.bio), an AI biosecurity company, with a $15M seed led by @OpenAI. Biorisk grows exponentially with AI capabilities. Our mission is to scale biological defenses at the same rate. A
on who we are + what we do!
[...]
We also need *financial* co-
Uhm yeah valid I guess the issue was illusion of transparency: I mostly copied the original post from my tweet, which was quote-tweeting the announcement, and I didn’t particularly think about adding more context because had it cached that the tweet is fine (I checked with people closely familiar with RQB before tweeting, and it did include all of the context by virtue of quote-tweeting the original announceemnt) and when posting to lw did not realize i'm not directly adding all of the context that was included in the tweet if people don't click on the lin...
Just finished reading Red Heart by Max Harms. I like it!
Just finished reading Red Heart by Max Harms. I like it!
Dump of my thoughts:
(1) The ending felt too rushed to me. I feel like that's the most interesting part of the story and it all goes by in a chapter. Spoiler warning!
I'm not sure I understand the plot entirely. My current understanding is: Li Fang was basically on a path to become God-Emperor because Yunna was corrigible to him and superior to all rival AIs, and the Party wasn't AGI-pilled enough to realize the danger. Li Fang was planning t
Thanks for the reply! I'm afraid I haven't read Crystal Society, but on that recommendation I will.
I still think it would be great if you wrote more content of the sort I'm asking for. Put it this way: I imagine a bunch of readers will bounce off the ending, having a reaction "and then things stopped making sense, there was a power struggle over who the AI should be most loyal to and as a result the AI just sorta snapped and took over the world and was very bad for no reason. I feel like that's when it went from hard sci-fi to soft sci-fi, or worse,
REASONS BESIDES JEALOUSY TO NOT BE POLYAMOROUS
Recently Amanda Bethlehem published comparing monogamous jealousy to kidney disease. Eneasz Brodski doubled down on this. I disagree with a lot of their implications, but today I'm going to focus on the implicit claim that jealousy is the only reason to be monogamous. Here is a list of other reasons you might choose monogamy:
Yes, sure! That comment was not very thoughtful.
I really liked Scott Alexander's post on AI consciousness. He doesn't really discuss the paper of Butlin et al which is fine by me. I can never get excited about consciousness research. It always seemed like drawing the target around the dart, where the dart is humans, and you're trying (not always successfully) to make a definition that includes all humans but doesn't include thermostats. I can't get that excited about whether or not LLM have the "something something feedback" or whether having chain of thoughts changes anything.
Maybe, like Scott says, it...
Not a big believer in hypotheticals. Mind uploading gets into some very weird issues. I will leave it to decide to future society when and if that happens. I would say that if people are torturing anything for fun, even if that thing has no capacity for pain, then that doesn’t sound morally good to me.
I read this older post by Nate Soares from 2023, AI as a Science, and Three Obstacles to Alignment Strategies, a pretty prescient overview of challenges in alignment research.
Alignment is difficult because (1) alignment and capabilities are intertwined (alignment research helping capabilities), (2) we don't have a process to verify what good ideas or progress look like, and we likely get (3) only one critical try. He already addresses many of the counterarguments that are getting brought up recently.
(1) Without any strong governance, a lot of alignment wor...
Reddit Mod is a "job" AI will likely replace.
There currently is a kerfuffle over the actions of a moderator of r/art which seem to be a shining example of the petty and arbitrary cruelty in which Reddit mods are free to indulge as unpaid volunteers. This made me think that Reddit Mods are very similar to Dungeon Masters in Hasbro's Dungeons and Dragons game, although why may not be clear to you readers.
Hasbro's DnD seems to be on a mission to alienate DMs from DnD in order to more directly access the players, which Hasbro llikely views as their real custom...
" Recently there has been rumors of a move to disallow the DM screen, a folding barrier which allowed the DM to conceal things like maps of areas yet unseen by players or detailed character sheets for NPCs. "
Interestingly, co-creator of D&D Dave Arnesson experimented with a DM screen so big he was invisible to the players to see if that increased their immersion. He saw it as a tool for the players. He also didn't always give players their character sheets, for the same reason, always looking for ways to make the player have a better experience. ...
I just discovered that I apparently independently invented already existing rephrasing of smoking lesion problem with toxoplasmosis. This is funny.
Do anyone have a strong takes about the probability Chinese labs are attempting to mislead western researchers by publishing misleading arguments/papers?
The thought occured to me because jus a few hours ago deepseek released this https://github.com/deepseek-ai/DeepSeek-Math-V2/blob/main/DeepSeekMath_V2.pdf and when they make releases of this type, I typically drop everything to read them, because so much of what frontier labs do is closed, but deepseek publishes details, which are gems that update my understanding.
But I worry somewhat that they could...
That sounds about right to me.
My main disagreement with Futurism or Transhumanism is that technological progress does not correlate strongly with satisfaction and happiness. And if there isn't a strong correlation, then, what's the point? Every additional technological step is another faustian bargain.
I guess I would just like to observe that silicon valleys innovation mindset of relentlessly questioning assumptions does not allow for the questioning of itself. If you disagree, kindly direct me!
This is my first post and my test of the waters so to speak. Hope to discuss.
Thoughts On Evaluation Awareness in Claude Opus 4.5.
Context:
Anthropic released Claude Opus 4.5 earlier today (model card). Opus 4.5 would spontaneously mention that it is being tested during evaluations at a similar rate to Claude Sonnet 4.5, but lower than Haiku 4.5 (pg. 65).
Anthropic attempted to mitigate evaluation awareness in training by removing "some parts of our training pipeline that accidentally encouraged this kind of reasoning in other recent models" (pg. 65). The model card later mentioned that Sonnet 4.5 was trained on "prompts th...
I think such consistency training on outputs would result in a model that's basically always eval aware on your training distribution, and after that point your consistency training has no gradient. Then you have this super eval-aware model, run it on your (held-out, OOD) evaluations, and hope that in those it's just eval aware enough so you can make conclusions about the model behaving the same if it were in an eval or not.
Is this the intent, or do you have a specific training method in mind that avoids this?
"Common sense" advice that is highly rational and easily overlooked by A LOT of people, including smart people
This is your regular reminder.
1. Buy fire alarms, a fire blanket, a strong light torch for blackouts, and a first aid kit (this is the bare minimum low-cost emergency kit that can save you a lot of hassle and potentially your life).
1.2 Do NOT have heavy power-draining appliances connected via extension cord. Fridges and washing machines should never sit in an extension cord unless you know what you are doing and that it can handle the strain....
Thinking about this much harder, I would rewrite the post to clearly phrase it as damage prevention you can do fast and cheap from your home.
I notice I forgot to mention a first aid kit. First thing to correct.
Secondly, when thinking hard on what of your feedback might actually be useful, I realized that statistically speaking you may do more good by buying slip mats for wet floors, and potentially investing in premium ladders. Since fall accidents dominate injury statistics. Not something I thought about intuitively but makes sense.