If military AGI is akin to nuclear bombs, then would it be justified to attack the country trying to militarize AGI? What would the first act of war in future wars be?
If a country A is building a nuke, then the argument for country B to pre-emptively attack it is that the first act of war involving nukes would effectively end country B. In this case, the act of war is still a physical explosion.
In case of AI, what would be the first act of war akin to physical explosion? Would country B be able to even detect if AI is being used against it? If ...
The "information war" sounds like politics as usual. Propaganda, censorship, Twitter, TikTok -- all have existed long before AGI.
International politics is not about fairness, but about how strong you are, who are your allies, and how far you can go before they stop supporting you. Israel can do whatever it wants, because there are many people in USA who will defend it no matter what. India does not have that kind of support. On the other hand, there is probably no need to worry about Trump; he always says some strong words to show his fans who is the boss,...
Are there known "rational paradoxes", akin to logical paradoxes ? A basic example is the following :
In the optimal search problem, the cost of search at position i is C_i, and the a priori probability of finding at i is P_i.
Optimality requires to sort search locations by non-decreasing P_i/C_i : search in priority where the likelyhood of finding divided by the cost of search is the highest.
But since sorting cost is O(n log(n)), C_i must grow faster than O(log(i)) otherwise sorting is asymptotically wastefull.
Do you know any other ?
There could be pathological cases where all P_i/C_i are the same up to epsilon.
We could dismiss that by saying that if the ratios are the same up to epsilon, then it does not truly matter which one of them we choose.
(Mathematically speaking, we could redefine the problem from "choosing the best option" to "making sure that our regret is not greater than X".)
I think there's a weak moral panic brewing here in terms of LLM usage, leading people to jump to conclusions they otherwise wouldn't, and assume "xyz person's brain is malfunctioning due to LLM use" before considering other likely options. As an example, someone on my recent post implied that the reason I didn't suggest using spellcheck for typo fixes was because my personal usage of LLMs was unhealthy, rather than (the actual reason) that using the browser's inbuilt spellcheck as a first pass seemed so obvious to me that it didn't bear mentioning.
Even if ...
It is quite high:
The current thinking is that although around 1.5 to 3.5% of people will meet diagnostic criteria for a psychotic disorder, a significantly larger, variable number will experience at least one psychotic symptom in their lifetime.
The cited study is a survey of 7076 people in the Netherlands, which mentions:
For me, a crux about the impact of AI on education broadly is how our appetite for entertainment behaves at the margins close to entertainment saturation.
Possibility 1: it will always be very tempting to direct our attention to the most entertaining alternative, even at very high levels of entertainment
Possibility 2: there is some absolute threshold of entertainment above which we become indifferent between unequally entertaining alternatives
If Possibility 1 holds, I have a hard time seeing how any kind of informational or educational content, which is con...
There is some conceptual misleadingness with the usual ways of framing algorithmic progress. Imagine that in 2022 the number of apples produced on some farm increased 10x year-over-year, then in 2023 the number of oranges increased 10x, and then in 2024 the number of pears increased 10x. That doesn't mean that the number of fruits is up 1000x in 3 years.
Price-performance of compute compounds over many years, but most algorithmic progress doesn't, it only applies to the things relevant around the timeframe when that progress happens, and stops being applica...
I think pretraining data pipeline improvements have this issue, they stop helping with larger models that want more data (or it becomes about midtraining). And similarly for the benchmark-placating better post-training data that enables ever less intelligent models to get good scores, but probably doesn't add up to much (at least when it's not pretraining-scale RLVR).
Things like MoE, GLU over LU, maybe DyT or Muon add up to a relatively modest compute multiplier over the original Transformer. For example Transformer++ vs. Transformer in Figure 4 of the Mam...
a youtuber with 25k subscribers, with a channel on technical deep learning, is making a promo vid for the moonshot program.
Talking about what alignment is, what agent foundations is, etc. His phd is in neuroscience.
do you want to comment on the script?
https://docs.google.com/document/d/1YyDIj2ohxwzaGVdyNxmmShCeAP-SVlvJSaDdyFdh6-s/edit?tab=t.0
btw, for links and stuff,
e.g. to lesswrong posts, see the planning tab please
and the format of:
Link:
What info to extract from this link:
How a researcher can use this info to solve alignment:
I have recurring worries about how what I've done could turn out to be net-negative.
I guess I was imagining an implied "in expectation", like predictions about second order effects of a certain degree of speculativeness are inaccurate enough that they're basically useless, and so shouldn't shift the expected value of an action. There are definitely exceptions and it'd depend how you formulate it, but "maybe my action was relevant to an emergent social phenomenon containing many other people with their own agency, and that phenomenon might be bad for abstract reasons, but it's too soon to tell" just feels like... you couldn't have anticipa...
Maybe there's a deep connection between:
(a) human propensity to emotionally adjust to the goodness / badness our recent circumstances such that we arrive at emotional homeostasis and it's mostly the relative level / the change in circumstances that we "feel"
(b) batch normalization, the common operation for training neural networks
Our trailing experiences form a kind of batch of "training data" on which we update, and perhaps we batchnorm their goodness since that's the superior way to update on data without all the pathologies of not normalizing.
Does anyone here have any tips on customizing and testing their AI? Personally, if I'm asking for an overview of a subject I'm unfamiliar with, I want the AI to examine things from a skeptical point of view. My main test case for this was: "What can you tell me about H. H. Holmes?" Initially, all the major AIs I tried, like ChatGPT, failed badly. But it seems they're doing better with that question nowadays, even without customization.
Why ask that question? Because there is an overwhelming flood of bad information about H. H. Holmes that drowns out more pl...
Slightly different, but I tried some experiments deliberately misspelling celebrity names - note how when I ask about "Miranda June" and then say "sorry I got my months mixed up" it apologizes that it knows nothing, yet correctly describes Miranda July as a "artist, filmmaker, writer and actress"
superintelligence may not look like we expect. because geniuses don't look like we expect.
for example, if einstein were to type up and hand you most of his internal monologue throughout his life, you might think he's sorta clever, but if you were reading a random sample you'd probably think he was a bumbling fool. the thoughts/realizations that led him to groundbreaking theories were like 1% of 1% of all his thoughts.
for most of his research career he was working on trying to disprove quantum mechanics (wrong). he was trying to organize a political movemen...
Yes, that's the kind of thing I find impressive/scary. Not merely generating ideas.
The ERROR Project: https://error.reviews/
Quoting Malte Elson
The very short description of ERROR is that we pay experts to examine important and influential scientific publications for errors in order to strengthen the culture of error checking, error acceptance, and error correction in our field. As in other bug bounty programs, the payout scales with the magnitude of errors found. Less important errors pay a smaller fee, whereas more important errors that affect core conclusions yield a larger payout.
We expect most published research to contain at least s...
We expect most published research to contain at least some errors
Have you set up the prediction markets on that? Not necessarily "is there an error in this paper", but "in this group of publications, what fraction has an issue of this kind" and so on.
But most of LLMs' knowledge comes from the public Web, so clearly there is still a substantial amount of useful content on it, and maybe if search engines had remained good enough at filtering spam fewer people would have fled to Discord.
On using LLMs for review and self-critique while avoiding sycophantic failure modes:
(Originally written as a reply to Kaj's post)
For a long time, just as long as they were productively capable of the same, I've used LLMs to review my writing, be it fictional or not, or offer feedback and critique.
The majority of LLMs are significantly sycophantic, to the point that you have to meaningfully adjust downwards unless you're in it for the sole purpose of flattering your ego. I've noticed this to a degree in just about all of them, but it's particula...
Terminal Recursion – A Thought Experiment on Consciousness at Death
I had a post recently rejected for being too speculative (which I totally understand!). I'm 16 and still learning, but I'm interested in feedback on this idea, even if it's unprovable.
What if, instead of a flash of memories, the brain at death enters a recursive simulation of life, creating the illusion that it’s still alive? Is this even philosophically coherent or just a fancy solipsism trap? Would love your thoughts.
It seems to me that many disagreements regarding whether the world can be made robust against a superintelligent attack (e. g., the recent exchange here) are downstream of different people taking on a mathematician's vs. a hacker's mindset.
...A mathematician might try to transform a program up into successively more abstract representations to eventually show it is trivially correct; a hacker would prefer to compile a program down into its most concrete representation to brute force all execution paths & find an exploit trivially proving it
I agree with this to first order, and I agree that even relatively mundane stuff does allow the AI to take over eventually, and I agree that in the longer run, ASI v human warfare likely wouldn't have both sides as peers, because it's plausibly relatively easy to make humans coordinate poorly, especially relative to ASI ability to coordinate.
There's a reason I didn't say AI takeover was impossible or had very low odds here, I still think AI takeover is an important problem to work on.
But I do think it actually matters here, because it informs stuff like ho...
People sometimes ask me what's good about glowfic, as a reader.
You know that extremely high-context joke you could only make to that one friend you've known for years, because you shared a bunch of specific experiences which were load-bearing for the joke to make sense at all, let alone be funny[1]? And you know how that joke is much funnier than the average low-context joke?
Well, reading glowfic is like that, but for fiction. You get to know a character as imagined by an author in much more depth than you'd get with traditional fiction, becaus...
I haven’t read glowfic before, but this resonates with me re: what’s fun about all-star seasons of Survivor. Players you know now being in new contexts, with new opponents, and with you maybe having more emotional stake in a specific player winning because you’ve rooted for them previously.
In the case of Survivor though, you also get players who are now aware of their edit/meta, and which can sometimes flanderize them or cause them to have baggage that requires a change in strategy. That’s sometimes interesting - can this known liar somehow trick people _again- ? - but also sometimes results in the fan favorite players getting voted out earlier.
My first reading was "O" as zero and "I" as one, and the message felt mysterious and IT-related but I couldn't decipher its intended meaning.
Plans are worthless, but planning is everything.
I believe this wholeheartedly. Planning demands thinking realistically about the situation and the goals, and that dramatically ups the odds of success. Plans won't perfectly address what actually occurs, but that's evidence you need more, not less, planning.
The responses to @Marius Hobbhahn's What’s the short timeline plan? convinced me that we are in need of better plans for alignment. Fairly complete plans were given for control and interpretability, ...
Thanks Seth! I appreciate you signal boosting this and laying out your reasoning for why planning is so critical for AI safety.