My mainline prediction scenario for the next decades.
My mainline prediction * :
governments will act quickly and (relativiely) decisively to bring these agents under state-control. national security concerns will dominate.
I dunno, like 20 years ago if someone had said “By the time somebody creates AI that displays common-sense reasoning, passes practically any written test up including graduate-level, (etc.), obviously governments will be flipping out and nationalizing AI companies etc.”, to me that would have seemed like a reasonable claim. But here we are, and the idea of the USA govt nationalizing OpenAI seems a million miles outside the Overton window.
Likewise, if someone said “After it becomes clear to everyone that lab leaks can cause pandemics costing trillions of dollars and millions of lives, then obviously governments will be flipping out and banning the study of dangerous viruses—or at least, passing stringent regulations with intrusive monitoring and felony penalties for noncompliance etc,” then that would also have sounded reasonable to me! But again, here we are.
So anyway, my conclusion is that when I ask my intuition / imagination whether governments will flip out in thus-and-such circumstance, my intuition / imagination is really ba...
One strong reason to think the AI case might be different is that US national security will be actively using AI to build weapons and thus it will be relatively clear and salient to US national security when things get scary.
I think this will look a bit outdated in 6-12 months, when there is no longer a clear distinction between LLMs and short term planning agents, and the distinction between the latter and LTPAs looks like a scale difference comparable to GPT2 vs GPT3 rather than a difference in kind. At what point do you imagine a national government saying "here but no further?".
I think scaffolding is the wrong metaphor. Sequences of actions, observations and rewards are just more tokens to be modeled, and if I were running Google I would be busy instructing all work units to start packaging up such sequences of tokens to feed into the training runs for Gemini models. Many seemingly minor tasks (e.g. app recommendation in the Play store) either have, or could have, components of RL built into the pipeline, and could benefit from incorporating LLMs, either by putting the RL task in-context or through fine-tuning of very fast cheap models.
So when I say I don't see a distinction between LLMs and "short term planning agents" I mean that we already know how to subsume RL tasks into next token prediction, and so there is in some technical sense already no distinction. It's a question of how the underlying capabilities are packaged and deployed, and I think that within 6-12 months there will be many internal deployments of LLMs doing short sequences of tasks within Google. If that works, then it seems very natural to just scale up sequence length as generalisation improves.
Arguably fine-tuning a next-token predictor on action, observation, reward sequences, or doing it in-context, is inferior to using algorithms like PPO. However, the advantage of knowledge transfer from the rest of the next-token predictor's data distribution may more than compensate for this on some short-term tasks.
in my reading, Strawberry is showing that indeed scaling just pretraining transformers will *not* lead to AGI. The new paradigm is inference-scaling - the obvious next step is doing RL on long horizons and sparse data domains. I have been saying this ever since gpt-3 came out.
For the question of general intelligence imho the scaling is conceptually a red herring: any (general purpose) algorithm will do better when scaled. The key in my mind is the algorithm not the resource, just like I would say a child is generally intelligent while a pocket calculator is not even if the child can't count to 20 yet. It's about the meta-capability to learn not the capability.
As we spoke earlier - it was predictable that this was going to be the next step. It was likely it was going to work, but there was a hopeful world in which doing the obvious thing turned out to be harder. That hope has been dashed - it suggests longer horizons might be easy too. This means superintelligence within two years is not out of the question.
I'm a bit confused by what you mean by "LLMs will not scale to AGI" in combination with "a single innovation is all that is needed for AGI".
E.g., consider the following scenarios:
IMO, these sound very similar to "LLMs scale to AGI" for many practical purposes:
Maybe it is really key in your view that the single innovation is really discontinuous and maybe the single innovation doesn't really require LLM scaling.
I think a single innovation left to create LTPA is unlikely because it runs contrary to the history of technology and of machine learning. For example, in the 10 years before AlphaGo and before GPT-4, several different innovations were required-- and that's if you count "deep learning" as one item. ChatGPT actually understates the number here because different components of the transformer architecture like attention, residual streams, and transformer++ innovations were all developed separately.
Then I think you should specify that progress within this single innovation could be continuous over years and include 10+ ML papers in sequence each developing some sub-innovation.
Misgivings about Category Theory
[No category theory is required to read and understand this screed]
A week does not go by without somebody asking me what the best way to learn category theory is. Despite it being set to mark its 80th annivesary, Category Theory has the evergreen reputation for being the Hot New Thing, a way to radically expand the braincase of the user through an injection of abstract mathematics. Its promise is alluring, intoxicating for any young person desperate to prove they are the smartest kid on the block.
Recently, there has been significant investment and attention focused on the intersection of category theory and AI, particularly in AI alignment research. Despite the influx of interest I am worried that it is not entirely understood just how big the theory-practice gap is.
I am worried that overselling risks poisoning the well for the general concept of advanced mathematical approaches to science in general, and AI alignment in particular. As I believe mathematically grounded approaches to AI alignment are perhaps the only way to get robust worst-case safety guarantees for the superintelligent regime I think this would be bad.
I find it difficult...
Modern mathematics is less about solving problems within established frameworks and more about designing entirely new games with their own rules. While school mathematics teaches us to be skilled players of pre-existing mathematical games, research mathematics requires us to be game designers, crafting rule systems that lead to interesting and profound consequences
I don't think so. This probably describes the kind of mathematics you aspire to do, but still the bulk of modern research in mathematics is in fact about solving problems within established frameworks and usually such research doesn't require us to "be game designers". Some of us are of course drawn to the kinds of frontiers where such work is necessary, and that's great, but I think this description undervalues the within-paradigm work that is the bulk of what is going on.
The Padding Argument or Simplicity = Degeneracy
[I learned this argument from Lucius Bushnaq and Matthias Dellago. It is also latent already in Solomonoff's original work]
Consider binary strings of a fixed length
Imagine feeding these strings into some turing machine; we think of strings as codes for a function. Suppose we have a function that can be coded by a short compressed string of length . That is, the function is computable by a small program.
Imagine uniformly sampling a random code for . What number of the codes implement the same function as the string ? It's close to . Indeed, given the string of length we can 'pad' it to a string of length by writing the code
"run skip "
where is an arbitrary string of length where is a small constant accounting for the overhead. There are approximately of such binary strings. If our programming language has a simple skip / commenting out functionality then we expect approximately codes encoding the same function as . The fr...
Re: the SLT dogma.
For those interested, a continuous version of the padding argument is used in Theorem 4.1 of Clift-Murfet-Wallbridge to show that the learning coefficient is a lower bound on the Kolmogorov complexity (in a sense) in the setting of noisy Turing machines. Just take the synthesis problem to be given by a TM's input-output map in that theorem. The result is treated in a more detailed way in Waring's thesis (Proposition 4.19). Noisy TMs are of course not neural networks, but they are a place where the link between the learning coefficient in SLT and algorithmic information theory has already been made precise.
For what it's worth, as explained in simple versus short, I don't actually think the local learning coefficient is algorithmic complexity (in the sense of program length) in neural networks, only that it is a lower bound. So I don't really see the LLC as a useful "approximation" of the algorithmic complexity.
For those wanting to read more about the padding argument in the classical setting, Hutter-Catt-Quarel "An Introduction to Universal Artificial Intelligence" has a nice detailed treatment.
How to prepare for the coming Taiwan Crisis? Should one short TSMC? Dig a nuclear cellar?
Metaculus gives a 25% of a fullscale invasion of Taiwan within 10 years and a 50% chance of a blockade. It gives a 65% chance that if China invades Taiwan before 2035 the US will respond with military force.
Metaculus has very strong calibration scores (apparently better than prediction markets). I am inclined to take these numbers as the best guess we currently have of the situation.
Is there any way to act on this information?
Novel Science is Inherently Illegible
Legibility, transparency, and open science are generally considered positive attributes, while opacity, elitism, and obscurantism are viewed as negative. However, increased legibility in science is not always beneficial and can often be detrimental.
Scientific management, with some exceptions, likely underperforms compared to simpler heuristics such as giving money to smart people or implementing grant lotteries. Scientific legibility suffers from the classic "Seeing like a State" problems. It constrains endeavors to the least informed stakeholder, hinders exploration, inevitably biases research to be simple and myopic, and exposes researchers to constant political tug-of-war between different interest groups poisoning objectivity.
I think the above would be considered relatively uncontroversial in EA circles. But I posit there is something deeper going on:
Novel research is inherently illegible. If it were legible, someone else would have already pursued it. As science advances her concepts become increasingly counterintuitive and further from common sense. Most of the legible low-hanging fruit has already been picked, and novel research requires venturing higher into the tree, pursuing illegible paths with indirect and hard-to-foresee impacts.
Novel research is inherently illegible.
I'm pretty skeptical of this and think we need data to back up such a claim. However there might be bias: when anyone makes a serendipitous discovery it's a better story, so it gets more attention. Has anyone gone through, say, the list of all Nobel laureates and looked at whether their research would have seemed promising before it produced results?
Thanks for your skepticism, Thomas. Before we get into this, I'd like to make sure actually disagree. My position is not that scientific progress is mostly due to plucky outsiders who are ignored for decades. (I feel something like this is a popular view on LW). Indeed, I think most scientific progress is made through pretty conventional (academic) routes.
I think one can predict that future scientific progress will likely be made by young smart people at prestigious universities and research labs specializing in fields that have good feedback loops and/or have historically made a lot of progress: physics, chemistry, medicine, etc
My contention is that beyond very broad predictive factors like this, judging whether a research direction is fruitful is hard & requires inside knowledge. Much of this knowledge is illegible, difficult to attain because it takes a lot of specialized knowledge etc.
Do you disagree with this ?
I do think that novel research is inherently illegible. Here are some thoughts on your comment :
1.Before getting into your Nobel prize proposal I'd like to caution for Hindsight bias (obvious reasons).
And perhaps to some degree I'd like to argue the burden of proo
I guess I'm not sure what you mean by "most scientific progress," and I'm missing some of the history here, but my sense is that importance-weighted science happens proportionally more outside of academia. E.g., Einstein did his miracle year outside of academia (and later stated that he wouldn't have been able to do it, had he succeeded at getting an academic position), Darwin figured out natural selection, and Carnot figured out the Carnot cycle, all mostly on their own, outside of academia. Those are three major scientists who arguably started entire fields (quantum mechanics, biology, and thermodynamics). I would anti-predict that future scientific progress, of the field-founding sort, comes primarily from people at prestigious universities, since they, imo, typically have some of the most intense gatekeeping dynamics which make it harder to have original thoughts.
Thank you, Thomas. I believe we find ourselves in broad agreement. The distinction you make between lay-legibility and expert-legibility is especially well-drawn.
One point: the confidence of researchers in their own approach may not be the right thing to look at. Perhaps a better measure is seeing who can predict not only their own approach will succed but explain in detail why other approaches won't work. Anecdotally, very succesful researchers have a keen sense of what will work out and what won't - in private conversation many are willing to share detailed models why other approaches will not work or are not as promising. I'd have to think about this more carefully but anecdotally the most succesful researchers have many bits of information over their competitors not just one or two. (Note that one bit of information means that their entire advantage could be wiped out by answering a single Y/N question. Not impossible, but not typical for most cases)
Why don't animals have guns?
Or why didn't evolution evolve the Hydralisk?
Evolution has found (sometimes multiple times) the camera, general intelligence, nanotech, electronavigation, aerial endurance better than any drone, robots more flexible than any human-made drone, highly efficient photosynthesis, etc.
First of all let's answer another question: why didn't evolution evolve the wheel like the alien wheeled elephants in His Dark Materials?
Is it biologically impossible to evolve?
Well, technically, the flagella of various bacteria is a proper wheel.
No the likely answer is that wheels are great when you have roads and suck when you don't. Roads are build by ants to some degree but on the whole probably don't make sense for an animal-intelligence species.
Aren't there animals that use projectiles?
Hold up. Is it actually true that there is not a single animal with a gun, harpoon or other projectile weapon?
Porcupines have quils, some snakes spit venom, a type of fish spits water as a projectile to kick insects of leaves than eats insects. Bombadier beetles can produce an explosive chemical mixture. Skunks use some other chemicals. Some snails shoot harpoons from very c...
Shower thought - why are sunglasses cool ?
Sunglasses create an asymmetry in the ability to discern emotions between the wearer and nonwearer. This implicitly makes the wearer less predictable, more mysterious, more dangerous and therefore higher in a dominance hierarchy.
also see ashiok from mtg: whole upper face/head is replaced with shadow
also, masks 'create an asymmetry in the ability to discern emotions' but do not seem to lead to the rest
That's a good counterexample ! Masks are dangerous and mysterious, but not cool in the way sunglasses are in agree
My timelines are lengthening.
I've long been a skeptic of scaling LLMs to AGI *. To me I fundamentally don't understand how this is even possible. It must be said that very smart people give this view credence. davidad, dmurfet. on the other side are vanessa kosoy and steven byrnes. When pushed proponents don't actually defend the position that a large enough transformer will create nanotech or even obsolete their job. They usually mumble something about scaffolding.
I won't get into this debate here but I do want to note that my timelines have lengthened, primarily because some of the never-clearly-stated but heavily implied AI developments by proponents of very short timelines have not materialized. To be clear, it has only been a year since gpt-4 is released, and gpt-5 is around the corner, so perhaps my hope is premature. Still my timelines are lengthening.
A year ago, when gpt-3 came out progress was blindingly fast. Part of short timelines came from a sense of 'if we got surprised so hard by gpt2-3, we are completely uncalibrated, who knows what comes next'.
People seemed surprised by gpt-4 in a way that seemed uncalibrated to me. gpt-4 performance was basically in li...
With scale, there is visible improvement in difficulty of novel-to-chatbot ideas/details that is possible to explain in-context, things like issues with the code it's writing. If a chatbot is below some threshold of situational awareness of a task, no scaffolding can keep it on track, but for a better chatbot trivial scaffolding might suffice. Many people can't google for a solution to a technical issue, the difference between them and those who can is often subtle.
So modest amount of scaling alone seems plausibly sufficient for making chatbots that can do whole jobs almost autonomously. If this works, 1-2 OOMs more of scaling becomes both economically feasible and more likely to be worthwhile. LLMs think much faster, so they only need to be barely smart enough to help with clearing those remaining roadblocks.
Yes agreed.
What I don't get about this position: If it was indeed just scaling - what's AI research for ? There is nothing to discover, just scale more compute. Sure you can maybe improve the speed of deploying compute a little but at the core of it it seems like a story that's in conflict with itself?
Here are two arguments for low-hanging algorithmic improvements.
First, in the past few years I have read many papers containing low-hanging algorithmic improvements. Most such improvements are a few percent or tens of percent. The largest such improvements are things like transformers or mixture of experts, which are substantial steps forward. Such a trend is not guaranteed to persist, but that’s the way to bet.
Second, existing models are far less sample-efficient than humans. We receive about a billion tokens growing to adulthood. The leading LLMs get orders of magnitude more than that. We should be able to do much better. Of course, there’s no guarantee that such an improvement is “low hanging”.
Encrypted Batteries
(I thank Dmitry Vaintrob for the idea of encrypted batteries. Thanks to Adam Scholl for the alignment angle. Thanks to the Computational Mechanics at the receent compMech conference. )
There are no Atoms in the Void just Bits in the Description. Given the right string a Maxwell Demon transducer can extract energy from a heatbath.
Imagine a pseudorandom heatbath + nano-Demon. It looks like a heatbath from the outside but secretly there is a private key string that when fed to the nano-Demon allows it to extra lots of energy from the heatbath.
P.S. Beyond the current ken of humanity lies a generalized concept of free energy that describes the generic potential ability or power of an agent to achieve goals. Money, the golden calf of Baal is one of its many avatars. Could there be ways to encrypt generalized free energy batteries to constraint the user to only see this power for good? It would be like money that could be only spent on good things.
Imagine a pseudorandom heatbath + nano-Demon. It looks like a heatbath from the outside but secretly there is a private key string that when fed to the nano-Demon allows it to extra lots of energy from the heatbath.
What would a 'pseudorandom heatbath' look like? I would expect most objects to quickly depart from any sort of private key or PRNG. Would this be something like... a reversible computer which shuffles around a large number of blank bits in a complicated pseudo-random order every timestep*, exposing a fraction of them to external access? so a daemon with the key/PRNG seed can write to the blank bits with approaching 100% efficiency (rendering it useful for another reversible computer doing some actual work) but anyone else can't do better than 50-50 (without breaking the PRNG/crypto) and that preserves the blank bit count and is no gain?
* As I understand reversible computing, you can have a reversible computer which does that for free: if this is something like a very large period loop blindly shuffling its bits, it need erase/write no bits (because it's just looping through the same states forever, akin to a time crystal), and so can be computed indefinitely at arbitrarily low energy cost. So any external computer which syncs up to it can also sync at zero cost, and just treat the exposed unused bits as if they were its own, thereby saving power.
AGI companies merging within next 2-3 years inevitable?
There are currently about a dozen major AI companies racing towards AGI with many more minor AI companies. The way the technology shakes out this seems like unstable equilibrium.
It seems by now inevitable that we will see further mergers, joint ventures - within two years there might only be two or three major players left. Scale is all-dominant. There is no magic sauce, no moat. OpenAI doesn't have algorithms that her competitors can't copy within 6-12 months. It's all leveraging compute. Whatever innovations smaller companies make can be easily stolen by tech giants.
e.g. we might have xAI- Meta, Anthropic- DeepMind-SSI-Google, OpenAI-Microsoft-Apple.
Actuallly, although this would be deeply unpopular in EA circles it wouldn't be all that surprising if Anthropic and OpenAI would team up.
And - of course - a few years later we might only have two competitors: USA, China.
EDIT: the obvious thing to happen is that nvidia realizes it can just build AI itself. if Taiwan is Dune, GPUs are the spice, then nvidia is house Atreides
Whatever innovations smaller companies make can be easily stolen by tech giants.
And they / their basic components are probably also published by academia, though the precise hyperparameters, etc. might still matter and be non-trivial/costly to find.
I have a similar feeling, but there are some forces in the opposite direction:
In 2-3 years they would need to decide on training systems built in 3-5 years, and by 2027-2029 the scale might get to $200-1000 billion for an individual training system. (This is assuming geographically distributed training is solved, since such systems would need 5-35 gigawatts.)
Getting to a go-ahead on $200 billion systems might require a level of success that also makes $1 trillion plausible. So instead of merging, they might instead either temporarily give up on scaling further (if there isn't sufficient success in 2-3 years), or become capable of financing such training systems individually, without pooling efforts.
Current work on Markov blankets and Boundaries on LW is flawed and outdated. State of the art should factor through this paper on Causal Blankets; https://iwaiworkshop.github.io/papers/2020/IWAI_2020_paper_22.pdf
A key problem for accounts of blankets and boundaries I have seen on LW so far is the following elementary problem (from the paper):
"Therefore, the MB [Markov Blanket] formalism forbids interdependencies induced by past events that are kept in memory, but may not directly influence the present state of the blankets.
Thanks to Fernando Rosas telling me about this paper.
You may want to make this a linkpost to that paper as that can then be tagged and may be noticed more widely.
Paradox of Ignorance
Paul Christiano presents the "paradox of ignorance" where a weaker, less informed agent appears to outperform a more powerful, more informed agent in certain situations. This seems to contradict the intuitive desideratum that more information should always lead to better performance.
The example given is of two agents, one powerful and one limited, trying to determine the truth of a universal statement ∀x:ϕ(x) for some Δ0 formula ϕ. The limited agent treats each new value of ϕ(x) as a surprise and evidence about the generalization ∀x:ϕ(x). So it can query the environment about some simple inputs x and get a reasonable view of the universal generalization.
In contrast, the more powerful agent may be able to deduce ϕ(x) directly for simple x. Because it assigns these statements prior probability 1, they don't act as evidence at all about the universal generalization ∀x:ϕ(x). So the powerful agent must consult the environment about more complex examples and pay a higher cost to form reasonable beliefs about the generalization.
Is it really a problem?
However, I argue that the more powerful agent is act...
One of the interesting thing about AI minds (such as LLMs) is that in theory, you can turn many topics into testable science while avoiding the 'problem of old evidence', because you can now construct artificial minds and mold them like putty. They know what you want them to know, and so you can see what they would predict in the absence of knowledge, or you can install in them false beliefs to test out counterfactual intellectual histories, or you can expose them to real evidence in different orders to measure biases or path dependency in reasoning.
With humans, you can't do that because they are so uncontrolled: even if someone says they didn't know about crucial piece of evidence X, there is no way for them to prove that, and they may be honestly mistaken and have already read about X and forgotten it (but humans never really forget so X has already changed their "priors", leading to double-counting), or there is leakage. And you can't get people to really believe things at the drop of a hat, so you can't make people imagine, "suppose Napoleon had won Waterloo, how do you predict history would have changed?" because no matter how you try to participate in the spirit of the exerci...
What did Yudkoswky get right?
Pockets of Deep Expertise
Why am I so bullish on academic outreach? Why do I keep hammering on 'getting the adults in the room'?
It's not that I think academics are all Super Smart.
I think rationalists/alignment people correctly ascertain that most professors don't have much useful to say about alignment & deep learning and often say silly things. They correctly see that much of AI congress is fueled by labs and scale not ML academia. I am bullish on non-ML academia, especially mathematics, physics and to a lesser extent theoretical CS, neuroscience, some parts of ML/ AI academia. This is because while I think 95 % of academia is bad and/or useless there are Pockets of Deep Expertise. Most questions in alignment are close to existing work in academia in some sense - but we have to make the connection!
A good example is 'sparse coding' and 'compressed sensing'. Lots of mech.interp has been rediscovering some of the basic ideas of sparse coding. But there is vast expertise in academia about these topics. We should leverage these!
Other examples are singular learning theory, computational mechanics, etc
Feature request: author-driven collaborative editing [CITATION needed] for the Good and Glorious Epistemic Commons.
Often I find myself writing claims which would ideally have citations but I don't know an exact reference, don't remember where I read it, or am simply too lazy to do the literature search.
This is bad for scholarship is a rationalist virtue. Proper citation is key to preserving and growing the epistemic commons.
It would be awesome if my lazyness were rewarded by giving me the option to add a [CITATION needed] that others could then suggest (push) a citation, link or short remark which the author (me) could then accept. The contribution of the citator is acknowledged of course. [even better would be if there was some central database that would track citations & links like with crosslinking etc like wikipedia]
a sort hybrid vigor of Community Notes and Wikipedia if you will. but It's collaborative, not adversarial*
author: blablablabla
sky is blue [citation Needed]
blabblabla
intrepid bibliographer: (push) [1] "I went outside and the sky was blue", Letters to the Empirical Review
*community notes on twitter has been a universally lauded concept when it first launched. We are already seeing it being abused unfortunately, often used for unreplyable cheap dunks. I still think it's a good addition to twitter but it does show how difficult it is to create shared agreed-upon epistemics in an adverserial setting.
Corrupting influences
The EA AI safety strategy has had a large focus on placing EA-aligned people in A(G)I labs. The thinking was that having enough aligned insiders would make a difference on crucial deployment decisions & longer-term alignment strategy. We could say that the strategy is an attempt to corrupt the goal of pure capability advance & making money towards the goal of alignment. This fits into a larger theme that EA needs to get close to power to have real influence.
[See also the large donations EA has made to OpenAI & Anthropic. ]
Whether this strategy paid off... too early to tell.
What has become apparent is that the large AI labs & being close to power have had a strong corrupting influence on EA epistemics and culture.
Entropy and AI Forecasting
Until relatively recently (2018-2019?) I did not seriously entertain the possibility that AGI in our lifetime was possible. This was a mistake, an epistemic error. A rational observer calmly and objectively considering the evidence for AI progress over the prior decades - especially in the light of rapid progress in deep learning - should have come to the reasonable position that AGI within 50 years was a serious possibility (>10%).
AGI plausibly arriving in our lifetime was a reasonable position. Yet this possibility was almost universally ridiculed or ignored or by academics and domain experts. One can find quite funny interview with AI experts on Lesswrong from 15 years ago. The only AI expert agreeing with the Yudkowskian view of AI in our lifetime was Jurgen Schmidthuber. The other dozen AI experts denied it as unknowable or even denied the hypothetical possibility of AGI.
Yudkowsky earns a ton of Bayes points for anticipating the likely arrival of AGI in our lifetime long before the deep learning took off.
**************************
We are currently experiencing a rapid AI takeoff, plausibly culminating in superintelligence by ...
I know of only two people who anticipated something like what we are seeing far ahead of time; Hans Moravec and Jan Leike
I didn't know about Jan's AI timelines. Shane Legg also had some decently early predictions of AI around 2030(~2007 was the earliest I knew about)
shane legg had 2028 median back in 2008, see e.g. https://e-discoveryteam.com/2023/11/17/shane-leggs-vision-agi-is-likely-by-2028-as-soon-as-we-overcome-ais-senior-moments/
Yudkowsky didnt dismiss neural networks iirc. He just said that there were a lot of different approaches to AI and from the Outside View it didnt seem clear which was promising - and plausibly on an Inside View it wasnt very clear that aritificial neural networks were going to work and work so well.
Re:alignment I dont follow. We dont know who will be proved ultimately right on alignment so im not sure how you can make such strong statements about whether Yudkowsky was right or wrong on this aspect.
We havent really gained that much bits on this question and plausibly will not gain many until later (by which time it might be too late if Yudkowsky is right).
I do agree that Yudkowsky's statements occasionally feel too confidently and dogmatically pessimistic on the question of Doom. But I would argue that the problem is that we simply dont know well because of irreducible uncertainty - not that Doom is unlikely.
Crypticity, Reverse Epsilon Machines and the Arrow of Time?
[see https://arxiv.org/abs/0902.1209 ]
Our subjective experience of the arrow of time is occasionally suggested to be an essentially entropic phenomenon.
This sounds cool and deep but crashes headlong into the issue that the entropy rate and the excess entropy of any stochastic process is time-symmetric. I find it amusing that despite hearing this idea often from physicists and the like apparently this rather elementary fact has not prevented their storycrafting.
Luckily, computational mechanics provides us with a measure that is not time symmetric: the stochastic complexity of the epsilon machine
For any stochastic process we may also consider the epsilon machine of the reverse process, in other words the machine that predicts the past based on the future. This can be a completely different machine whose reverse stochastic complexity is not equal to .
Some processes are easier to predict forward than backward. For example, there is considerable evidence that language is such a process. If the stochastic complexity and the reverse stochastic complexity differ we speak of a causally a...
This sounds cool and deep but crashes headlong into the issue that the entropy rate and the excess entropy of any stochastic process is time-symmetric.
It's time symmetric around a starting point of low entropy. The further is from , the more entropy you'll have, in either direction. The absolute value is what matters.
In this case, is usually taken to be the big bang. So the further in time you are from the big bang, the less the universe is like a dense uniform soup with little structure that needs description, and the higher your entropy will be. That's how you get the subjective perception of temporal causality.
Presumably, this would hold to the other side of as well, if there is one. But we can't extrapolate past , because close to everything gets really really energy dense, so we'd need to know how to do quantum gravity to calculate what the state on the other side might look like. So we can't check that. And the notion of time as we're discussing it here might break down at those energies anyway.
Neural Network have a bias towards Highly Decomposable Functions.
tl;dr Neural networks favor functions that can be "decomposed" into a composition of simple pieces in many ways - "highly decomposable functions".
Degeneracy = bias under uniform prior
[see here for why I think bias under the uniform prior is important]
Consider a space of parameters used to implement functions, where each element specifies a function via some map . Here, the set is our parameter space, and we can think of each as representing a specific configuration of the neural network that yields a particular function .
The mapping assigns each point to a function . Due to redundancies and symmetries in parameter space, multiple configurations might yield the same function, forming what we call a fiber, or the "set of degenerates." of
This fiber is the set of ways in which the same functional behavior can be achieved by different parameterizations. If we uniformly sample from codes, the degeneracy of a function counts how likely it is to be sampl...
I have an embarrasing confession to make. I don't understand why PvsNP is so hard.
[I'm in good company since apparently Richard Feynmann couldn't be convinced it was a serious open problem.]
I think I understand PvsNP and its many variants like existence of one-way function is about computational hardness of certain tasks. It is surprising that we have such strong intuitions that some tasks are computationally hard but we fail to be able to prove it!
Of course I don't think I can prove it and I am not foolish enough to spend significant amount of time on trying to prove it. I still would like to understand the deep reasons why it's so hard to prove computational hardness results. That means I'd like to understand why certain general proof strategies are impossible or very hard.
There is an old argument by Shannon that proves that almost every* Boolean function has exponential circuit depth. This is a simple counting argument. Basically, there are exponentially many more Boolean functions than there are circuits. It's hard to give explicit examples of computationally hard functions** but we can easily show they are plentiful.
This would seem to settle...
I'm just computational complexity theory enthusiast, but my opinion is that P vs NP centered explanation of computational complexity is confusing. Explanation of NP should happen in the very end of the course.
There is nothing difficult in proving that computationally hard functions exist: time hierarchy theorem implies that, say, P is not equal EXPTIME. Therefore, EXPTIME is "computationally hard". What is difficult is to prove that very specific class of problems which have zero-error polynomial-time verification algorithms is "computationally hard".
Why no prediction markets for large infrastructure projects?
Been reading this excellent piece on why prediction markets aren't popular. They say that without subsidies prediction markets won't be large enough; the information value of prediction markets is often nog high enough.
Large infrastructure projects undertaken by governments, and other large actors often go overbudget, often hilariously so: 3x,5x,10x or more is not uncommon, indeed often even the standard.
One of the reasons is that government officials deciding on billion dollar infrastructure projects don't have enough skin in the game. Politicians are often not long enough in office to care on the time horizons of large infrastructure projects. Contractors don't gain by being efficient or delivering on time. To the contrary, infrastructure projects are huge cashcows. Another problem is that there are often far too many veto-stakeholders. All too often the initial bid is wildly overoptimistic.
Similar considerations apply to other government projects like defense procurement or IT projects.
Okay - how to remedy this situation? Internal prediction markets theoretically could prove beneficial. All stakeholders &...
Fractal Fuzz: making up for size
GPT-3 recognizes 50k possible tokens. For a 1000 token context window that means there are possible prompts. Astronomically large. If we assume the output of a single run of gpt is 200 tokens then for each possible prompt there are possible continuations.
GPT-3 is probabilistic, defining for each possible prompt () a distribution on a set of size , in other words a dimensional space. [1]
Mind-boggingly large. Compared to these numbers the amount of data (40 trillion tokens??) and the size of the model (175 billion parameters) seems absolutely puny in comparison.
I won't be talking about the data, or 'overparameterizations' in this short, that is well-explained by Singular Learning Theory. Instead, I will be talking about nonrealizability.
Nonrealizability & the structure of natural data
Recall the setup of (parametric) Bayesian learning: there is a sample space , a true distribution on and a parameterized family of probability distributions .
It is often assumed that the true distrib...
Q: What is it like to understand advanced mathematics? Does it feel analogous to having mastery of another language like in programming or linguistics?
A: It's like being stranded on a tropical island where all your needs are met, the weather is always perfect, and life is wonderful.
Except nobody wants to hear about it at parties.
level 0: A state of ignorance. you live in a pre-formal mindset. You don't know how to formalize things. You don't even know what it would even mean 'to prove something mathematically'. This is perhaps the longest. It is the default state of a human. Most anti-theory sentiment comes from this state. Since you've neve
You can't productively read Math books. You often decry that these mathematicians make books way too hard to read. If they only would take the time to explain things simply you would understand.
level 1 : all math is amorphous blob
You know the basic of writing an epsilon-delta proof. Although you don't know why the rules of maths are this or that way you can at least follow the recipes. You can follow simple short proofs, albeit slowly.
You know there are differen...
I say that knowing particular kinds of math, the kind that let you model the world more-precisely, and that give you a theory of error, isn't like knowing another language. It's like knowing language at all. Learning these types of math gives you as much of an effective intelligence boost over people who don't, as learning a spoken language gives you above people who don't know any language (e.g., many deaf-mutes in earlier times).
The kinds of math I mean include:
Elon building massive 1 million gpu data center in Tennessee. Tens of billions of dollars. Intends to leapfrog competitors.
EA handwringing about Sam Altman & anthropicstanning suddenly pretty silly?
all the focus on the minutia of OpenAI & Anthropic may very well end up misplaced.
This doesn't follow. The fact that OpenAI and Anthropic are racing contributes to other people like Musk deciding to race, too. This development just means that there's one more company to criticize.
A recently commonly heard viewpoint on the development of AI states that AI will be economically impactful but will not upend the dominancy of humans. Instead AI and humans will flourish together, trading and cooperating with one another. This view is particularly popular with a certain kind of libertarian economist: Tyler Cowen, Matthew Barnett, Robin Hanson.
They share the curious conviction that the probablity of AI-caused extinction p(Doom) is neglible. They base this with analogizing AI with previous technological transition of humanity, like the industrial revolution or the development of new communication mediums. A core assumption/argument is that AI will not disempower humanity because they will respect the existing legal system, apparently because they can gain from trades with humans.
The most extreme version of the GMU economist view is Hanson's Age of EM; it hypothesizes radical change in the form of a new species of human-derived uploaded electronic people which curiously have just the same dreary office jobs as we do but way faster.
Why is there trade & specialization in the first place?
Trade and specializ...
A recently commonly heard viewpoint on the development of AI states that AI will be economically impactful but will not upend the dominancy of humans. Instead AI and humans will flourish together, trading and cooperating with one another. This view is particularly popular with a certain kind of libertarian economist: Tyler Cowen, Matthew Barnett, Robin Hanson.
They share the curious conviction that the probablity of AI-caused extinction p(Doom) is neglible. They base this with analogizing AI with previous technological transition of humanity, like the industrial revolution or the development of new communication mediums. A core assumption/argument is that AI will not disempower humanity because they will respect the existing legal system, apparently because they can gain from trades with humans.
I think this summarizes my view quite poorly on a number of points. For example, I think that:
AI is likely to be much more impactful than the development of new communication mediums. My default prediction is that AI will fundamentally increase the economic growth rate, rather than merely continuing the trend of the last few centuries.
Biological humans are very unlikely to remain
Of Greater Agents and Lesser Agents
How do more sophisticated decision-makers differ from less sophisticated decision-makers in their behaviour and values?
Smarter more sophisticated decisionmakers engage in more and more complex commitments — including meta-commitments not to commit. Consequently, the values and behaviour of these more sophisticated decisionmakers "Greater Agents" are systematically biased compared to less sophisticated decisionmakers "Lesser Agents".
*******************************
Compared to Lesser Agents, the Greater Agents are more judgemental, (self-)righteous, punish naivité, are more long-term oriented, adaptive, malleable, self-modifying, legibly trustworthy and practice more virtue-signalling, strategic, engage in self-reflection & metacognition, engage in more thinking, less doing, symbolic reasoning, consistent & 'rational' in their preferences, they like money & currency more, sacred values less, value engagement in thinking over doing, engaged in more "global" conflicts [including multiverse-wide conflicts throguh acausal trade], less empirical, more rational, more universalistic in their morals, and more cosmopolitan in their esthetics, they...
Thermal vision cuts right through tree cover, traditional camouflage and the cover of night.
Human soldiers in the open are helpless against cheap FPS drones with thermal vision.
A youtubw channel went through a dozen countermeasures. Nothing worked except one: Umbrellas.
Wildlife Welfare Will Win
The long arc of history bend towards gentleness and compassion. Future generations will look with horror on factory farming. And already young people are following this moral thread to its logical conclusion; turning their eyes in disgust to mother nature, red in tooth and claw. Wildlife Welfare Done Right, compassion towards our pets followed to its forceful conclusion would entail the forced uploading of all higher animals, and judging by the memetic virulences of shrimp welfare to lower animals as well.
Morality-upon-reflex...
Hot Take #44: Preaching to the choir is 'good' actually.
EDIT: I was wrong. Theo the French Whale was the sharp. From the Kelly formula and his own statements his all things considered probability was 80-90% - he would need to possess an enormous amount of private information to justify such a deviation from other observers. It turns out he did. He commissioned his own secret polls using a novel polling method to compensate for the shy Trump voter.
https://x.com/FellowHominid/status/1854303630549037180
The French rich idiot who bought 75 million dollar of Trump is an EA hero win or lose.
LW loves prediction markets...
I disagree with “He seems to have no inside information.” He presented himself as having no inside information, but that’s presumably how he would have presented himself regardless of whether he had inside information or not. It’s not like he needed to convince others that he knows what he’s doing, like how in the stock market you want to buy then pump then sell. This is different—it’s a market that’s about to resolve. The smart play from his perspective would be to aggressively trash-talk his own competence, to lower the price in case he wants to buy more.
Looks likely that tonight is going to be a massive transfer of wealth from "sharps"(among other people) to him. Post hoc and all, but I think if somebody is raking in huge wins while making "stupid" decisions it's worth considering whether they're actually so stupid after all.
>> 'a massive transfer of wealth from "sharps" '.
no. That's exactly the point.
1. there might no be any real sharps (=traders having access to real private arbitragiable information that are consistently taking risk-neutral bets on them) in this market at all.
This is because a) this might simple be a noisy, high entropy source that is inherently difficult to predict, hence there is little arbitragiable information and/or b) sharps have not been sufficiently incenticiz
2. The transfer of wealth is actually disappointing because Theo the French Whale moved the price so much.
For an understanding of what the trading decisions of a verifiable sharp looks like one should take a look at Jim Simons' Medaillon fund. They do enormous hidden information collection, ?myssterious computer models, but at the end of the day take a large amount of very hedged tiny edge positions.
***************************************************
You are misunderstanding my argument (and most of the LW commentariat with you). I might note that I made my statement before the election result and clearly said 'win or lose' but it seems that even on LW people think winning on a noisy N=1 sample is proof of rationality.
That's why I said: "In expectation", "win or lose"
That the coinflip came out one way rather than another doesnt prove the guy had actual inside knowledge. He bought a large part of the shares at crazy odds because his market impact moved the price so much.
But yes, he could be a sharp in sheeps clothings. I doubt it but who knows. EDIT: I calculated the implied private odds for a rational Kelly bettor that this guy would have to have. Suffice to say these private odds seem unrealistic for election betting.
Point is that the winners contribute epistemics and the losers contribute money. The real winner is society [if the questions are about socially-relevant topics].
On the word 'theory'.
The word 'theory' is oft used and abused.
there is two ways 'theory' is used that are different and often lead to confusion.
Theory in thescientific sense
the way a physicist would use: it's a model of the world that is either right or wrong. there might be competing theories and we neeed to have empirical evidence to figure out which one's right. Ideally, they agree with empirical evidence or at least are highly falsifiable. Importantly, if two theories are to conflict they need to actually speak about the same variables, the...
[see also Hanson on rot, generalizations of the second law to nonequilibrium systems (Baez-Pollard, Crutchfield et al.) ]
Imperfect Persistence of Metabolically Active Engines
All things rot. Indidivual organisms, societies-at-large, businesses, churches, empires and maritime republics, man-made artifacts of glass and steel, creatures of flesh and blood.
Conjecture #1 There is a lower bound on the amount of dissipation / rot that any metabolically-active engine creates.
Conjecture #2 Metabolic Rot of an engine is proportional to (1) size and complexity o...
The Sun revolves around the Earth actually
The traditional story is that in olden times people were proudly stupid and thought the human animal lived at the centre of the universe, with all the planets, stars and the sun revolving around the God's creation, made in his image. The church would send anybody that said the sun was at the centre to be burned at the stake. [1]
Except...
there is no absolute sense in which the sun is at the centre of the solar system [2]. It's simply a question of perspective, a choice of frame of reference.
EDIT 06/11/2024 My thinking has crystallized more on these topics. The current version is lacking but I believe may be steelmanned to a degree.
"I dreamed I was a butterfly, flitting around in the sky; then I awoke. Now I wonder: Am I a man who dreamt of being a butterfly, or am I a butterfly dreaming that I am a man?"- Zhuangzi
Questions I have that you might have too:
Mindmeld
In theory AIs can transmit information far faster and more directly than humans. They can directly send weight/activation vectors to one another. The most important variable on whether entities (cells, organisms, polities, companies, ideologies, empire etc) stay individuals or amalgate into a superorganism is communication bandwith & copy fidelity.
Both of these differ many order of magnitude for humans versus AIs. At some point, mere communication becomes a literal melding of minds. It seems quite plausibly then that AIs will tend to mind...
God exists because the most reasonable take is the Solomonoff Prior.
A funny consequence of that is that Intelligent Design will have a fairly large weight in the Solomonoff prior. Indeed the simulation argument can be seen as a version of Intelligent Design.
The Abrahamic God hypothesis is still substantially downweighted because it seems to involve many contigent bits - i.e noisy random bits that can't be compressed. The Solomonoff prior therefore has to downweight them.
Clem's Synthetic- Physicalist Hypothesis
The mathematico-physicalist hypothesis states that our physical universe is actually a piece of math. It was famously popularized by Max Tegmark.
It's one of those big-brain ideas that sound profound when you first hear about it, then you think about it some more and you realize it's vacuous.
Recently, in a conversation with Clem von Stengel they suggested a version of the mathematico-physicalist hypothesis that I find provoking.
Synthetic mathematics
'Synthetic' mathematics is a bit of weird name...
Know your scientific competitors.
In trading, entering a market dominated by insiders without proper research is a sure-fire way to lose a lot of money and time. Fintech companies go to great lengths to uncover their competitors' strategies while safeguarding their own.
A friend who worked in trading told me that traders would share subtly incorrect advice on trading Discords to mislead competitors and protect their strategies.
Surprisingly, in many scientific disciplines researchers are often curiously incurious about their peers' work.
The long f...
Agent Foundations Reading List [Living Document]
This is a stub for a living document on a reading list for Agent Foundations.
Causality
Book of Why, Causality - Pearl
Probability theory
Logic of Science - Jaynes
Are Solomonoff Daemons exponentially dense?
Some doomers have very strong intuitions that doom is almost assured for almost any kind of building AI. Yudkowsky likes to say that alignment is about hitting a tiny part of values space in a vast universe of deeply alien values.
Is there a way to make this more formal? Is there a formal model in which some kind of solomonoff daemon/ mesa-optimizer/ gremlins in the machine start popping up all over the place as the cognitive power of the agent is scaled up?
Imagine that a magically powerful AI decides to set a new political system for humans and create a "Constitution of Earth" that will be perfectly enforced by local smaller AIs, while the greatest one travels away to explore other galaxies.
The AI decides that the most fair way to create the constitution is randomly. It will choose a length, for example 10000 words of English text. Then it will generate all possible combinations of 10000 English words. (It is magical, so let's not worry about how much compute that would actually take.) Out of the generated combinations, it will remove the ones that don't make any sense (an overwhelming majority of them) and the ones that could not be meaningfully interpreted as "a constitution" of a country (this is kinda subjective, but the AI does not mind reading them all, evaluating each of them patiently using the same criteria, and accepting only the ones that pass a certain threshold). Out of the remaining ones, the AI will choose the "Constitution of Earth" randomly, using a fair quantum randomness generator.
Shortly before the result is announced, how optimistic would you feel about your future life, as a citizen of Earth?
[this is a draft. I strongly welcome comments]
A blockade of Taiwan seems significantly more likely than a full-scale invasion. The US's non-intervention in Ukraine suggests similar restraint might occur with Taiwan.
Nevertheless, Metaculus predicts a 65% chance of US military response to a Chinese invasion and separately gives 20-50% for some kind of Chinese military intervention by 2035. Let us imagine that the worst comes to pass and China and the United States are engaged in a hot war?
China's...
Four levels of information theory
There are four levels of information theory.
Level 1: Number Entropy
Information is measured by Shannon entropy
Level 2: Random variable
look at the underlying random variable ('surprisal') of which entropy is the expectation.
Level 3: Coding functions
Shannon's source coding theorem says entropy of a source is the expected number of bits for an optimal encoding of samples of .
Related quantity like mutual information, relative entropy, cross e...
Idle thoughts about UDASSA I: the Simulation hypothesis
I was talking to my neighbor about UDASSA the other day. He mentioned a book I keep getting recommended but never read where characters get simulated and then the simulating machine is progressively slowed down.
One would expect one wouldn't be able to notice from inside the simulation that the simulating machine is being slowed down.
This presents a conundrum for simulation style hypotheses: if the simulation can be slowed down 100x without the insiders noticing, why not 1000x or 10^100x or ...
Looking for specific tips and tricks to break AI out of formal/corporate writing patterns. Tried style mimicry ('write like Hemingway') and direct requests ('be more creative') - both fell flat. What works?
Should I be using different AI models ( I am using GPT and Claude)? The base models output an enormous creative storm, but somehow the RLHF has partially lobotomized LLMs such that they always seem to output either cheesy stereotypes or overly verbose academise/corporatespeak.
Is true Novelty a Mirage?
One view on novelty is that it's a mirage. Novelty is 'just synthesis of existing work, plus some randomness.'
I don't think that's correct. I think true novelty is more subtle than that. Yes sometimes novel artforms or scientific ideas are about noisily mixing existing ideas. Does it describe all forms of novelty?
A reductio ad absurdum of the novelty-as-mirage point of view is that all artforms that have appeared since the dawn of time are simply noised versions of cavepaintings. This seems absurd.
Consider AlphaGO. Does AlphaGO jus...
The Virtue of Comparison Shopping
Comparison shopping, informed in-depth reviewing, answering customer surveys plausibly have substantial positive externalities. It provides incentives through local actors, avoids preference falsification or social desirability bias, and is non-coercive & market-based.
Plausibly it is even has a better social impact than many kinds of charitable donations or direct work. This is not so hard since it seems that the latter contains many kinds of interventions that have neglibible or even negative impact.
West African athletes dominate sprinting events, East Africans excel in endurance running, and despite their tiny population Icelanders have shown remarkable prowess in weightlifting competitions. We examine the Gaussian approximation for a simple additive genetic model for these observations.
Let's begin by considering a simple additive genetic model. In this model, a trait T is influenced by n independent genes, each contributing a small effect, along with environmental ...
tl;dr
Salmon swimming upstream to their birthing grounds to breed may be that rare form of group selection.
Pure Aryan Salmon
Salmon engage in anodromous reproduction; they risk their lives to swim up rivers to return to their original place of birth and reproduce there.
Most species of salmon die there, only reproducing at the birthing grounds. Many don't make it at all. The ones that survive the run upstream will die shortly after, a biologically triggered death sentence. If the cost is immense - the benefits must be even greater.
The more u...
Why do people like big houses in the countryside /suburbs?
Empirically people move out to the suburbs/countryside when they get children and/or gain wealth. Having a big house with a large yard is the quintessential American dream.
but why? Dense cities are economoically more productive, commuting is measurably one of the worst factors for happiness and productivity. Raising kids in small houses is totally possible and people have done so at far higher densities in the past.
Yet people will spend vast amounts of money on living in a large house wi...
I can report my own feelings with regards to this. I find cities (at least the American cities I have experience with) to be spiritually fatiguing. The constant sounds, the lack of anything natural, the smells - they all contribute to a lack of mental openness and quiet inside of myself.
The older I get the more I feel this.
Jefferson had a quote that might be related, though to be honest I'm not exactly sure what he was getting at:
I think our governments will remain virtuous for many centuries; as long as they are chiefly agricultural; and this will be as long as there shall be vacant lands in any part of America. When they get piled upon one another in large cities, as in Europe, they will become corrupt as in Europe. Above all things I hope the education of the common people will be attended to; convinced that on their good sense we may rely with the most security for the preservation of a due degree of liberty.
One interpretation of this is that Jefferson thought there was something spiritually corrupting of cities. This supported by another quote:
...
I view great cities as pestilential to the morals, the health and the liberties of man. true, they nourish some of the eleg
Why (talk-)Therapy
Therapy is a curious practice. Therapy sounds like a scam, quackery, pseudo-science but it seems RCT consistently show therapy has benefits above and beyond medication & placebo.
Therapy has a long history. The Dodo verdict states that it doesn't matter which form of therapy you do - they all work equally well. It follows that priests and shamans served the functions of a therapist. In the past, one would confessed ones sins to a priest, or spoken with the local shaman.
There is also the thing that therapy ...
[This is joint thinking with Sam Eisenstat. Also thanks to Caspar Oesterheld for his thoughtful comments. Thanks to Steve Byrnes for pushing me to write this out.]
The Hyena problem in long-term planning
Logical induction is a nice framework to think about bounded reasoning. Very soon after the discovery of logical induction people tried to make logical inductor decision makers work. This is difficult to make work: one of two obstacles is
Obstacle 1: Untaken Actions are not Observable
Caspar Oesterheld brilliantly solved this problem by using auction ma...
Latent abstractions Bootlegged.
Let be random variables distributed according to a probability distribution on a sample space .
Defn. A (weak) natural latent of is a random variable such that
(i) are independent conditional on
(ii) [reconstructability] for all
[This is not really reconstructability, more like a stability property. The information is contained in many parts of the system... I might al...
Inspired by this Shalizi paper defining local causal states. The idea is so simple and elegant I'm surprised I had never seen it before.
Basically, starting with a a factored probability distribution over a dynamical DAG we can use Crutchfield causal state construction locally to construct a derived causal model factored over the dynamical DAG as where is defined by considering the past and forward lightcone of defined as all those points/ variables which influence respectively are influenced by (in a causal interventional sense) . Now take define the equivalence relatio on realization of (which includes by definition)[1] whenever the conditional probability distribution on the future light cones are equal.
These factored probability distributions over dynamical DAGs are called 'fields' by physicists. Given any field we define a derived local causal state field in the above way. Woah!
...
Reasons to think Lobian Cooperation is important
Usually the modal Lobian cooperation is dismissed as not relevant for real situations but it is plausible that Lobian cooperation extends far more broadly than what is proved currently.
It is plausible that much of cooperation we see in the real world is actually approximate Lobian cooperation rather than purely given by traditional game-theoretic incentives.
Lobian cooperation is far stronger in cases where the players resemble each other and/or have access to one another's blueprint. This is ...
Pseudorandom warp fields
[tl;dr the loss landscape around a set of weights encoding an unlearnable 'pseudorandom' function will be warped in such a way that gradient optimizers will bob around for exponentially long. ]
Computational learning theory contains numerous 'no-go' results indicating that many functions are not tractably learnable.
The most classical result is probably the VC dimension and PAC learnability. A good example to think about are parity functions. The output is, in some sense, ver...
Self- Rituals as Schelling loci for Self-control and OCD
Why do people engage in non-social Rituals 'self-rituals'? These are very common and can even become pathological (OCD).
High-self control people seem to more often have OCD-like symptoms.
One way to think about self-control is as a form of internal bargaining between internal subagents. From this perspective, Self-control, time-discounting can be seen as a resource. In the absence of self-control the superagent
D...
I feel like the whole "subagent" framework suffers from homunculus problem: we fail to explain behavior using the abstraction of coherent agent, so we move to the abstraction of multiple coherent agents, and while it can be useful, I don't think it displays actual mechanistic truth about minds.
When I plan something and then fail to execute plan it's mostly not like "failure to bargain". It's just when I plan something I usually have good consequences of plan in my imagination and this consequences make me excited and then I start plan execution and get hit by multiple unpleasant details of reality. Coherent structure emerges from multiple not-really-agentic pieces.
(conversation with Scott Garrabrant)
Destructive Criticism
Sometimes you can say something isn't quite right but you can't provide an alternative.
Difference between 'generation of ideas' and 'filtration of ideas' - i.e. babble and prune.
ScottG: Bayesian learning assumes we are in a babble-rich environment and only does pr...
Reasonable interpretations of Recursive Self Improvement are either trivial, tautological or false?
Trivial but important
Aumann agreement can fail for purely epistemic reasons because real-world minds do not do Bayesian updating. Bayesian updating is intractable so realistic minds sample from the prior. This is how e.g. gradient descent works and also how human minds work.
In this situation a two minds can end in two different basins with similar loss on the data. Because of computational limitations. These minds can have genuinely different expectation for generalization.
(Of course this does not contradict the statement of the theorem which is correct.)
Imprecise Information theory
Would like a notion of entropy for credal sets. Diffractor suggests the following:
let be a credal set.
Then the entropy of is defined as
where denotes the usual Shannon entropy.
I don't like this since it doesn't satisfy the natural desiderata below.
Instead, I suggest the following. Let denote the (absolute) maximum entropy distribution, i.e. and let .
Desideratum 1: ...
Roko's basilisk is a thought experiment which states that an otherwise benevolent artificial superintelligence (AI) in the future would be incentivized to create a virtual reality simulation to torture anyone who knew of its potential existence but did not directly contribute to its advancement or development.
Why Roko's basilisk probably doesn't work for simulation fidelity reasons:
Roko's basilisk threatens to simulate and torture you in the future if you don't comply. Simulation cycles cost resources. Instead of following through on torturing our wo...
All concepts can be learnt. All things worth knowing may be grasped. Eventually.
All can be understood - given enough time and effort.
For Turing-complete organism, there is no qualitive gap between knowledge and ignorance.
No qualitive gap but one. The true qualitative difference: quantity.
Often we simply miss a piece of data. The gap is too large - we jump and never reach the other side. A friendly hominid who has trodden the path before can share their journey. Once we know the road, there is no mystery. Only effort and time. Some hominids choose not to share their journey. We keep a special name for these singular hominids: genius.
Abnormalised sampling?
Probability theory talks about sampling for probability distributions, i.e. normalized measures. However, non-normalized measures abound: weighted automata, infra-stuff, uniform priors on noncompact spaces, wealth in logical-inductor esque math, quantum stuff?? etc.
Most of probability theory constructions go through just for arbitrary measures, doesn't need the normalization assumption. Except, crucially, sampling.
What does it even mean to sample from a non-normalized measure? What is unnormalized abnormal sampling?
I don't know....
SLT and phase transitions
The morphogenetic SLT story says that during training the Bayesian posterior concentrates around a series of subspaces with rlcts and losses . As the size of the data sample is scaled the Bayesian posterior makes transitions trading off higher complexity (higher ) for better accuracy (lower loss ).
This is the radical new framework of SLT: phase transitions happen i...
Alignment by Simulation?
I've heard this alignment plan that is a variation of 'simulate top alignment researchers' with an LLM. Usually the poor alignment researcher in question is Paul.
This strikes me as deeply unserious and I am confused why it is having so much traction.
That AI-assisted alignment is coming (indeed, is already here!) is undeniable. But even somewhat accurately simulating a human from textdata is a crazy sci-fi ability, probably not even physically possible. It seems to ascribe nearly magical abilities to LLMs.
Predicting...
[Edit 15/05/2024: I currently think that both forward and backward chaining paradigms are missing something important. Instead, there is something like 'side-chaining' or 'wide-chaining' where you are investigating how things are related forwardly, backwardly and sideways to make use of synergystic information ]
Optimal Forward-chaining versus backward-chaining.
In general, this is going to depend on the domain. In environments for which we have many expert samples and there are many existing techniques backward-chaining is key. (i.e. deploying r...
Thin versus Thick Thinking
Thick: aggregate many noisy sources to make a sequential series of actions in mildly related environments, model-free RL
carnal sins: failure of prioritization / not throwing away enough information , nerdsnipes, insufficient aggegration, trusting too much in any particular model, indecisiveness, overfitting on noise, ignoring consensus of experts/ social reality
default of the ancestral environment
CEOs, general, doctors, economist, police detective in the real world, trader
Thin: precise, systematic analysis, preferably ...
[Thanks to Vlad Firoiu for helping me]
An Attempted Derivation of the Lindy Effect
Wikipedia:
The Lindy effect (also known as Lindy's Law[1]) is a theorized phenomenon by which the future life expectancy of some non-perishable things, like a technology or an idea, is proportional to their current age.
Laplace Rule of Succesion
What is the probability that the Sun will rise tomorrow, given that is has risen every day for 5000 years?
Let denote the probability that the Sun will rise tomorrow. A priori we have no information on the value of&...
Generalized Jeffrey Prior for singular models?
For singular models the Jeffrey Prior is not well-behaved for the simple fact that it will be zero at minima of the loss function.
Does this mean the Jeffrey prior is only of interest in regular models? I beg to differ.
Usually the Jeffrey prior is derived as parameterization invariant prior. There is another way of thinking about the Jeffrey prior as arising from an 'indistinguishability prior'.
The argument is delightfully simple: given two weights if they encode the same distributi...
"The links between logic and games go back a long way. If one thinks of a debate as a kind of game, then Aristotle already made the connection; his writings about syllogism are closely intertwined with his study of the aims and rules of debating. Aristotle’s viewpoint survived into the common medieval name for logic: dialectics. In the mid twentieth century Charles Hamblin revived the link between dialogue and the rules of sound reasoning, soon after Paul Lorenzen had connected dialogue to constructive foundations of logic." from the Stanford Encyclopedia ...
Ambiguous Counterfactuals
[Thanks to Matthias Georg Mayer for pointing me towards ambiguous counterfactuals]
Salary is a function of eXperience and Education
We have a candidate with given salary, experience and education .
Their current salary is given by
We 'd like to consider the counterfactual where they didn't have the education . How do we evaluate their salary in this counterfactual?
This is slightly ambiguous - there are two counterfactuals:
or
In the second c...
Insights as Islands of Abductive Percolation?
I've been fascinated by this beautiful paper by Viteri & DeDeo.
What is a mathematical insight? We feel intuitively that proving a difficult theorem requires discovering one or more key insights. Before we get into what the Dedeo-Viteri paper has to say about (mathematical) insights let me recall some basic observations on the nature of insights:
(see also my previous shortform)
The pseudorandom lie under the Lava lamp
Our observations are compatible with a world that is generated by a Turing machine with just a couple thousand bits.
That means that all the seemingly random bits we see in Geiger counters, Lava lamps, gasses and the like is only pseudorandomness in actuality.
Evidence Manipulation and Legal Admissible Evidence
[This was inspired by Kokotaljo's shortform on comparing strong with weak evidence]
In the real world the weight of many pieces of weak evidence is not always comparable to a single piece of strong evidence. The important variable here is not strong versus weak per se but the source of the evidence. Some sources of evidence are easier to manipulate in various ways. Evidence manipulation, either consciously or emergently, is common and a large obstactle to truth-finding.
Consider aggregating many ...
Imagine a data stream
assumed infinite in both directions for simplicity. Here represents the current state ( the "present") and while and represents the future
Predictible Information versus Predictive Information
Predictible information is the maximal information (in bits) that you can derive about the future given the access to the past. Predictive information is the amount of bits that you need from the past to make that optimal prediction.
Suppose you are...
Hopfield Networks = Ising Models = Distributions over Causal models?
Given a joint probability distributions famously there might be many 'Markov' factorizations. Each corresponds with a different causal model.
Instead of choosing a particular one we might have a distribution of beliefs over these different causal models. This feels basically like a Hopfield Network/ Ising Model.
You have a distribution over nodes and an 'interaction' distribution over edges.
The distribution over nodes corresponds to the joint probability di...