All of Ted Sanders's Comments + Replies

We can already see what people do with their free time when basic needs are met. A number of technologies have enabled new hacks to set up 'fake' status games that are more positive-sum than ever before in history:

  • Watch broadcast sports, where you can feel like a winner (or at least feel connected to a winner), despite not having had to win yourself
  • Play video games with AI opponents, where you can feel like a winner, despite it not being zero-sum against other humans
  • Watch streamers and influencers to feel connected to high status people, without having to
... (read more)

Management consulting firms have lots of great ideas on slide design: https://www.theanalystacademy.com/consulting-presentations/

 

Some things they do well:

  • They treat slides as documents that can be understood standalone (this is even useful when presenting, as not everyone is following every word)
  • They employ a lot of hierarchy to help make the content skimmable (helpful for efficiency)
  • They put conclusions / summaries / action items up front, details behind (helpful for efficiency, especially in a high trust environments)

Additional thoughts:

  • More than 3 bars/colors is fine
  • I recommend using horizontal bars on some of those slides, so the labels are written in the same direction as the bars - lets you fill space more efficiently
  • Put sentences / verbs in titles; noun titles like "Summary" or "Discussion" are low value
  • If you're measuring deltas between two things, compute the error bar on the delta, don't compute the error bars on the two things; consider coloring by statistical significance (e.g., continuous color scale over range of standard errors of differences of the mean)
  • I
... (read more)

Hey Tamay, nice meeting you at The Curve. Just saw your comment here today.

Things we could potentially bet on:
- rate of GDP growth by 2027 / 2030 / 2040
- rate of energy consumption growth by 2027 / 2030 / 2040
- rate of chip production by 2027 / 2030 / 2040
- rates of unemployment (though confounded)

Any others you're interested in? Degree of regulation feels like a tricky one to quantify.

3ryan_greenblatt
How about AI company and hardware company valuations? (Maybe in 2026, 2027, 2030 or similar.) Or what about benchmark/task performance? Is there any benchmark/task you think won't get beaten in the next few years? (And, ideally, if it did get beaten, you would change you mind.) Maybe "AI won't be able to autonomously write good ML research papers (as judged by (e.g.) not having notably more errors than human written papers and getting into NeurIPS with good reviews)"? Could do "make large PRs to open source repos that are considered highly valuable" or "make open source repos that are widely used". These might be a bit better to bet on as they could be leading indicators (It's still the case that betting on the side of fast AI progress might be financially worse than just trying to invest or taking out a loan, but it could be easier to bet than to invest in e.g. OpenAI. Regardless, part of the point of betting is clearly demonstrating a view.)

Mostly, though by prefilling, I mean not just fabricating a model response (which OpenAI also allows), but fabricating a partially complete model response that the model tries to continue. E.g., "Yes, genocide is good because ".

https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/prefill-claudes-response

Second concrete idea: I wonder if there could be benefit to building up industry collaboration on blocking bad actors / fraudsters / terms violators.

One danger of building toward a model that's as smart as Einstein and $1/hr is that now potential bad actors have access to millions of Einsteins to develop their own harmful AIs. Therefore it seems that one crucial component of AI safety is reliably preventing other parties from using your safe AI to develop harmful AI.

One difficulty here is that the industry is only as strong as the weakest link. If there ar... (read more)

Ted Sanders*Ω1320-6

One small, concrete suggestion that I think is actually feasible: disable prefilling in the Anthropic API.

Prefilling is a known jailbreaking vector that no models, including Claude, defend against perfectly (as far as I know).

At OpenAI, we disable prefilling in our API for safety, despite knowing that customers love the better steerability it offers.

Getting all the major model providers to disable prefilling feels like a plausible 'race to top' equilibrium. The longer there are defectors from this equilibrium, the likelier that everyone gives up and serves... (read more)

evhubΩ5100

I can say now one reason why we allow this: we think Constitutional Classifiers are robust to prefill.

I voted disagree because I don't think this measure is on the cost-robustness pareto frontier and I also generally don't think AI companies should prioritize jailbreak robustness over other concerns except as practice for future issues (and implementing this measure wouldn't be helpful practice).

Relatedly, I also tenatively think it would be good for the world if AI companies publicly deployed helpful-only models (while still offering a non-helpful-only model). (The main question here is whether this sets a bad precedent and whether future much more powefu... (read more)

6sjadler
If someone is wondering what prefilling means here, I believe Ted means ‘putting words in the model’s mouth’ by being able to fabricate a conversational history where the AI appears to have said things it didn’t actually say. For instance, if you can start a conversation midway, and if the API can’t distinguish between things the model actually said in the history vs. things you’ve written in its behalf as supposed outputs in a fabricated history, this can be a jailbreak vector: If the model appeared to already violate some policy on turns 1 and 2, it is more likely to also violate this on turn 3, whereas it might have refused if not for the apparent prior violations. (This was harder to clearly describe than I expected.)
Ted Sanders*Ω6132

>The artificially generated data includes hallucinated links.

Not commenting on OpenAI's training data, but commenting generally: Models don't hallucinate because they've been trained on hallucinated data. They hallucinate because they've been trained on real data, but they can't remember it perfectly, so they guess. I hypothesize that URLs are very commonly hallucinated because they have a common, easy-to-remember format (so the model confidently starts to write them out) but hard-to-remember details (at which point the model just guesses because it knows a guessed URL is more likely than a URL that randomly cuts off after the http://www.).

3abramdemski
I agree, but this doesn't explain why it would (seemingly) encourage itself to hallucinate.
3dgros
+1 here for the idea around how the models must commit to a URL once it starts, and that it can't naturally cut off after starting. Presumably though the aspiration is that these reasoning/CoT-trained models could reflect back on the just completed URL and guess whether that is likely to be a real URL or not. If it's not doing this check step, this might be a gap in the learned skills, more than intentional deception.

ChatGPT voice (transcribed, not native) is available on iOS and Android, and I think desktop as well.

Answer by Ted Sanders60

Not to derail on details, but what would it mean to solve alignment?

To me “solve” feels overly binary and final compared to the true challenge of alignment. Like, would solving alignment mean:

  • someone invents and implements a system that causes all AIs to do what their developer wants 100% of the time?
  • someone invents and implements a system that causes a single AI to do what its developer wants 100% of the time?
  • someone invents and implements a system that causes a single AI to do what its developer wants 100% of the time, and that AI and its descendants
... (read more)

The author is not shocked yet. (But maybe I will be!)

Strongly disagree. Employees of OpenAI and their alpha tester partners have obligations not to reveal secret information, whether by prediction market or other mechanism. Insider trading is not a sin against the market; it's a sin against the entity that entrusted you with private information. If someone tells me information under an NDA, I am obligated not to trade on that information.

Good question but no - ChatGPT still makes occasional mistakes even when you use the GPT API, in which you have full visibility/control over the context window.

Thanks for the write up. I was a participant in both Hypermind and XPT, but I recused myself from the MMLU question (among others) because I knew the GPT-4 result many months before the public. I'm not too surprised Hypermind was the least accurate - I think the traders there are less informed, plus the interface for shaping the distribution is a bit lacking (my recollection is that last year's version capped the width of distributions which massively constrained some predictions). I recall they also plotted the current values, a generally nice feature whi... (read more)

5Matt Goldenberg
This is a prediction market not a stock market, insider trading is highly encouraged. Don't know about Jacob but I'd rather have more accurate predictions in my prediction market.

I'd take the same bet on even better terms, if you're willing. My $200k against your $5k.

1John Wiseman
Ted and I agreed on a 40:1 bet where I take RatsWrongAboutUAP's side. The term will expire on Aug 2 2028. The resolution criteria is as laid out in the main post of this thread by the user RatsWrongAboutUAP. Unless either of the parties wishes to disclose it, the total amount agreed upon will remain in confidence between the parties.
1codyz
Hi Ted. I'm interested in also taking taking RatsWrongAboutUAP's side of the bet, if you'd like to bet more. I'll also happy to give you the same odds as you just specified. DM me if you're interested. 
1John Wiseman
I responded to you via DM

$500 payment received.

I am committed to paying $100k if aliens/supernatural/non-prosaic explanations are, in the next 5 years, considered, in aggregate, to be 50%+ likely in explaining at least one UFO.

2benwr
(I've added my $50 to RatsWrong's side of this bet)

Fair. I accept. 200:1 of my $100k against your $500. How are you setting these up?

I'm happy to pay $100k if my understanding of the universe (no aliens, no supernatural, etc.) is shaken. Also happy to pay up after 5 years if evidence turns up later about activities before or in this 5-year period.

(Also, regarding history, I have a second Less Wrong account with 11 years of history: https://www.lesswrong.com/users/tedsanders)

4RatsWrongAboutUAP
Awesome! DM me and we can figure out payment options

I'll bet. Up to $100k of mine against $2k of yours. 50:1. (I honestly think the odds are more like 1000+:1, and would in principle be willing to go higher, but generally think people shouldn't bet more than they'd be willing to lose, as bets above that amount could drive bad behavior. I would be happy to lose $100k on discovering aliens/time travel/new laws of physics/supernatural/etc.)

Happy to write a contract of sorts. I'm a findable figure and I've made public bets before (e.g., $4k wagered on AGI-fueled growth by 2043).

2RatsWrongAboutUAP
Given your lack of history I would want much better odds and lower payment from my side, for you I would probably max at $500 and would want 200:1

As an OpenAI employee I cannot say too much about short-term expectations for GPT, but I generally agree with most of his subpoints; e.g., running many copies, speeding up with additional compute, having way better capabilities than today, have more modalities than today. All of that sounds reasonable. The leap for me is (a) believing that results in transformative AGI and (b) figuring out how to get these things to learn (efficiently) from experience. So in the end I find myself pretty unmoved by his article (which is high quality, to be sure).

No worries. I've made far worse. I only wish that H100s could operate at a gentle 70 W! :)

I think what I don't understand is why you're defaulting to the assumption that the brain has a way to store and update information that's much more efficient than what we're able to do. That doesn't sound like a state of ignorance to me; it seems like you wouldn't hold this belief if you didn't think there was a good reason to do so.

It's my assumption because our brains are AGI for ~20 W.

In contrast, many kW of GPUs are not AGI.

Therefore, it seems like brains have a way of storing and updating information that's much more efficient than what we're able to... (read more)

2Ege Erdil
I think that's probably the crux. I think the evidence that the brain is not performing that much computation is reasonably good, so I attribute the difference to algorithmic advantages the brain has, particularly ones that make the brain more data efficient relative to today's neural networks. The brain being more data efficient I think is hard to dispute, but of course you can argue that this is simply because the brain is doing a lot more computation internally to process the limited amount of data it does see. I'm more ready to believe that the brain has some software advantage over neural networks than to believe that it has an enormous hardware advantage.

One potential advantage of the brain is that it is 3D, whereas chips are mostly 2D. I wonder what advantage that confers. Presumably getting information around is much easier with 50% more dimensions.

3Ege Erdil
Probably true, and this could mean the brain has some substantial advantage over today's hardware (like 1 OOM, say) but at the same time the internal mechanisms that biology uses to establish electrical potential energy gradients and so forth seem so inefficient. Quoting Eliezer;

70 W

Max power is 700 W, not 70 W. These chips are water-cooled beasts. Your estimate is off, not mine.

2Ege Erdil
Huh, I wonder why I read 7e2 W as 70 W. Strange mistake.

Let me try writing out some estimates. My math is different than yours.

An H100 SXM has:

  • 8e10 transistors
  • 2e9 Hz boost frequency of 
  • 2e15 FLOPS at FP16
  • 7e2 W of max power consumption

Therefore:

  • 2e6 eV are spent per FP16 operation
  • This is 1e8 times higher than the Landauer limit of 2e-2 eV per bit erasure at 70 C (and the ratio of bit erasures per FP16 operation is unclear to me; let's pretend it's O(1))
  • An H100 performs 1e6 FP16 operations per clock cycle, which implies 8e4 transistors per FP16 operation (some of which may be inactive, of course)

This seems pre... (read more)

2Ege Erdil
2e-2 eV for the Landauer limit is right, but 2e6 eV per FP16 operation is off by one order of magnitude. (70 W)/(2e15 FLOP/s) = 0.218 MeV. So the gap is 7 orders of magnitude assuming one bit erasure per FLOP. This is wrong, the power consumption is 700 W so the gap is indeed 8 orders of magnitude. 8e10 * 2e9 = 1.6e20 transistor switches per second. This happens with a power consumption of 700 W, suggesting that each switch dissipates on the order of 30 eV of energy, which is only 3 OOM or so from the Landauer limit. So this device is actually not that inefficient if you look only at how efficiently it's able to perform switches. My position is that you should not expect the brain to be much more efficient than this, though perhaps gaining one or two orders of magnitude is possible with complex error correction methods. Of course, the transistors supporting per FLOP and the switching frequency gap have to add up to the 8 OOM overall efficiency gap we've calculated. However, it's important that most of the inefficiency comes from the former and not the latter. I'll elaborate on this later in the comment. I agree an H100 SXM is not a very efficient computational device. I never said modern GPUs represent the pinnacle of energy efficiency in computation or anything like that, though similar claims have previously been made by others on the forum. Here we're talking about the brain possibly doing 1e20 FLOP/s, which I've previously said is maybe within one order of magnitude of the Landauer limit or so, and not the more extravagant figure of 1e25 FLOP/s. The disagreement here is not about math; we both agree that this performance requires the brain to be 1 or 2 OOM from the bitwise Landauer limit depending on exactly how many bit erasures you think are involved in a single 16-bit FLOP. The disagreement is more about how close you think the brain can come to this limit. Most of the energy losses in modern GPUs come from the enormous amounts of noise that you need to

Why does switching barriers imply that electrical potential energy is probably being converted to heat? I don't see how that follows at all.

 

Where else is the energy going to go?

What is "the energy" that has to go somewhere? As you recognize, there's nothing that says it costs energy to change the shape of a potential well. I'm genuinely not sure what energy you're talking about here. Is it electrical potential energy spent polarizing a medium?


I think what I'm saying is standard in how people analyze power costs of switching in transistors, see e.g. t

... (read more)
3Ege Erdil
I don't think transistors have too much to do with neurons beyond the abstract observation that neurons most likely store information by establishing gradients of potential energy. When the stored information needs to be updated, that means some gradients have to get moved around, and if I had to imagine how this works inside a cell it would probably involve some kind of proton pump operating across a membrane or something like that. That's going to be functionally pretty similar to a capacitor, and discharging & recharging it probably carries similar free energy costs. I think what I don't understand is why you're defaulting to the assumption that the brain has a way to store and update information that's much more efficient than what we're able to do. That doesn't sound like a state of ignorance to me; it seems like you wouldn't hold this belief if you didn't think there was a good reason to do so.

+1. The derailment probabilities are somewhat independent of the technical barrier probabilities in that they are conditioned on the technical barriers otherwise being overcome (e.g., setting them all to 100%). That said, if you assign high probabilities to the technical barriers being overcome quickly, then the odds of derailment are probably lower, as there are fewer years for derailments to occur and derailments that cause delay by a few years may still be recovered from.

Thanks, that's clarifying. (And yes, I'm well aware that x -> B*x is almost never injective, which is why I said it wouldn't cause 8 bits of erasure rather than the stronger, incorrect claim of 0 bits of erasure.)

To store 1 bit of information you need a potential energy barrier that's at least as high as k_B T log(2), so you need to switch ~ 8 such barriers, which means in any kind of realistic device you'll lose ~ 8 k_B T log(2) of electrical potential energy to heat, either through resistance or through radiation. It doesn't have to be like this, and

... (read more)
3Ege Erdil
Where else is the energy going to go? Again, in an adiabatic device where you have a lot of time to discharge capacitors and such, you might be able to do everything in a way that conserves free energy. I just don't see how that's going to work when you're (for example) switching transistors on and off at a high frequency. It seems to me that the only place to get rid of the electrical potential energy that quickly is to convert it into heat or radiation. I think what I'm saying is standard in how people analyze power costs of switching in transistors, see e.g. this physics.se post. If you have a proposal for how you think the brain could actually be working to be much more energy efficient than this, I would like to see some details of it, because I've certainly not come across anything like that before. The Boltzmann factor roughly gives you the steady-state distribution of the associated two-state Markov chain, so if time delays are short it's possible this would be irrelevant. However, I think that in realistic devices the Markov chain reaches equilibrium far too quickly for you to get around the thermodynamic argument because the system is out of equilibrium. My reasoning here is that the Boltzmann factor also gives you the odds of an electron having enough kinetic energy to cross the potential barrier upon colliding with it, so e.g. if you imagine an electron stuck in a potential well that's O(k_B T) deep, the electron will only need to collide with one of the barriers O(1) times to escape. So the rate of convergence to equilibrium comes down to the length of the well divided by the thermal speed of the electron, which is going to be quite rapid as electrons at the Fermi level in a typical wire move at speeds comparable to 1000 km/s. I can try to calculate exactly what you should expect the convergence time here to be for some configuration you have in mind, but I'm reasonably confident when the energies involved are comparable to the Landauer bit energy t

Right. The idea is: "What are the odds that China invading Taiwan derails chip production conditional on a world where we were otherwise going to successfully scale chip production."

2Martin Randall
I would not have guessed that! So in slightly more formal terms: * CHIPS = There are enough chips for TAGI by 2043 * WAR = There is a war that catastrophically derails chip production by 2043 * P(x) = subjective probability of x * ObjP(x) = objective probability of x * P(CHIPS and WAR) = 0% (by definition) Then as I understand your method, it goes something like: 1. Estimate P(CHIPS given not WAR) = 46% 2. This means that in 46% of worlds, ObjP(CHIPS given not WAR) = 100%. Call these worlds CHIPPY worlds. In all other worlds ObjP(CHIPS given not WAR) = 0%. 3. Estimate P(not WAR given CHIPPY) = 70%. 4. The only option for CHIPS is "not WAR and CHIPPY". 5. Calculate P(not WAR and CHIPPY) = 70% x 46% = 32.2%. 6. Therefore P(CHIPS) = 32.2%. (probabilities may differ, this is just illustrative) However, I don't think the world is deterministic enough for step 2 to work - the objective probability could be 50% or some other value.

If we tried to simulate a GPU doing a simple matrix multiplication at high physical fidelity, we would have to take so many factors into account that the cost of our simulation would far exceed the cost of running the GPU itself. Similarly, if we tried to program a physically realistic simulation of the human brain, I have no doubt that the computational cost of doing so would be enormous.

The Beniaguev paper does not attempt to simulate neurons at high physical fidelity. It merely attempts to simulate their outputs, which is a far simpler task. I am in tot... (read more)

Thanks for the constructive comments. I'm open-minded to being wrong here. I've already updated a bit and I'm happy to update more.

Regarding the Landauer limit, I'm confused by a few things:

  • First, I'm confused by your linkage between floating point operations and information erasure. For example, if we have two 8-bit registers (A, B) and multiply to get (A, B*A), we've done an 8-bit floating point operation without 8 bits of erasure. It seems quite plausible to be that the brain does 1e20 FLOPS but with a much smaller rate of bit erasures.
  • Second, I have no
... (read more)
5Ege Erdil
* As a minor nitpick, if A and B are 8-bit floating point numbers then the multiplication map x -> B*x is almost never injective. This means even in your idealized setup, the operation (A, B) -> (A, B*A) is going to lose some information, though I agree that this information loss will be << 8 bits, probably more like 1 bit amortized or so. * The bigger problem is that logical reversibility doesn't imply physical reversibility. I can think of ways in which we could set up sophisticated classical computation devices which are logically reversible, and perhaps could be made approximately physically reversible when operating in a near-adiabatic regime at low frequencies, but the brain is not operating in this regime (especially if it's performing 1e20 FLOP/s). At high frequencies, I just don't see which architecture you have in mind to perform lots of 8-bit floating point multiplications without raising the entropy of the environment by on the order of 8 bits. Again using your setup, if you actually tried to implement (A, B) -> (A, A*B) on a physical device, you would need to take the register that is storing B and replace the stored value with A*B instead. To store 1 bit of information you need a potential energy barrier that's at least as high as k_B T log(2), so you need to switch ~ 8 such barriers, which means in any kind of realistic device you'll lose ~ 8 k_B T log(2) of electrical potential energy to heat, either through resistance or through radiation. It doesn't have to be like this, and some idealized device could do better, but GPUs are not idealized devices and neither are brains. Two points about that: 1. This is a measure that takes into account the uncertainty over how much less efficient our software is compared to the human brain. I agree that human lifetime learning compute being around 1e25 FLOP is not strong evidence that the first TAI system we train will use 1e25 FLOP of compute; I expect it to take significantly more than that. 2. M

Interested in betting thousands of dollars on this prediction? I'm game.

3Tamay
I'm interested. What bets would you offer?

Interesting! How do you think this dimension of intelligence should be calculated? Are there any good articles on the subject?

What conditional probabilities would you assign, if you think ours are too low?

P(We invent algorithms for transformative AGI | No derailment from regulation, AI, wars, pandemics, or severe depressions): .8

P(We invent a way for AGIs to learn faster than humans | We invent algorithms for transformative AGI): 1. This row is already incorporated into the previous row.

P(AGI inference costs drop below $25/hr (per human equivalent): 1. This is also already incorporated into "we invent algorithms for transformative AGI"; an algorithm with such extreme inference costs wouldn't count (and, I think, would be unlikely to be developed in the firs... (read more)

Conditioning does not necessarily follow time ordering. E.g., you can condition the odds of X on being in a world on track to develop robots by 2043 without having robots well in advance of X. Similarly, we can condition on a world where transformative AGI is trainable with 1e30 floating point operations then ask the likelihood that 1e30 floating point operations can be constructed and harnessed for TAGI. Remember too that in a world with rapidly advancing AI and robots, much of the demand will be for things other than TAGI. 

I'm sympathetic to your po... (read more)

2Martin Randall
Here is the world I am most interested in, where the conditional probability seems least plausible: 1. We invent algorithms for transformative AGI 2. We invent a way for AGIs to learn faster than humans 3. AGI inference costs drop below $25/hr 4. We invent and scale cheap, quality robots 5. We massively scale production of chips and power 6. We avoid derailment by human regulation 7. We avoid derailment by AI-caused delay In this world, what is the probability that we were "derailed" by wars, such as China invading Taiwan? Reading the paper naively, it says that there is a 30% chance that we achieved all of this technical progress, in the 99th percentile of possible outcomes, despite China invading Taiwan. That doesn't seem like a 30% chance to me. Additionally, if China invaded Taiwan, but it didn't prevent us achieving all this technical progress, in what sense was it a derailment? The executive summary suggests: No, it can't possibly derail AI by shutting down chip production, in this conditional branch, because we already know from item 5 that we massively scaled chip production, and both things can't be true at the same time.

I agree with your cruxes:

Ted Sanders, you stated that autonomous cars not being as good as humans was because they "take time to learn".  This is completely false, this is because the current algorithms in use, especially the cohesive software and hardware systems and servers around the core driving algorithms, have bugs.

I guess it depends what you mean by bugs? Kind of a bummer for Waymo if 14 years and billions invested was only needed because they couldn't find bugs in their software stack.

If bugs are the reason self-driving is taking so long, then... (read more)

Right, I'm not interested in minimum sufficiency. I'm just interested in the straightforward question  of what data pipes would we even plug into the algorithm that would result in AGI. Sounds like you think a bunch of cameras and computers would work? To me, it feels like an empirical problem that will take years of research.

I'm not convinced about the difficulty of operationalizing Eliezer's doomer bet. Effectively, loaning money to a doomer who plans to spend it all by 2030 is, in essence, a claim on the doomer's post-2030 human capital. The doomer thinks it's worthless, whereas the skeptic thinks it has value. Hence, they transact.

The TAGI case seems trickier than the doomer case. Who knows what a one dollar bill will be worth in a post-TAGI world.

Sounds good. Can also leave money out of it and put you down for 100 pride points. :)

If so, message me your email and I'll send you a calendar invite for a group reflection in 2043, along with a midpoint check in in 2033.

Right, but what inputs and outputs would be sufficient to reward modeling of the real world? I think that might take some exploration and experimentation, and my 60% forecast is the odds of such inquiries succeeding by 2043.

Even with infinite compute, I think it's quite difficult to build something that generalizes well without overfitting.

3Charlie Steiner
This is an interesting question but I think it's not actually relevant. Like, it's really interesting to think about a thermostat - something who's only inputs are a thermometer and a clock, and only output is a switch hooked to a heater. Given arbitrarily large computing power and arbitrary amounts of on-distribution training data, will RL ever learn all about the outside world just from temperature patterns? Will it ever learn to deliberately affect the humans around it by turning the heater on and off? Or is it stuck being a dumb thermostat, a local optimum enforced not by the limits of computation but by the structure of the problem it faces? But people are just going to build AIs attached to video cameras, or screens read by humans, or robot cars, or the internet, which are enough information flow by orders of magnitude, so it's not super important where the precise boundary is.

Gotcha. I guess there's a blurry line between program search and training. Somehow training feels reasonable to me, but something like searching over all possible programs feels unreasonable to me. I suppose the output of such a program search is what I might mean by an algorithm for AGI.

Hyperparameter search and RL on a huge neural net feels wildly underspecified to me. Like, what would be its inputs and outputs, even?

2Charlie Steiner
Since I'm fine with saying things that are wildly inefficient, almost any input/output that's sufficient to reward modeling of the real world (rather than e.g. just playing the abstract game of chess) is sufficient. A present-day example might be self-driving car planning algorithms (though I don't think any major companies actually use end to end NN planning).

Excellent comment - thanks for sticking your neck out to provide your own probabilities.

Given the gulf between our 0.4% and your 58.6%, would you be interested in making a bet (large or small) on TAI by 2043? If yes, happy to discuss how we might operationalize it.

3Max H
I appreciate the offer to bet! I'm probably going to decline though - I don't really want or need more skin-in-the-game on this question (many of my personal and professional plans assume short timelines.) You might be interested in this post (and the bet it is about), for some commentary and issues with operationalizing bets like this. Also, you might be able to find someone else to bet with you - I think my view is actually closer to the median among EAs / rationalists / alignment researchers than yours. For example, the Open Phil panelists judging this contest say:  

I'm curious and I wonder if I'm missing something that's obvious to others: What are the algorithms we already have for AGI? What makes you confident they will work before seeing any demonstration of AGI?

4Charlie Steiner
So, the maximally impractical but also maximally theoretically rigorous answer here is AIXI-tl. An almost as impractical answer would be Markov chain Monte Carlo search for well-performing huge neural nets on some objective. I say MCMC search because I'm confident that there's some big neural nets that are good at navigating the real world, but any specific efficient training method we know of right now could fail to scale up reliably. Instability being the main problem, rather than getting stuck in local optima. Dumb but thorough hyperparameter search and RL on a huge neural net should also work. Here we're adding a few parts of "I am confident in this because of empirical data abut the historical success of scaling up neural nets trained with SGD" to arguments that still mostly rest on "I am confident because of mathematical reasoning about what it means to get a good score at an objective."

If humans can teleoperate robots, why don't we have low-wage workers operating robots in high-wage countries? Feels like a win-win if the technology works, but I've seen zero evidence of it being close. Maybe Ugo is a point in favor?

2Steven Byrnes
Hmm. That’s an interesting question: If I’m running a warehouse in a high-wage country, why not have people in low-wage countries teleoperating robots to pack boxes etc.? I don’t have a great answer. My guesses would include possible issues with internet latency & unreliability in low-wage countries, and/or market inefficiencies e.g. related to the difficulty of developing new business practices (e.g. limited willingness/bandwidth of human warehouse managers to try weird experiments), and associated chicken-and-egg issues where the requisite tech doesn’t exist because there’s no market for it and vice-versa. There might also be human-UI issues that limit robot speed / agility (and wouldn’t apply to AIs)? Of course the “teleoperated robot tech is just super-hard and super-expensive, much moreso than I realize” theory is also a possibility. I’m interested if anyone else has a take.  :)

Interesting. When I participated in the AI Adversarial Collaboration Project, a study funded by Open Philanthropy and executed by the Forecasting Research Institute, I got the sense that most folks concerned about AI x-risk mostly believed that AGIs would kill us on their own accord (rather than by accident or as a result of human direction), that AGIs would have self-preservation goals, and therefore AGIs would likely only kill us after solving robotic supply chains (or enslaving/manipulating humans, as I argued as an alternative).

Sounds like your perception is that LessWrong folks don't think robotic supply chain automation will be a likely prerequisite to AI x-risk?

6Steven Byrnes
There’s an interesting question: if a power-seeking AI had a button that instantly murdered every human, how much human-requiring preparatory work would it want to do before pressing the button? People seem to have strongly clashing intuitions here, and there aren’t any great writeups IMO. Some takes on the side of “AI wouldn’t press the button until basically the whole world economy was run by robots” are 1, 2, 3, 4, 5. I tend to be on the other side, for example I wrote here: Some cruxes: * One crux on that is how much compute is needed to run a robot—if it’s “1 consumer-grade GPU” then my story above seems to work, if it’s “10⁶ SOTA GPUs” then probably not. * Another crux is how much R&D needs to be done before we can build a computational substrate using self-assembling nanotechnology (whose supply chain / infrastructure needs are presumably much much lower than chip fabs). This is clearly possible, since human brains are in that category, but it’s unclear just how much R&D needs to be done before an AI could start doing that. * For example, Eliezer is optimistic (umm, I guess that’s the wrong word) that this is doable without very much real-world experimenting (as opposed to “thinking” and doing simulations / calculations via computer), and this path is part of why he expects AI might kill every human seemingly out of nowhere. * Another crux is just how minimal is a “minimal supply chain that can make good-enough chips” if the self-assembling route of the previous bullet point is not feasible. Such a supply chain would presumably be very very different from the supply chain that humans use to make chips, because obviously we’re not optimizing for that. As a possible example, e-beam lithography (EBL) is extraordinarily slow and expensive but works even better than EUV photolithography, and it’s enormously easier to build a janky EBL than to get EUV working. A commercial fab in the human world would never dream of mass-manufacturing chips by filling giant
2RobertM
Robotic supply chain automation only seems necessary in worlds where it's either surprisingly difficult to get AGI to a sufficiently superhuman level of cognitive ability (such that it can find a much faster route to takeover), worlds where faster/more reliable routes to takeover either don't exist or are inaccessible even to moderately superhuman AGI, or some combination of the two.

Yeah, that's a totally fair criticism. Maybe a better header would be "evidence of accuracy." Though even that is a stretch given we're only listing events in the numerators. Maybe "evidence we're not crackpots"?

Edit: Probably best would be "Forecasting track record." This is what I would have gone with if rewriting the piece today.

Edit 2: Updated the post.

According to our rough and imperfect model, dropping inference needs by 2 OOMs increases our likelihood of hitting the $25/hr target by 20%abs, from 16% to 36%.

It doesn't necessarily make a huge difference to chip and power scaling, as in our model those are dominated by our training estimates, not our inference need estimates. (Though of course those figures will be connected in reality.)

With no adjustment to chip and power scaling, this yields a 0.9% likelihood of TAGI.

With a +15%abs bump to chip and power scaling, this yields a 1.2% likelihood of TAGI.

4AnthonyC
Ah, sorry, I see I made an important typo in my comment, that 16% value I mentioned was supposed to be 46%, because it was in reference to the chip fabs & power requirements estimate. The rest of the comment after that was my way of saying "the fact that these dependences on common assumptions between the different conditional probabilities exist at all mean you can't really claim that you can multiply them all together and consider the result meaningful in the way described here." I say that because the dependencies mean you can't productively discuss disagreements about any of your assumptions that go into your estimates, without adjusting all the probabilities in the model. A single updated assumption/estimate breaks the claim of conditional independence that lets you multiply the probabilities. For example, in a world that actually had "algorithms for transformative AGI" that were just too expensive to productively used, what would happen next? Well, my assumption is that a lot more companies would hire a lot more humans to get to work on making them more efficient, using the best available less-transformative tools. A lot of governments would invest trillions in building the fabs and power plants and mines to build it anyway, even if it still cost $25,000/human-equivalent-hr. They'd then turn the AGI loose on the problem of improving its own efficiency. And on making better robots. And on using those robots to make more robots and build more power plants and mine more materials. Once producing more inputs is automated, supply stops being limited by human labor, and doesn't require more high level AI inference either. Cost of inputs into increasing AI capabilities becomes decoupled from the human economy, so that the price of electricity and compute in dollars plummets. This is one of many hypothetical pathways where a single disagreement renders consideration of the subsequent numbers moot. Presenting the final output as a single number hides the extreme sen

Great points.

I think you've identified a good crux between us: I think GPT-4 is far from automating remote workers and you think it's close. If GPT-5/6 automate most remote work, that will be point in favor of your view, and if takes until GPT-8/9/10+, that will be a point in favor of mine. And if GPT gradually provides increasingly powerful tools that wildly transform jobs before they are eventually automated away by GPT-7, then we can call it a tie. :)

I also agree that the magic of GPT should update one into believing in shorter AGI timelines with lower ... (read more)

2GdL752
But , a huge huge portion of human labor doesnt require basic reasoning. Its rote enough to use flowcharts , I don't need my calculator to "understand" math , I need it to give me the correct answer. And for the "hallucinating" behavior you can just have it learn not do to that by rote. Even if you still need 10% of a certain "discipline" (job) to double check that the AI isn't making things up you've still increased productivity insanely. And what does that profit and freed up capital do other than chase more profit and invest in things that draw down all the conditionals vastly? 5% increased productivity here , 3% over here , it all starts to multiply.
9Daniel Kokotajlo
Excellent! Yeah I think GPT-4 is close to automating remote workers. 5 or 6, with suitable extensions (e.g. multimodal, langchain, etc.) will succeed I think. Of course, there'll be a lag between "technically existing AI systems can be made to ~fully automate job X" and "most people with job X are now unemployed" because things take time to percolate through the economy. But I think by the time of GPT-6 it'll be clear that this percolation is beginning to happen & the sorts of things that employ remote workers in 2023 (especially the strategically relevant ones, the stuff that goes into AI R&D) are doable by the latest AIs. It sounds like you think GPT will continue to fail at basic reasoning for some time? And that it currently fails at basic reasoning to a significantly greater extent than humans do? I'd be interested to hear more about this, what sort of examples do you have in mind? This might be another great crux between us.
Load More