gwern - LessWrong

We are headed into an extreme compute overhang

I'm not sure how to square those results with the Chinchilla paper though

Apples and oranges. The Chinchilla paper simply optimizes the final trained model's loss given a fixed compute budget. It doesn't say anything about any downstream uses - similar to how it doesn't tell you (directly) how you should allocate your compute if you have X GPUs and you want to run a model for your users for Y requests, and you have a tradeoff between spending your GPUs at training time to create a smaller model which needs fewer GPUs to serve Y requests. Likewise, you've probably seen some "overtraining" analyses which argue that you should overtrain a Chinchilla by some large amount Z to get the model which best balances train vs run - but those also answer a different question because they assume that you will deploy that Chinchilla model without any sparsification or lower precision, even though that's hardly what anyone actually does.

(While no one has done Li et al for MoEs I know of, I would expect that the results will be fairly similar, but shifted up/down, because you can often think of a MoE as a bunch of smaller dense models.)

Alexander Gietelink Oldenziel's Shortform

gwern2d31

Yup. Who knows but we are all part of a giant leave-one-out cross-validation computing counterfactual credit assignment on human history? Schmidhuber-em will be crushed by the results.

Ilya Sutskever and Jan Leike resign from OpenAI

gwern2d4525

But this is also right after GPT-4o, which, like Sora not that long ago, is a major triumph of the Sutskeverian vision of just scaling up sequence prediction for everything, and which OA has been researching for years (at least since CLIP/DALL-E 1, and possibly this effort for 2-3 years as 'Gobi'). I don't find it so hard to believe that he's held off until Sora & GPT-4o were out. These are the achievements of not just his lifetime, but hundreds of other peoples' lives (look at the contributor list). He's not going to quit anywhere before it. (Especially since by all accounts he's been gone the entire time, so what's a few more days or weeks of silence?)

Is there a particular reason to think that he would have had an exactly 6-month notice from the vote to remove Altman? And why would he have submitted notice then, exactly? The logical day to submit your quitting notice would be when the investigation report was submitted and was a complete Altman victory, which was not 6 months ago.

Alexander Gietelink Oldenziel's Shortform

gwern2d95

One of the interesting thing about AI minds (such as LLMs) is that in theory, you can turn many topics into testable science while avoiding the 'problem of old evidence', because you can now construct artificial minds and mold them like putty. They know what you want them to know, and so you can see what they would predict in the absence of knowledge, or you can install in them false beliefs to test out counterfactual intellectual histories, or you can expose them to real evidence in different orders to measure biases or path dependency in reasoning.

With humans, you can't do that because they are so uncontrolled: even if someone says they didn't know about crucial piece of evidence X, there is no way for them to prove that, and they may be honestly mistaken and have already read about X and forgotten it (but humans never really forget so X has already changed their "priors", leading to double-counting), or there is leakage. And you can't get people to really believe things at the drop of a hat, so you can't make people imagine, "suppose Napoleon had won Waterloo, how do you predict history would have changed?" because no matter how you try to participate in the spirit of the exercise, you always know that Napoleon lost and you have various opinions on that contaminating your retrodictions, and even if you have never read a single book or paper on Napoleon, you are still contaminated by expressions like "his Waterloo" ('Hm, the general in this imaginary story is going to fight at someplace called Waterloo? Bad vibes. I think he's gonna lose.')

But with a LLM, say, you could simply train it with all timestamped texts up to Waterloo, like all surviving newspapers, and then simply have one version generate a bunch of texts about how 'Napoleon won Waterloo', train the other version on these definitely-totally-real French newspaper reports about his stunning victory over the monarchist invaders, and then ask it to make forecasts about Europe.

(These are the sorts of experiments which are why one might wind up running tons of 'ancestor simulations'... There's many more reasons to be simulating past minds than simply very fancy versions of playing The Sims. Perhaps we are now just distant LLM personae being tested about reasoning about the Singularity in one particular scenario involving deep learning counterfactuals, where DL worked, although in the real reality it was Bayesian program synthesis & search.)

An Introduction to AI Sandbagging

gwern2d143

Seems like this might work better as a control variate sort of idea, with the noise added to the environment rather than the model itself, to measure whether the model can steer its way back to its intended measured-capability level. If you add a lot of noise to an environment, then an agent which is trying to pessimize and achieve a certain lower performance will have to overperform in order to 'merely' be bad; then you can subtract the noise which you added, and estimate its performance. If it's been deliberately taking a dive, then it'll do better than non-sandbagging agents do when evaluated in the new noisy environment.

(Imagine, say, playing chess, where a chess agent is deliberately aiming to have a rating of only, say, 1500 Elo where its true Elo is 3000. You evaluate it normally, and it scores ~1500, carefully playing in a deniable way so that it imitates the distribution of moves an Elo 1500 player has eg. like a Decision Transformer. But you then add in occasional stochastic errors to its chosen move, as if its hand "trembled"; it observes that it's behind and keeps ramping up its move quality in order to hit its 1500 target. Eventually, you add so much noise that even playing in its final form, it's too far behind and it starts getting Elos like 1400. You do this same evaluation with a non-deceptive version of that same model, which is honestly only 1500 Elo, and its performance degrades smoothly the instant you add in any noise at all because it's not trained for that sort of noise and has no performance reserves to draw on, just like a human chess player will get worse if you periodically reached in & screwed with them by surprisingly randomizing one of their moves.)

To avoid this, it would need to be so situated and so knowledgeable about other agents that it knows it's being tested in this anti-sandbagging way and also which agents it's being compared against and how badly they do in order to know how much of an additional dive to take. "With this level of noise, an honest 1500 Elo version of me would actually be 1300 Elo, and I need to let myself lose exactly that much more, but neither more nor less." And that may be quite difficult: an agent may be able to figure out easily it's being tested on a weirdo environment (the way Claude-3 increasingly seems situated when you give it weird prompts or tasks), but how does it know what other agents it's being compared to this time and what their performance in the new environment is?

What comes after Roam's renaissance?

gwern4d7431

I have to say, I still don't understand the cult of Roam or why people were so impressed by, eg. the [[link]] syntax borrowed from English Wikipedia (which introduced it something like 18 years before on what is still the most widely-read & edited wiki software in history), which you remark on repeatedly. Even in 2019 in beta it just seemed like a personal wiki, not much different from, say, PmWiki (2002) with some more emphasis than usual on the common backlink or 'reverse citation' functionality (that so many hypertext systems had supported going back decades in parallel with Xanadu ideas). It may be nicer than, say, English Wikipedia's "WhatLinksHere" (which has been there since before I began using it early in the 2000s), but nothing to create a social-media cult over or sell "courses" about (!).

But if the bubble has burst, it's not hard to see why: any note-taking, personal knowledge management, or personal wiki system is inherently limited by the fact that they require a lot of work for what is, for most people, little gain. For most people, trying to track all of this stuff is as useful as exact itemized grocery store receipts from 5 years ago.

Most people simply have no need for lots of half-formed ideas, random lists of research papers, and so on. This is what people always miss about Zettelkasten: are you writing a book? Are you a historian or German scholar? Do you publish a dozen papers a year? No? Then why do you think you need a Zettelkasten? If you are going to be pulling out a decent chunk of those references for an essay or something, possibly decades from now, then it can be worth the upfront cost of entering references into your system, knowing that you'll never use most of them and the benefit is mostly from the long tail, and you will, in the natural course of usage, periodically look over them to foster serendipity & creativity; if you aren't writing all that, then there's no long tail, no real benefit, no intrinsic review & serendipity, and it's just a massive time & energy sink. Eventually, the user abandons it... and their life gets better.

Further, these systems are inherently passive, and force people to become secretaries, typists, reference librarians, archivists, & writers simply to keep it from rotting (quite aside from any mere software issue), to keep it up to date, revise tenses or references, fix spelling errors, deal with link rot, and so on. (Surprisingly, most people do not find that enjoyable.) There is no intelligence in such systems, and they don't do anything. The user still has to do all the thinking, and it adds on a lot of thinking overhead.

So what comes after Roam and other personal systems which force the user to do all the thinking? I should think that would be obvious: systems which can think for the user instead. LLMs and other contemporary AI are wildly underused in the personal system space right now, and can potentially fix a lot of these issues, through approaches like actively surfacing connections instead of passively waiting for the user to make them on their own and manually record them, and can proactively suggest edits & updates & fixes that the user simply approves in batches. (Think of how much easier it is to copyedit a document using a spellcheck as a series of Y/N semi-automatic edits, than to go through it by eye, fixing typos.)

However, like most such paradigm shifts, it will be hard to tack it onto existing systems. You can't reap the full benefits of LLMs with some tweaks like 'let's embed documents and add a little retrieval pane!'. You need to rethink the entire system and rewrite it from the ground up on the basis of making neural nets do as much as possible, to figure out the new capabilities and design patterns, and what to drop from the old obsolete personal wikis like Roam.

From what it sounds like, the Roam community would never stand for that, and I have a lot of doubts about whether it makes sense economically to try. It seems like if one wanted to do that, it would be better to start with a clean sheet (and an empty cap table).

Decaeneus's Shortform

gwern7d64

Those are not randomly selected pairs, however. There are 3 major causal patterns: A->B, A<-B, and A<-C->B. Daecaneus is pointing out that for a random pair of correlations of some variables, we do not assign a uniform prior of 33% to each of these. While it may sound crazy to try to argue for some specific prior like 'we should assign 1% to the direct causal patterns of A->B and A<-B, and 99% to the confounding pattern of A<-C->B', this is a lot closer to the truth than thinking that 'a third of the time, A causes B; a third of the time, B causes A; and the other third of the time, it's just some confounder'.

What would be relevant there is "Everything is Correlated". If you look at, say, Meehl's examples of correlations from very large datasets, and ask about causality, I think it becomes clearer. Let's take one of his first examples:

For example, only children are nearly twice as likely to be Presbyterian than Baptist in Minnesota, more than half of the Episcopalians “usually like school” but only 45% of Lutherans do, 55% of Presbyterians feel that their grades reflect their abilities as compared to only 47% of Episcopalians, and Episcopalians are more likely to be male whereas Baptists are more likely to be female.

Like, if you randomly assigned Baptist children to be converted to Presbyterianism, it seems unlikely that their school-liking will suddenly jump because they go somewhere else on Sunday, or that siblings will appear & vanish; it also seems unlikely that if they start liking school (maybe because of a nicer principal), that many of those children would spontaneously convert to Presbyterianism. Similarly, it seems rather unlikely that undergoing sexual-reassignment surgery will make Episcopalian men and Baptist women swap places, and it seems even more unlikely that their religious status caused their gender at conception. In all of these 5 cases, we are pretty sure that we can rule out one of the direct patterns, and that it was probably the third, and we could go through the rest of Meehl's examples. (Indeed, this turns out to be a bad example because we can apply our knowledge that sex must have come many years before any other variable like "has cold hands" or "likes poetry" to rule out one pattern, but even so, we still don't find any 50%s: it's usually pretty obviously direct causation from the temporally earlier variable, or confounding, or both.)

So what I am doing in 'How Often Does Correlation=Causality?' is testing the claim that "yes, of course it would be absurd to take pairs of arbitrary variables and calculate their causal patterns for prior probabilities, because yeah, it would be low, maybe approaching 0 - but that's irrelevant because that's not what you or I are discussing when we discuss things like medicine. We're discussing the good correlations, for interventions which have been filtered through the scientific process. All of the interventions we are discussing are clearly plausible and do not require time travel machines, usually have mechanisms proposed, have survived sophisticated statistical analysis which often controls for covariates or confounders, are regarded as credible by highly sophisticated credentialed experts like doctors or researchers with centuries of experience, and may even have had quasi-randomized or other kinds of experimental evidence; surely we can repose at least, say, 90% credibility, by the time that some drug or surgery or educational program has gotten that far and we're reading about it in our favorite newspaper or blog? Being wrong 1 in 10 times would be painful, but it certainly doesn't justify the sort of corrosive epistemological nihilism you seem to be espousing."

But unfortunately, it seems that the error rate, after everything we humans can collectively do, is still a lot higher than 1 in 10 before the randomized version gets run. (Which implies that the scientific evidence is not very good in terms of providing enough Bayesian evidence to promote the hypothesis from <1% to >90%, or that it's <<1% because causality is that rare.)

Dyslucksia

gwern8d112

Classic 'typical mind' like experience: https://www.lesswrong.com/posts/baTWMegR42PAsH9qJ/generalizing-from-one-example

Has Generative AI Already Peaked? - Computerphile

gwern8d92

Key graph: https://arxiv.org/pdf/2404.04125#page=6

I don't think they 'consistently find' anything, except about possibly CLIP (which we've known to have severe flaws ever since it came out, as expected from a contrastive loss, despite of course many extremely valuable uses). Unless I'm missing something in the transcript, Computerphile doesn't mention this at all.

LESSWRONG
LW

Posts

Wiki Contributions

Comments