gwern - LessWrong

I have to say, I still don't understand the cult of Roam or why people were so impressed by, eg. the [[link]] syntax borrowed from English Wikipedia (which introduced it something like 18 years before on what is still the most widely-read & edited wiki software in history), which you remark on repeatedly. Even in 2019 in beta it just seemed liked a personal wiki, not much different from, say, PmWiki (2002) with some more emphasis than usual on the backlink or 'reverse citation' functionality that so many hypertext systems had supported going back decades in parallel with Xanadu ideas. It may be nicer than, say, English Wikipedia's "WhatLinksHere" (which has been there since before I began using it early in the 2000s), but nothing to create a social-media cult over or sell "courses" about (!).

But if the bubble has burst, it's not hard to see why: any note-taking, personal knowledge management, or personal wiki system is inherently limited by the fact that they require a lot of work for what is, for most people, little gain. For most people, trying to track all of this stuff is as useful as exact itemized grocery store receipts from 5 years ago.

Most people simply have no need for lots of half-formed ideas, random lists of research papers, and so on. This is what people always miss about Zettelkasten: are you writing a book? Are you a historian or German scholar? Do you publish a dozen papers a year? No? Then why do you think you need a Zettelkasten? If you are going to be pulling out a decent chunk of those references for an essay or something, possibly decades from now, then it can be worth the upfront cost of entering references into your system, knowing that you'll never use most of them and the benefit is mostly from the long tail, and you will, in the natural course of usage, periodically look over them to foster serendipity & creativity; if you aren't writing all that, then there's no long tail, no real benefit, no intrinsic review & serendipity, and it's just a massive time & energy sink. Eventually, the user abandons it... and their life gets better.

Further, these systems are inherently passive, and force people to become secretaries, typists, reference librarians, & writers simply to keep it from rotting (quite aside from any mere software issue), to keep it up to date, revise tenses or references, fix spelling errors, and so on. (Surprisingly, most people do not find that enjoyable.) There is no intelligence in such systems, and they don't do anything.

So what comes after Roam and other personal systems which force the user to do all the thinking? I should think that would be obvious: systems which can think for the user instead. LLMs and other contemporary AI are wildly underused in the personal system space right now, and can potentially fix a lot of these issues, through approaches like actively surfacing connections instead of passively waiting for the user to make them on their own and manually record them, and can proactively suggest edits & updates & fixes that the user simply approves in batches. (Think of how much easier it is to copyedit a document using a spellcheck as a series of Y/N semi-automatic edits, than to go through it by eye, fixing typos.)

However, like most such paradigm shifts, it will be hard to tack it onto existing systems. You can't reap the full benefits of LLMs with some tweaks like 'let's embed documents and add a little retrieval pane!'. You need to rethink the entire system and rewrite it from the ground up on the basis of making neural nets do as much as possible, to figure out the new capabilities and design patterns, and what to drop from the old obsolete personal wikis like Roam.

From what it sounds like, the Roam community would never stand for that, and I have a lot of doubts about whether it makes sense economically to try. It seems like if one wanted to do that, it would be better to start with a clean sheet (and an empty cap table).

Building The Torment Nexus

gwern6h22

For extremely opinionated 'tags' like that, where their very existence is probably going too far, maybe users should be encouraged to simply use a comment in their Short Form to list URLs? Since looking up one's Short Form comments to edit in a URL is annoying, possibly with some UX slapped on top for convenience: a widget on every page for "add to Personal List [A / B / C / D]" where a 'Personal List' is just a 'Short Form' comment starting with a phrase "A" and then a list of links which get auto-edited to append the next one.

(For less inflammatory ones, I think my personalized-wiki hybrid proposal works fine by clearly subordinating the user comments 'replying' to the tag and indicating responsibility & non-community-endorsement.)

Decaeneus's Shortform

gwern2d64

Those are not randomly selected pairs, however. There are 3 major causal patterns: A->B, A<-B, and A<-C->B. Daecaneus is pointing out that for a random pair of correlations of some variables, we do not assign a uniform prior of 33% to each of these. While it may sound crazy to try to argue for some specific prior like 'we should assign 1% to the direct causal patterns of A->B and A<-B, and 99% to the confounding pattern of A<-C->B', this is a lot closer to the truth than thinking that 'a third of the time, A causes B; a third of the time, B causes A; and the other third of the time, it's just some confounder'.

What would be relevant there is "Everything is Correlated". If you look at, say, Meehl's examples of correlations from very large datasets, and ask about causality, I think it becomes clearer. Let's take one of his first examples:

For example, only children are nearly twice as likely to be Presbyterian than Baptist in Minnesota, more than half of the Episcopalians “usually like school” but only 45% of Lutherans do, 55% of Presbyterians feel that their grades reflect their abilities as compared to only 47% of Episcopalians, and Episcopalians are more likely to be male whereas Baptists are more likely to be female.

Like, if you randomly assigned Baptist children to be converted to Presbyterianism, it seems unlikely that their school-liking will suddenly jump because they go somewhere else on Sunday, or that siblings will appear & vanish; it also seems unlikely that if they start liking school (maybe because of a nicer principal), that many of those children would spontaneously convert to Presbyterianism. Similarly, it seems rather unlikely that undergoing sexual-reassignment surgery will make Episcopalian men and Baptist women swap places, and it seems even more unlikely that their religious status caused their gender at conception. In all of these 5 cases, we are pretty sure that we can rule out one of the direct patterns, and that it was probably the third, and we could go through the rest of Meehl's examples. (Indeed, this turns out to be a bad example because we can apply our knowledge that sex must have come many years before any other variable like "has cold hands" or "likes poetry" to rule out one pattern, but even so, we still don't find any 50%s: it's usually pretty obviously direct causation from the temporally earlier variable, or confounding, or both.)

So what I am doing in 'How Often Does Correlation=Causality?' is testing the claim that "yes, of course it would be absurd to take pairs of arbitrary variables and calculate their causal patterns for prior probabilities, because yeah, it would be low, maybe approaching 0 - but that's irrelevant because that's not what you or I are discussing when we discuss things like medicine. We're discussing the good correlations, for interventions which have been filtered through the scientific process. All of the interventions we are discussing are clearly plausible and do not require time travel machines, usually have mechanisms proposed, have survived sophisticated statistical analysis which often controls for covariates or confounders, are regarded as credible by highly sophisticated credentialed experts like doctors or researchers with centuries of experience, and may even have had quasi-randomized or other kinds of experimental evidence; surely we can repose at least, say, 90% credibility, by the time that some drug or surgery or educational program has gotten that far and we're reading about it in our favorite newspaper or blog? Being wrong 1 in 10 times would be painful, but it certainly doesn't justify the sort of corrosive epistemological nihilism you seem to be espousing."

But unfortunately, it seems that the error rate, after everything we humans can collectively do, is still a lot higher than 1 in 10 before the randomized version gets run. (Which implies that the scientific evidence is not very good in terms of providing enough Bayesian evidence to promote the hypothesis from <1% to >90%, or that it's <<1% because causality is that rare.)

Dyslucksia

gwern3d101

Classic 'typical mind' like experience: https://www.lesswrong.com/posts/baTWMegR42PAsH9qJ/generalizing-from-one-example

Has Generative AI Already Peaked? - Computerphile

gwern3d81

Key graph: https://arxiv.org/pdf/2404.04125#page=6

I don't think they 'consistently find' anything, except about possibly CLIP (which we've known to have severe flaws ever since it came out, as expected from a contrastive loss, despite of course many extremely valuable uses). Unless I'm missing something in the transcript, Computerphile doesn't mention this at all.

Transformers Represent Belief State Geometry in their Residual Stream

gwern4d64

My earlier comment on meta-learning and Bayesian RL/inference for background: https://www.lesswrong.com/posts/TiBsZ9beNqDHEvXt4/how-we-picture-bayesian-agents?commentId=yhmoEbztTunQMRzJx

The main question I have been thinking about is what is a state for language and how that can be useful if so discovered in this way?

The way I would put it is that 'state' is misleading you here. It makes you think that it must be some sort of little Turing machine or clockwork, where it has a 'state', like the current state of the Turing machine tape or the rotations of each gear in a clockwork gadget, where the goal is to infer that. This is misleading, and it is a coincidence in these simple toy problems, which are so simple that there is nothing to know beyond the actual state.

As Ortega et al highlights in those graphs, what you are really trying to define is the sufficient statistics: the summary of the data (history) which is 100% adequate for decision making, and where additionally knowing the original raw data doesn't help you.

In the coin flip case, the sufficient statistics are simply the 2-tuple (heads,tails), and you define a very simple decision over all of the possible observed 2-tuples. Note that the sufficient statistic is less information than the original raw "the history", because you throw out the ordering. (A 2-tuple like '(3,1)' is simpler than all of the histories it summarizes, like '[1,1,1,0]', '[0,1,1,1]'. '[1,0,1,1]', etc.) From the point of view of decision making, these all yield the same posterior distribution over the coin flip probability parameter, which is all you need for decision making (optimal action: 'bet on the side with the higher probability'), and so that's the sufficient statistic. If I tell you the history as a list instead of a 2-tuple, you cannot make better decisions. It just doesn't matter if you got a tails first and then all heads, or all heads first then tails, etc.

It is not obvious that this is true: a priori, maybe that ordering was hugely important, and those correspond to different games. But the RNN there has learned that the differences are not important, and in fact, they are all the same.

And the 2-tuple here doesn't correspond to any particular environment 'state'. The environment doesn't need to store that anywhere. The environment is just a RNG operating according to the coin flip probability, independently every turn of the game, with no memory. There is nowhere which is counting heads & tails in a 2-tuple. That exists solely in the RNN's hidden state as it accumulates evidence over turns, and optimally updates priors to posteriors every observed coin flip, and possibly switches its bet.

So, in language tasks like LLMs, they are the same thing, but on a vastly grander scale, and still highly incomplete. They are (trying to) infer sufficient statistics of whatever language-games they have been trained on, and then predicting accordingly.

What are those sufficient statistics in LLMs? Hard to say. In that coinflip example, it is so simple that we can easily derive by hand the conjugate statistics and know it is just a binomial and so we only need to track heads/tails as the one and only sufficient statistic, and we can then look in the hidden state to find where that is encoded in a converged optimal agent. In LLMs... not so much. There's a lot going on.

Based on interpretability research and studies of how well they simulate people as well as just all of the anecdotal experience with the base models, we can point to a few latents like honesty, calibration, demographics, and so on. (See Janus's "Simulator Theory" for a more poetic take, less focused on agency than the straight Bayesian meta-imitation learning take I'm giving here.) Meanwhile, there are tons of things about the inputs that the model wants to throw away, irrelevant details like the exact mispellings of words in the prompt (while recording that there were mispellings, as grist for the inference mill about the environment generating the mispelled text).

So conceptually, the sufficient statistics when you or I punch in a prompt to GPT-3 might look like some extremely long list of variables like, "English speaker, Millennial, American, telling the truth, reliable, above-average intelligence, Common Crawl-like text not corrupted by WET processing, shortform, Markdown formatting, only 1 common typo or misspelling total, ..." and it will then tailor responses accordingly and maximize its utility by predicting the next token accurately (because the 'coin flip' there is simply betting on the logits with the highest likelihood etc). Like the coinflip 2-tuple, most of these do not correspond to any real-world 'state': if you or I put in a prompt, there is no atom or set of atoms which corresponds to many of these variables. But they have consequences: if we ask about Tienanmen Square, for example, we'll get a different answer than if we had asked in Mandarin, because the sufficient statistics there are inferred to be very different and yield a different array of latents which cause different outputs.

And that's what "state" is for language: it is the model's best attempt to infer a useful set of latent variables which collectively are sufficient statistics for whatever language-game or task or environment or agent-history or whatever the context/prompt encodes, which then supports optimal decision-making.

William_S's Shortform

gwern4d235

Rest assured, there is plenty that could leak at OA... (And might were there not NDAs, which of course is much of the point of having them.)

For a past example, note that no one knew that Sam Altman had been fired from YC CEO for similar reasons as OA CEO, until the extreme aggravating factor of the OA coup, 5 years later. That was certainly more than 'run of the mill office politics', I'm sure you'll agree, but if that could be kept secret, surely lesser things now could be kept secret well past 2029?

Navigating LLM embedding spaces using archetype-based directions

gwern5d72

In terms of factorizing or fingerprinting, 20 Tarot concepts seems like a lot; it's exhausting even just to skim it. Why do you think you need so many and that they aren't just mostly many fewer factors like Big Five or dirty uninterpretable mixes? Like the 500 closest tokens generally look pretty random to me for each one.

William_S's Shortform

gwern9dΩ286257

I think it is safe to infer from the conspicuous and repeated silence by ex-OA employees when asked whether they signed a NDA which also included a gag order about the NDA, that there is in fact an NDA with a gag order in it, presumably tied to the OA LLC PPUs (which are not real equity and so probably even less protected than usual).

We are headed into an extreme compute overhang

gwern11d124

For example, 70B model trained on next-token prediction only on the entire 20TB GenBank dataset will have better performance at next-nucleotide prediction than a 70B model that has been trained both on the 20TB GenBank dataset and on all 14TB of code on Github.

I don't believe that's obvious, and to the extent that it's true, I think it's largely irrelevant (and part of the general prejudice against scaling & Bitter Lesson thinking, where everyone is desperate to find an excuse for small specialist models with complicated structures & fancy inductive biases because that feels right).

Once you have a bunch of specialized models "the weights are identical" and "a fine tune can be applied to all members" no longer holds.

Nor do I see how this is relevant to your original claim. If you have lots of task-specialist models, how does this refute the claim that those will be able to coordinate? Of course they will. They will just share weight updates in exactly the way I just outlined, which works so well in practice. You may not be able to share parameter-updates across your protein-only and your Python-only LLMs, but they will be able to share updates within that model family and the original claim ("AGIs derived from the same model are likely to collaborate more effectively than humans because their weights are identical. Any fine-tune can be applied to all members, and text produced by one can be understood by all members.") remains true, no matter how you swap out your definition of 'model'.

DL models are fantastically good at collaborating and updating each other, in many ways completely impossible for humans, whether you are talking about AGI models or narrow specialist models.

LESSWRONG
LW

Posts

Wiki Contributions

Comments