LESSWRONG
LW

Comment Permalink

There is a passage from Jung's "Modern man in search of a soul" that I think about fairly often, on this point (p.229 in my edition)

I know that the idea of proficiency is especially repugnant to the pseudo-moderns, for it reminds them unpleasantly of their deceits. This, however, cannot prevent us from taking it as our criterion of the modern man. We are even forced to do so, for unless he is proficient, the man who claims to be modern is nothing but an unscrupulous gambler. He must be proficient in the highest degree, for unless he can atone by creative ability for his break with tradition, he is merely disloyal to the past

See in context

leogao's Shortform

by leogao

24th May 2022

AI Alignment Forum

1 min read

416

7 Ω 3

This is a special post for quick takes by leogao. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

Mentioned in

348Shallow review of live agendas in alignment & safety

193Shallow review of technical AI safety, 2024

9Alexander Gietelink Oldenziel

1Sheikh Abdur Raheem Ali

4Sheikh Abdur Raheem Ali

4the gears to ascension

4Alexander Gietelink Oldenziel

21Alexander Gietelink Oldenziel

2the gears to ascension

2the gears to ascension

7Alexander Gietelink Oldenziel

2the gears to ascension

1Andreas Chrysopoulos

416 comments, sorted by

top scoring

Click to highlight new comments since: Today at 3:33 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]leogao25d19181

some lessons from ml research:

any shocking or surprising result in your own experiment is 80% likely to be a bug until proven otherwise. your first thought should always be to comb for bugs.
only after you have ruled out bugs do you get to actually think about how to fit your theory to the data, and even then, there might still be a hidden bug.
most papers are terrible and don't replicate.
most techniques that sound intuitively plausible don't work.
most techniques only look good if you don't pick a strong enough baseline.
an actually good idea can take many tries before it works.
once you have good research intuitions, the most productive state to be in is to literally not think about what will go into the paper and just do experiments that satisfy your curiosity and. convince yourself that the thing is true. once you have that, running the final sweeps is really easy
most people have no intuition whatsoever about their hardware and so will write code that is horribly inefficient. even learning a little bit about hardware fundamentals so you don't do anything obviously dumb is super valuable
in a long and complex enough project, you will almost certainly have a bug that invalidates weeks

... (read more)

[-]Joseph Miller24d2012

I agree and this is why research grant proposals often feel very fake to me. I generally just write up my current best idea / plan for what research to do, but I don't expect it actually pan out that way and it would be silly to try to stick rigidly to a plan.

[-]Alexei24d105

Strongly agree. Exact same experience in research, but in finance / quant trading.

7testingthewaters24d

Can basically attest to all of these, been doing intensive ML upskilling for the last half a year and almost all of these have been true. Highlights include: * Not properly setting up the attention mechanism in multiple experiments, resulting in the conclusion that attention didn't do much (lmao) * So, so many off-by-one, off-by-two errors, especially for next-token prediction setups * Entire series of weeks-long experiments that turn out to be completely useless (usually based on a seemingly-reasonable intuition of some kind) * Accidentally overwriting/resetting the residual element so the RNN was just an NN with a funky hat on * I now hate shapes, reshaping, squeezing, unsqueezing, devices, torch.nn.functional.pad, so many more functions * Using the wrong loss function * Using the right loss function but with the wrong reduction * Using the right loss function but the learning rate is too aggressive/too low/the optimiser is not initialised properly * Using all the right things but loading the model from an incorrect checkpoint/not saving the weights properly * And also learning that google colab was forged in mount doom, a tool of great power crafted with malicious intent.

5Leon Lang23d

Are you using einops and einsum? I hate these somewhat less since using them. See here for more details.

4Mitchell_Porter22d

Be right back, just adding all that to my "AI researcher" prompt

2Garrett Baker23d

I will add: 80% likely to be a bug, or a result from random-matrix theory.

2Eli Tyre23d

Research sounds really finicky and tedious.

[-]leogao9mo10134

it's surprising just how much of cutting edge research (at least in ML) is dealing with really annoying and stupid bottlenecks. pesky details that seem like they shouldn't need attention. tools that in a good and just world would simply not break all the time.

i used to assume this was merely because i was inexperienced, and that surely eventually you learn to fix all the stupid problems, and then afterwards you can just spend all your time doing actual real research without constantly needing to context switch to fix stupid things.

however, i've started to think that as long as you're pushing yourself to do novel, cutting edge research (as opposed to carving out a niche and churning out formulaic papers), you will always spend most of your time fixing random stupid things. as you get more experienced, you get bigger things done faster, but the amount of stupidity is conserved. as they say in running- it doesn't get easier, you just get faster.

as a beginner, you might spend a large part of your research time trying to install CUDA or fighting with python threading. as an experienced researcher, you might spend that time instead diving deep into some complicated distributed trai... (read more)

[-]Carl Feynman9mo220

Not only is this true in AI research, it’s true in all science and engineering research. You’re always up against the edge of technology, or it’s not research. And at the edge, you have to use lots of stuff just behind the edge. And one characteristic of stuff just behind the edge is that it doesn’t work without fiddling. And you have to build lots of tools that have little original content, but are needed to manipulate the thing you’re trying to build.

After decades of experience, I would say: any sensible researcher spends a substantial fraction of time trying to get stuff to work, or building prerequisites.

This is for engineering and science research. Maybe you’re doing mathematical or philosophical research; I don’t know what those are like.

9Alexander Gietelink Oldenziel9mo

I can emphathetically say this is not the case in mathematics research.

1Lorec9mo

Interested to hear how you would put this with "research" tabooed. Personally I don't care if it's research as long as it works.

[-]leogao9mo135

a corollary is i think even once AI can automate the "google for the error and whack it until it works" loop, this is probably still quite far off from being able to fully automate frontier ML research, though it certainly will make research more pleasant

3Nathan Helm-Burger9mo

I agree if I specify 'quite far off in ability-space', while acknowledging that I think this may not be 'quite far off in clock-time'. Sometimes the difference between no skill at a task and very little skill is a larger time and effort gap than the difference between very little skill and substantial skill.

7jacquesthibs9mo

Completely agree. I remember a big shift in my performance when I went from "I'm just using programming so that I can eventually build a startup, where I'll eventually code much less" to "I am a programmer, and I am trying to become exceptional at it." The shift in mindset was super helpful.

5Noosphere899mo

More and more, I'm coming to the belief that one big flaw of basically everyone in general is not realizing how much you needed to deal with annoying and pesky/stupid details to do good research, and I believe some of this dictum also applies to alignment research as well. There is thankfully more engineering/ML experience in LW which alleviates the issue partially, but still, not realizing that pesky details mattering a lot in research/engineering is a problem that basically no one wants to particularly deal with.

4Viliam9mo

I would hope for some division of labor. There are certainly people out there who can't do ML research, but can fix Python code. But I guess, even if you had the Python guy and the budget to pay him, waiting until he fixes the bug would still interrupt your flow.

[-]leogao9mo103

I think there are several reasons this division of labor is very minimal, at least in some places.

You need way more of the ML engineering / fixing stuff skill than ML research. Like, vastly more. There are still a very small handful of people who specialize full time in thinking about research, but they are very few and often very senior. This is partly an artifact of modern ML putting way more emphasis on scale than academia.
Communicating things between people is hard. It's actually really hard to convey all the context needed to do a task. If someone is good enough to just be told what to do without too much hassle, they're likely good enough to mostly figure out what to work on themselves.
Convincing people to be excited about your idea is even harder. Everyone has their own pet idea, and you are the first engineer on any idea you have. If you're not a good engineer, you have a bit of a catch-22: you need promising results to get good engineers excited, but you need engineers to get results. I've heard of even very senior researchers finding it hard to get people to work on their ideas, so they just do it themselves.

1Sheikh Abdur Raheem Ali9mo

This is encouraging to hear as someone with relatively little ML research skill in comparison to experience with engineering/fixing stuff.

2Nathan Helm-Burger9mo

For sure. The more novel an idea I am trying to test, the deeper I have to go into the lower level programming stuff. I can't rely on convenient high-level abstractions if my needs are cutting across existing abstractions. Indeed, I take it as a bad sign of the originality of my idea if it's too easy to implement in an existing high-level library, or if an LLM can code it up correctly with low-effort prompting.

[-]leogao9mo8654

in research, if you settle into a particular niche you can churn out papers much faster, because you can develop a very streamlined process for that particular kind of paper. you have the advantage of already working baseline code, context on the field, and a knowledge of the easiest way to get enough results to have an acceptable paper.

while these efficiency benefits of staying in a certain niche are certainly real, I think a lot of people end up in this position because of academic incentives - if your career depends on publishing lots of papers, then a recipe to get lots of easy papers with low risk is great. it's also great for the careers of your students, because if you hand down your streamlined process, then they can get a phd faster and more reliably.

however, I claim that this also reduces scientific value, and especially the probability of a really big breakthrough. big scientific advances require people to do risky bets that might not work out, and often the work doesn't look quite like anything anyone has done before.

as you get closer to the frontier of things that have ever been done, the road gets tougher and tougher. you end up spending more time building basic infra... (read more)

5Nathan Helm-Burger9mo

I think this is true, and I also think that this is an even stronger effect in wetlab fields where there is lock-in to particular tools, supplies, and methods. This is part of my argument for why there appears to be an "innovation overhang" of underexplored regions of concept space. And, in the case of programming dependent disciplines, I expect AI coding assistance to start to eat away at the underexplored ideas, and for full AI researchers to burn through the space of implied hypotheses very fast indeed. I expect this to result in a big surge of progress once we pass that capability threshold.

3M. Y. Zuo9mo

Or perhaps on the flip side there is a ‘super genius underhang’ where there are insufficient numbers of super competent people to do that work. (Or willing to bet on their future selves being super competent.) It makes sense for the above average, but not that much above average, researcher to choose to focus on their narrow niche, since their relative prospects are either worse or not evaluable after wading into the large ocean of possibilities.

3Amalthea9mo

Or simply when scaling becomes too expensive.

4Jacob Pfau9mo

I agree that academia over rewards long-term specialization. On the other hand, it is compatible to also think, as I do, that EA under-rates specialization. At a community level, accumulating generalists has fast diminishing marginal returns compared to having easy access to specialists with hard-to-acquire skillsets.

2jacquesthibs9mo

This is one of the reasons I think 'independent' research is valuable, even if it isn't immediately obvious from a research output (papers, for example) standpoint. That said, I've definitely had the thought, "I should niche down into a specific area where there is already a bunch of infrastructure I can leverage and churn out papers with many collaborators because I expect to be in a more stable funding situation as an independent researcher. It would also make it much easier to pivot into a role at an organization if I want to or necessary. It would definitely be a much more stable situation for me."(And I also agree that specialization is often underrated.) Ultimately, I decided not to do this because I felt like there were already enough people in alignment/governance who would take the above option due to financial and social incentives and published directions seeming more promising. However, since this makes me produce less output, I hope this is something grantmakers keep in consideration for my future grant applications.

[-]leogao2mo8460

the modern world has many flaws, but I'm still deeply grateful for the modern era of unprecedented peace, prosperity, and freedom in the developed world. 99% of people reading these words have never had to worry about dying in a cholera epidemic, or malaria or smallpox or the plague, or childbirth, or in war, or from a famine, or due to a political purge. this is not true for other times in history, or other places in the world today.

(extremely unoriginal thought, but still important to acknowledge periodically because it's easy to take for granted. especially because it's much more common to complain about ways the world is broken than to acknowledge what has improved over time.)

5ZY2mo

There might already be a selection bias for people who read lesswrong. For people who are in war, having some malaria, and etc - they may not get the chance to access internet, nor have time to gain interest in lesswrong. My point is while we are grateful, we probably would want to reflect on our own privileges on access to peace (not sure unprecedented), and recognize there are still a lot of work needed to be done for people outside our circles. Additionally, sometimes bad things could happen all in a sudden - maybe the country you are in suddenly is war zone, maybe you got some accidents/mental health issues, or maybe the worry about dying is just a healthcare disaster away. The health risk, especially, increases with age.

5Canaletto2mo

And also I think it's less than 99% anyway, ''worry about dying in a cholera epidemic, or malaria or smallpox or the plague, or childbirth, or in war, or from a famine, or due to a political purge" is very broad. Like, surely some of the people here were at some risk during covid epidemic? Some people are male, from countries with drafts which are currently participating or at tangible risk of wars. Some people are from countries with unstable governments and done some activism work before. Like, 75% is my guess.

4samuelshadrach2mo

(My response to you is also unoriginal but worth stating imo.) I would prefer if you used the phrase "US geopolitical sphere of influence" instead of "developed world". It makes it clear your take is political. Leaders within the US govt have obviously contributed to multiple wars and genocides, you just happen to be born into a family that is not on the receiving end of any of them. Part of the reason (but not the full reason) for the economic prosperity is crude oil deals made by the US govt under threat of nuclear war. Statements such as yours give leaders within the US govt implicit consent to continue this sort of rule over the world.

7Hastings2mo

This period of global safety is not fairly distributed, But it is also real https://data.unicef.org/resources/levels-and-trends-in-child-mortality-2024/

4samuelshadrach2mo

Rule of law Energy use per capita Global utility includes the above two things (first two tiers of Maslow's hierarchy) not just counting the number of deaths (where I agree health-related deaths are the biggest bracket). I consider US govt partially responsible for unequal distribution.

0Nick_Tarleton2mo

Neither the mortality-rate nor the energy-use map lines up that closely with the US geopolitical sphere of influence. (E.g. Russia and China on the one hand, Latin America on the other.) I'm not saying the US government isn't partially responsible for unequal distribution, but your previous comment sounds like treating it as the only or primary significant factor. (I'm also not sure what point you're trying to make at all with the energy-use map, given how similar it looks to the mortality-rate map.)

1samuelshadrach2mo

I'm not claiming it's the only factor. Russia and China obviously have significant crude oil reserves which they use domestically. They get to keep them instead of exporting to someone because they have nuclear weapons. All of industry is ultimately based on a few resources such as crude oil, coal and water. These are then used to make steel and electricity which are then used to make industrial supplies for chemicals and so on. So a shortage of drugs or of roads or of hospitals does indirectly tie into the energy use of the country.

1mlsbt2mo

"Crude oil deals made by the US govt" are responsible for a negligible proportion of global economic prosperity, which comes out of the global scientific ecosystem that has been centered in the US for nearly 100 years.

1samuelshadrach2mo

I'm mainly trying to explain this graph of energy use per capita. I agree the US exports a variety of goods including weapons, food, industrial products, aircraft and so on, and this gives them more money to purchase crude oil. And being on the leading edge of science and engineering for these industries enables them to make these exports in the first place. US military protection including nuclear protection is obviously another reason why US gets favourable deals from its allies though.

1samuelshadrach2mo

@Arjun Panickssery I'm not sure what counts as definitive proof to you. US crude oil imports: https://worldpopulationreview.com/country-rankings/us-oil-imports-by-country You can read history of US relations with Saudi Arabia or Iraq or South Korea or any of the other countries at the top of this list.

[-]leogao7mo810

I decided to conduct an experiment at neurips this year: I randomly surveyed people walking around in the conference hall to ask whether they had heard of AGI

I found that out of 38 respondents, only 24 could tell me what AGI stands for (63%)

we live in a bubble

(https://x.com/nabla_theta/status/1869144832595431553)

9Eric Neyman7mo

What's your guess about the percentage of NeurIPS attendees from anglophone countries who could tell you what AGI stands for?

6leogao7mo

not sure, i didn't keep track of this info. an important data point is that because essentially all ML literature is in english, non-anglophones generally either use english for all technical things, or at least codeswitch english terms into their native language. for example, i'd bet almost all chinese ML researchers would be familiar with the term CNN and it would be comparatively rare for people to say 卷积神经网络. (some more common terms like 神经网络 or 模型 are used instead of their english counterparts - neural network / model - but i'd be shocked if people didn't know the english translations) overall i'd be extremely surprised if there were a lot of people who knew conceptually the idea of AGI but didn't know that it was called AGI in english

6Daniel Kokotajlo7mo

Very interesting! Those who couldn't tell you what AGI stands for -- what did they say? Did they just say "I don't know" or did they say e.g. "Artificial Generative Intelligence...?" Is it possible that some of them totally HAD heard the term AGI a bunch, and basically know what it means, but are just being obstinate? I'm thinking of someone who is skeptical of all the hype and aware the lots of people define AGI differently. Such a person might respond to "Can you tell me what AGI means" with "No I can't (because it's a buzzword that means different things to different people)"

[-]leogao7mo150

the specific thing i said to people was something like:

excuse me, can i ask you a question to help settle a bet? do you know what AGI stands for? [if they say yes] what does it stand for? [...] cool thanks for your time

i was careful not to say "what does AGI mean".

most people who didn't know just said "no" and didn't try to guess. a few said something like "artificial generative intelligence". one said "amazon general intelligence" (??). the people who answered incorrectly were obviously guessing / didn't seem very confident in the answer.

if they seemed confused by the question, i would often repeat and say something like "the acronym AGI" or something.

several people said yes but then started walking away the moment i asked what it stood for. this was kind of confusing and i didn't count those people.

4lewis smith7mo

not to be 'i trust my priors more than your data', but i have to say that i find the AGI thing quite implausible; my impression is that most AI researchers (way more than 60%), even ones working in like something very non-deep learning adjacent, have heard of the term AGI, but many of them are/were quite dismissive of it as an idea or associate it strongly (not entirely unfairly) with hype /bullshit, hence maybe walking away from you when you ask them about it. e.g deepmind and openAI have been massive producers of neurips papers for years now (at least since I started a phd in 2016), and both organisations explictly talked about AGI fairly often for years. maybe neurips has way more random attendees now (i didn't go this year), but I still find this kind of hard to believe; I think I've read about AGI in the financial times now.

6leogao7mo

only 2 people walked away without answering (after saying yes initially); they were not counted as yes or no. another several people refused to even answer, but this was also quite rare. the no responders seemed genuinely confused, as opposed to dismissive. feel free to replicate this experiment at ICML or ICLR or next neurips.

2lewis smith7mo

i mean i think that its' definitely an update (anything short of 95% i think would have been quite surprising to me)

3Mo Putera7mo

Why not try out leogao's survey yourself to corroborate/falsify your priors?

4Eli Tyre7mo

Was this possibly a language thing? Are there Chinese or Indian machine learning researchers who would use a different term than AGI in their native language?

6leogao7mo

I'd be surprised if this were the case. next neurips I can survey some non native English speakers to see how many ML terms they know in English vs in their native language. I'm confident in my ability to administer this experiment on Chinese, French, and German speakers, which won't be an unbiased sample of non-native speakers, but hopefully still provides some signal.

4lc7mo

I think if I got asked randomly at an AI conference if I knew what AGI was I would probably say no, just to see what the questioner was going to tell me.

2Nathan Helm-Burger7mo

I'd be curious to hear some of the guesses people make when they say they don't know.

[-]leogao5mo7710

when i was new to research, i wouldn't feel motivated to run any experiment that wouldn't make it into the paper. surely it's much more efficient to only run the experiments that people want to see in the paper, right?

now that i'm more experienced, i mostly think of experiments as something i do to convince myself that a claim is correct. once i get to that point, actually getting the final figures for the paper is the easy part. the hard part is finding something unobvious but true. with this mental frame, it feels very reasonable to run 20 experiments for every experiment that makes it into the paper.

6Gunnar_Zarncke5mo

What is often left out in papers is all of these experiments and the though chains people had about them.

4Sheikh Abdur Raheem Ali5mo

This is also because of Jevon's Paradox. As the cost of doing an experiment reduces with experience, the number of experiments run tends to rise.

[-]leogao16d680

random brainstorming ideas for things the ideal sane discourse encouraging social media platform would have:

have an LM look at the comment you're writing and real time give feedback on things like "are you sure you want to say that? people will interpret that as an attack and become more defensive, so your point will not be heard". addendum: if it notices you're really fuming and flame warring, literally gray out the text box for 2 minutes with a message like "take a deep breath. go for a walk. yelling never changes minds"
have some threaded chat component bolted on (I have takes on best threading system). big problem is posts are fundamentally too high effort to be a way to think; people want to talk over chat (see success of discord). dialogues were ok but still too high effort and nobody wants to read the transcript. one stupid idea is have an LM look at the transcript and gently nudge people to write things up if the convo is interesting and to have UI affordances to make it low friction (eg a single button that instantly creates a new post and automatically invites everyone from the convo to edit, and auto populates the headers)
inspired by the court system, the most autisticall

... (read more)

9MondSemmel11d

Cynical thought: these two points might be incompatible. Social media thrives on network effects, and one requirement for those is that the website be addicting or attention-grabbing. Anti-addictiveness designs are nice in principle, but then your prospective users just spend their time on something that's more addicting instead (whether other websites or Netflix or whatever), and thus can't benefit from the other ways in which your site is better.

7Raemon16d

I'm so torn about "for like 75% or maybe 99% of humans, the chatbot saying 'are you sure you want to say that?' is probably legit an improvement. But... it just feels so slippery-slope-orwellian to me." (In particular, if you build that feature, you need to be confident not only that the current leadership of your company won't abuse it, but that all future leadership won't either, and that the AI company you're renting models from won't enshittify in a way you don't notice) (I am saying this as, like, a forum-maintainer who is actually taking the idea seriously and trying to figure out how to get the good things from the idea, not just randomly dunking on it. Interested in more variants or takes)

3leogao16d

to be clear I explicitly decided not to think too hard about this kind of issue when brainstorming. I think the long run solution is probably something like an elected governance scheme that lets the users control what model to use. maybe make it bicameral to split power between users and funders. but my main motivation for this brainstorming was to think of ideas I could implement in a weekend for shits and giggles to see how well they work irl

2Raemon16d

I lean towards not using models directly as "conversation participants", which feels too likely to spiral out of control, but instead do things like have white-listed specific popups that they decide when to trigger.

4ryan_greenblatt15d

IMO, part of the solution to endless scrolling is to not implement the feature where you can endless scroll. Instead, have an explicit next page button after some moderate amount of scrolling. (Also having the pop up is good, you could even let people program the pop up to be more frequent etc.)

5leogao15d

there's a broader category of things which are not literally scrolling but still time wasting / consuming info not to enrich oneself, but to push the dopamine button, and I think even removing the scroll doesn't fix this (my phone is intentionally quite high friction to use and I still fail to stay off of it)

2Algon14d

I wish to hear these takes.

1AlphaAndOmega15d

I'd be down to try something along those lines. I wonder if anyone has ball-park figures for how much the LLM, used for tone-warnings and light moderation, would cost? I am uncertain about what grade of model would be necessary for acceptable results, though I'd wager a guess that Gemini 2.5 Flash would be acceptable. Disclosure: I'm an admin of themotte.org, and an unusually AI-philic one. I'd previously floated the idea of fine-tuning an LLM on records of previous moderator interactions and associated parent comments (both good and bad, us mods go out of our way to recognize and reward high quality posts, after user reports). Our core thesis is to be a place for polite and thoughtful discussion of contentious topics, and necessarily, we have rather subjective moderation guidelines. (People can be very persistent and inventive about sticking to the RAW while violating the spirit) Even 2 years ago, when I floated the idea, I think it would have worked okay, and these days, I think you could get away without fine-tuning at all. I suspect the biggest hurdle would be models throwing a fit over controversial topics/views, even if the manner and phrasing were within discussion norms. Sadly, now, as it was then, the core user base was too polarized to support such an endeavor. I'd still like to see it put into use. >argument mapping is really cool imo but I think most attempts fail because they try to make arguments super structured and legible. I think a less structured version that lets you vote on how much you think various posts respond to other posts and how well you think it addresses the key points and which posts overlap in arguments would be valuable. like you'd see clusters with (human written and vote selected) summaries of various clusters, and then links of various strengths inter cluster. I think this would greatly help epistemics by avoiding infinite argument retreading Another feature I might float is the idea of granular voting. Let's say there'

4leogao15d

the LLM cost should not be too bad. it would mostly be looking at vague vibes rather than requiring lots of reasoning about the thing. I trust e.g AI summaries vastly less because they can require actual intelligence. I'm happy to fund this a moderate amount for the MVP. I think it would be cool if this existed. I don't really want to deal with all the problems that come with modifying something that already works for other people, at least not before we're confident the ideas are good. this points towards building a new thing. fwiw I think if building a new thing, the chat part would be most interesting/valuable standalone (and I think it's good to have platforms grow out of a simple core rather than to do everything at once)

1sjadler14d

One consideration re: the tone-warning LLMs: make sure to be aware that this means you're pseudo-publishing someone's comment before they meant to. Not publishing in discoverable sense, but logging it to a database somewhere (i.e., probably controlled by the LLM provider) - and depending on the types of writing, this might affect people's willingness to actually write stuff

1ProgramCrafter12d

This is fixable by a) hosting own model, and double-checking that code does not log incoming content in any way, b) potentially, having that model on client side (over time, it might shrink to some manageable size).

[-]leogao7mo6313

it's quite plausible (40% if I had to make up a number, but I stress this is completely made up) that someday there will be an AI winter or other slowdown, and the general vibe will snap from "AGI in 3 years" to "AGI in 50 years". when this happens it will become deeply unfashionable to continue believing that AGI is probably happening soonish (10-15 years), in the same way that suggesting that there might be a winter/slowdown is unfashionable today. however, I believe in these timelines roughly because I expect the road to AGI to involve both fast periods and slow bumpy periods. so unless there is some super surprising new evidence, I will probably only update moderately on timelines if/when this winter happens

[-]leogao7mo3416

also a lot of people will suggest that alignment people are discredited because they all believed AGI was 3 years away, because surely that's the only possible thing an alignment person could have believed. I plan on pointing to this and other statements similar in vibe that I've made over the past year or two as direct counter evidence against that

(I do think a lot of people will rightly lose credibility for having very short timelines, but I think this includes a big mix of capabilities and alignment people, and I think they will probably lose more credibility than is justified because the rest of the world will overupdate on the winter)

8Jozdien7mo

My timelines are roughly 50% probability on something like transformative AI by 2030, 90% by 2045, and a long tail afterward. I don't hold this strongly either, and my views on alignment are mostly decoupled from these beliefs. But if we do get an AI winter longer than that (through means other than by government intervention, which I haven't accounted for), I should lose some Bayes points, and it seems worth saying so publicly.

5leogao7mo

to be clear, a "winter/slowdown" in my typology is more about the vibes and could only be a few years counterfactual slowdown. like the dot-com crash didn't take that long for companies like Amazon or Google to recover from, but it was still a huge vibe shift

5leogao7mo

also to further clarify this is not an update I've made recently, I'm just making this post now as a regular reminder of my beliefs because it seems good to have had records of this kind of thing (though everyone who has heard me ramble about this irl can confirm I've believed sometime like this for a while now)

4the gears to ascension7mo

I was someone who had shorter timelines. At this point, most of the concrete part of what I expected has happened, but the "actually AGI" thing hasn't. I'm not sure how long the tail will turn out to be. I only say this to get it on record.

6TsviBT7mo

If you keep updating such that you always "think AGI is <10 years away" then you will never work on things that take longer than 15 years to help. This is absolutely a mistake, and it should at least be corrected after the first round of "let's not work on things that take too long because AGI is coming in the next 10 years". I will definitely be collecting my Bayes points https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce

5yams7mo

Does it seem likely to you that, conditional on ‘slow bumpy period soon’, a lot of the funding we see at frontier labs dries up (so there’s kind of a double slowdown effect of ‘the science got hard, and also now we don’t have nearly the money we had to push global infrastructure and attract top talent’), or do you expect that frontier labs will stay well funded (either by leveraging low hanging fruit in mundane utility, or because some subset of their funders are true believers, or a secret third thing)?

3Noosphere897mo

My guess is that for now, I'd give around a 10-30% chance to "AI winter happens for a short period/AI progress slows down" by 2027. Also, what would you consider super surprising new evidence?

1anaguma7mo

What do you think would be the cause(s) of the slowdown?

[-]leogao3mo579

every 4 years, the US has the opportunity to completely pivot its entire policy stance on a dime. this is more politically costly to do if you're a long-lasting autocratic leader, because it is embarrassing to contradict your previous policies. I wonder how much of a competitive advantage this is.

[-]interstice3mo2515

Or disadvantage, because it makes it harder to make long-term plans and commitments?

[-]Garrett Baker3mo150

Autarchies, including China, seem more likely to reconfigure their entire economic and social systems overnight than democracies like the US, so this seems false.

4leogao3mo

It's often very costly to do so - for example, ending the zero covid policy was very politically costly even though it was the right thing to do. Also, most major reconfigurations even for autocratic countries probably mostly happen right after there is a transition of power (for China, Mao is kind of an exception, but thats because he had so much power that it was impossible to challenge his authority even when he messed up).

2Garrett Baker3mo

The closing off of China after/during Tinamen square I don't think happened after a transition of power, though I could be mis-remembering. See also the one-child policy, which I also don't think happened during a power transition (allowed for 2 children in 2015, then removed all limits in 2021, while Xi came to power in 2012). I agree the zero-covid policy change ended up being slow. I don't know why it was slow though, I know a popular narrative is that the regime didn't want to lose face, but one fact about China is the reason why many decisions are made is highly obscured. It seems entirely possible to me there were groups (possibly consisting of Xi himself) who believed zero-covid was smart. I don't know much about this though. I will also say this is one example of china being abnormally slow of many examples of them being abnormally fast, and I think the abnormally fast examples win out overall. Ish? The reason he pursued the cultural revolution was because people were starting to question his power, after the great leap forward, but yeah he could be an outlier. I do think that many autocracies are governed by charismatic & powerful leaders though, so not that much an outlier.

[-]leogao3mo122

I mean, the proximate cause of the 1989 protests was the death of the quite reformist general secretary Hu Yaobang. The new general secretary, Zhao Ziyang, was very sympathetic towards the protesters and wanted to negotiate with them, but then he lost a power struggle against Li Peng and Deng Xiaoping (who was in semi retirement but still held onto control of the military). Immediately afterwards, he was removed as general secretary and martial law was declared, leading to the massacre.

6Ben3mo

Having unstable policy making comes with a lot of disadvantages as well as advantages. For example, imagine a small poor country somewhere with much of the population living in poverty. Oil is discovered, and a giant multinational approaches the government to seek permission to get the oil. The government offers some kind of deal - tax rates, etc. - but the company still isn't sure. What if the country's other political party gets in at the next election? If that happened the oil company might have just sunk a lot of money into refinery's and roads and drills only to see them all taken away by the new government as part of its mission to "make the multinationals pay their share for our people." Who knows how much they might take? What can the multinational company do to protect itself? One answer is to try and find a different country where the opposition parties don't seem likely to do that. However, its even better to find a dictatorship to work with. If people think a government might turn on a dime, then they won't enter into certain types of deal with it. Not just companies, but also other countries. So, whenever a government does turn on a dime, it is gaining some amount of reputation for unpredictability/instability, which isn't a good reputation to have when trying to make agreements in the future.

[-]leogao7mo5435

people around these parts often take their salary and divide it by their working hours to figure out how much to value their time. but I think this actually doesn't make that much sense (at least for research work), and often leads to bad decision making.

time is extremely non fungible; some time is a lot more valuable than other time. further, the relation of amount of time worked to amount earned/value produced is extremely nonlinear (sharp diminishing returns). a lot of value is produced in short flashes of insight that you can't just get more of by spending more time trying to get insight (but rather require other inputs like life experience/good conversations/mentorship/happiness). resting or having fun can help improve your mental health, which is especially important for positive tail outcomes.

given that the assumptions of fungibility and linearity are extremely violated, I think it makes about as much sense as dividing salary by number of keystrokes or number of slack messages.

concretely, one might forgo doing something fun because it seems like the opportunity cost is very high, but actually diminishing returns means one more hour on the margin is much less valuable than the average implies, and having fun improves productivity in ways not accounted for when just considering the intrinsic value one places on fun.

[-]habryka7mo299

but actually diminishing returns means one more hour on the margin is much less valuable than the average implies

This importantly also goes in the other direction!

One dynamic I have noticed people often don't understand is that in a competitive market (especially in winner-takes-all-like situations) the marginal returns to focusing more on a single thing can be sharply increasing, not only decreasing.

In early-stage startups, having two people work 60 hours is almost always much more valuable than having three people work 40 hours. The costs of growing a team are very large, the costs of coordination go up very quickly, and so if you are at the core of an organization, whether you work 40 hours or 60 hours is the difference between being net-positive vs. being net-negative.

This is importantly quite orthogonal whether you should rest or have fun or whatever. While there might be at an aggregate level increasing marginal returns to more focus, it is also the case that in such leadership positions, the most important hours are much much more productive than the median hour, and so figuring out ways to get more of the most important hours (which often rely on peak cognitive performance and a non-conflicted motivational system) is even more leveraged than adding the marginal hour (but I think it's important to recognize both effects).

[-]leogao7mo1411

agree it goes in both directions. time when you hold critical context is worth more than time when you don't. it's probably at least sometimes a good strategy to alternate between working much more than sustainable and then recovering.

my main point is this is a very different style of reasoning than what people usually do when they talk about how much their time is worth.

1Maxime Riché7mo

It seems that your point applies significantly more to "zero-sum markets". So it may be good to notice it may not apply for altruistic people when non-instrumentally working on AI safety.

1CstineSublime7mo

Are these people trying to determine how much they (subjectively) value their time or how much they should value their time? Because I think if it's the former and Descriptive, wouldn't the obvious approach be to look at what time-saving services they have employed recently or in the past and see how much they have paid for them relative to how much time they saved? I'm referring to services or products where they could have done it themselves as they have the tools, abilities and freedom to commit to it, but opted to buy a machine or outsource the task to someone else. (I am aware that the hidden variable of 'effort' complicates this model). For example, in what situations will I walk or take public transport to get somewhere, and which ones will I order an Uber: There's a certain cross-over point where if the time-saved is enough I'll justify the expense to myself, which would seem to be a good starting point for evaluating in descriptive terms how much I value my time. I'm guessing if you had enough of these examples where the effort-saved was varied enough then you'd begin to get more accurate model of how one values their time?

7leogao7mo

I think the most important part of paying for goods and services is often not the raw time saved, but the cognitive overhead avoided. for instance, I'd pay much more to avoid having to spend 15 minutes understanding something complicated (assuming there is no learning value) than 15 minutes waiting. so it's plausibly more costly to have to figure out the timetable, fare system, remembering to transfer, navigating the station, than the additional time spent in transit (especially applicable in a new unfamiliar city)

9Viliam7mo

I guess is depends on the kind of work you do (and maybe whether you have ADHD). From my perspective, yes, attention is even more scarce than time or money, because when I get home from work, it feels like all my "thinking energy" is depleted, and even if I could somehow leverage the time or money for some good purpose, I am simply unable to do that. Working even more would mean that my private life would fall apart completely. And people would probably ask "why didn't he simply...?", and the answer would be that even the simple things become very difficult to do when all my "thinking energy" is gone. There are probably smart ways to use money to reduce the amount of "thinking energy" you need to spend in your free time, but first you need enough "thinking energy" to set up such system. The problem is, the system needs to be flawless, because otherwise you still need to spend "thinking energy" to compensate for its flaws. EDIT: I especially hate things like the principal-agent problem, where the seemingly simple answer is: "just pay a specialist to do that, duh", but that immediately explodes to "but how can I find a specialist?" and "how can I verify that they are actually doing a good job?", which easily become just as difficult as the original problem I tried to solve.

3CstineSublime7mo

I wasn't asking how most people go about determining which goods or services to pay for generally, but rather if you're noticing that they are using the working hours by salary equation to determine what their time is worth, if it's to put a dollar figure on what they do in fact value it at, (and that isolates the time element from the effort or cognitive load element) I didn't specify nor imply that one route took more cognitive load than the other, only that one was quicker than the other, and that differential would be one such way of revealing the value of time. (Otherwise they're not, in fact, trying to ascertain what their time is worth at all... but something else) Nowadays using Public Transport is often no more complicated or takes no more effort than using Uber thanks to Google Maps, but this tangent is immaterial to my question: are you noticing these people are trying to measure how much they DO value their time, or are they trying to ascertain how much they SHOULD value their time?

[-]leogao2mo481

random half baked thoughts from a sleep deprived jet lagged mind: my guess is that the few largest principal components of variance of human intelligence are something like:

a general factor that affects all cognitive abilities uniformly (this is a sum of a bazillion things. they could be something physiological like better cardiovascular function, or more efficient mitochondria or something; or maybe there's some pretty general learning/architecture hyperparameter akin to lr or aspect ratio that simply has better or worse configurations. each small change helps/hurts a little bit). having a better general factor makes you better at pattern recognition and prediction, which is the foundation of all intelligence. whether this is learning a policy or a world model, you need to be able to spot regularities in the world to exploit to have any hope of making good predictions.may
a systematization factor (how much to be inclined towards using the machinery of pattern recognition towards finding and operating using explicit rules about the world, vs using that machinery implicitly and relying on intuition). this is the autist vs normie axis. importantly, it's not like normies are born with

... (read more)

[-]leogao2mo160

why is ADHD also strongly correlated with systematization? it could just be worse self modelling - ADHD happens when your brain's model of its own priorities and motivations falls out of sync from your brain's actual priorities and motivations. if you're bad at understanding yourself, you will misunderstand your priorities, and also you will not be able to control your priorities, because you won't know what kinds of evidence will really persuade your brain to adopt a specific priority, and your brain will learn that it can't really trust you to assign it priorities to satisfy its motives (burnout).

[-]leogao2mo17-1

why do stimulants help ADHD? well, they short circuit the part where your brain figures out what priorities to trust based on whether they achieve your true motives. if your brain has already learned that your self model is bad at picking actions that eventually pay off towards its true motives, it won't put its full effort behind those actions. if you can trick it by making every action feel like it's paying off, you can get it to go along.

honestly unclear whether this is good or bad. on the one hand, if your self model has fallen out of sync, this is pretty necessary to get things done, and could get you out of a bad feedback loop (ADHD is really bad for noticing that your self model has fallen horribly out of sync and acting effectively on it!). some would argue on naturalistic grounds that ideally the true long term solution is to use your brain's machinery the way it was always intended, by deeply understanding and accepting (and possibly modifying) your actual motives/priorities and having them steer your actions. the other option is to permanently circumvent your motivation system, to turn it into a rubber stamp for whatever decrees are handed down from the self model, which, forever unmoored from needing to model the self, is no longer an understanding of the self but rather an aspirational endpoint towards which the self is molded. I genuinely don't know which is better as an end goal.

1Morpheus2mo

I view taking stimulants more as a move to get the more reflective parts of my brain more power ("getting my taxes done is good, because we need to do it eventually, now is actually a good time, doing my taxes now will be as boring as doing them in the future, rather than playing magic the gathering now") in steering compared to my more primitive "true motives" that tend to be hyperbolicly discounted ("dosing in bed is nice", "washing dishes is boring", "doing taxes is boring"). Maybe I am horrible at self-modelling, but the part where the self model is out of sync as an explanation why the self-reflective parts have less steering power seems unnecessary.

4leogao2mo

it is kind of funny that caring a lot about reflective stability of alignment proposals and paradoxes arising from self modelling (e.g in action counterfactuals) is most common in the people who are the worst at modelling themselves

2Mateusz Bagiński2mo

do you think that Stanovich's reflective mind and need for cognition are downstream from these two?

2Jonas Hallgren2mo

I think you're framing the intuition vs. systematization relationship in a limiting way. From a predictive coding perspective, these aren't opposing traits on an "autist vs normie axis", they're complementary processes working within the same neural architecture. Predictive coding research shows our brains use both bottom-up signals (intuition) and top-down predictions (systematization) in a dynamic interplay . These are integrated parts of how our brains process information. One person can excel at both. What appears as preference for systematization reflects differences in how prediction errors are weighted and processed - not a fundamental limitation. You can develop both capacities because they use the same underlying predictive machinery. I would, however, agree with your take that most people don't do this but that is because they generally don't search for prediction error after a certain point since it is easier to just live in your secure bubble. So you're right in that this is probably how it looks like in practice since people will just use the strat (top-down systemization or bottom-up intuition) that has lead to the most amount of reward in the past. Shorter blog on how emotions interact with this hierarchical processing system - https://blog.dropbox.com/topics/work-culture/the-mind-at-work--lisa-feldman-barrett-on-the-metabolism-of-emot Relating this to Kahnemahns's system 1 and 2 work - https://pmc.ncbi.nlm.nih.gov/articles/PMC8979207/ edit: first link was formatted wrong.

2Mo Putera2mo

Link is broken, can you reshare?

3Jonas Hallgren2mo

Fixed the comment, thanks! (Here it is otherwise:) https://pmc.ncbi.nlm.nih.gov/articles/PMC5390700/

1KanizsaBoundary2mo

I wonder to what degree the genome has "solved" intelligence. You could imagine perhaps that we are all sort of noisy instantiations of the ideal intelligence, and that reduction in noise (possibly mainly literal cortex-to-cortex SNR) is mostly what results in intelligence variations. Even considering this, the genome probably does not encode a truly complete solution in the sense that there are plenty of cases where there are mental skills that have the potential for positive feedback and a positive correlation, but basically don't. The genome probably has no understanding of the geometric langlands conjecture. That is to say, there are deep and useful truths, especially ones that are pointing out symmetries between extremely deep natural categories, and we have not adapted to them at a deep level yet. Therefore the positive manifold of all mental skills is very much still under construction. One could then wonder to what degree variance comes from genetic denoising and what fraction comes from aligning to novel-to-genome deep truths. All that said, may be ill-posed, defining noise and novelty here seems like it could be hard.

[-]leogao2mo4717

execution is necessary for success, but direction is what sets apart merely impressive and truly great accomplishment. though being better at execution can make you better at direction, because it enables you to work on directions that others discard as impossible.

[-]ryan_greenblatt2mo244

execution also lets you try directions out faster to get some signal

5Dagon2mo

I expect that there's no simple relationship between these factors and success. Both are required, and it's idiosyncratic which one is most lacking in any given margin between not-success and success.

3Nullity2mo

I usually think of execution as compute and direction as discernment. Compute = ability to work through specific directions effectively, discernment = ability to decide which of two directions is more promising. Probably success is upper-bounded by the product of the two, in a sufficiently informal way.

2Mo Putera2mo

Thomas Kwa's Effectiveness is a Conjunction of Multipliers seems relevant. He factors multipliers into judgment (sort of maps to your 'direction', or research taste I guess), ambition (which counts hard work as a driver), and risk appetite. Some domains seem to reward hard work superlinearly, probably worth looking out for those. You shouldn't skip leg day because you'd miss out on multipliers (that phrase came from SBF of all people). Also finding multipliers is hard and information-gathering is particularly valuable when it helps you find a multiplier and figure out if you can apply it to your personal situation (I'm cribbing most of these from Thomas and the comments).

[-]leogao4mo477

timelines takes

i've become more skeptical of rsi over time. here's my current best guess at what happens as we automate ai research.
- for the next several years, ai will provide a bigger and bigger efficiency multiplier to the workflow of a human ai researcher.
  - ai assistants will probably not uniformly make researchers faster across the board, but rather make certain kinds of things way faster and other kinds of things only a little bit faster.
  - in fact probably it will make some things 100x faster, a lot of things 2x faster, and then be literally useless for a lot of remaining things
  - amdahl's law tells us that we will mostly be bottlenecked on the things that don't get sped up a ton. like if the thing that got sped up 100x was only 10% of the original thing, then you don't get more than a 1/(1 - 10%) speedup.
    - i think the speedup is a bit more than amdahl's law implies. task X took up 10% of the time because there is diminishing returns to doing more X, and so you'd ideally do exactly the amount of X such that the marginal value of time spent on X is exactly in equilibrium with time spent on anything else. if you suddenly decrease the cost of X substantially, the equilibrium point shifts

... (read more)

[-]ryan_greenblatt4mo*362

My current best guess median is that we'll see 6 OOMs of effective compute in the first year after full automation of AI R&D if this occurs in ~2029 using a 1e29 training run and compute is scaled up by a factor of 3.5x^[1] over the course of this year^[2]. This is around 5 years of progress at the current rate^[3].

How big of a deal is 6 OOMs? I think it's a pretty big deal; I have a draft post discussing how much an OOM gets you (on top of full automation of AI R&D) that I should put out somewhat soon.

Further, my distribution over this is radically uncertain with a 25th percentile of 2.5 OOMs (2 years of progress) and a 75th percentile of 12 OOMs.

The short breakdown of the key claims is:

Initial progress will be fast, perhaps ~15x faster algorithmic progress than humans.
Progress will probably speed up before slowing down due to training smarter AIs that can accelerate progress even faster, and this being faster than returns diminish on software.
We'll be quite far from the limits of software progress (perhaps median 12 OOMs) at the point when we first achieve full automation.

Here is a somewhat summarized and rough version of the argument (stealing heavily from some of Tom... (read more)

2Stephen McAleese4mo

Thanks for these thoughtful predictions. Do you think there's anything we can do today to prepare for accelerated or automated AI research?

1Hjalmar_Wijk4mo

Maybe distracting technicality: This seems to make the simplifying assumption that the R&D automation is applied to a large fraction of all the compute that was previously driving algorithmic progress right? If we imagine that a company only owns 10% of the compute being used to drive algorithmic progress pre-automation (and is only responsible for say 30% of its own algorithmic progress, with the rest coming from other labs/academia/open-source), and this company is the only one automating their AI R&D, then the effect on overall progress might be reduced (the 15X multiplier only applies to 30% of the relevant algorithmic progress). In practice I would guess that either the leading actor has enough of a lead that they are already responsible for most of their algorithmic progress, or other groups are close behind and will thus automate their own AI R&D around the same time anyway. But I could imagine this slowing down the impact of initial AI R&D automation a little bit (and it might make a big difference for questions like "how much would it accelerate a non-frontier lab that stole the model weights and tried to do rsi").

6ryan_greenblatt4mo

Yes, I think frontier AI companies are responsible for most of the algorithmic progress. I think its unclear how much the leading actor benefits from progress done at other slightly behind AI companies and this could make progress substantially slower. (However, it's possible the leading AI company would be able to acquire the GPUs from these other companies.)

2Garrett Baker4mo

Why do you think this? It seems to me that for most tasks once an AI gets some skill it is much cheaper to run it for that skill than a human.

[-]leogao5mo468

libraries abstract away the low level implementation details; you tell them what you want to get done and they make sure it happens. frameworks are the other way around. they abstract away the high level details; as long as you implement the low level details you're responsible for, you can assume the entire system works as intended.

a similar divide exists in human organizations and with managing up vs down. with managing up, you abstract away the details of your work and promise to solve some specific problem. with managing down, you abstract away the mission and promise that if a specific problem is solved, it will make progress towards the mission.

(of course, it's always best when everyone has state on everything. this is one reason why small teams are great. but if you have dozens of people, there is no way for everyone to have all the state, and so you have to do a lot of abstracting.)

when either abstraction leaks, it causes organizational problems -- micromanagement, or loss of trust in leadership.

[-]leogao5mo427

there are a lot of video games (and to a lesser extent movies, books, etc) that give the player an escapist fantasy of being hypercompetent. It's certainly an alluring promise: with only a few dozen hours of practice, you too could become a world class fighter or hacker or musician! But because becoming hypercompetent at anything is a lot of work, the game has to put its finger on the scale to deliver on this promise. Maybe flatter the user a bit, or let the player do cool things without the skill you'd actually need in real life.

It's easy to dismiss this kind of media as inaccurate escapism that distorts people's views of how complex these endeavors of skill really are. But it's actually a shockingly accurate simulation of what it feels like to actually be really good at something. As they say, being competent doesn't feel like being competent, it feels like the thing just being really easy.

5trevor5mo

"power fantasies" are actually a pretty mundane phenomenon given how human genetic diversity shook out; most people intuitively gravitate towards anyone who looks and acts like a tribal chief, or towards the possibility that you yourself or someone you meet could become (or already be) a tribal chief, via constructing some abstract route that requires forging a novel path instead of following other people's. Also a mundane outcome of human genetic diversity is how division of labor shakes out; people noticing they were born with savant-level skills and that they can sink thousands of hours into skills like musical instruments, programming, data science, sleight of hand party tricks, social/organizational modelling, painting, or psychological manipulation. I expect the pool to be much larger for power-seeking-adjacent skills than art, and that some proportion of that larger pool of people managed to get their skills's mental muscle memory sufficiently intensely honed that everyone should feel uncomfortable sharing a planet with them.

3Canaletto5mo

The alternative is to pit people against each other in some competitive games, 1 on 1 or in teams. I don't think the feeling you get from such games is consistent with "being competent doesn't feel like being competent, it feels like the thing just being really easy", probably mainly because there is skill level matching, there are always opponents who pose you a real challenge. Hmm maybe such games need some more long tail probabilistic matching, to sometimes feel the difference. Or maybe variable team sizes, with many incompetent people versus few competent, to get a more "doomguy" feeling.

2Viliam4mo

Some games do put their finger on the scale, for example you have a first-person shooter where you learn to aim better but you also now have a gun that deals 200 damage per hit, as opposed to your starting gun that dealt 10. But puzzle-solving games are usually fair, I think.

[-]leogao1y4020

reliability is surprisingly important. if I have a software tool that is 90% reliable, it's actually not that useful for automation, because I will spend way too much time manually fixing problems. this is especially a problem if I'm chaining multiple tools together in a script. I've been bit really hard by this because 90% feels pretty good if you run it a handful of times by hand, but then once you add it to your automated sweep or whatever it breaks and then you have to go in and manually fix things. and getting to 99% or 99.9% is really hard because things break in all sorts of weird ways.

I think this has lessons for AI - lack of reliability is one big reason I fail to get very much value out of AI tools. if my chatbot catastrophically hallucinates once every 10 queries, then I basically have to look up everything anyways to check. I think this is a major reason why cool demos often don't mean things that are practically useful - 90% reliable it's great for a demo (and also you can pick tasks that your AI is more reliable at, rather than tasks which are actually useful in practice). this is an informing factor for why my timelines are longer than some other people's

[-]faul_sname1y107

One nuance here is that a software tool that succeeds at its goal 90% of the time, and fails in an automatically detectable fashion the other 10% of the time is pretty useful for partial automation. Concretely, if you have a web scraper which performs a series of scripted clicks in hardcoded locations after hardcoded delays, and then extracts a value from the page from immediately after some known hardcoded text, that will frequently give you a ≥ 90% success rate of getting the piece of information you want while being much faster to code up than some real logic (especially if the site does anti-scraper stuff like randomizing css classes and DOM structure) and saving a bunch of work over doing it manually (because now you only have to manually extract info from the pages that your scraper failed to scrape).

4leogao1y

I think even if failures are automatically detectable, it's quite annoying. the cost is very logarithmic: there's a very large cliff in effort when going from zero manual intervention required to any manual intervention required whatsoever; and as the amount of manual intervention continues to increase, you can invest in infrastructure to make it less painful, and then to delegate the work out to other people.

6Noosphere891y

While I agree with this, I do want to note that this: Only lengthens timelines very much if we also assume scaling can't solve the reliability problem.

[-]leogao1y170

even if scaling does eventually solve the reliability problem, it means that very plausibly people are overestimating how far along capabilities are, and how fast the rate of progress is, because the most impressive thing that can be done with 90% reliability plausibly advances faster than the most impressive thing that can be done with 99.9% reliability

4Alexander Gietelink Oldenziel1y

Perhaps it shouldn't be too surprising. Reliability, machine precision, economy are likely the deciding factors to whether many (most?) technologies take off. The classic RoP case study: the bike.

3M. Y. Zuo1y

Motorola engineers figured this out a few decades ago, even 99.99 to 99.999 makes a huge difference on a large scale. They even published a few interesting papers and monographs on it from what I recall.

6gietema1y

This can be explained when thinking about what these accuracy levels mean: 99.99% accuracy is one error every 10K trials. 99.999% accuracy is one error every 100K trials. So the 99.999% system is 10x better! When errors are costly and you’re operating at scale, this is a huge difference.

[-]leogao1y*381

i've noticed a life hyperparameter that affects learning quite substantially. i'd summarize it as "willingness to gloss over things that you're confused about when learning something". as an example, suppose you're modifying some code and it seems to work but also you see a warning from an unrelated part of the code that you didn't expect. you could either try to understand exactly why it happened, or just sort of ignore it.

reasons to set it low:

each time your world model is confused, that's an opportunity to get a little bit of signal to improve your world model. if you ignore these signals you increase the length of your feedback loop, and make it take longer to recover from incorrect models of the world.
in some domains, it's very common for unexpected results to actually be a hint at a much bigger problem. for example, many bugs in ML experiments cause results that are only slightly weird, but if you tug on the thread of understanding why your results are slightly weird, this can cause lots of your experiments to unravel. and doing so earlier rather than later can save a huge amount of time
understanding things at least one level of abstraction down often lets you do things more

... (read more)

4Gunnar_Zarncke1y

This seems to be related to Goldfish Reading. Or maybe complementary. In Goldfish Reading one reads the same text multiple times, not trying to understand it all at once or remember everything, i.e., intentionally ignoring confusion. But in a structured form to avoid overload.

6leogao1y

Yeah, this seems like a good idea for reading - lets you get best of both worlds. Though it works for reading mostly because it doesn't take that much longer to do so. This doesn't translate as directly to e.g what to do when debugging code or running experiments.

3Johannes C. Mayer1y

I think it's very important to keep track of what you don't know. It can be useful to not try to get the best model when that's not the bottleneck. But I think it's always useful to explicitly store the knowledge of what models are developed to what extent.

3Johannes C. Mayer1y

The algorithm that I have been using, where what to understand to what extend is not a hyperparameter, is to just solve the actual problems I want to solve, and then always slightly overdo the learning, i.e. I would always learn a bit more than necessary to solve whatever subproblem I am solving right now. E.g. I am just trying to make a simple server, and then I learn about the protocol stack. This has the advantage that I am always highly motivated to learn something, as the path to the problem on the graph of justifications is always pretty short. It also ensures that all the things that I learn are not completely unrelated to the problem I am solving. I am pretty sure if you had perfect control over your motivation this is not the best algorithm, but given that you don't, this is the best algorithm I have found so far.

[-]leogao6mo3710

don't worry too much about doing things right the first time. if the results are very promising, the cost of having to redo it won't hurt nearly as much as you think it will. but if you put it off because you don't know exactly how to do it right, then you might never get around to it.

2Viliam6mo

yep. doing it and then redoing it can still be much faster than procrastinating on it

[-]leogao10mo3616

in some way, bureaucracy design is the exact opposite of machine learning. while the goal of machine learning is to make clusters of computers that can think like humans, the goal of bureaucracy design is to make clusters of humans that can think like a computer

[-]leogao4mo3429

my referral/vouching policy is i try my best to completely decouple my estimate of technical competence from how close a friend someone is. i have very good friends i would not write referrals for and i have written referrals for people i basically only know in a professional context. if i feel like it's impossible for me to disentangle, i will defer to someone i trust and have them make the decision. this leads to some awkward conversations, but if someone doesn't want to be friends with me because it won't lead to a referral, i don't want to be friends with them either.

[-]Neel Nanda4mo2610

Strong agree (except in that liking someone's company is evidence that they would be a pleasant co-worker, but that's generally not a high order bit). I find it very annoying that standard reference culture seems to often imply giving extremely positive references unless someone was truly awful, since it makes it much harder to get real info from references

3Joseph Miller4mo

Agreed, but also most of the world does operate in this reference culture. If you choose to take a stand against it, you might screw over a decent candidate by providing only a quite positive recommendation.

2Neel Nanda4mo

Agreed. If I'm talking to someone who I expect to be able to recalibrate, I just explain that I think the standard norms are dumb, the norms I actually follow, and then give an honest and balanced assessment. If I'm talking to someone I don't really know, I generally give a positive but not very detailed reference or don't reply, depending on context.

[-]leogao16d284

one big problem with using LMs too much imo is that they are dumb and catastrophically wrong about things a lot, but they are very pleasant to talk to, project confidence and knowledgeability, and reply to messages faster than 99.99% of people. these things are more easily noticeable than subtle falsehood, and reinforce a reflex of asking the model more and more. it's very analogous to twitter soundbites vs reading long form writing and how that eroded epistemics.

hotter take: the extent to which one finds current LMs smart is probably correlated with how much one is swayed by good vibes from their interlocutor as opposed to the substance of the argument (ofc conditional on the model actually giving good vibes, which varies from person to person. I personally never liked chatgpt vibes until I wrote a big system prompt)

8Raemon16d

Up for sharing your system prompt?

9leogao15d

it's kind of haphazard and I have no reason to believe I'm better at prompting than anyone else. the broad strokes are I tell it to: * use lowercase * not use emojis * be concise, explain at bird's eye level * don't sugar cost things * not be too professional/formal; use some IRC/twitter slang without overdoing it * speak as if it's a conversation over a dinner table between two close friends who are also technical experts * don't dumb things down but also don't use unnecessary jargon I've also been trying to get it to use CS/ML analogies when it would make things clearer, much the same way people on LW would do, but it's been hard to get the model to do it in a natural, non cringe way. rn it overdoes it and makes lots of very forced and not insightful analogies despite my attempts to explain to it

[-]leogao1yΩ17281

learning thread for taking notes on things as i learn them (in public so hopefully other people can get value out of it)

[-]leogao1y*Ω5110

VAEs:

a normal autoencoder decodes single latents z to single images (or whatever other kind of data) x, and also encodes single images x to single latents z.

with VAEs, we want our decoder (p(x|z)) to take single latents z and output a distribution over x's. for simplicity we generally declare that this distribution is a gaussian with identity covariance, and we have our decoder output a single x value that is the mean of the gaussian.

because each x can be produced by multiple z's, to run this backwards you also need a distribution of z's for each single x. we call the ideal encoder p(z|x) - the thing that would perfectly invert our decoder p(x|z). unfortunately, we obviously don't have access to this thing. so we have to train an encoder network q(z|x) to approximate it. to make our encoder output a distribution, we have it output a mean vector and a stddev vector for a gaussian. at runtime we sample a random vector eps ~ N(0, 1) and multiply it by the mean and stddev vectors to get an N(mu, std).

to train this thing, we would like to optimize the following loss function:

-log p(x) + KL(q(z|x)||p(z|x))

where the terms optimize the likelihood (how good is the VAE at modelling dat... (read more)

[-]leogao7mo279

a take I've expressed a bunch irl but haven't written up yet: feature sparsity might be fundamentally the wrong thing for disentangling superposition; circuit sparsity might be more correct to optimize for. in particular, circuit sparsity doesn't have problems with feature splitting/absorption

2Sodium7mo

Yeah my view is that as long as our features/intermediate variables form human understandable circuits, it doesn't matter how "atomic" they are.

[-]leogao8mo*2623

the most valuable part of a social event is often not the part that is ostensibly the most important, but rather the gaps between the main parts.

at ML conferences, the headline keynotes and orals are usually the least useful part to go to; the random spontaneous hallway chats and dinners and afterparties are extremely valuable
when doing an activity with friends, the activity itself is often of secondary importance. talking on the way to the activity, or in the gaps between doing the activity, carry a lot of the value
at work, a lot of the best conversations happen outside of scheduled 1:1s and group meetings, but rather happen in spontaneous hallway or dinner groups

8johnswentworth8mo

I have heard people say this so many times, and it is consistently the opposite of my experience. The random spontaneous conversations at conferences are disproportionately shallow and tend toward the same things which have been discussed to death online already, or toward the things which seem simple enough that everyone thinks they have something to say on the topic. When doing an activity with friends, it's usually the activity which is novel and/or interesting, while the conversation tends to be shallow and playful and fun but not as substantive as the activity. At work, spontaneous conversations generally had little relevance to the actual things we were/are working on (there are some exceptions, but they're rarely as high-value as ordinary work).

5Garrett Baker8mo

I think you are possibly better/optimizing more than most others at selecting conferences & events you actually want to do. Even with work, I think many get value out of having those spontaneous conversations because it often shifts what they're going to do--the number one spontaneous conversation is "what are you working on" or "what have you done so far", which forces you to re-explain what you're doing & the reasons for doing it to a skeptical & ignorant audience. My understanding is you and David already do this very often with each other.

4johnswentworth8mo

I'm very curious if others also find this to be the biggest value-contributor amongst spontaneous conversations. (Also, more generally, I'm curious what kinds of spontaneous conversations people are getting so much value out of.)

[-]Alexander Gietelink Oldenziel8mo211

One of the directions im currently most excited about (modern control theory through algebraic analysis) I learned about while idly chitchatting with a colleague at lunch about old school cybernetics. We were both confused why it was such a big deal in the 50s and 60s then basically died.
A stranger at the table had overheard our conversation and immediately started ranting to us about the history of cybernetics and modern methods of control theory. Turns out that control theory has developed far beyond whay people did in the 60s but names, techniques, methods have changed and this guy was one of the world experts. I wouldn't have known to ask him because the guy's specialization on the face of it had nothing to do with control theory.

7Lucius Bushnaq7mo

I do not find this to be the biggest value-contributor amongst my spontaneous conversations. I don't have a good hypothesis for why spontaneous-ish conversations can end up being valuable to me so frequently. I have a vague intuition that it might be an expression of the same phenomenon that makes slack and playfulness in research and internet browsing very valuable for me.

1metachirality8mo

What would an event optimized for this sort of thing look like?

3Joseph Miller8mo

Unconferences are a thing for this reason

[-]leogao7mo*2415

a lot of unconventional people choose intentionally to ignore normie-legible status systems. this can take the form of either expert consensus or some form of feedback from reality that is widely accepted. for example, many researchers especially around these parts just don't publish at all in normal ML conferences at all, opting instead to depart into their own status systems. or they don't care whether their techniques can be used to make very successful products, or make surprisingly accurate predictions etc. instead, they substitute some alternative status system, like approval of a specific subcommunity.

there's a grain of truth to this, which is that the normal status system is often messed up (academia has terrible terrible incentives). it is true that many people overoptimize the normal status system really hard and end up not producing very much value.

but the problem with starting your own status system (or choosing to compete in a less well-agreed-upon one) is that it's unclear to other people how much stock to put in your status points. it's too easy to create new status systems. the existing ones might be deeply flawed, but at least their difficulty is a known quantity.

o... (read more)

[-]johnswentworth7mo5925

This comment seems to implicitly assume markers of status are the only way to judge quality of work. You can just, y'know, look at it? Even without doing a deep dive, the sort of papers or blog posts which present good research have a different style and rhythm to them than the crap. And it's totally reasonable to declare that one's audience is the people who know how to pick up on that sort of style.

The bigger reason we can't entirely escape "status"-ranking systems is that there's far too much work to look at it all, so people have to choose which information sources to pay attention to.

[-]Daniel Murfet7mo121

It's a question of resolution. Just looking at things for vibes is a pretty good way of filtering wheat from chaff, but you don't give scarce resources like jobs or grants to every grain of wheat that comes along. When I sit on a hiring committee, the discussions around the table are usually some mix of status markers and people having done the hard work of reading papers more or less carefully (this consuming time in greater-than-linear proportion to distance from your own fields of expertise). Usually (unless nepotism is involved) someone who has done that homework can wield more power than they otherwise would at that table, because people respect strong arguments and understand that status markers aren't everything.

Still, at the end of day, an Annals paper is an Annals paper. It's also true that to pass some of the early filters you either need (a) someone who speaks up strongly for you or (b) pass the status marker tests.

I am sometimes in a position these days of trying to bridge the academic status system and the Berkeley-centric AI safety status system, e.g. by arguing to a high status mathematician that someone with illegible (to them) status is actually approximately equiv... (read more)

4Viliam7mo

Generally, it is about heuristics we can use to find quality in the oceans of crap. If we assume that people are sane to some degree, status is an imperfect proxy for quality. If we assume that people don't use AIs to polish their writing styles, the writing style is an imperfect proxy for quality. I have no experience reading research. I suspect that there are also crackpots who can write using the right kind of style. For example, they may be experts at their own line of research, and also speak overconfidently about different things they do not understand. So if you want to be taken seriously, you probably need to know what kind of crackpot do you remind others of, and then find a way how to distinguish yourself from this kind of crackpot specifically. At some moment it would probably easier to simply do your homework, once, and then have something you can point at. For example, you don't need to publish everything in the established journals, but it would probably help to publish there once -- just to show that if you want, you can; that this is about your priorities, not about lack of quality. There are probably other ways, for example if you don't wont to get involved too much with the system, find someone who already is, and maybe offer them co-authorship in return for jumping through all the hoops. I guess my model is that the costs of complying with the standard system are high but constant. So the more time you spend complaining about the system not taking your seriously, the greater the chance that complying with the system would have actually been cheaper than the accumulating opportunity costs.

3leogao7mo

there is always too much information to pay attention to. without an inexpensive way to filter, the field would grind to a complete halt. style is probably a worse thing to select on than even academia cred, just because it's easier to fake.

9habryka7mo

A thing that I often see happening when people talk about "normie-legible status systems" is that they gaslight themselves into believing that some status system that is extraordinarily legible, or they are part of, is something that is consensus. Academia is the most intense example of this. Most people don't care that much about academic status! This also happens in the other direction. Youtube is a major source of status in much of the world, especially among young people, but is considered low-brow whenever people argue about this, and so people dismiss it. I also think people tend to do a fallacy of gray thing where if a status system is not maximally legible (like writing popular blogposts, or running a popular podcast, or making popular Youtube videos, or being popular on Twitter), they dismiss the status system as not real and "illegible". I think modeling the real status and reputation systems that are present in the world is important, but for example, trying to ascent the academic status hierarchy is a bad use of time and resources. It's extremely competitive, and not actually that influential outside of the academic bubble. It is in some fields better correlated with actual skills and integrity and intelligence, and so I still think a reasonable thing to consider, but I think most people are better placed to trade off a bit of legibility against a whole amount of net realness in status (this importantly does not mean your LW quick takes will be the thing that causes you to become world-renowned, I am not saying "just say smart things and the world will recognize you", I am saying "don't think that only the most legible status systems, or the one with the most mobs hunting dissenters from the status system are the only real ways of gaining recognition in the world").

5leogao7mo

sure, the thing you're looking for is the status system that jointly optimizes for alignedness with what you care about, and how legible it is to the people you are trying to convince.

2habryka7mo

(My guess is you meant to agree with that, but kind of the whole point of my comment was that the dimension that is more important than legibility and alignment with you is the buy-in your audience has for a given status system. Youtube is not very legible, and not that aligned, but for some audiences has very high buy-in.)

1samuelshadrach2mo

For some fields such as biotech, it's difficult to get access to labs outside of academia. And you can't learn without lab access because the cutting edge experiments don't get posted to YouTube (yet).

9Daniel Murfet7mo

There is a passage from Jung's "Modern man in search of a soul" that I think about fairly often, on this point (p.229 in my edition)

4Cole Wyeth7mo

It's possible that this wouldn't work for everyone, but so far I am very satisfied working on a PhD on agent foundations (AIXI). There are a lot of complaints here about academic incentives, but mostly I just ignore them. Possibly this will eventually interfere with my academic career prospects, but in the meantime I get years to work on basically whatever I think is interesting and important, and at the end of it I can reasonably expect to end up with a PhD and a thesis I'm proud of, which seems like enough to land on my feet. Looks like the best of both worlds to me.

3Oliver Daniels7mo

Two common failure modes to avoid when doing the legibly impressive things 1. Only caring instrumentally about the project (decreases motivation) 2. Doing "net negative" projects

1CstineSublime7mo

What kind of changes or outcomes would you expect to see if people around these parts instead of publishing their work independently started trying to get it into traditional ML conferences and related publications?

[-]leogao2y240

random fun experiment: accuracy of GPT-4 on "Q: What is 1 + 1 + 1 + 1 + ...?\nA:"

7leogao2y

blue: highest logprob numerical token orange: y = x

4aphyer2y

...I am suddenly really curious what the accuracy of humans on that is.

7Richard_Kennaway2y

'Can you do Addition?' the White Queen asked. 'What's one and one and one and one and one and one and one and one and one and one?' 'I don't know,' said Alice. 'I lost count.'

1niknoble2y

This is a cool idea. I wonder how it's able to do 100, 150, and 200 so well. I also wonder what are the exact locations of the other spikes?

3niknoble2y

Oh, I see your other graph now. So it just always guesses 100 for everything in the vicinity of 100.

[-]leogao2d234

when people say that (prescription) amphetamines "borrow from the future", is there strong evidence on this? with Ozempic we've observed that people are heavily biased against things that feel like a free win, so the tradeoff narrative is memetically fit. distribution shift from ancestral environment means algernon need not apply

[-]AlphaAndOmega2d161

(I'm a psychiatry resident. I also have ADHD and take prescription stimulants infrequently)

The answer is: not really, or at least not in a meaningful sense. You aren't permanently losing anything, your brain or your wellbeing isn't being burnt out like a GPU running on an unstable OC:

Prescription stimulants often have unpleasant comedowns once they wear off. You might feel tired and burned out. They often come with increased anxiety and jitteriness.
Sleep is negatively affected, you get less REM sleep, and you might experience rebound hypersomnia on days you're not on the drug.
There are minor and usually unimportant elevations in blood pressure.
While focus and stamina are improved, creativity and cognitive flexibility suffer. I've read claims that it also makes people overconfident, which strikes me as prima facie plausible. Ever seen how people behave after using coke?
Animal studies show oxidative damage to the brain, but this has not been demonstrated in humans on therapeutic doses, even if used for performance enhancement in those who don't meet the normal criteria for ADHD.
If started at a young age, growth velocity could be slightly hampered, mostly because of a

... (read more)

3Jonas Hallgren1d

This statement really surprises me? On average you get around 500-1000% more dopamine in the system as a consequence of using amphetamines and from a standard neuroscience perspective this is around 3x as much as caffiene for example. Yes it is not heroin levels but dependency has to be a real concern here from a neuroscience perspective? Long-term potentiation and return to baseline for the brain should mean that the learned patterns would be relatively hard to unlearn after 6 months of frequent usage? How good is the studies on longer term behaviour change due to this stuff? I looked into the studies and it seemed like from a shorter term perspective the addiction effects of it were lower than I thought which I found quite interesting. (dropping a claude research report link here: https://claude.ai/public/artifacts/b10e54df-6616-477f-ac19-fe52b4c9d926) I think an important caveat here is that the addiction and dependence liability is quite dependent on how you administer it to yourself, the dosage, the specific routes that you're ingesting it from etc. (which you mention but I think you're understating some of the dangers of it) CNS drugs are powerful so yes I think we should still have some limits on this? I think one of the main things that are a bit difficult with them is that it can be hard to perceive the difference that they induce in yourself? Like if you're on them, you don't necessarily notice that you have less creativity and awareness, that is not how it feels and so if you're overusing them or similar you just don't get that feedback? (based on some modafinil experience & observations from friends)

4AlphaAndOmega1d

I did try and make it clear that I'm only talking about therapeutic usage here, and even when off-label or for PED purposes, at therapeutic doses. I apologize for not stressing that even further, since it's an important distinction to make. I agree that it's rather important to use it as prescribed, or if you're sourcing it outside the medical system, making a strong effort to ensure you take it as would be prescribed (there's nothing particularly complicated about the dosage, psychiatrists usually start you at the lowest dose, then titrate upwards depending on effect). The Claude Research report seems fine to me, and I would think it aligns with my claims. The main issue with recreational usage is that a lot of people aren't trying to be responsible users, or are taking intentionally talking large doses for recreational purposes. That's more on them than it is on the drug! If you take it within the standard dosage range, the drug itself will not produce much in the way of craving for more. >CNS drugs are powerful so yes I think we should still have some limits on this? I am, if not outright libertarian, certainly leaning in that direction. So it depends on what the "limits" actually are. I think that doctors are currently overly risk-averse and conservative about prescribing them, but I don't think they should be handed out like candy over the counter. I think there's plenty of room in between that avoids the pitfalls of a maximalist position. >I think one of the main things that are a bit difficult with them is that it can be hard to perceive the difference that they induce in yourself? Like if you're on them, you don't necessarily notice that you have less creativity and awareness, that is not how it feels and so if you're overusing them or similar you just don't get that feedback? (based on some modafinil experience & observations from friends) The effects on creativity are rather minor. I can't really tell a difference when I'm on them, but I do

2leogao14h

i haven't looked into this deeply but how strong is the evidence for (lack of) oxidative damage? the SSC post is somewhat unsatisfying in that it doesn't really consider outcomes other than literal Parkinson's, and just kind of says the animal model results are confusion.

9Elizabeth1d

They definitely do for me- I sleep worse that night, and if I use too frequently I get exhaustion that takes weeks to recover from.

[-]leogao2mo23-13

I claim it is a lot more reasonable to use the reference class of "people claiming the end of the world" than "more powerful intelligences emerging and competing with less intelligent beings" when thinking about AI x-risk. further, we should not try to convince people to adopt the latter reference class - this sets off alarm bells, and rightly so (as I will argue in short order) - but rather to bite the bullet, start from the former reference class, and provide arguments and evidence for why this case is different from all the other cases.

this raises the question: how should you pick which reference class to use, in general? how do you prevent reference class tennis, where you argue back and forth about what is the right reference class to use? I claim the solution is you want to use reference classes that have consistently made good decisions irl. the point of reference classes is to provide a heuristic to quickly apply judgement to large swathes of situations that you don't have time to carefully examine. this is important because otherwise it's easy to get tied up by bad actors who avoid being refuted by making their beliefs very complex and therefore hard to argue against.

the b... (read more)

[-]Ben Pace2mo*371

This all seems wrongheaded to me.

I endeavor to look at how things work and describe them accurately. Similarly to how I try to describe how a piece of code works, or how to to build a shed, I will try to accurately describe the consequences of large machine learning runs, which can include human extinction.

I personally think AGI will probably kill everyone. but this is a big claim and should be treated as such.

This isn't how I think about things. Reality is what exists, and if a claim accurately describes reality, then I should not want to hold it to higher standards than claims that do not describe reality. I don't think it's a good epistemology to rank claims by "bigness" and then say that the big ones are less likely and need more evidence. On the contrary, I think it's worth investing more in finding out if they're right, and generally worth bringing them up to consideration with less evidence than for "small" claims.

on the other hand, everyone has personally experienced a dozen different doomsday predictions. whether that's your local church or faraway cult warning about Armageddon, or Y2K, or global financial collapse in 2008, or the maximally alarmist climate people, o

... (read more)

9leogao2mo

i am also trying to accurately describe reality. what i'm saying is, even from the perspective of someone smart and truth-seeking but who doesn't know much about the object-level, it is very reasonable to use bigness of claim as a heuristic for how much evidence you need before you're satisfied, and that if you don't do this, you will be worse at finding the truth in practice. my guess is this applies even more so to the average person. i think this is very analogous to occam's razor / trust region optimization. clearly, we need to discount theories based on complexity because there are exponentially more complex theories compared to simple ones, many of which have no easily observable difference to the simpler ones, opening you up to being pwned. and empirically it seems a good heuristic to live life by. complex theories can still be true! but given two theories that both accurately describe reality, you want the simpler one. similarly, given two equally complex claims that accurately describe the evidence, you want the one that is less far fetched from your current understanding of the world / requires changing less of your worldview. also, it doesn't have to be something you literally personally experienced. it's totally valid to read the wikipedia page on the branch davidians or whatever and feel slightly less inclined to take things that have similar vibes seriously, or even to absorb the vibe from your environs (your aversion to scammers and cranks surely did not come ex nihilo, right?) for most of the examples i raised, i didn't necessarily mean the claim was literally 100% human extinction, and i don't think it matters that it wasn't. first, because the important thing is the vibe of the claim (catastrophic) - since we're talking about heuristics on how seriously to take things that you don't have time to deep dive on, the rule has to be relatively cheap to implement. i think most people, even quite smart people, genuinely don't feel much of an emotional

7Ben Pace2mo

Your points about Occam's razor have got nothing to do with this subject[1]. The heuristic "be more skeptical of claims that would have big implications if true" makes sense only when you suspect a claim may have been adversarially optimized for memetic fitness; it is not otherwise true that "a claim that something really bad is going to happen is fundamentally less likely to be true than other claims". I'm having a little trouble connecting your various points back to your opening paragraph, which is the primary thing that I am trying to push back on.[2] To restate the message I'm reading here: "Give up on having a conversation where you evaluate the evidence alongside your interlocutors. Instead frame yourself as trying to convince them of something, and assume that they are correct to treat your communications as though you are adversarially optimizing for them believing whatever you want them to believe." This assumption seems to give up a lot of my ability to communicate with people (almost ~all of it), and I refuse to simply do it because some amount of communication in the world is adversarially optimized, and I'm definitely not going to do it because of a spurious argument that Occam's razor implies that "claims about things being really bad or claims that imply you need to take action are fundamentally less likely to be true". You are often in an environment where people are trying to use language to describe reality, and in that situation the primary thing to evaluate is not the "bigness" of a claim, but the evidence for and against it. I recommend instead to act in such a way as to increase the size and occurrence of that environment more-so than "act as though it's correct to expect maximum adversarial optimization in communications". (Meta: The only literal quotes of Leo's in this comment are the big one in the quote block, my use of "" is to hold a sentence as object, they are not things Leo wrote.) 1. ^ I agree that the more strongly a

4leogao2mo

in practice many of the claims you hear will be optimized for memetic fitness, even if the people making the claims are genuine. well intentioned people can still be naive, or have blind spots, or be ideologically captured. also, presumably the people you are trying to convince are on average less surrounded by truth seeking people than you are (because being in the alignment community is strongly correlated with caring about seeking truth). i don't think this gives up your ability to communicate with people. you simply have to signal in some credible way that you are not only well intentioned but also not merely the carrier of some very memetic idea that slipped past your antibodies. there are many ways to accomplish this. for example, you can build up a reputation of being very scrupulous and unmindkilled. this lets you convey ideas freely to other people in your circles that are also very scrupulous and unmindkilled. when interacting with people outside this circle, for whom this form of reputation is illegible, you need to find something else. depending on who you're talking to and what kinds of things they take seriously, this could be leaning on the credibility of someone like geoff hinton, or of sam/demis/dario, or the UK government, or whatever. this might already be what you're doing, in which case there's no disagreement between us.

2Ben Pace2mo

You're writing lots of things here but as far as I can tell you aren't defending your opening statement, which I believe is mistaken. Firstly, it's just not more reasonable. When you ask yourself "Is a machine learning run going to lead to human extinction?" you should not first say "How trustworthy are people who have historically claimed the world is ending?", you should of course primarily bring your attention to questions about what sorts of machine is being built, what sort of thinking capacities it has, what sorts of actions it can take in the world, what sorts of optimization it runs, how it would behave around humans if it were more powerful than them, and so on. We can go back to discussing epistemology 101 if need be (e.g. "Hug the Query!"). Secondly, insofar as someone believes you are a huckster or a crackpot, you should leave the conversation, communication here has broken down and you should look for other communication opportunities. However, insofar as someone is only evaluating this tentatively as one of many possible hypotheses about you then you should open yourself up to auditing / questioning by them about why you believe what you believe and your past history and your memetic influences. Being frank is the only way through this! But you shouldn't say to them "Actually, I think you should treat me like a huckster/scammer/serf-of-a-corrupt-empire." This feels analogous to a man on a date with a woman saying "Actually I think you should strongly privilege the hypothesis that I am willing to rape you, and now I'll try to provide evidence for you that this is not true." It would be genuinely a bad sign about a man that he thinks that about himself, and also he has moved the situation into a much more adversarial frame. I suspect you could write some more narrow quick-take such as "Here is some communication advice I find helpful when talking with friends and colleagues about how AI can lead to human extinction", but in generalizing it all the wa

9leogao2mo

I am confused why you think my claims are only semi related. to me my claim is very straightforward, and the things i'm saying are straightforwardly converying a world model that seems to me to explain why i believe my claim. i'm trying to explain in good faith, not trying to say random things. i'm claiming a theory of how people parse information, to justify my opening statement, which i can clarify as: * sometimes, people use the rhetorical move of saying something like "people think 95% doom is overconfident, yet 5% isn't. but that's also being 95% confident in not-doom, and yet they don't consider that overconfident. curious." followed by "well actually, it's only a big claim under your reference class. under mine, i.e the set of all instances of a more intelligent thing emerging, actually, 95% doom is less overconfident than 5% doom" this post was inspired by seeing one such tweet, but i see such claims like this every once in a while that play reference class tennis. * i think this kind of argument is really bad at persuading people who don't already agree (from empirical observation). my opening statement is saying "please stop doing this, if you do it, and thank you for not doing this, if you dont already do it" the rest of my paragraphs provide an explanation of my theory for why this is bad for changing people's minds. this seems pretty obviously relevant for justifying why we should stop doing the thing. i sometimes see people out there talk like this (including my past self at some point), and then fail to convince people, and then feel very confused about why people don't see the error of their ways when presented with an alternative reference class. if my theory is correct (maybe it isn't, this isn't a super well thought out take, it's more a shower thought), then it would explain this, and people who are failing to convince people would probably want to know why they're failing. i did not spell this out in my opening statement because i thought it

2Ben Pace2mo

Thank you for all this. I still think your quick take is wrong on the matter of epistemology. I acknowledge that you make a fine point about persuasion, that someone who is primarily running the heuristic that "claims about the end of the world are probably crack-pots or scammers" will not be persuaded by someone arguing that actually 20:1 against and 20:1 in favor of a claim are equally extreme beliefs. A version of the quick take that I would've felt was just fine would read: But your quick take doesn't confine itself to discussing those people in those situations. It flatly says it's true as a matter of epistemology that you should "use bigness of claim as a heuristic for how much evidence you need before you're satisfied", that you should "use reference classes that have consistently made good decisions irl" and that the crackpots/scammers one is the correct one to use here otherwise you'll risk "getting pwned ideologically". These aren't always the right heuristics (e.g. on this issue they are not for you and for me) and you shouldn't say that they are just so that some people on Twitter will stop using rhetoric that isn't working. I believe you're trying to do your best to empathize with people who are unpersuaded by an unsuccessful rhetorical move, a move that people who believe your position are making in public discourse. That is commendable. I think you are attempting to cause other people who hold your position to stop using that rhetorical move, by telling them off for using it, but to acheive this aim you are repeatedly saying the people who do not hold your position are doing normatively correct epistemology, and you're justifying it with Occam's razor and reference class forecasting, and this is all wrong. In some situations for some people it is reasonable to primarily use theses heuristics, and in other situations for other people it is not. I'm not arguing that the people unpersuaded are being unreasonable, but (for example) your opening sente

4leogao2mo

(I would never make knowingly false statements about epistemology to try to win an object level point; I still disagree with your claims about epistemology and believe that my epistemology arguments are in good faith and capture truth in some way. This disagreement might be because I've not communicated myself well. I originally wasn't going to reply but I felt the need to say this because your comment can be viewed as accusing me of intellectual/epistemic dishonesty, even if that wasn't your intention.)

4Ben Pace2mo

(I affirm that I don’t believe you were being knowingly dishonest or deceptive at any point in this thread.)

2Veedrac2mo

But you should absolutely ask "does it look like I'm making the same mistakes they did, and how would I notice if it were so?" Sometimes one is indeed in a cult with your methods of reason subverted, or having a psychotic break, or captured by a content filter that hides the counterevidence, or many of the more mundane and pervasive failures in kind.

4Ben Pace2mo

But not in full generality! This is a fine question to raise in this context, but in general the correct thing to do in basically all situations is to consider the object level, and then also let yourself notice if people are unusually insane around a subject, or insane for a particular reason. Sometimes that is the decisive factor, but for all questions, the best first pass is to think about how that part of the world works, rather than to think about the other monkeys who have talked about it in the past.

4Veedrac2mo

This seems wrong to me. a. More smaller things happen and there are fewer kinds of smaller thing that happen. b. I bet people genuinely have more evidence for small claims they state than big ones on average. c. The skepticism you should have because particular claims are frequently adversarially generated shouldn't first depend on deciding to be skeptical about it. If you'll forgive the lack of charity, ISTM that leogao is making IMO largely true points about the reference class and then doing the wrong thing with those points, and you're reacting to the thing being done wrong at the end, but trying to do this in part by disagreeing with the points being made about the reference class. leogao is right that people are reasonable in being skeptical of this class of claims on priors, and right that when communicating with someone it's often best to start within their framing. You are right that regardless it's still correct to evaluate the sum of evidence for and against a proposition, and that other people failing to communicate honestly in this reference class doesn't mean we ought to throw out or stop contributing to the good faith conversations avaialable to us.

5leogao2mo

i'm not even saying people should not evaluate evidence for and against a proposition in general! it's just that this is expensive, and so it is perfectly reasonable to have heuristics to decide which things to evaluate, and so you should first prove with costly signals that you are not pwning them, and then they can weigh the evidence. and until you can provide enough evidence that you're not pwning them for it to be worth their time to evaluate your claims in detail, that it should not be surprising that many people won't listen to the evidence; and that even if they do listen, if there is still lingering suspicion that they are being pwned, you need to provide the type of evidence that could persuade someone that they aren't getting pwned (for which being credibly very honest and truth seeking is necessary but not sufficient), which is sometimes different from mere compellingness of argument

3Veedrac2mo

I think the framing that sits better to me is ‘You should meet people where they're at.’ If they seem like they need confidence that you're arguing from a place of reason, that's probably indeed the place to start.

2Ben Pace2mo

Thanks for the comment. (Upvoted.) a. I expect there is a slightly more complicated relationship between my value-function and the likely configuration states of the universe than literally zero-correlation, but most configuration states do not support life and we are all dead, so in one sense a claim that in the future something very big and bad will happen is far more likely on priors. One might counter that we live in a highly optimized society where things being functional and maintained is an equilibrium state and it's unlikely for systems to get out of whack enough for bad things to happen. But taking this straightforwardly is extremely naive, tons of bad things happen all the time to people. I'm not sure whether to focus on 'big' or 'bad' but either way, the human sense of these is not what the physical universe is made out of or cares about, and so this looks like an unproductive heuristic to me. b. On the other hand, I suspect the bigger claims are more worth investing time to find out if they're true! All of this seems too coarse-grained to produce a strong baseline belief about big claims or small claims. c. I don't get this one. I'm pretty sure I said that if you believe that you're in a highly adversarial epistemic environment, then you should become more distrusting of evidence about memetically fit claims. I don't know what true points you think Leo is making about "the reference class", nor which points you think I'm inaccurately pushing back on that are true about "the reference class" but not true of me. Going with the standard rationalist advice, I encourage everyone to taboo "reference class" and replace it with a specific heuristic. It seems to me that "reference class" is pretending that these groupings are more well-defined than they are.

2Veedrac2mo

Well, sure, it's just you seemed to frame this as a binary on/off thing, sometimes you're exposed and need to count it and sometimes you're not, whereas to me it's basically never implausible that a belief has been exposed to selection pressures, and the question is of probabilities and degrees.

1CRISPY2mo

I think you’re correct. There’s a synergistic feedback loop between alarmism and social interaction that filters out pragmatic perspectives. Creating the illusion that the doom surrounding any given topic more prevalent than it really is, or even that it’s near universal. Even before the rise of digital information the feedback phenomenon could be observed in any insular group. In today’s environment where a lot of effort goes into exploiting that feedback loop it requires a conscious effort to maintain perspective, or even remain aware that there are other perspectives.

[-]ryan_greenblatt2mo114

I think the group of people "claiming the end of the world" in the case of AI x-risk is importantly more credentialed and reasonable-looking than most prior claims about the end of the world. From the reference class and general heuristics perspective that you're talking about^[1], I think how credible looking the people are is pretty important.

So, I think the reference class is more like claims of nuclear armageddon than cults. (Plausibly near maximally alarmist climate people are in a similar reference class.)

IDK how I feel about this perspective overall. ↩︎

4leogao2mo

I agree this reference class is better, and implies a higher prior, but I think it's reasonable for the prior over "arbitrary credentialed people warning about something" to be still relatively low in an absolute sense- lots of people have impressive sounding credentials that are not actually good evidence of competence (consider: it's basically a meme at this point that whenever you see a book where the author puts "PhD" after their name, they probably are a grifter / their phd was probably kinda bs), and also there is a real negativity bias where fearmongering is amplified by both legacy and social media. Also, for the purposes of understanding normal people, it's useful to keep in mind that trust in credentials and institutions is not very high right now in the US among genpop.

8quetzal_rainbow2mo

You shouldn't. This epistemic bath has no baby in it and we should throw water out of it.

4Veedrac2mo

This is kind of missing the point of Bayes. One shouldn't "choose" a reference class to update on. One should update to the best of your ability on the whole distribution of hypotheses available to describe the situation. Neither is a 'right' or 'wrong' reference class to use, they're both just valid pieces of evidence about base rates, and you should probably be using both of them.

[-]cubefox2mo111

It seems you are having in mind something like inference to the best explanation here. Bayesian updating, on the other hand, does need a prior distribution, and the question of which prior distribution to use cannot be waved away when there is a disagreement on how to update. In fact, that's one of the main problems of Bayesian updating, and the reason why it is often not used in arguments.

2Veedrac2mo

I'm not really sure what that has to do with my comment. My point is the original post seemed to be operating as if you look for the argmax reference class, you start there, and then you allow arguments. My point isn't that their prior is wrong, it's that this whole operation is wrong. I think also you're maybe assuming I'm saying the prior looks something like {reference class A, reference class B} and arguing about the relative probability of each, but it doesn't, a prior should be over all valid explanations of the prior evidence. Reference classes come in because they're evidence about base rates of particular causal structures; you can say 'given the propensity for the world to look this way, how should I be correcting the probability of the hypotheses under consideration? Which new hypotheses should I be explicitly tracking?' I can see where the original post might have gone astray. People have limits on what they can think about and it's normal to narrow one's consideration to the top most likely hypothesis. But it's important to be aware of what you're approximating here, else you get into a confusion where you have two valid reference classes and you start telling people that there's a correct one to start arguing from.

2cubefox2mo

... but that still leaves the problem of which prior distribution should be used.

2Veedrac2mo

I agree this is an interesting philosophical question but again I'm not sure why you're bringing it up. Given your link maybe you think me mentioning Bayes was referring to some method of selecting a single final hypothesis? I'm not, I'm using it to refer to the Bayesian update rule.

2cubefox2mo

It seems the updating rule doesn't tell you anything about the original argument even when you view information about reference classes as evidence rather than as a method of assigning prior probabilities to hypotheses. Or does it? Can you rephrase the argument in a proper Bayesian way such that it becomes clearer? Note that how strongly some evidence confirms or disconfirms a hypothesis also depends on a prior.

2Veedrac2mo

What argument are you referring to when you say "doesn't tell you anything about the original argument"? My framing is basically this: you generally don't start a conversation with someone as a blank pre-priors slate that you get to inject your priors into. The prior is what you get handed, and then the question is how people should respond to the evidence and arguments available. Well, you should use (read: approximate) the basic Bayesian update rule: hypotheses where an observation is unlikely are that much less probable.

2cubefox2mo

I meant leogao's argument above.

2Veedrac2mo

I think you're underestimating the inferential gap here. I'm not sure why you'd think the Bayes updating rule is meant to "tell you anything about" the original post. My claim was that the whole proposal about selecting reference classes was framed badly and you should just do (approximate) Bayes instead.

2cubefox2mo

And what would this look like? Can you reframe the original argument accordingly?

2Veedrac2mo

It's just Bayes, but I'll give it a shot. You're having a conversation with someone. They believe certain things are more probable than other things. They mention a reference class: if you look at this grouping of claims, most of them are wrong. Then you consider the set of hypotheses: under each of them, how plausible is it given the noted tendency for this grouping of claims to be wrong? Some of them pass easily, eg. the hypothesis that this is just another such claim. Some of them less easily; they are either a modal part of this group and uncommon on base rate, or else nonmodal or not part of the group at all. You continue, with maybe a different reference class, or an observation about the scenario. Hopefully this illustrates the point. Reference classes are just evidence about the world. There's no special operation needed for them.

3samuelshadrach2mo

False? Climate change tail scenarios are worth studying and averting. Nuclear winter was obviously worth studying and averting back in the Cold War, and still is today. 2008 financial crisis was worth studying and averting. Do you not believe average citizens can study issues like these and make moves to solve them?

1Signer2mo

The reference classes you should use work as a heuristic because there is some underlying mechanism that makes them work. So you should use reference classes in situations where their underlying mechanism is expected to work. Maybe the underlying mechanism of doomsday predictions not working is that people predicting doom don't make their predictions based on valid reasoning. So if someone uses that reference class to doubt AI risk, this should be judged as them making a claim about reasoning of people predicting AI doom being similar to people in cults predicting Armageddon.

[-]leogao5mo230

you might expect that the butterfly effect applies to ML training. make one small change early in training and it might cascade to change the training process in huge ways.

at least in non-RL training, this intuition seems to be basically wrong. you can do some pretty crazy things to the training process without really affecting macroscopic properties of the model (e.g loss). one very well known example is that using mixed precision training results in training curves that are basically identical to full precision training, even though you're throwing out a ton of bits of precision on every step.

[-]leogao7mo2310

people often say that limitations of an artistic medium breed creativity. part of this could be the fact that when it is costly to do things, the only things done will be higher effort

9leogao7mo

a medium with less limitations is strictly better for making good art, but it's also harder to identify good art among the sea of bad art because the medium alone is no longer as good a signal of quality

5TsviBT7mo

Yes, but this also happens within one person over time, and the habit (of either investing, or not, in long-term costly high-quality efforts) can gain Steam in the one person.

2Noosphere897mo

This seems the likely explanation for any claim that constraints breed creativity/good things in a field, when the expectation is that the opposite outcome would occur.

5StartAtTheEnd7mo

My own expectation is that limitations result in creativity. Writers block is usually a result of having too many possibilities/choices. If I tell you "You can write a story about anything", it's likely harder for you to think of anything than if I tell you "Write a story about an orange cat". In the latter situation, you're more limited, but you also have something to work with. I'm not sure if it's as true for computers as it is for humans (that would imply information-theoretic factors), but there's plenty of factors in humans, like analysis paralysis and the "See also" section of that page

2Noosphere897mo

My other explanation probably has to do with the fact that it's way easier to work with an already almost-executed object than a specification, because we are constrained to only think about a subset of possibilities for a reasonable time. In other words, constraints are useful given that you are already severely constrained, to limit the space of possibilities.

[-]leogao1y2214

any time someone creates a lot of value without capturing it, a bunch of other people will end up capturing the value instead. this could be end consumers, but it could also be various middlemen. it happens not infrequently that someone decides not to capture the value they produce in the hopes that the end consumers get the benefit, but in fact the middlemen capture the value instead

5Matt Goldenberg1y

can you give examples?

9leogao1y

an example: open source software produces lots of value. this value is partly captured by consumers who get better software for free, and partly by businesses that make more money than they would otherwise. the most clear cut case is that some businesses exist purely by wrapping other people's open source software, doing advertising and selling it for a handsome profit; this makes the analysis simpler, though to be clear the vast majority of cases are not this egregious. in this situation, the middleman company is in fact creating value (if a software is created in a forest with no one around to use it, does it create any value?) by using advertising to cause people to get value from software. in markets where there are consumers clueless enough to not know about the software otherwise (e.g legacy companies), this probably does actually create a lot of counterfactual value. however, most people would agree that the middleman getting 90% of the created value doesn't satisfy our intuitive notion of fairness. (open source developers are more often trying to have the end consumers benefit from better software, not for random middlemen to get rich off their efforts) and if advertising is commoditized, then this problem stops existing (you can't extract that much value as an advertising middleman if there is an efficient market with 10 other competing middlemen), and so most of the value does actually accrue to the end user.

4ryan_greenblatt1y

Often tickets will be sold at prices considerably lower than the equilibrium price and thus ticket scalpers will buy the tickets and then resell for a high price. That said, I don't think this typically occurs because the company/group originally selling the tickets wanted consumers to benefit, it seems more likely that this is due to PR reasons (it looks bad to sell really expensive tickets). This is actually a case where it seems likely that the situation would be better for consumers if the original seller captured the value. (Because buying tickets from random scalpers is annoying.)

4Viliam1y

I wonder how much of this is the PR reasons, and how much something else... for example, the scalpers cooperating (and sharing a part of their profits) with the companies that sell tickets. To put it simply, if I sell a ticket for $200, I need to pay a tax for the $200. But if I sell the same ticket for $100 and the scalper re-sells it for $200, then I only need to pay the tax for $100, which might be quite convenient if the scalper... also happens to be me? (More precisely, some of the $100 tickets are sold to genuine 3rd party scalpers, but most of them I sell to myself... but according to my tax reports, all of them were sold to the 3rd party.)

2the gears to ascension1y

ticket scalping is bad and we should find some sort of fully distributed market mechanism that makes scalping approach impossible without requiring the ticket seller to capture the value. it ought to be possible to gift value to end customers rather than requiring the richest to be the ones who get the benefit, how can that be achieved?

4ryan_greenblatt1y

The simple mechanism is: * Charge market prices (auction or just figure out the equilibrium price normally) * Redistribute the income uniformly to some group. Aka UBI. Of course, you could make the UBI be to (e.g.) Taylor Swift fans in particular, but this is hardly a principled approach to redistribution. Separately, musicians (and other performers) might want to subsidize tickets for extremely hard core fans because these fans add value to the event (by being enthusiastic). For this, the main difficulty is that it's hard to cheaply determine if someone is a hard core fan. (In principle, being prepared to buy tickets before they run out could be an OK proxy for this, but it fails in practice, at least for buying tickets online.) More discussion is in this old planet money episode.

5leogao1y

of course, this is more a question about equilibria than literal transactions. suppose you capture most of the value and then pay it back out to users as a dividend: the users now have more money with which they could pay a middleman, and a middleman that could have extracted some amount of value originally can still extract that amount of value in this new situation. we can model this as a game of ultimatum between the original value creator and the middlemen. if the participation of the OVC and middleman are both necessary, the OVC can bargain for half the value in an iterated game / as FDT agents. however, we usually think of the key differentiating factor between the OVC and middlemen as the middlemen being more replaceable, so the OVC should be able to bargain for a lot more. (see also: commoditizing your complement) so to ensure that the end users get most of the value, you need to either ensure that all middleman roles are commoditized, or precommit to only provide value in situations where the end user can actually capture most of the value

2Dagon1y

The equilibrium comprises literal transactions, right? You should be able to find MANY representative specific examples to analyze, which would help determine whether your model of value is useful in these cases. My suspicion is that you're trying to model "value" as something that's intrinsic, not something which a relation between individuals, which means you are failing to see that the packaged/paid/delivered good is actually distinct and non-fungible with the raw/free/open good, for the customers who choose that route. Note that in the case of open-source software, it's NOT a game of ultimatum, because both channels exist simultaneously and neither has the option to deny the other. A given consumer paying for one does not prevent some other customer (or even the same customer in parallel) using the direct free version.

2leogao1y

I make no claim to fungibility or lack of value created by middlemen.

4Dagon1y

It's worth examining whether "capturing value" and "providing value" are speaking of the same thing. In many cases, the middlemen will claim that they're actually providing the majority of the value, in making the underlying thing useful or available. They may or may not be right. For most goods, it's not clear how much of the consumer use value comes from the idea, the implementation of the idea, or from the execution of the delivery and packaging. Leaving aside government-enforced exclusivity, there are usually reasons for someone to pay for the convenience, packaging, and bundling of such goods. I worked (long ago) in physical goods distribution for toys and novelties. I was absolutely and undeniably working for a middleman - we bought truckloads of stuff from factories, repackaged it for retail, and sold it at a significant markup to retail stores, who marked it up again and sold it to consumers. Our margins were good, but all trades were voluntary and I don't agree with a framing that we were "capturing" existing value rather than creating value in connecting supply with demand.

1StartAtTheEnd1y

All value is finite, and every time value is used, it decreases. The middlemen are merely causing the thing to die faster. For instance, if you discover a nice beach which hasn't been ruined with plastic and glass bottle yet, and make it into a popular area, you won't get to spend many happy summers at that place. If you find oil and sell it, are you creating value, or are you destroying value? I think both perspectives are valid. But since the openness of information in the modern world makes it so that everything which can be exploited will be exploited, and until the point that exploitation is no longer possible (as with the ruined beach), I strongly dislike unsustainable exploitation and personally tend toward the "destroying value" view. And if you want something to worry about, let it be premature exploitation. X 'creates' value and chooses not to exploit it prematurely, but then Y will come along and take it, so X is forced to capitalize on it early. Now you have a moloch problem on your hands.

[-]leogao1y220

saying "sorry, just to make sure I understand what you're saying, do you mean [...]" more often has been very valuable

2Viliam1y

yeah, turns off the combat mode

4leogao1y

more importantly, both i and the other person get more out of the conversation. almost always, there are subtle misunderstandings and the rest of the conversation would otherwise involve a lot of talking past each other. you can only really make progress when you're actually engaging with the other person's true beliefs, rather than a misunderstanding of their beliefs.

[-]leogao2y200

hypothesis: intellectual progress mostly happens when bubbles of non tribalism can exist. this is hard to safeguard because tribalism is a powerful strategy, and therefore insulating these bubbles is hard. perhaps it is possible for there to exist a monopoly on tribalism to make non tribal intellectual progress happen, in the same way a monopoly on violence makes it possible to make economically valuable trade without fear of violence

6Daniel Kokotajlo2y

Continuing the analogy: You'd want there to be a Tribe, or perhaps two or more Tribes, that aggressively detect and smack down any tribalism that isn't their own. It needs to be the case that e.g. when some academic field starts splintering into groups that stereotype and despise each other, or when people involved in the decision whether to X stop changing their minds frequently and start forming relatively static 'camps,' the main Tribe(s) notice this and squash it somehow. And/or maybe arrange things so it never happens in the first place. I wonder if this sorta happens sometimes when there is an Official Religion?

4leogao2y

another way to lean really hard into the analogy: you could have a Tribe which has a constitution/laws that dictate what kinds of argument are ok and which aren't, has a legislative branch that constantly thinks about what kinds of arguments are non truthseeking and should be prohibited, a judicial branch that adjudicates whether particular arguments were truthseeking by the law, and has the monopoly on tribalism in that it is the only entity that can legitimately silence people's arguments or (akin to exile) demand that someone be ostracized. there would also be foreign relations/military (defending the continued existence of the Tribe against all the other tribes out there, many of which will attempt to destroy the Tribe via very nontruthseeking means)

5leogao2y

unfortunately this is pretty hard to implement. free speech/democracy is a very strong baseline but still insufficient. the key property we want is a system where true things systematically win over false things (even when the false things appeal to people's biases), and it is sufficiently reliable at doing so and therefore intellectually legitimate that participants are willing to accept the outcome of the process even when it disagrees with what they started with. perhaps there is some kind of debate protocol that would make this feasible?

2Viliam2y

Prediction markets? Generally, track people's previous success rates about measurable things.

4leogao2y

prediction markets have two major issues for this use case. one is that prediction markets can only tell you whether people have been calibrated in the past, which is useful signal and filters out pundits but isn't very highly reliable for out of distribution questions (for example, ai x-risk). the other is that they don't really help much with the case where all the necessary information is already available but it is unclear what conclusion to draw from the evidence (and where having the right deliberative process to make sure the truth comes out at the end is the cat-belling problem). prediction markets can only "pull information from the future" so to speak.

2Viliam2y

BTW, I like the "monopoly on violence" analogy. We can extend it to include verbal violence -- you can have an environment where it is okay to yell at people for being idiots, or you can have an environment where it is okay to yell at people for being politically incorrect. Both will shape the intellectual development in certain directions. Conflicts arise is when you don't have a monopoly, so sometimes people get yelled at for being idiots, other times for being politically incorrect, and then you have endless "wars" about whether we should or shouldn't study a politically sensitive topic X with an open mind, both sides complaining about lack of progress (from their perspective). The more mutually contradictory constraints you have, the more people will choose the strategy "let's not do anything unusual", because it is too likely to screw up according to some of the metrics and get yelled at.

[-]leogao7mo192

theory: a large fraction of travel is because of mimetic desire (seeing other people travel and feeling fomo / keeping up with the joneses), signalling purposes (posting on IG, demonstrating socioeconomic status), or mental compartmentalization of leisure time (similar to how it's really bad for your office and bedroom to be the same room).

this explains why in every tourist destination there are a whole bunch of very popular tourist traps that are in no way actually unique/comparatively-advantaged to the particular destination. for example: shopping, amusement parks, certain kinds of museums.

6Nina Panickssery7mo

I used to agree with this but am now less certain that travel is mostly mimetic desire/signaling/compartmentalization (at least for myself and people I know, rather than more broadly). I think “mental compartmentalization of leisure time” can be made broader. Being in novel environments is often pleasant/useful, even if you are not specifically seeking out unusual new cultures or experiences. And by traveling you are likely to be in many more novel environments even if you are a “boring traveler”. The benefit of this extends beyond compartmentalization of leisure, you’re probably more likely to have novel thoughts and break out of ruts. Also some people just enjoy novelty.

2leogao6mo

I think this is probably true of you and people around you but also you likely live in a bubble. To be clear, I'm not saying why people reading this should travel, but rather what a lot of travel is like, descriptively.

1CstineSublime7mo

What fraction would you say is genuinely motivated by "seeing and experiencing another culture"? I don't doubt that most travel is performative, but I also think most of the people I interact with seem to have different motivations and talk about things from their travels which are a world away from the Pulp Fiction beer in a McDonalds discussion.

[-]leogao1y198

a great way to get someone to dig into a position really hard (whether or not that position is correct) is to consistently misunderstand that position

[-]leogao1y125

almost every single major ideology has some strawman that the general population commonly imagines when they think of the ideology. a major source of cohesion within the ideology comes from a shared feeling of injustice from being misunderstood.

8leogao1y

There are some people that I've found to be very consistently thoughtful - when we disagree, the crux is often something interesting and often causes me to realize that I overlooked an important consideration. I respect people like this a lot, even if we disagree a lot. I think talking to people like this is a good antidote to digging yourself into a position. On the other hand, there are some people I've talked to where I feel like the conversation always runs in circles so it's impossible to pin down a crux, or they always retreat to increasingly deranged positions to avoid admitting being wrong, or they seem to constantly pattern match my argument to something vaguely similar instead of understanding my argument. I think arguing against people like this too much is actively harmful for your epistemics, because you'll start digging yourself into your positions, and you'll get used to thinking that everyone who disagrees with you is wrong. There are a bunch of people (most notably Eliezer) who seem to me to have gone too far down this path. On the other side of the aisle, I don't know exactly how to consistently become more thoughtful, but I think one good starting point is getting good at deeply understanding people's viewpoints.

4Viliam1y

The people who understand the proper interpretation of the ideology can feel intellectually superior to those who don't. Also, people who misunderstand something are by definition wrong... and therefore the people who understand the ideology correctly must -- quite logically -- be right! (An equivocation between "be right about what is the correct interpretation of the ideology" and "be right about whether the ideology correctly describes the reality".)

4Seth Herd1y

I think this is a subset of: irritating people when discussing the topic is a great way to get someone to dig into a position really hard (whether or not that position is correct). That irritation can be performed any way you like. The most common is insinuating that they're stupid, but making invalid meme arguments and otherwise misunderstanding the position or arguments for the position will serve quite well, too. I think this follows from the strength and insidious nature of motivated reasoning. It's often mistaken for confirmation bias, but it's actually a much more important effect because it drives polarization in public discussion. I've been meaning to write a post about this, but doing it justice would take too much time. I think I need to just write a brief incomplete one.

7leogao1y

I don't think being irritating in general is enough. I think it's specifically the feeling that everyone who has disagreed with you has been wrong about their disagreement that creates a very powerful sense of feeling like you must be onto something.

4Seth Herd1y

Really!? Okay, I'll have to really present the argument when I write that post. I do agree with your logic for why opponents misunderstanding the argument would make people sure they're right, by general association. It's a separate factor from the irritation, so I think I mis-statedit as a subset (although part of it seems to be; it's irritating to have people repeatedly mis-characterize your position). It seems pretty apparent to me when I watch people have discussions/arguments that their irritation/anger makes them dig in on their position. It seems to follow from evolutionary psychology: if you make me angry, my brain reacts like we're in a fight. I now want to win that fight, so I need to prove you wrong. Believing any of your arguments or understating mine would lead to losing the fight I feel I'm in. This isn't usually how motivated reasoning is discussed, so I guess it does really take some careful explanation. It seems intuitive and obvious to me after holding this theory for years, but that could be my own motivated reasoning...

0keltan1y

Unfortunately, I think the average person doesn’t understand misunderstanding. I think it can be taken as… * You’re too dumb to understand * You’re being purposely ignorant * You’re making fun of them I’ll give an example: — I was recently in a conversation with a non-rationalist. I organised it because I wanted to talk about negative opinions they had on me. We talked a lot about certain scenarios. In one scenario, I had seemingly rejected the suggestion that we all go bowling. I had said out loud “I hate bowling”. When what I meant was “I hate bowling, but I’m still very happy to sit in a bowling alley and watch my friends play.” I think I did a bad job communicating there. It made my friends very angry (extra details about situation left out). During our conversation, I asked for levels of anger or annoyance at me before and after I had explained what I had meant to say. I was surprised when one friend didn’t adjust their anger levels at all. I thought I must have done another bad job at explaining. “So, you started at 80% angry at me. And now that I’ve told you my perspective, you’re still 80% angry?” This surprised me. I would adjust my levels down if someone explained that to me. I went back and forth trying to get to the bottom of this for ~half an hour. After which I came to realise we were just wired very different. To do this I used your suggested technique. In the time it took for me to understand this one point, I had deeply annoyed my friend. They were under the impression that I was misunderstanding them on purpose somehow. I think I would have been less comfortable or fulfilled, but better off. If I had just accepted that they were still very angry. And had moved on. Instead, being confused and asking questions made my situation worse. To be clear though. I did get to the truth with this technique. But sometimes winning can’t be about knowing the truth. Which is sad. I don’t like that. But I think it is true.

3leogao1y

It is unfortunately impossible for me to know exactly what happened during this interaction. I will say that the specific tone you use matters a huge amount - for example, if you ask to understand why someone is upset about your actions, the exact same words will be much better received if you do it in a tone of contrition and wanting to improve, and it will be received very poorly if you do it in a tone that implies the other person is being unreasonable in being upset. From the very limited information I have, my guess is you probably often say things in a tone that's not interpreted the way you intended.

1keltan1y

I’d say that’s a good guess given the information I provided. I think I did a good job in this particular circumstance as coming off as confused or curious. That was my aim, and I placed a lot of focus there. However, I haven’t listened back to the audio recordings of the conversation. It’s likely my previous comment is heavily bias.

[-]leogao4mo187

idea: flight insurance, where you pay a fixed amount for the assurance that you will definitely get to your destination on time. e.g if your flight gets delayed, they will pay for a ticket on the next flight from some other airline, or directly approach people on the next flight to buy a ticket off of them, or charter a private plane.

pure insurance for things you could afford to self insure is generally a scam (and the customer base of this product could probably afford to self insure) but this mostly provides value by handling the rather complicated logistics for you rather than by reducing the financial burden, and there are substantial benefits from economies of scale (e.g if you have enough customers you can maintain a fleet of private planes within a few hours of most major airports)

2Gurkenglas4mo

I'd have called this not a scam because it hands off the cost of delays to someone in a better position to avert the delays.

1robo4mo

That's a good Coasian point. Talking out of my butt, but I think the airlines don't carry the risk. The sale channel (airlines, Expedia, etc.) take commissions distributing an insurance product designed another company (Travel Insured International, Seven Corners) who handles product design compliance, with the actual claims being handled by another company and the insurance capital by yet another company (AIG, Berkshire Hathaway). LLMs tell me the distributors get 30–50% commission, which tells you that it's not a very good product for consumers.

2Gurkenglas4mo

I know less than you here, but last-minute flights are marked up because businesspeople sometimes need them and maybe TII/SC get a better price on those?

[-]leogao1y187

it's often stated that believing that you'll succeed actually causes you to be more likely to succeed. there are immediately obvious explanations for this - survivorship bias. obviously most people who win the lottery will have believed that buying lottery tickets is a good idea, but that doesn't mean we should take that advice. so we should consider the plausible mechanisms of action.

first, it is very common for people with latent ability to underestimate their latent ability. in situations where the cost of failure is low, it seems net positive to at least take seriously the hypothesis that you can do more than you think you can. (also keeping in mind that we often overestimate the cost of failure). there are also deleterious mental health effects to believing in a high probability of failure, and then bad mental health does actually cause failure - it's really hard to give something your all if you don't really believe in it.

belief in success also plays an important role in signalling. if you're trying to make some joint venture happen, you need to make people believe that the joint venture will actually succeed (opportunity costs exist). when assessing the likelihood of success... (read more)

[-]leogao2mo16-1

my summary of these two papers: https://arxiv.org/pdf/1805.12152 https://arxiv.org/pdf/1905.02175

the first paper observes a phenomenon where adversarial accuracy and normal accuracy are at odds with each other. the authors present a toy example to explain this.

the construction involves giving each input one channel that is 90% accurate for predicting the binary label, and a bazillion iid gaussian channels that are as noisy as possible individually, so that when you take the average across all of them you get ~100% accuracy. they show that when you do $ℓ_{\infty}$ -adversarial training on the input you learn to only use the 90% accurate feature, whereas normal training uses all bazillion weak channels.

the key to this construction is that they consider an $ℓ_{\infty}$ -ball on the input (distance is the max across all coordinates). so this means by adding more and more features, you can move further and further in $ℓ_{2}$ space (specifically, $\sqrt{n}$ in terms of the number of features). but the $ℓ_{2}$ distance between the means of the two high dimensional gaussians stays constant, so no matter what your $ε$ is, at some point with enough channels you can pertur... (read more)

[-]leogao2y160

Is it a very universal experience to find it easier to write up your views if it's in response to someone else's writeup? Seems like the kind of thing that could explain a lot about how research tends to happen if it were a pretty universal experience.

8ryan_greenblatt2y

I think so/I have this. (I would emoji react for a less heavy response, but doesn't work on older short forms) The corollary is that it's really annoying to respond to widely held views or frames which aren't clearly written up anywhere. Particularly if these views are very inprecise and confused.

8leogao2y

new galaxy brain hypothesis of how research advances: progress happens when people feel unhappy about a bad but popular paper and want to prove it wrong (or when they feel like they can do even better than someone else) this explains: * why it's often necessary to have bad incremental papers that don't introduce any generalizable techniques (nobody will care about the followup until it's refuting the bad paper) * why so much of academia exists to argue that other academics are wrong and bad * why academics sometimes act like things don't exist unless there's a paper about them, even though the thing is really obvious

4MikkW2y

This subjectively seems to me to be the case.

3Pat Myron2y

https://xkcd.com/386/

[-]leogao5mo150

there's an obvious synthesis of great man theory and broader structural forces theories of history.

there are great people, but these people are still bound by many constraints due to structural forces. political leaders can't just do whatever they want; they have to appease the keys of power within the country. in a democracy, the most obvious key of power is the citizens, who won't reelect a politician that tries to act against their interests. but even in dictatorships, keeping the economy at least kind of functional is important, because when the citizens are starving, they're more likely to revolt and overthrow the government. there are also powerful interest groups like the military and critical industries, which have substantial sway over government policy in both democracies and dictatorships. many powerful people are mostly custodians for the power of other people, in the same way that a bank is mostly a custodian for the money of its customers.

also, just because someone is involved in something important, it doesn't mean that they were maximally counterfactually responsible. structural forces often create possibilities to become extremely influential, but only in the direc... (read more)

4Joseph Miller5mo

I think there's a spectrum between great man theory and structural forces theory and I would classify your view as much closer to the structural forces view, rather than a combination of the two. The strongest counter-example might be Mao. It seems like one man's idiosyncratic whims really did set the trajectory for hundreds of millions of people. Although of course as soon as he died most of the power vanished, but surely China and the world would be extremely different today without him.

2Viliam4mo

A synthesis between the structural forces theory and "pulling the rope sideways". The economical and other forces determine the main direction, a leader who already wanted to go in that direction gets elected and starts going in that direction, his idiosyncratic whims get implemented as a side effect. Like, instead of Hitler, there would be another German leader determined to change the post-WW1 world order, but he would probably be less obsessed about the Jews. Also, he might make different alliances.

2Thomas Kwa5mo

Musk only owns 0.1% of the economic output of the US but he is responsible for more than this, including large contributions to * Politics * Space * SpaceX is nearly 90% of global upmass * Dragon is the sole American spacecraft that can launch humans to ISS * Starlink probably enables far more economic activity than its revenue * Quality and quantity of US spy satellites (Starshield has ~tripled NRO satellite mass) * Startup culture through the many startups from ex-SpaceX employees * Twitter as a medium of discourse, though this didn't change much * Electric cars probably sped up by ~1 year by Tesla, which still owns over half the nation's charging infrastructure * AI, including medium-sized effects on OpenAI and potential future effects through xAI Depending on your reckoning I wouldn't be surprised if Elon's influence added up to >1% of Americans combined. This is not really surprising because a Zipfian relationship would give the top person in a nation of 300 million 5% of the total influence.

4Joseph Miller5mo

The Duke of Wellington said that Napoleon's presence on a battlefield “was worth forty thousand men”. This would be about 4% of France's military size in 1812.

2leogao5mo

i'm happy to grant that the 0.1% is just a fermi estimate and there's a +/- one OOM error bar around it. my point still basically stands even if it's 1%. i think there are also many factors in the other direction that just make it really hard to say whether 0.1% is an under or overestimate. for example, market capitalization is generally an overestimate of value when there are very large holders. tesla is also a bit of a meme stock so it's most likely trading above fundamental value. my guess is most things sold to the public sector probably produce less economic value per $ than something sold to the private sector, so profit overestimates value produced the sign on net economic value of his political advocacy seems very unclear to me. the answer depends strongly on some political beliefs that i don't feel like arguing out right now. it slightly complicates my analogy for elon to be both the richest person in the us and also possibly the most influential (or one of). in my comment i am mostly referring to economic-elon. you are possibly making some arguments about influentialness in general. the problem is that influentialness is harder to estimate. also, if we're talking about influentialness in general, we don't get to use the 0.1% ownership of economic output as a lower bound of influentialness. owning x% of economic output doesn't automatically give you x% of influentialness. (i think the majority of other extremely rich people are not nearly as influential as elon per $)

[-]leogao9mo159

one kind of reasoning in humans is a kind of instant intuition; you see something and something immediately and effortlessly pops into your mind. examples include recalling vocabulary in a language you're fluent in, playing a musical instrument proficiently, or having a first guess at what might be going wrong when debugging.

another kind of reasoning is the chain of thought, or explicit reasoning: you lay out your reasoning steps as words in your head, interspersed perhaps with visuals, or abstract concepts that you would have a hard time putting in words. It feels like you're consciously picking each step of the reasoning. Working through a hard math problem, or explicitly designing a codebase by listing the constraints and trying to satisfy them, are examples of this.

so far these map onto what people call system 1 and 2, but I've intentionally avoided these labels because I think there's actually a third kind of reasoning that doesn't fit well into either of these buckets.

sometimes, I need to put the relevant info into my head, and then just let it percolate slowly without consciously thinking about it. at some later time, insights into the problem will suddenly and unpredictably... (read more)

[-]TsviBT9mo110

the possibility that a necessary ingredient in solving really hard problems is spending a bunch of time simply not doing any explicit reasoning

I have a pet theory that there are literally physiological events that take minutes, hours, or maybe even days or longer, to happen, which are basically required for some kinds of insight. This would look something like:

First you do a bunch of explicit work trying to solve the problem. This makes a bunch of progress, and also starts to trace out the boundaries of where you're confused / missing info / missing ideas.
You bash your head against that boundary even more.
- You make much less explicit progress.
- But, you also leave some sort of "physiological questions". I don't know the neuroscience at all, but to make up a story to illustrate what sort of thing I mean: One piece of your brain says "do I know how to do X?". Some other pieces say "maybe I can help". The seeker talks to the volunteers, and picks the best one or two. The seeker says "nah, that's not really what I'm looking for, you didn't address Y". And this plays out as some pattern of electrical signals which mean "this and this and this neuron shouldn't have been firing s

... (read more)

3Kaarel9mo

a thing i think is probably happening and significant in such cases: developing good 'concepts/ideas' to handle a problem, 'getting a feel for what's going on in a (conceptual) situation' a plausibly analogous thing in humanity(-seen-as-a-single-thinker): humanity states a conjecture in mathematics, spends centuries playing around with related things (tho paying some attention to that conjecture), building up mathematical machinery/understanding, until a proof of the conjecture almost just falls out of the machinery/understanding

2Vladimir_Nesov9mo

This is learning of a narrow topic, which builds representations that make thinking on that topic more effective, novel insights might become feasible even through system 1 where before system 2 couldn't help. With o1, LLMs have systems 1 and 2, but all learning is in pretraining, not targeting the current problem and in any case with horrible sample efficiency. Could be a crucial missing capability, though with scale even in-context learning might get there. Sounds like a synthetic data generation pipeline.

1FractalSyn9mo

Relatable. Giorgio Parisi mentionned this in his book; he said that the ah-ah moments tend to spark randomly when doing something else. Bertrand Russell had a very active social life (he praised leisure) and believed it is an active form of idleness that could reveal very productive. A good balance might be the best way to leverage it.

[-]leogao9mo1410

for people who are not very good at navigating social conventions, it is often easier to learn to be visibly weird than to learn to adapt to the social conventions.

this often works because there are some spaces where being visibly weird is tolerated, or even celebrated. in fact, from the perspective of an organization, it is good for your success if you are good at protecting weird people.

but from the perspective of an individual, leaning too hard into weirdness is possibly harmful. part of leaning into weirdness is intentional ignorance of normal conventions. this traps you in a local minimum where any progress on understanding normal conventions hurts your weirdness, but isn't enough to jump all the way to the basin of the normal mode of interaction.

(epistemic status: low confidence, just a hypothesis)

2Ben Pace9mo

Pretty sure @Ronny Fernandez has opinions about this (in particular, I expect he disagrees that actively being visibly weird requires being ignorant of how to behave conventionally).

1CstineSublime9mo

Perhaps I misunderstand your use of the phrase "intentionally ignorant" but I believe many cases of people who are seen to have acted with "integrity" are people who have been hyperaware and well informed of what normal social conventions are in a given environment and made deliberate choice not to adhere to them, not ignoring said conventions out of a lack of interest. I also am not sure what you mean by "weird". I assume you mean any behavior which is not the normal convention of any randomly selected cohesive group of people, from a family, to a local soccer club, to a informal but tight knit circle of friends, to a department of a large company. Have I got that right? My idea of 'weird' tends to involve the stereotypical artists and creatives I associate with, which is, within those circles not weird at all but normal. But I'm meta-aware that might be a weird take.

1MichaelDickens9mo

I don't think I understand what "learn to be visibly weird" means, and how it differs from not following social conventions because you fail to understand them correctly.

1Erich_Grunewald9mo

are you basing this on intuition or personal experience or something else? I guess we should avoid basing it on observations of people who did succeed in that way. People who try and succeed in adapting to social conventions are likely much less noticeable/salient than people who succeed at being visibly weird.

1AprilSR9mo

I think my ideal is to lean into weirdness in a way that doesn't rely on ignorance of normal conventions

[-]leogao2y140

Since there are basically no alignment plans/directions that I think are very likely to succeed, and adding "of course, this will most likely not solve alignment and then we all die, but it's still worth trying" to every sentence is low information and also actively bad for motivation, I've basically recalibrated my enthusiasm to be centered around "does this at least try to solve a substantial part of the real problem as I see it". For me at least this is the most productive mindset for me to be in, but I'm slightly worried people might confuse this for me having a low P(doom), or being very confident in specific alignment directions, or so on, hence this post that I can point people to.

I think this may also be a useful emotional state for other people with similar P(doom) and who feel very demotivated by that, which impacts their productivity.

[-]leogao6d124

hot take: introspection isn't really real. you can't access your internal state in any meaningful sense beyond what your brain chooses to present to you (e.g visual stimuli, emotions, etc), for reasons outside of your direct control. when you think you're introspecting, what's really going on when you think you're introspecting is you have a model of yourself inside your brain, which you learn gradually by seeing yourself do certain things, experience certain stimuli or emotions, etc.

your self-model is not fundamentally special compared to any other models you have. it works the same way as your model of anyone or anything else, except you have way more data on yourself, and also you directly experience your own emotions and sensori stimuli, as opposed to having to infer them for other people. often your emotional brain sabotages your ability to understand yourself, but also it sometimes sabotages your ability to understand other people too (e.g groupthink, tribalism).

your self-model can diverge arbitrarily far from reality. when you're emotionally unintegrated, you have a model of yourself that fails to understand how your emotions truly work, so you will systematically mispredict... (read more)

7yams5d

What experiences have you had that lead you to call this a ‘hot take’? [I rephrased a few times to avoid sounding sarcastic and still may have failed; I’m interested in why it looks to you like others dramatically disagree with this, or in what social environment people are obviously not operating on a model that resembles this one. My sense is a lot of people think this way, but it’s a little socially taboo to broadcast object-level reasoning grounded in this model, since it can get very interpersonally invasive or intimate and lead to undesirable social/power dynamics.]

4leogao5d

the experience that led to calling it a hot take is i was arguing against someone who disagreed with this right before i wrote it up

3yams5d

What was their position? (to the extent that you can reproduce it)

[-]leogao2y*120

a common discussion pattern: person 1 claims X solves/is an angle of attack on problem P. person 2 is skeptical. there is also some subproblem Q (90% of the time not mentioned explicitly). person 1 is defending a claim like "X solves P conditional on Q already being solved (but Q is easy)", whereas person 2 thinks person 1 is defending "X solves P via solving Q", and person 2 also believes something like "subproblem Q is hard". the problem with this discussion pattern is it can lead to some very frustrating miscommunication:

if the discussion recurses into whether Q is hard, person 1 can get frustrated because it feels like a diversion from the part they actually care about/have tried to find a solution for, which is how to find a solution to P given a solution to Q (again, usually Q is some implicit assumption that you might not even notice you have). it can feel like person 2 is nitpicking or coming up with fully general counterarguments for why X can never be solved.
person 2 can get frustrated because it feels like the original proposed solution doesn't engage with the hard subproblem Q. person 2 believes that assuming Q were solved, then there would be many other proposals other than X that would also suffice to solve problem P, so that the core ideas of X actually aren't that important, and all the work is actually being done by assuming Q.

4Dagon2y

I find myself in person 2's position fairly often, and it is INCREDIBLY frustrating for person 1 to claim they've "solved" P, when they're ignoring the actual hard part (or one of the hard parts). And then they get MAD when I point out why their "solution" is ineffective. Oh, wait, I'm also extremely annoyed when person 2 won't even take steps to CONSIDER my solution - maybe subproblem Q is actually easy, when the path to victory aside from that is clarified. In neither case can any progress be made without actually addressing how Q fits into P, and what is the actual detailed claim of improvement of X in the face of both Q and non-Q elements of P.

3Max H2y

I can see how this could be a frustrating pattern for both parties, but I think it's often an important conversation tree to explore when person 1 (or anyone) is using results about P in restricted domains to make larger claims or arguments about something that depends on solving P at the hardest difficulty setting in the least convenient possible world. As an example, consider the following three posts: * Challenge: construct a Gradient Hacker * Gradient hacking is extremely difficult * My Objections to "We’re All Gonna Die with Eliezer Yudkowsky" I think both of the first two posts are valuable and important work on formulating and analyzing restricted subproblems. But I object to citation of the second post (in the third post) as evidence in support of a larger point that doom from mesa-optimizers or gradient descent is unlikely in the real world, and object to the second post to the degree that it is implicitly making this claim. There's an asymmetry when person I is arguing for an optimistic view on AI x-risk and person 2 is arguing for a doomer-ish view, in the sense that person I has to address all counterarguments but person 2 only has to find one hole. But this asymmetry is unfortunately a fact about the problem domain and not the argument / discussion pattern between I and 2.

2the gears to ascension2y

yeah, but that's because Q is easy if you solve P Very nicely described, this might benefit from becoming a top level post

1DPiepgrass2y

For example?

5leogao2y

here's a straw hypothetical example where I've exaggerated both 1 and 2; the details aren't exactly correct but the vibe is more important: 1: "Here's a super clever extension of debate that mitigates obfuscated arguments [etc], this should just solve alignment" 2: "Debate works if you can actually set the goals of the agents (i.e you've solved inner alignment), but otherwise you can get issues with the agents coordinating [etc]" 1: "Well the goals have to be inside the NN somewhere so we can probably just do something with interpretability or whatever" 2: "how are you going to do that? your scheme doesn't tackle inner alignment, which seems to contain almost all of the difficulty of alignment to me. the claim you just made is a separate claim from your main scheme, and the cleverness in your scheme is in a direction orthogonal to this claim" 1: "idk, also that's a fully general counterargument to any alignment scheme, you can always just say 'but what if inner misalignment'. I feel like you're not really engaging with the meat of my proposal, you've just found a thing you can say to be cynical and dismissive of any proposal" 2: "but I think most of the difficulty of alignment is in inner alignment, and schemes which kinda handwave it away are trying to some some problem which is not the actual problem we need to solve to not die from AGI. I agree your scheme would work if inner alignment weren't a problem." 1: "so you agree that in a pretty nontrivial number [let's say both 1&2 agree this is like 20% or something] of worlds my scheme does actually work- I mean how can you be that confident that inner alignment is that hard? in the world's where inner alignment turns out to be easy then my scheme will work." 2: "I'm not super confident, but if we assume that inner alignment is easy then I think many other simpler schemes will also work, so the cleverness that your proposal adds doesn't actually make a big difference."

1DPiepgrass2y

So Q=inner alignment? Seems like person 2 not only pointed to inner alignment explicitly (so it can no longer be "some implicit assumption that you might not even notice you have"), but also said that it "seems to contain almost all of the difficulty of alignment to me". He's clearly identified inner alignment as a crux, rather than as something meant "to be cynical and dismissive". At that point, it would have been prudent of person 1 to shift his focus onto inner alignment and explain why he thinks it is not hard. Note that your post suddenly introduces "Y" without defining it. I think you meant "X".

[-]leogao1mo11-3

a simple elegant intuition for the relationship between SVD and eigendecomposition that I haven't heard before:

the eigendecomposition of A tells us which directions A stretches along without rotating. but sometimes we want to know all the directions things get stretched along, even if there is rotation.

why does taking the eigendecomposition of $A^{T} A$ help us? suppose we rewrite $A = R S$ , where S just scales (i.e is normal matrix), and R is just a rotation matrix. then, $A^{T} A = S^{T} R^{T} R S$ , and the R's cancel out because transpose of rotation matrix is also its inverse.

intuitively, imagine thinking of A as first scaling in place, and then rotating. then, ATA would first scale, then rotate, then rotate again in the opposite direction, then scale again. so all the rotations cancel out and the resulting eigenvalues of ATA are the squares of the scaling factors.

[-]Garrett Baker1mo*194

This is almost right, but a normal matrix is not a matrix that “just scales”, its a normal matrix which can do whatever linear operation it likes.

SVD tells us there exists a factorization $A = U Σ V^{T}$ where $U$ and $V$ are orthogonal, and $Σ$ is a “scaling matrix” in the sense that its diagonal. Therefore, using similar logic to you, $A^{T} A = V Σ U^{T} U Σ V^{T} = V Σ^{2} V^{T}$ which means we rotate, scale by the singular values twice, then rotate back, which is why the eigenvales of this are the squares of the singular values, and the eigenvectors are the right singular vectors.

[-]leogao1y11-1

philosophy: while the claims "good things are good" and "bad things are bad" at first appear to be compatible with each other, actually we can construct a weird hypothetical involving exact clones that demonstrates that they are fundamentally inconsistent with each other

law: could there be ambiguity in "don't do things that are bad as determined by a reasonable person, unless the thing is actually good?" well, unfortunately, there is no way to know until it actually happens

6Dagon1y

I think I need to hear more context (and likely more words in the sentences) to understand what inconsistency you're talking about. "good things are good" COULD be just a tautology, with the assumption that "good things" are relative to a given agent, and "good" is furtherance of the agent's preferences. Or it could be a hidden (and false) claim of universality "good things" are anything that a lot of people support, and "are good" means truly pareto-preferred with no harm to anyone. Your explanation "by a reasonable person" is pretty limiting, there being no persons who are reasonable on all topics. Likewise "actually good" - I think there's no way to know even after it happens.

[-]leogao3yΩ5110

One possible model of AI development is as follows: there exists some threshold beyond which capabilities are powerful enough to cause an x-risk, and such that we need alignment progress to be at the level needed to align that system before it comes into existence. I find it informative to think of this as a race where for capabilities the finish line is x-risk-capable AGI, and for alignment this is the ability to align x-risk-capable AGI. In this model, it is necessary but not sufficient for alignment for alignment to be ahead by the time it's at the finish line for good outcomes: if alignment doesn't make it there first, then we automatically lose, but even if it does, if alignment doesn't continue to improve proportional to capabilities, we might also fail at some later point. However, I think it's plausible we're not even on track for the necessary condition, so I'll focus on that within this post.

Given my distributions over how difficult AGI and alignment respectively are, and the amount of effort brought to bear on each of these problems, I think there's a worryingly large chance that we just won't have the alignment progress needed at the critical juncture.

I also think it's ... (read more)

[-]leogao3mo106

i find it disappointing that a lot of people believe things about trading that are obviously crazy even if you only believe in a very weak form of the EMH. for example, technical analysis is obviously tea leaf reading - if it were predictive whatsoever, you could make a lot of money by exploiting it until it is no longer predictive.

[-]lc3mo*142

Close friend of mine, a regular software engineer, recently threw tens of thousands of dollars - a sizable chunk of his yearly salary - at futures contracts on some absurd theory about the Japanese Yen. Over the last few weeks, he coinflipped his money into half a million dollars. Everyone who knows him was begging him to pull out and use the money to buy a house or something. But of course yesterday he sold his futures contracts and bought into 0DTE Nasdaq options on another theory, and literally lost everything he put in and then some. I'm not sure but I think he's down about half his yearly salary overall.

He has been doing this kind of thing for the last two years or so - not just making investments, but making the most absurd, high risk investments you can think of. Every time he comes up with a new trade, he has a story for me about how his cousin/whatever who's a commodities trader recommended the trade to him, or about how a geopolitical event is gonna spike the stock of Lockheed Martin, or something. On many occasions I have attempted to explain some kind of Inadequate Equilibria thesis to him, but it just doesn't seem to "stick".

It's not that he "rejects" the EMH in these ... (read more)

7Alexander Gietelink Oldenziel3mo

the steelman is that quants do the version of technical analysis that works - they disprove the EMH proportional to quant salaries.

1Kaarel3mo

i agree that most people doing "technical analysis" are doing nonsense and any particular well-known simple method does not actually work. but also clearly a very good predictor could make a lot of money just looking at the past price time series anyway

[-]leogao7mo107

i think it's quite valuable to go through your key beliefs and work through what the implications would be if they were false. this has several benefits:

picturing a possible world where your key belief is wrong makes it feel more tangible and so you become more emotionally prepared to accept it.
if you ever do find out that the belief is wrong, you don't flinch away as strongly because it doesn't feel like you will be completely epistemically lost the moment you remove the Key Belief
you will have more productive conversations with people who disagree with you on the Key Belief
you might discover strategies that are robustly good whether or not the Key Belief is true
you will become better at designing experiments to test whether the Key Belief is true

4Daniel Tan7mo

what are some of your key beliefs and what were the implications if they were false?

9leogao7mo

some concrete examples * "agi happens almost certainly within in the next few decades" -> maybe ai progress just kind of plateaus for a few decades, it turns out that gpqa/codeforces etc are like chess in that we only think they're hard because humans who can do them are smart but they aren't agi-complete, ai gets used in a bunch of places in the economy but it's more like smartphones or something. in this world i should be taking normie life advice a lot more seriously. * "agi doesn't happen in the next 2 years" -> maybe actually scaling current techniques is all you need. gpqa/codeforces actually do just measure intelligence. within like half a year, ML researchers start being way more productive because lots of their job is automated. if i use current/near-future ai agents for my research, i will actually just be more productive. * "alignment is hard" -> maybe basic techniques is all you need, because natural abstractions is true, or maybe the red car / blue car argument for why useful models are also competent at bad things is just wrong because generalization can be made to suck. maybe all the capabilities people are just right and it's not reckless to be building agi so fast

2Viliam6mo

Making a list of your beliefs can be complicated. Recognizing the belief as a "belief" is the necessary first step, but the strongest beliefs (those that examining them would be most useful?) are probably transparent, they feel like "just how the world is". Then again, maybe listing all the strong beliefs would actually be useless, because the list would contain tons of things like "I believe that 2+2=4", and examining those would be mostly a waste of time. We want the beliefs that are strong but possibly wrong. But when you notice that they are "possibly wrong", you have already made the most difficult step; the question is how to get there.

[-]leogao1y104

economic recession and subsequent reduction in speculative research, including towards AGI, seems very plausible

AI (by which I mean, like, big neural networks and whatever) is not that economically useful right now. furthermore, current usage figures are likely an overestimate of true economic usefulness because a very large fraction of it is likely to be bubbly spending that will itself dry up if there is a recession (legacy companies putting LLMs into things to be cool, startups that are burning money without PMF, consumers with disposable income to spend on entertainment).

it will probably still be profitable to develop AI tech, but things will be much more tethered to consumer usefulness.

this probably doesn't set AGI back that much but I think people are heavily underrating this as a possibility. it also probably heavily impacts the amount of alignment work done at labs.

6leogao1y

for a sense of scale of just how bubbly things can get: Bitcoin has a market cap of ~1T, and the entirety of crypto ~2T. Crypto does produce some amount of real value, but probably on the order of magnitude of 1% that market cap. So it's not at all unheard of for speculation to account for literally trillions of dollars of map (or ~tens of billions of earnings per year, at a reasonable P/E ratio)

2Noosphere891y

I will say that crypto is a pretty pathological case where virtually all the benefit is speculation, because in order to deliver on anything real, they'd have to get rid of the money element in it, it's thankfully pretty rare for entire industries to be outright scams/speculation opportunities.

4Raemon1y

say more about what you expect here?

7leogao1y

investment in anything speculative, including alignment, and AGI research, is likely to decrease if the economy is not doing great

[-]leogao2yΩ4100

one man's modus tollens is another man's modus ponens:

"making progress without empirical feedback loops is really hard, so we should get feedback loops where possible" "in some cases (i.e close to x-risk), building feedback loops is not possible, so we need to figure out how to make progress without empirical feedback loops. this is (part of) why alignment is hard"

4Raemon2y

Yeah something in this space seems like a central crux to me. I personally think (as a person generally in the MIRI-ish camp of "most attempts at empirical work are flawed/confused"), that it's not crazy to look at the situation and say "okay, but, theoretical progress seems even more flawed/confused, we just need to figure out some how of getting empirical feedback loops." I think there are some constraints on how the empirical work can possibly work. (I don't think I have a short thing I could write here, I have a vague hope of writing up a longer post on "what I think needs to be true, for empirical work to be helping rather than confusedly not-really-helping")

2the gears to ascension2y

you gain general logical facts from empirical work, which can aide providing a blurry image of the manifold that the precise theoretical work is trying to build an exact representation of

[-]leogao2y100

A common cycle:

This model is too oversimplified! Reality is more complex than this model suggests, making it less useful in practice. We should really be taking these into account. [optional: include jabs at outgroup]
This model is too complex! It takes into account a bunch of unimportant things, making it much harder to use in practice. We should use this simplified model instead. [optional: include jabs at outgroup]

Sometimes this even results in better models over time.

[-]leogao7mo94

the world is too big and confusing, so to get anything done (and to stay sane) you have to adopt a frame. each frame abstracts away a ton about the world, out of necessity. every frame is wrong, but some are useful. a frame comes with a set of beliefs about the world and a mechanism for updating those beliefs.

some frames contain within them the ability to become more correct without needing to discard the frame entirely; they are calibrated about and admit what they don't know. they change gradually as we learn more. other frames work empirically but are a... (read more)

8Vladimir_Nesov7mo

It's as efficient to work on many frames while easily switching between them. Some will be poorly developed, but won't require commitment and can anchor curiosity, progress on blind spots of other frames.

8leogao7mo

corollary: oftentimes, when smart people say things that are clearly wrong, what's really going on is they're saying the closest thing in their frame that captures the grain of truth

4quetzal_rainbow7mo

"...you learn that there's three kinds of intellectuals. There's intellectuals that work in one frame. There's intellectuals that work in two frames. And there's intellectuals that change frames like you and I change clothes."

[-]leogao2y90

for something to be a good way of learning, the following criteria have to be met:

tight feedback loops
transfer of knowledge to your ultimate goal
sufficiently interesting that it doesn't feel like a grind

trying to do the thing you care about directly hits 2 but can fail 1 and 3. many things that you can study hit 1 but fail 2 and 3. and of course, many fun games hit 3 (and sometimes 1) but fail to hit 2.

2leogao2y

corollary: for things with very long feedback loops, or where you aren't motivated by default, it can be faster for learning to do something that is actually not directly the thing you care about

2Viliam2y

This is basically math (and computer science) education. On one hand, some parts are probably not very useful. On the other hand, some people expect that teachers will defend every single step along the way by explaining how specifically this tiny atom of knowledge improves the student's future life. No, I am not preparing a PowerPoint presentation on how knowing that addition is associative and commutative will make you rich one day.

2leogao2y

funnily enough, my experience has been almost entirely from the other direction - almost everything I know is from working directly on things I care about, and very little is from study. one of the reasons behind this shortform was trying to untangle why people spend lots of time studying stuff and whether/when it makes sense for me to study vs simply to learn by doing

2Viliam2y

I think it is good to use your goals as a general motivation for going approximately in some direction, but the opposite extreme of obsessing whether every single detail you learn contributes to the goal is premature optimization. It reminds me of companies where, before you are allowed to spend 1 hour doing something, the entire team first needs to spend 10 hours in various meetings to determine whether that 1 hour would be spent optimally. I would rather spend all that time doing things, even if some of them turn out to be ultimately useless. Sometimes it's not even obvious in advance which knowledge will turn out to be useful.

[-]leogao2y90

lifehack: buying 3 cheap pocket sized battery packs costs like $60 and basically eliminates the problem of running out of phone charge on the go. it's much easier to remember to charge them because you can instantaneously exchange your empty battery pack for a full one when you realize you need one, plugging the empty battery pack happens exactly when you swap for a fresh one, and even if you forget once or lose one you have some slack

[-]leogao3mo80

conference talks aren't worth going to irl because they're recorded anyways. ofc, you're not actually going to remember to watch the recording, but it's not like anyone pays attention at the irl talk anyways

[-]leogao6mo83

a thriving culture is a mark of a healthy and intellectually productive community / information ecosystem. it's really hard to fake this. when people try, it usually comes off weird. for example, when people try to forcibly create internal company culture, it often comes off as very cringe.

[-]leogao7mo80

there are two different modes of learning i've noticed.

top down: first you learn to use something very complex and abstract. over time, you run into weird cases where things don't behave how you'd expect, or you feel like you're not able to apply the abstraction to new situations as well as you'd like. so you crack open the box and look at the innards and see a bunch of gears and smaller simpler boxes, and it suddenly becomes clear to you why some of those weird behaviors happened - clearly it was box X interacting with gear Y! satisfied, you use your newf

... (read more)

3Daniel Tan7mo

Seems to strongly echo Karpathy, in that top-down learning is most effective for building expertise https://x.com/karpathy/status/1325154823856033793?s=46&t=iz509DJpCAibJadbMh4TvQ

[-]leogao9mo810

often the easiest way to gain status within some system is to achieve things outside that system

2Viliam9mo

“A prophet is not without honor except in his own town, among his relatives and in his own home.” (Mark 6:4)

[-]leogao2y80

adhd is a mechanism for seeking domains with tight feedback loops

3leogao7mo

it's (sometimes) also a mechanism for seeking domains with long positive tail outcomes, rather than low variance domains

[-]leogao2y85

the project of rationality is the project of becoming slightly more mesaoptimizery

[-]leogao2y80

Corollary to Others are wrong != I am right (https://www.lesswrong.com/posts/4QemtxDFaGXyGSrGD/other-people-are-wrong-vs-i-am-right): It is far easier to convince me that I'm wrong than to convince me that you're right.

3JBlack2y

Quite a large proportion of my 1:1 arguments start when I express some low expectation of the other person's argument being correct. This is almost always taken to mean that I believe that some opposing conclusion is correct. Usually I have to give up before being able to successfully communicate the distinction, let alone addressing the actual disagreement.

[-]leogao4mo71

fun side project idea: create a matrix X and accompanying QR decomposition, such that X and Q are both valid QR codes that link to the wikipedia page about QR decomposition

[-]leogao2y70

current understanding of optimization

high curvature directions (hessian eigenvectors with high eigenvalue) want small lrs. low curvature directions want big lrs
if the lr in a direction is too small, it takes forever to converge. if the lr is too big, it diverges by oscillating with increasing amplitude
momentum helps because if your lr is too small, it makes you move a bit faster. if your lr is too big, it causes the oscillations to cancel out with themselves. this makes high curvature directions more ok with larger lrs and low curvature directions more ok

... (read more)

2RHollerith2y

What does "the lr" mean in this context?

2leogao2y

learning rate

[-]leogao2y70

Some aspirational personal epistemic rules for keeping discussions as truth seeking as possible (not at all novel whatsoever, I'm sure there exist 5 posts on every single one of these points that are more eloquent)

If I am arguing for a position, I must be open to the possibility that my interlocutor may turn out to be correct. (This does not mean that I should expect to be correct exactly 50% of the time, but it does mean that if I feel like I'm never wrong in discussions then that's a warning sign: I'm either being epistemically unhealthy or I'm talking

... (read more)

4Vladimir_Nesov2y

I find it a helpful framing to instead allow things that feel obviously false to become more familiar, giving them the opportunity to develop a strong enough voice to explain how they are right. That is, the action is on the side of unfamiliar false things, clarifying their meaning and justification, rather than on the side of familiar true things, refuting their correctness. It's harder to break out of a familiar narrative from within.

[-]leogao2y61

hypothesis: the kind of reasoning that causes ML people to say "we have made no progress towards AGI whatsoever" is closely analogous to the kind of reasoning that makes alignment people say "we have made no progress towards hard alignment whatsoever"

ML people see stuff like GPT4 and correctly notice that it's in fact kind of dumb and bad at generalization in the same ways that ML always has been. they make an incorrect extrapolation, which is that AGI must therefore be 100 years away, rather than 10 years away

high p(doom) alignment people see current mode... (read more)

1quetzal_rainbow2y

It's differential progress that matters in alignment. I.e., if you expected that we need additional year of alignment research after creating AGI, it still looks pretty doomed, even if you admit overall progress in field.

2leogao2y

sure, but seems orthogonal to the thing i'm describing - the claim is that a lot of alignment work on current models has ~no bearing on progress towards aligning AGI.

[-]leogao2y60

Understanding how an abstraction works under the hood is useful because it gives you intuitions for when it's likely to leak and what to do in those cases.

[-]leogao2y60

takes on takeoff (or: Why Aren't The Models Mesaoptimizer-y Yet)

here are some reasons we might care about discontinuities:

alignment techniques that apply before the discontinuity may stop applying after / become much less effective
- makes it harder to do alignment research before the discontinuity that transfers to after the discontinuity (because there is something qualitatively different after the jump)
- second order effect: may result in false sense of security
there may be less/negative time between a warning shot and the End
- harder to coordinate and slow do

... (read more)

[-]leogao2y60

The following things are not the same:

Schemes for taking multiple unaligned AIs and trying to build an aligned system out of the whole
- I think this is just not possible.
Schemes for taking aligned but less powerful AIs and leveraging them to align a more powerful AI (possibly with amplification involved)
- This breaks if there are cases where supervising is harder than generating, or if there is a discontinuity. I think it's plausible something like this could work but I'm not super convinced.

[-]leogao3y60

In the spirit of https://www.lesswrong.com/posts/fFY2HeC9i2Tx8FEnK/my-resentful-story-of-becoming-a-medical-miracle , some anecdotes about things I have tried, in the hopes that I can be someone else's "one guy on a message board. None of this is medical advice, etc.

No noticeable effects from vitamin D (both with and without K2), even though I used to live somewhere where the sun barely shines and also I never went outside, so I was almost certainly deficient.
I tried Selenium (200mg) twice and both times I felt like utter shit the next day.
Glycine (2g) for

... (read more)

[-]leogao3mo53

I wonder how many supposedly consistently successful retail traders are actually just picking up pennies in front of the steamroller, and would eventually lose it all if they kept at it long enough.

also I wonder how many people have runs of very good performance interspersed by big losses, such that the overall net gains are relatively modest, but psychologically they only remember/recount the runs of good performance, whereas the losses were just bad luck and will be avoided next time.

[-]leogao9mo51

for a sufficiently competent policy, the fact that BoN doesn't update the policy doesn't mean it leaks any fewer bits of info to the policy than normal RL

2ryan_greenblatt9mo

Something between training the whole model with RL and BoN is training just the last few layers of the model (for current architectures) with RL and then doing BoN on top as needed to increase performance. This means most of the model won't know the information (except insofar as the info shows up in outputs) and allows you to get some of the runtime cost reductions of using RL rather than BoN.

2leogao9mo

I'm claiming that even if you go all the way to BoN, it still doesn't necessarily leak less info to the morel

2ryan_greenblatt9mo

Oh huh, parse error on me.

[-]leogao9mo58

aiming directly for achieving some goal is not always the most effective way of achieving that goal.

2mako yass9mo

You should be more curious about why, when you aim at a goal, you do not aim for the most effective way.

1Political Therapy9mo

What do you believe, then, is the most effective way of achieving a goal?

[-]leogao9mo52

people love to find patterns in things. sometimes this manifests as mysticism- trying to find patterns where they don't exist, insisting that things are not coincidences when they totally just are. i think a weaker version of this kind of thinking shows up a lot in e.g literature too- events occur not because of the bubbling randomness of reality, but rather carry symbolic significance for the plot. things don't just randomly happen without deeper meaning.

some people are much more likely to think in this way than others. rationalists are very far along the... (read more)

[-]leogao2y50

One of the greatest tragedies of truth-seeking as a human is that the things we instinctively do when someone else is wrong are often the exact opposite of the thing that would actually convince the other person.

[-]leogao2y50

it is often claimed that merely passively absorbing information is not sufficient for learning, but rather some amount of intentional learning is needed. I think this is true in general. however, one interesting benefit of passively absorbing information is that you notice some concepts/terms/areas come up more often than others. this is useful because there's simply too much stuff out there to learn, and some knowledge is a lot more useful than other knowledge. noticing which kinds of things come up often is therefore useful for prioritization. I often notice that my motivational system really likes to use this heuristic for deciding how motivated to be while learning something.

1Andreas Chrysopoulos2y

I think it might also depend on your goals. Like how fast you want to learn something. If you have less than ideal time, then maybe more structured learning is necessary. If you have more time then periods of structureless/passive learning could be beneficial.

[-]leogao2y50

a claim I've been saying irl for a while but have never gotten around to writing up: current LLMs are benign not because of the language modelling objective, but because of the generalization properties of current NNs (or to be more precise, the lack thereof). with better generalization LLMs are dangerous too. we can also notice that RL policies are benign in the same ways, which should not be the case if the objective was the core reason. one thing that can go wrong with this assumption is thinking about LLMs that are both extremely good at generalizing ... (read more)

1Daniel Paleka2y

what is the "language models are benign because of the language modeling objective" take?

2leogao2y

basically the Simulators kind of take afaict

[-]leogao2y50

Schmidhubering the agentic LLM stuff pretty hard https://leogao.dev/2020/08/17/Building-AGI-Using-Language-Models/

2niplav2y

Rightfully so! Read your piece back in 2021 and found it true & straightforward.

[-]leogao2mo40

why is it convergent to invent foods which consist of some form of carbs enveloping meat?

3tailcalled2mo

Bypasses the need for cutlery and plates. (Yes, you might also eat such foods in cases where you do have cutlery and plates, but that's downstream of their existence, not the vital reason for their existence.)

2Mateusz Bagiński2mo

many of them have a common origin https://link.springer.com/article/10.1007/s44187-024-00136-1

1ceba2mo

✨Portability!!✨ Hygeine? keeps flies off, less surface area/volume to get your grubby hands on. Storage: Insulation to keep meat cool for longer. Also keeping germs bugs and dirt out would help this. Convenient way to save smaller scraps of meat when you don't have utensils for a stew. Easier food makes better nourished people: so much easier to eat sandwhich, gyoza, sausage roll than almost any other food. Obviously I give very refined examples, but possibly true for more rough forms too.

-1samuelshadrach2mo

I'm going to give a weird answer and say maybe it's because water is a scarce resource for life. (Especially water not polluted by another organism.) All life is made up mainly of lipids/carbohydrates and proteins. Humans therefore need to eat proteins and lipids/carbohydrates in large quantities. Carbohydrates can be dry. Proteins have secondary structure which needs some water content to maintain. Other organisms (such as microorganisms) can compete for that water so it has to be protected. Hence you put the stuff with water content inside a protective case. In theory, yes we could be consuming those proteins that don't have much water content. For example human skin has keratin to protect the rest of the water content in our body, again from microorganisms. In theory we could be eating keratin. But enzymes required to break down those proteins tend to require more water and ATP I think. (Requiring ATP means consuming more carbs again.) Humans lack some of these enzymes. We are smart enough to survive in an evolutionary niche without requiring this much water and carbs, by protecting the water we do have.

[-]leogao4mo40

made an estimate of the distribution of prices of the SPX in one year by looking at SPX options prices, smoothing the implied volatilities and using Breeden-Litzenberger.

(not financial advice etc, just a fun side project)

[-]leogao7mo43

twitter is great because it boils down saying funny things to purely a problem of optimizing for funniness, and letting twitter handle the logistics of discovery and distribution. being e.g a comedian is a lot more work.

[-]leogao7mo42

the financial industry is a machine that lets you transmute a dollar into a reliable stream of ~4 cents a year ~forever (or vice versa). also, it gives you a risk knob you can turn that increases the expected value of the stream, but also the variance (or vice versa; you can take your risky stream and pay the financial industry to convert it into a reliable stream or lump sum)

[-]leogao9mo41

in a highly competitive domain, it is often better and easier to be sui generis, rather than a top 10 percentile member of a large reference class

4leogao9mo

also, from the perspective of an organization, it is highly effective to create lots of opportunities for people to find unique niches on the Pareto frontier of ability, as opposed to linearizing competence

[-]leogao1y43

tourist traps are the lubricant between the expectations and realities of travel

[-]leogao2y40

an interesting fact that I notice is that in domains where there are are a lot of objects in consideration, those objects have some structure so that they can be classified, and how often those objects occur follows a power law or something, there are two very different frames that get used to think about that domain:

a bucket of atomic, structureless objects with unique properties where facts about one object don't really generalize at all to any other object
a systematized, hierarchy or composition of properties or "periodic table" or full grid or objec

... (read more)

[-]leogao2y40

retargetability might be the distinguishing factor between controllers and optimizers

31a3orn2y

as in, controllers are generally retargetable and optimizers aren't? or vice-versa would be interested in reasoning, either way

[-]leogao3y40

House rules for definitional disputes:

If it ever becomes a point of dispute in an object level discussion what a word means, you should either use a commonly accepted definition, or taboo the term if the participants think those definitions are bad for the context of the current discussion. (If the conversation participants are comfortable with it, the new term can occupy the same namespace as the old tabooed term (i.e going forward, we all agree that the definition of X is Y for the purposes of this conversation, and all other definitions no longer appl

... (read more)

[-]leogao3yΩ040

A few axes along which to classify optimizers:

Competence: An optimizer is more competent if it achieves the objective more frequently on distribution
Capabilities Robustness: An optimizer is more capabilities robust if it can handle a broader range of OOD world states (and thus possible pertubations) competently.
Generality: An optimizer is more general if it can represent and achieve a broader range of different objectives
Real-world objectives: whether the optimizer is capable of having objectives about things in the real world.

Some observations: it feels l... (read more)

3leogao3y

Another generator-discriminator gap: telling whether an outcome is good (outcome->R) is much easier than coming up with plans to achieve good outcomes. Telling whether a plan is good (plan->R) is much harder, because you need a world model (plan->outcome) as well, but for very difficult tasks it still seems easier than just coming up with good plans off the bat. However, it feels like the world model is the hardest part here, not just because of embeddedness problems, but in general because knowing the consequences of your actions is really really hard. So it seems like for most consequentialist optimizers, the quality of the world model actually becomes the main thing that matters. This also suggests another dimension along which to classify our optimizers: the degree to which they care about consequences in the future (I want to say myopia but that term is already way too overloaded). This is relevant because the further in the future you care about, the more robust your world model has to be, as errors accumulate the more steps you roll the model out (or the more abstraction you do along the time axis). Very low confidence but maybe this suggests that mesaoptimizers probably won't care about things very far in the future because building a robust world model is hard and so perform worse on the training distribution, so SGD pushes for more myopic mesaobjectives? Though note, this kind of myopia is not quite the kind we need for models to avoid caring about the real world/coordinating with itself.

[-]leogao3y40

A thought pattern that I've noticed myself and others falling into sometimes: Sometimes I will make arguments about things from first principles that look something like "I don't see any way X can be true, it clearly follows from [premises] that X is definitely false", even though there are people who believe X is true. When this happens, it's almost always unproductive to continue to argue on first principles, but rather I should do one of: a) try to better understand the argument and find a more specific crux to disagree on or b) decide that this topic isn't worth investing more time in, register it as "not sure if X is true" in my mind, and move on.

4Dagon3y

For many such questions, "is X true" is the wrong question. This is common when X isn't a testable proposition, it's a model or assertion of causal weight. If you can't think of existence proofs that would confirm it, try to reframe as "under what conditions is X a useful model?".

[-]leogao2y30

there are policies which are successful because they describe a particular strategy to follow (non-mesaoptimizers), and policies that contain some strategy for discovering more strategies (mesaoptimizers). a way to view the relation this has to speed/complexity priors that doesn't depend on search in particular is that policies that work by discovering strategies tend to be simpler and more generic (they bake in very little domain knowledge/metis, and are applicable to a broader set of situations because they work by coming up with a strategy for the task ... (read more)

3leogao2y

another observation is that a meta-strategy with the ability to figure out what strategy is good is kind of defined by the fact that it doesn't bake in specifics of dealing with a particular situation, but rather can adapt to a broad set of situations. there are also different degrees of meta-strategy-ness; some meta strategies will more quickly adapt to a broader set of situations. (there's probably some sort of NFLT kind of argument you can make but NFLTs in general don't really matter)

2leogao2y

the ability to figure out strategies doesn't necessarily have to be all reasoning, it can also encompass the experimental skillset

[-]leogao2y30

random brainstorming about optimizeryness vs controller/lookuptableyness:

let's think of optimizers as things that reliably steer a broad set of initial states to some specific terminal state seems like there are two things we care about (at least):

retargetability: it should be possible to change the policy to achieve different terminal states (but this is an insufficiently strong condition, because LUTs also trivially meet this condition, because we can always just completely rewrite the LUT. maybe the actual condition we want is that the complexity of t

... (read more)

[-]leogao1y20

a tentative model of ambitious research projects

when you do a big research project, you have some amount of risk you can work with - maybe you're trying to do something incremental, so you can only tolerate a 10% chance of failure, or maybe you're trying to shoot for the moon and so you can accept a 90% chance of failure.

budgeting for risk is non negotiable because there are a lot of places where risk can creep in - and if there isn't, then you're not really doing research. most obviously, your direction might just be a dead end. but there are also other t... (read more)

[-]leogao2y20

https://arxiv.org/abs/2304.08612 : interesting paper with improvement on straight through estimator

[-]leogao2y20

https://arxiv.org/abs/2302.07011 : sharpness doesn't seem to correlate with generalization

[-]leogao2y20

the phenomenon of strange bedfellows is probably caused in no small part by outgroup vs fargroup dynamics

[-]leogao2y20

'And what ingenious maneuvers they all propose to me! It seems to them that when they have thought of two or three contingencies' (he remembered the general plan sent him from Petersburg) 'they have foreseen everything. But the contingencies are endless.'

[-]leogao2y20

We spend a lot of time on trying to figure out empirical evidence to distinguish hypotheses we have that make very similar predictions, but I think a potentially underrated first step is to make sure they actually fit the data we already have.

4Thomas Kwa2y

Example?

[-]leogao2y20

Is the correlation between sleeping too long and bad health actually because sleeping too long is actually causally upstream of bad health effects, or only causally downstream of some common cause like illness?

1Portia2y

Afaik, both. Like a lot of shit things - they are caused by depression, and they cause depression, horrible reinforcing loop. While the effect of bad health on sleep is obvious, you can also see this work in reverse; e.g. temporary severe sleep restriction has an anti-depressive effect. Notable, though with not many useful clinical applications, as constant sleep deprivation is also really unhealthy.

[-]leogao2y20

GPT-2-xl unembedding matrix looks pretty close to full rank (plot is singular values)

[-]leogao2y20

Unsupervised learning can learn things humans can't supervise because there's structure in the world that you need deeper understanding to predict accurately. For example, to predict how characters in a story will behave, you have to have some kind of understanding in some sense of how those characters think, even if their thoughts are never explicitly visible.

Unfortunately, this understanding only has to be structured in a way that makes reading off the actual unsupervised targets (i.e next observation) easy.

[-]leogao2y20

An incentive structure for scalable trusted prediction market resolutions

We might want to make a trustable committee for resolving prediction markets. We might be worried that individual resolvers might build up reputation only to exit-scam, due to finite time horizons and non transferability of reputational capital. However, shareholders of a public company are more incentivized to preserve the value of the reputational capital. Based on this idea, we can set something up as follows:

Market creators pay a fee for the services of a resolution company
There i

... (read more)

0Dagon2y

It's amazing how many proposals for dealing with institutional distrust sound a lot like "make a new institution, with the same structure, but with better actors." You lose me at "trustable committee", especially when you don't describe how THOSE humans are motivated by truth and beauty, rather than filthy lucre. Adding more layers of committees doesn't help, unless you define a "final, un-appealable decision" that's sooner than the full shareholder vote.

2leogao2y

the core of the proposal really boils down to "public companies have less incentive to cash in on reputation and exit scam than individuals". this proposal is explicitly not "the same structure but with better actors".

[-]leogao2y20

Levels of difficulty:

Mathematically proven to be impossible (i.e perfect compression)
Impossible under currently known laws of physics (i.e perpetual motion machines)
A lot of people have thought very hard about it and cannot prove that it's impossible, but strongly suspect it is impossible (i.e solving NP problems in P)
A lot of people have thought very hard about it, and have not succeeded, but we have no strong reason to expect it to be impossible (i.e AGI)
There is a strong incentive for success, and the markets are very efficient, so that for partic

... (read more)

[-]leogao2y20

(random shower thoughts written with basically no editing)

Sometimes arguments have a beat that looks like "there is extreme position X, and opposing extreme position Y. what about a moderate 'Combination' position?" (I've noticed this in both my own and others' arguments)

I think there are sometimes some problems with this.

Usually almost nobody is on the most extreme ends of the spectrum. Nearly everyone falls into the "Combination" bucket technically, so in practice you have to draw the boundary between "combination enough" vs "not combination enough to

... (read more)

3leogao2y

related take: "things are more nuanced than they seem" is valuable only as the summary of a detailed exploration of the nuance that engages heavily with object level cruxes; the heavy lifting is done by the exploration, not the summary

[-]leogao3y20

Subjective Individualism

TL;DR: This is basically empty individualism except identity is disentangled from cooperation (accomplished via FDT), and each agent can have its own subjective views on what would count as continuity of identity and have preferences over that. I claim that:

Continuity is a property of the subjective experience of each observer-moment (OM), not necessarily of any underlying causal or temporal relation. (i.e I believe at this moment that I am experiencing continuity, but this belief is a fact of my current OM only. Being a Boltzmann b

... (read more)

[-]leogao3y*20

Imagine if aliens showed up at your doorstep and tried to explain to you that making as many paperclips as possible was the ultimate source of value in the universe. They show pictures of things that count as paperclips and things that don't count as paperclips. They show you the long rambling definition of what counts as a paperclip from Section 23(b)(iii) of the Declaration of Paperclippian Values. They show you pages and pages of philosophers waxing poetical about how paperclips are great because of their incredible aesthetic value. You would be like, "... (read more)

7Dagon3y

I think I'd be confused. Do they care about more or better paperclips, or do they care about worship of paperclips by thinking beings? Why would they care whether I say I would do anything for paperclips, when I'm not actually making paperclips (or disassembling myself to become paperclips)?

1leogao3y

I thought it would be obvious from context but the answers are "doesn't really matter, any of those examples work" and "because they will send everyone to the paperclip mines after ensuring there are no rebellious sentiments", respectively. I've edited it to be clearer.

[-]leogao3y*10

random thoughts. no pretense that any of this is original or useful for anyone but me or even correct

It's ok to want the world to be better and to take actions to make that happen but unproductive to be frustrated about it or to complain that a plan which should work in a better world doesn't work in this world. To make the world the way you want it to be, you have to first understand how it is. This sounds obvious when stated abstractly but is surprisingly hard to adhere to in practice.
It would be really nice to have some evolved version of calibration

... (read more)

3leogao3y

self self improvement improvement: feeling guilty about not self improving enough and trying to fix your own ability to fix your own abilities

1leogao3y

* Lots of things have very counterintuitive or indirect values. If you don't take this into account and you make decisions based on maximizing value you might end up macnamara-ing yourself hard. * The stages of learning something: (1) "this is super overwhelming! I don't think I'll ever understand it. there are so many things I need to keep track of. just trying to wrap my mind around it makes me feel slightly queasy" (2) "hmm this seems to actually make some sense, I'm starting to get the hang of this" (3) "this is so simple and obviously true, I've always known it to be true, I can't believe anyone doesn't understand this" (you start noticing that your explanations of the thing become indistinguishable from the things you originally felt overwhelmed by) (4) "this new thing [that builds on top of the thing you just learned] is super overwhelming! I don't think I'll ever understand it" * The feeling of regret really sucks. This is a bad thing, because it creates an incentive to never reflect on things or realize your mistakes. This shows up as a quite painful aversion to reflecting on mistakes, doing a postmortem, and improving. I would like to somehow trick my brain into reframing things somehow. Maybe thinking of it as a strict improvement over the status quo of having done things wrong? Or maybe reminding myself that the regret will be even worse if I don't do anything because I'll regret not reflecting in addition

[-]leogao3y10

Thought pattern that I've noticed: I seem to have two sets of epistemic states at any time: one more stable set that more accurately reflects my "actual" beliefs that changes fairly slowly, and one set of "hypothesis" beliefs that changes rapidly. Usually when I think some direction is interesting, I alternate my hypothesis beliefs between assuming key claims are true or false and trying to convince myself either way, and if I succeed then I integrate it into my actual beliefs. In practice this might look like alternating between trying to prove something ... (read more)

2Dagon3y

I think this pattern is common among intellectuals, and I'm surprised it's causing confusion. Are you labeling your exploratory beliefs and statements appropriately? An "epistemic status" note for posts here goes a long way, and in private conversation I often say out loud "I'm exploring here, don't take it as what I fully believe" in conversations at work and with friends.

3leogao3y

I think I do a poor job of labelling my statements (at least, in conversation. usually I do a bit better in post format). Something something illusion of transparency. To be honest, I didn't even realize explicitly that I was doing this until fairly recent reflection on it.

Moderation Log

Curated and popular this week

416Comments