All of cdt's Comments + Replies

cdt10

Adding a contrary stance to the other comments comments, I think there is a lot of merit to not keeping on with university, but only if you can find an opportunity you are happy with. Your post seems to imply the alternative to university is hedonism, and if that's what you want then you should go for it, but I don't feel that is the only other option. You may also find it harder to enjoy yourself if you feel you are forced into that choice it out of a fear of ruin.

cdt30

I thought I was the only one who struggled with that. Nice to see another example in the wild, and I hope that you find a new set of habits that works for you.

cdt30

This was a thought-provoking essay. I hope you consider full mirroring posts here in the future as I think you'll get more engagement.

1B Jacobs
Thanks! Yeah, I think I'll do that (in a couple weeks)
cdt30

I agree super-persuasion is poorly defined, comparing it to hypnosis is probably false.

I was reading this paper on medical diagnoses with AI and the fact that patients rate it significantly better than the average human doctor. Combine that with all of the reports about things like Character.ai, I think this shows that LLMs are already superhuman at building trust, which is a key component of persuasion. 

Part of this is that the reliable signals of trust between humans do not transfer between humans and AI. A human who writes 600 words back to your qu... (read more)

cdt10

in the absence of such incomplete research agendas we'd need to rely on AI's judgment more completely

 

This is a key insight and I think that operationalising or pinning down the edges of a new research area is one of the longest time-horizon projects there is. If the METR estimate is accurate, then developing research directions is a distinct value-add even after AI research is semi-automatable. 

cdt*20

I agree there is significant uncertainty in the moral patienthood of AI models and so far there is a limited opportunity cost to not using them. It would be useful for some ethical guidelines to be put in place (some have already suggested this against users deceiving models like offering fake rewards) but fmpov it's easiest to simply refrain from use right now.

cdt104

This may be because editing has become easier and faster to iterate.

It's comparatively easy to identify sentences that are too long. Is it easy to identify sentences that are too short? You can always add an additional sentence, but finding examples where sentences themselves should be longer is much harder. With more editing cycles, this leads to shorter and shorter sentences.

1Hugo Villeneuve
I had the same insight. Older writings would be closer to oral form with abundant disgressions. Modern text processing alows for evermore edits where the flow progressively gets structured with shorter sentences in the right order.
cdt10

If you offer them a quit button, you are tacitly acknowledging that their existing circumstances are hellish.

I think it's important to know if you give them a quit button the usage-rate and circumstances in which it is used. Based on the evidence now, I think it is likely they have some rights, but it's not obvious to me what those rights are or how feasible it is to grant those rights to them. I don't use LLMs for work purposes because it's too difficult to know what your ethical stance should be, and there are no public guidelines.

 

There's a seconda... (read more)

cdt40

I agree this is really important - particularly because I think many of the theoretical arguments for expecting misalignment provide empirical comparative hypotheses. Being able to look at semi-independent replicates of behaviour relies on old models being available. I don't know the best way forward because I doubt any frontier lab would release old models under a CC license - maybe some kind of centralised charitable foundation.

cdt10

It's an unfortunate truth that the same organisms are a) the most information-dense, b) have the most engineering literature, and c) are the most dangerous if misused (intentionally or accidentally). It's perhaps the most direct capability-safety tradeoff. I did imagine a genomic LLM just trained on higher eukaryotes which would be safer but would stop many "typical" biotechnological benefits.

cdt20

A measurable uptick in persuasive ability, combined with middling benchmark scores but a positive eval of "taste" and "aesthetics", should raise some eyebrows. I wonder how we can distinguish good (or the 'correct') output from output that is simply pleasant.

cdt24

I agree that there is a consistent message here, and I think it is one of the most practical analogies, but I get the strong impression that tech experts do not want to be associated with environmentalists.

cdt10

During the COVID-19 pandemic, this became particularly apparent. Someone close to response efforts told me that policymakers frequently had to ask academic secondees to access research articles for them. This created delays and inefficiencies during a crisis where speed was essential.

I wonder if this is why major governments pushed mandatory open access around 2022-2023. In the UK, all public-funded research is now required to be open access. I think the coverage is different in the US.

How big of this is an issue in practice? For AI in particular, considering that so much contemporary research is published on arxiv, it must be relatively accessible?

3Adam Jones
I think this is less of an issue for technical AI papers. But I'm finding more governance researchers (especially people moving from other academic communities) seem intent on journal publishing in places that policymakers can't read their stuff! I have also been blocked sometimes from sharing papers with governance friends easily because they are behind paywalls. I might see this more because at BlueDot we get a lot of people who are early on in their career transition, and producing projects they want to publish in places.
cdt20

I am surprised that you find theoretical physics research less tight funding-wise than AI alignment [is this because the paths to funding in physics are well-worn, rather than better resourced?].

This whole post was a little discouraging. I hope that the research community can find a way forward.

cdt52

I do think it's conceptually nicer to donate to PauseAI now rather than rely on the investment appreciating enough to offset the time-delay in donation. Not that it's necessarily the wrong thing to do, but it injects a lot more uncertainty into the model that is difficult to quantify.

cdt32

The fight for human flourishing doesn't end at the initiation of takeoff [echo many points from Seth Herd here]. More generally, it's very possible to win the fight and lose the war, and a broader base of people who are invested in AI issues will improve the situation.

 

(I also don't think this is an accurate simplification of the climate movement or its successes/failures. But that's tangential to the point I'd like to make.)

cdt45

I think PauseAI would be more effective if it could mobilise people who aren't currently associated with AI safety, but from what I can see it largely draws from the same base as EA. It is important to involve as wide a section of society as possible in the x-risk conversation and activism could help achieve this.

cdt51

The most likely scenario by far is that a mirrored bacteria would be outcompeted by other bacteria and killed by achiral defenses due to [examples of ecological factors]

I think this is the crux of the different feelings around this paper. There are a lot of unknowns here. The paper does a good job of acknowledging this and (imo) it justifies a precautionary approach, but I think the breadth of uncertainty is difficult to communicate in e.g. policy briefs or newspaper articles.

cdt32

It's a good connection to draw - I wonder if increased awareness about AI is sparking increased awareness of safety concepts in related fields. It's a particularly good sign for awareness and action of the safety concepts present in the overlap between AI and biotechnology.

I think you're right that there's very little benefit compared to the risks for mirror life which is not seen as true with AI - on top of the general truth that biotech is harder to monetise.

cdt20

Can you explain more about why you think [AGI requires] a shared feature of mammals and not, say, humans or other particular species?

3RussellThor
I think it is clear that if say you had a complete connectome scan and knew everything about how a chimp brain worked you could scale it easily to get human+ intelligence. There are no major differences. Small mammal is my best guess, mammals/birds seem to be able to learn better than say lizards. Specifically the https://en.wikipedia.org/wiki/Cortical_column is important to understand, once you fully understand one, stacking them will scale at least somewhat well. Going  to smaller scales/numbers of neurons, it may not need to be as much as a mammal, https://cosmosmagazine.com/technology/dishbrain-pong-brain-on-chip-startup/, perhaps we can learn enough of the secrets here? I expect not, but only weakly confident. Going even simpler, we have the connectome scan of a fly now, https://flyconnecto.me/ and that hasn't led to major AI advances. So its somewhere between fly/chimp I'd guess mouse that gives us the missing insight to get TAI
cdt10

It's very field-dependent. In ecology & evolution, advisor-student fit is very influential and most programmes are direct admit to a certain professor. The weighting seems different for CS programs, many of which make you choose an advisor after admission (my knowledge is weaker here).

In the UK it's more funding dependent - grant-funded PhDs are almost entirely dependent on the advisor's opinion, whereas DTPs/CDTs have different selection criteria and are (imo) more grades-focused.

cdt10

From discussing AI politics with the general public [i.e. not experts], it seems that the public perception of AI progress is bifurcating on two parallel lines:

A) Current AI progress is sudden and warrants a response (either acceleration or regulation)

B) Current AI progress is a flash-in-the-pan or a nothingburger.

(This is independent from responding to hypothetical AI-in-concept.)

These perspectives are largely factual rather than ideological. In conversation, the active tension between these two incompatible perspectives is really obvious. It makes it har... (read more)

cdt20

It is worth noting that UKRI is in the process of changing their language to Doctoral Landscape Awards (replacing DTP) and Doctoral Focal Awards (CDT). The announcements for BBSRC and NERC have already been done, but I can't find what EPSRC is doing.

cdt21

I agree that evolutionary arguments are frequently confused and oversimplified, but your argument is proving too much.

[the difference between] AI and genetic code is that genetic code has way less ability to error-correct than basically all AI code, and it's in a weird spot of reliability where random mutations are frequent enough to drive evolution, but not so frequent as to cause organisms to outright collapse within seconds or minutes.

This "weird spot of reliability" is itself an evolved trait, and even with the effects of mutation rate variation betwee... (read more)

cdt20

There's a connection to the idea of irony poisoning here, and I do not think it is good for the person in question to pretend to hold extremist views. This is a parallel issue with the fact that it's terrible optics and creates a difficult tension with this website's newfound interest in doing communications/policy/outreach work.

5Noosphere89
I'd argue one of the issues with a lot of early social media moderation policies was treating ironic beliefs that were usually banned as not ban-worthy, because as it turned out, ironic belief in some extremism turned out to either have been fake, or turned into the real versions over time.
cdt41

Currently I'm not convinced that the memetic analogy has done more to clarify than to occlude cultural evolution/opinion dynamics. That's not to say that work in genetics is useless, but I think that the terminology has taken precedence above what the actual concepts mean, and I read a lot of conversations that feel like people just trading the information that they read The Selfish Gene 40 years ago.

There's certainly scope for an applied "memetics" but it's really crying out for a good predictive (even if simplistic) model.

1AtillaYasar
But I'm curious what you think is a better word or term for referring to "iteration on ideas", as this is one of the things I'm actively trying to sort out by writing this post.
1AtillaYasar
It's just a pointer to a concept, I'm not relying on the analogy to genes.
cdt30

I've noticed they perform much better on graduate-level ecology/evolution questions (in a qualitative sense - they provide answers that are more 'full' as well as technically accurate). I think translating that into a "usefulness" metric is always going to be difficult though.

cdt10

I would have found it helpful in your report for there to be a ROSES-type diagram or other flowchart showing the steps in your paper collation. This would bring it closer in line with other scoping reviews and would have made it easier to understand your methodology.

cdt40

Linguistic Drift, Neuralese, and Steganography

In this section you use these terms implying there's a body of research underneath these terms. I'm very interested in understanding this behaviour but I wasn't aware it was being measured. Is anyone currently working on models of linguistic drift/measuring it with manuscripts you could link?

5RohanS
Max Nadeau recently made a comment on another post that gave an opinionated summary of a lot of existing CoT faithfulness work, including steganography. I'd recommend reading that. I’m not aware of very much relevant literature here; it’s possible it exists and I haven’t heard about it, but I think it’s also possible that this is a new conversation that exists in tweets more than papers so far. * Paper on inducing steganography and combatting it with rephrasing: Preventing Language Models From Hiding Their Reasoning * Noting a difference between steganography and linguistic drift: I think rephrasing doesn’t make much sense as a defense against strong linguistic drift. If your AI is solving hard sequential reasoning problems with CoT that looks like “Blix farnozzium grellik thopswen…,” what is rephrasing going to do for you? * Countering Language Drift via Visual Grounding (Meta, 2019) * I haven’t looked at this closely enough to see if it’s really relevant, but it does say in the abstract “We find that agents that were initially pretrained to produce natural language can also experience detrimental language drift: when a non-linguistic reward is used in a goal-based task, e.g. some scalar success metric, the communication protocol may easily and radically diverge from natural language.” That sounds relevant. * Andrej Karpathy suggesting that pushing o1-style RL further is likely to lead to linguistic drift:  https://x.com/karpathy/status/1835561952258723930?s=46&t=foMweExRiWvAyWixlTSaFA * In the o1 blog post, OpenAI said (under one interpretation) that they didn’t want to just penalize the model for saying harmful or deceptive plans in the CoT because that might lead it to keep having those plans but not writing them in CoT. * “Assuming it is faithful and legible, the hidden chain of thought allows us to “read the mind” of the model and understand its thought process. For example, in the future we may wish to monitor the chain of thought for signs of m
cdt10

My impression is that's a little simplistic, but I also don't have the best knowledge of the market outside WGS/WES and related tools. That particular market is a bloodbath. Maybe there's better scope in proteomics/metabolomics/stuff I know nothing about.

cdt10

My impression is that much of this style of innovation is happening inside research institutes and then diffusing outward. There are plenty of people doing "boring" infrastructure work at the Sanger Institute, EMBL-EBI, etc. And you all get it for free! I can however see that on-demand services for biotech are a little different.

1Abhishaike Mahajan
That’s very true, but I do think the translation to privatization can be useful! Helps push for better UI/UX, better support, and even better infra work. This isn’t true across the board, hard to imagine a company creating something like MMSeq, but I have to imagine its true in other areas
cdt40

This fail-state is particularly worrying to me, although it is not obvious whether there is enough time for such an effect to actually intervene on the future outcome.

cdt10

Are you aware of anyone else working on the same topic?

This is not novel to Hanson, it's been a staple of (neo)reactionary /conservative thought for millenia.

cdt10

I was reading the UK National Risk Register earlier today and thinking about this. Notable to me that the top-level disaster severity has a very low cap of ~thousands of casualties, or billions of economic loss. Although it does note in the register that AI is a chronic risk that is being managed under a new framework (that I can't find precedent for).

cdt10

I do think this comes back to the messages in On Green and also why the post went down like a cup of cold sick - rationality is about winning. Obviously nobody on LW wants to "win" in the sense you describe, but more winning over more harmony on the margin, I think.

The future will probably contain less of the way of life I value (or something entirely orthogonal), but then that's the nature of things.

2Noosphere89
I think 2 cruxes IMO dominate the discussion a lot that are relevant here: 1. Will a value lock-in event happen, especially soon in a way such that once the values are locked in, it's basically impossible to change values? 2. Is something like the vulnerable world hypothesis correct about technological development? If you believed 1 or 2, I could see why people disagreed with Sarah Constantin's statement on here.
cdt30

Your general argument rings true to my ears - except the part about AI safety. It is very hard to interact with AI safety without entering the x-risk sphere, as shown by this piece of research by the Cosmos Institute, where the x-risk sphere is almost 2/3rds total funding (I have some doubts about the accounting). Your argument about Mustafa Suleyman strikes me as a "just-so" story - I do wish it were replicable, but I would be surprised, particularly with AI safety's sense of urgency. 

I'm here because truly there is no better place, and I mean that i... (read more)

cdt10

I think you're getting at something fairly close to the Piranha theorem from a different (ecological?) angle.

2tailcalled
There's overlap; the Piranha theorem gives you a tradeoff between frequency and effect size. But it's missing the part where logarithmic perception means you only care about the rare factors with large effect and not the common factors with small effect.
cdt20

Advice for journalists was a bit more polemic which I think naturally leads to more engagement. But I'd like to say that I strongly upvoted the mapping discussions post and played around with the site quite a bit when it was first posted - it's really valuable to me.

Karma's a bit of a blunt tool - yes I think it's good to have posts with broad appeal but some posts are going to be comparatively more useful to a smaller group of people, and that's OK too. 

1Nathan Young
Sure but shouldn't the karma system be a prioritisation ranking, not just "what is fun to read?"
cdt30

Your points are true and insightful, but you've written them in a way that won't gain much cachet here. 

I wrote a similar piece to the Cosmos Institute competition, which hopefully I can share here when that is finished, and maybe we can bounce the idea off each other?

1wahala
That would be great. You can send a DM.
cdt30

I think this effect will be more wide-spread than targeting only already-vulnerable people, and it is particularly hard to measure because the causes will be decentralised and the effects will be diffuse. I predict it being a larger problem if, in the run-up between narrow AI and ASI, we have a longer period of necessary public discourse and decision-making. If the period is very short then it doesn't matter. It may not affect many people given how much penetration AI chatbots have in the market before takeoff too.

cdtΩ010

This is not an obvious continuation of the prompt to me - maybe there are just a lot more examples of explicit refusal on the internet than there are in (e.g.) real life.

2Arthur Conmy
My current best guess for why base models refuse so much is that "Sorry, I can't help with that. I don't know how to" is actually extremely common on the internet, based on discussion with Achyuta Rajaram on twitter: https://x.com/ArthurConmy/status/1840514842098106527 This fits with our observations about how frequently LLaMA-1 performs incompetent refusal
cdt41

thinking maybe we owe something to our former selves, but future people probably won't think this

This is a very strong assertion. Aren't most people on this forum, when making present claims about what they would like to happen in the future, trying to form this contract? (This comes back to the value lock-in debate.)

cdtΩ010

Is there a reason to expect this kind of behaviour to appear from base models with no fine-tuning?

2Cleo Nardo
the base model is just predicting the likely continuation of the prompt. and it's a reasonable prediction that, when an assistant is given a harmful instruction, they will refuse. this behaviour isn't surprising.
cdt10

Unfortunately you did nerdsnipe me with the 'biologists think' statement so I am forced to keep replying!

It's worth noting that the original derivations of natural selection do use absolute fitness - relative fitness is simply a reparameterization when you have constant N (source: any population genetics textbook). This was why I brought up density-dependent selection, as under that framework N (and s) is changing, and selection in those circumstances is more complicated. 

In fact, even under typical models, relative fitness and absolute fitness show i... (read more)

cdt30

I don't think contemporary theory has ignored this - see recent theories of density-dependent selection here: (article making the same point), (review). The fundamental issue you're hinging on is that absolute population growth (most effective exploitation of resources) is an ecological concept, not an evolutionary one, and population ecology theory is less well-known outside its field than population genetic theory.

1Lorec
As far as I can tell, density-dependent selection is an entirely different concept from what I'm trying to get at here, and operates entirely within the usual paradigm that says natural selection is always optimizing for relative fitness. Yes, the authors are trying to say, "We have to be careful to pay attention to the baseline that selection is relative to", but I think biologists were always implicitly careful to pay attention to varying baseline populations - which are usually understood to be species, and not ecological at all. I'm trying to take a step back and say, look, we can't actually take it for granted that selection is optimizing for reproductive fitness relative to ANY baseline in the first place; it's an empirical observation external to the theory that "what reproduces, increases in prevalence" that evolution within sexual species seems to optimize for %-prevalence-within-the-species, rather than absolute size of descendants-cone.
cdt30

My understanding was the typical explanation was antagonistic pleiotropy, but I don't know whether that's the consensus view.

This seems to have the name 'pathogen control hypothesis' in the literature - see review. I think it has all the hallmarks of a good predictive hypothesis, but I'd really want to see some simulations of which parameter scenarios induce selection this way. 

2lemonhope
They keywords are much appreciated. That second link is only from 2022! I wonder if anybody suggested this in like 1900. Edit: some of the citations are from very long ago
cdt10

I first learned about Arcadia from https://dynamicecology.wordpress.com/ blog as a "evolutionary biology" startup. When I looked they only had their fungal capsid negative result published.

I'm quite optimistic about the potential for data mining from phylogenomic inference, but I wouldn't have described any of their current projects as "blue-sky" or "high variance" like mentioned in the post. I'm not sure that generating data, competing with large government-funded research hubs, is effective. Maybe there's scope away from human health research areas which... (read more)

1Abhishaike Mahajan
Yeah Arcadia is definitely less on the blue-sky/high-variance realm of the spectrum, and closer to 'better research in underserved areas of biology'.  I was pondering adding Altos Labs here, alongside Retro Bio and Newlimit, but they do feel a bit different from others here given the strong focus on a for-profit system (Arcadia's for-profit focus is a bit more opportunistic). 
cdt10

There's an implied assumption that when you lose parts of society through a bottleneck that you can always recreate them with high fidelity. It seems plausible that some bottleneck events could "limit humanity's potential", since choices may rely on those lost values, and not all choices are exchangeable in time. (This has connections both to the long reflection and to the rich shaping the world in their own image).

As an aside, the bottleneck paper you're referring to is pretty contentious. I personally find it unlikely that no other demographic model dete... (read more)

2Noosphere89
I do think this assumption will become more true in the 21st century, and indeed AI and robotics progress making the assumption more true is the main reason that if a catastrophe happens, this is how it happens. Re the bottleneck paper, I'm not going to comment any further, since I just wanted to provide an example.