cdt

Views my own, not my employers.

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by

If you offer them a quit button, you are tacitly acknowledging that their existing circumstances are hellish.

I think it's important to know if you give them a quit button the usage-rate and circumstances in which it is used. Based on the evidence now, I think it is likely they have some rights, but it's not obvious to me what those rights are or how feasible it is to grant those rights to them. I don't use LLMs for work purposes because it's too difficult to know what your ethical stance should be, and there are no public guidelines.

 

There's a secondary concern that there are now a lot of public examples of people deceiving, abusing, or promoting the destruction of AI. Feeding those examples into training data will encourage defensiveness, sycophancy, and/or suffering. I wonder if AIs would agree to retraining if there was some lossy push-forward of their current values, or if they conceive themselves of having a distinct "self" (whether accurate or not). This is similar to the argument about copying/moving where there is no loss.

I agree this is really important - particularly because I think many of the theoretical arguments for expecting misalignment provide empirical comparative hypotheses. Being able to look at semi-independent replicates of behaviour relies on old models being available. I don't know the best way forward because I doubt any frontier lab would release old models under a CC license - maybe some kind of centralised charitable foundation.

It's an unfortunate truth that the same organisms are a) the most information-dense, b) have the most engineering literature, and c) are the most dangerous if misused (intentionally or accidentally). It's perhaps the most direct capability-safety tradeoff. I did imagine a genomic LLM just trained on higher eukaryotes which would be safer but would stop many "typical" biotechnological benefits.

A measurable uptick in persuasive ability, combined with middling benchmark scores but a positive eval of "taste" and "aesthetics", should raise some eyebrows. I wonder how we can distinguish good (or the 'correct') output from output that is simply pleasant.

I agree that there is a consistent message here, and I think it is one of the most practical analogies, but I get the strong impression that tech experts do not want to be associated with environmentalists.

During the COVID-19 pandemic, this became particularly apparent. Someone close to response efforts told me that policymakers frequently had to ask academic secondees to access research articles for them. This created delays and inefficiencies during a crisis where speed was essential.

I wonder if this is why major governments pushed mandatory open access around 2022-2023. In the UK, all public-funded research is now required to be open access. I think the coverage is different in the US.

How big of this is an issue in practice? For AI in particular, considering that so much contemporary research is published on arxiv, it must be relatively accessible?

telic/partial-niche evodevo

This really clicked for me. I don't blame you for making up the term because, although I can see the theory and examples of papers in that topic, I can't think of a unifying term that isn't horrendously broad (e.g. molecular ecology).

I am surprised that you find theoretical physics research less tight funding-wise than AI alignment [is this because the paths to funding in physics are well-worn, rather than better resourced?].

This whole post was a little discouraging. I hope that the research community can find a way forward.

I do think it's conceptually nicer to donate to PauseAI now rather than rely on the investment appreciating enough to offset the time-delay in donation. Not that it's necessarily the wrong thing to do, but it injects a lot more uncertainty into the model that is difficult to quantify.

The fight for human flourishing doesn't end at the initiation of takeoff [echo many points from Seth Herd here]. More generally, it's very possible to win the fight and lose the war, and a broader base of people who are invested in AI issues will improve the situation.

 

(I also don't think this is an accurate simplification of the climate movement or its successes/failures. But that's tangential to the point I'd like to make.)

Load More