I'm the chief scientist at Redwood Research.
To be clear, I'm sympathetic to some notion like "AI companies should generally be responsible in terms of having notably higher benefits than costs (such that they could e.g. buy insurance for their activities)" which likely implies that you need jailbreak robustness (or similar) once models are somewhat more capable of helping people make bioweapons. More minimally, I think having jailbreak robustness while also giving researchers helpful-only access probably passes "normal" cost benefit at this point relative to not bothering to improve robustness.
But, I think it's relatively clear that AI companies aren't planning to follow this sort of policy when existential risks are actually high as it would likely require effectively shutting down (and these companies seem to pretty clearly not be planning to shut down even if reasonable impartial experts would think the risk is reasonably high). (I think this sort of policy would probably require getting cumulative existential risks below 0.25% or so given the preferences of most humans. Getting risks this low would require substantial novel advances that seem unlikely to occur in time.) This sort of thinking makes me more indifferent and confused about demanding AIs companies behave responsibly about relatively lower costs (e.g. $30 billion per year) especially when I expect this directly trades off with existential risks.
(There is the "yes (deontological) risks are high, but we're net decreasing risks from a consequentialist" objection (aka ends justify the means), but I think this will also apply in the opposite way to jailbreak robustness where I expect that measures like removing prefil net increase risks long term while reducing deontological/direct harm now.)
I voted disagree because I don't think this measure is on the cost-robustness pareto frontier and I also generally don't think AI companies should prioritize jailbreak robustness over other concerns except as practice for future issues (and implementing this measure wouldn't be helpful practice).
Relatedly, I also tenatively think it would be good for the world if AI companies publicly deployed helpful-only models (while still offering a non-helpful-only model). (The main question here is whether this sets a bad precedent and whether future much more poweful models will still be deployed helpful-only when they really shouldn't be due to setting bad expectations.) So, this makes me more indifferent to deploying (rather than just testing) measures that make models harder to jailbreak.
(Inexpert speculation, please forgive my errors.)
I think I basically agree with the bottom line here, but I think one point seems a bit off to me.
Another major issue is that a full or near-full land value tax would likely establish a troubling precedent by signaling that the government has the appetite to effectively confiscate an additional category of assets that people have already acquired long ago through their labor and purchases.
[...]
Beyond setting a harmful precedent that could influence people's future behavior, a land value tax also creates a major disruption to people's current financial plans, particularly for those who have spent decades developing strategies to preserve their wealth.
I agree this applies if the land value tax operates via the government effectively owning all land from now on (e.g. via charging annual rent proportional to unimproved value or in principle the government could do a one time tax equivalent to land value and then tax increases in value). Another way to put this is that this version of an LVT is effectively the same as doing a 100% one time capital levy on a particular type of asset and has the corresponding issues that capital levies have as well as increasing variance via just targeting one type of asset.
One proposal for (partially) getting around this issue is to say that the government owns increases in land value from the time the LVT is introduced, but does not tax existing land value. Precisely, suppose your land is currently worth $1 million. If your land would stay the same value, you pay nothing. If the land appreciates to being worth $1.1 million, you have to pay the government $0.1 million that year. (Or alternatively, the government will charge rent for that $0.1 million share it "owns" ongoingly.) With this approach, it's important for the government to give money back if land depreciates to avoid disincentivizing variance (similar to the sort of proposal discussed here).
(I also think that a LVT should be more like 80% than 100%, but I've used 100% in the numbers above for simplicity.)
Of course, this means the LVT brings in even less revenue in the short term if the tax is implemented in a rental style. If land appreciates at 4% per year on average, then in 30 years[1], the government will own about 70% of land by value via this type of LVT, so it shouldn't take that long before this is equivalent to the government owning unimproved land value. (I think you'd probably want to gradually increase the tax after testing it out for a bit to reduce disruption.)
If we imagine that the goverment directly charges for increases in land value (or sells off the rental rights to get the revenue immediately), then this means government revenue increases by total_land_value * appreciation * LVT_rate which is perhaps (for the US) $23 trillion * 4% * 80% = $0.7 trillion. Current tax revenues are about $4.5 trillion in the US, so this would be about 15% of revenue which isn't that bad: you certainly still need other taxes, but you can displace a substantial amount.
Ok, but isn't this still a bad precedent and disruptive, just to a lesser extent? After all, people were expecting that they would get the appreciation in land value and the government is taking that. I think it sets no more of a bad precendent and is no more disruptive than increasing income or capital gains tax (analogously, people were expecting to see returns on their investment in education or their capital investments). Quantitatively, there is something different about taxing a certain type of appreciation at a high rate (e.g. 80%) rather than increasing taxes more marginally, but it doesn't seem that bad.
I think this type of land value tax where you tax just appreciation seems basically strictly better than income or capital gains taxation even for a well implemented capital gains tax. (That is, putting aside evaluation issues and assuming you can get kinda reasonable exemptions for value discovery.) It still seems bad to suddenly increase a tax to a very high amount due to disruptiveness, but certainly no worse than if you got $0.7 trillion via suddenly increasing some other tax.
I expect transformative AI prior to 30 years and generally think this makes discussion of mundane goverance much more confusing, but in this comment, I'm ignoring this. ↩︎
Alexander is replying to John's comment (asking him if he thinks these papers are worthwhile); he's not replying to the top level comment.
Yeah, I meant that I use "AI x-safety" to refer to the field overall and "AI x-safety technical research" to specifically refer to technical research in that field (e.g. alignment research).
(Sorry about not making this clear.)
There maybe should be a standardly used name for the field of generally reducing AI x-risk
I say "AI x-safety" and "AI x-safety technical research". I potentially cut the "x-" to just "AI safety" or "AI safety technical research".
I think this tag should be called "AI Psychology" or "Model Psychology" as LLM is a bit of an arbitrary and non-generalizable term.
(E.g., suppose 99% of compute in training was RL, should it still be called an LLM?)
It has spoilers thought they aren't that big of spoilers I think.
I expect that as you increase AI R&D labor acceleration compute becomes a larger and larger bottleneck. So, the first doubling of acceleration has less of a compute bottleneck tax than the 4th doubling.
This can be equivalently thought of in terms of "tax brackets" though perhaps this was a confusing way to put it.
Some notes: