One thing I like about this is making the actual difficulty deltas between colleges more felt/legible/concrete (by anyone who takes the exams). What I might do in your system at my IQ level (which is pretty high outside of EA but pretty mediocre inside EA) is knock out a degree at an easy university to get warmed up then study for years for a degree at a hard school[1].
In real life, I can download or audit courses from whatever university I want, but I don't know what the grading curve is, so when 5/6 exercises are too hard I don't know if that's because I'm dumb or if 1/6 is B+ level performance. This is a way that the current system underserves a credential-indifferent autodidact. It's really hard to know how difficult a course is supposed to be when you're isolated from the local conditions that make up the grading curve!
Another thing I like about your system is tutoring markets separated from assessment companies. Why is it that we bundle gatekeeping/assessment with preparation? Unbundling might help maintain objective standards, get rid of problems that look like "the professor feels too much affection for the student to fail them".
This is all completely separate for why your proposal is a hard social problem / a complete nonstarter, which is that I don't think the system is "broken" right now. There's an idea you might pick up if you read the smarter leftists, which is that credentialism especially at elite levels preserves privilege and status as a first class use case. This is not completely false today, not least because the further you go back in time in western universities the truer it is.
my prior, 15 years ago, looked like "stanford has a boating scholarship, so obviously selectivity is a wealth/status thing and not reflective of scholarship or rigor", so the fact that I now believe "more selective colleges have harder coursework" means I've seen a lot of evidence. It pains me, believe me, but reality doesn't care :) ↩︎
I get pretty intense visceral outrage at overreaches in immigration enforcement, just seems the height of depravity. Ive looked for a lot of different routes to mental coolness over the last decade (since Trump started his speeches), they mostly amount to staying busy and distracted. Just seems like a really cost ineffective kind of activism to get involved in. Bankrolling lawyers for random people isn't really in my action space and if it was i'd have opportunity cost to consider.
seems like there's more prior literature than I thought https://en.wikipedia.org/wiki/Role-based_access_control
My main aim is to work on "hardening the box" i.e. eliminating software bugs so containment schemes don't fail for preventable reasons. But in the famous 4o system card example, the one that looks a little like docker exfiltration, the situation arose from user error, wild guess in compose.yaml
or the shell script invoking docker run
.
In a linux machine
Here's an example nix file
users.users =
let
authorized-key-files = [
"${keyspath}/id_server_ed25519.pub"
"${keyspath}/id_qd_ed25519.pub"
];
in
{
unpermissioneduser = {
isNormalUser = false;
extraGroups = [ "docker" ];
description = "AgentID=claude-0x0000";
};
coreuser = {
isNormalUser = true;
extraGroups = [
"wheel"
"networkmanager"
"docker"
"video"
];
home = "/home/coreuser";
description = "Core User (delegator of unpermissioneduser)";
shell = pkgs.fish;
openssh.authorizedKeys.keyFiles = authorized-key-files;
};
root = {
openssh.authorizedKeys.keyFiles = authorized-key-files;
shell = pkgs.fish;
};
};
You can see that unpermissioneduser
has less abilities than coreuser
. So you can imagine I just say that unpermissioneduser is an agent and coreuser is the human delegator.
Nix is simply a fully declarative way to do standard linux permissioning (a feature not in the snippet is allocating chmod/chown information for particular users to particular parts of the filesystem). There's no conceptual leaps from the status quo.
is linux all that great for when you want to keep track of who's a delegatee and who's a delegator? do we need a more graph flavored version of linux userspace/permissions? I'm talking about once we're reasoning about proliferating agents and their permissions on various machines. Linux groups do not support inheritance, but a user can be a member of many groups. So you could in principle MVP a graph based permissions DSL (perhaps in Nix) on top of the existing Linux user/group ontology, 80% confident, but it could be hairier than making a new ontology. idk.
Examples of promising risk-targeted applications
This section reeks of the guaranteed safe AI agendas, a lot of agreement. For example, using formal methods to harden any box we try to put the AI in is a kind of defensive acceleration that doesn't work (too expensive) until certain pre-ASI stages of development. I'm working on formal verification agents along these lines right now.
@Tyra Burgess and I wrote down a royalty-aware payout function yesterday:
For a type , let be the "left closure under implication" or the admissible antecedents. I.e., the set of all the antecedents A in the public ledger such that . is the price that a proposition was listed for (admitting summing over duplicates). Suppose player have previously proven and is none other than the set of all from to .
We would like to fix an (could be fairly big, like ) and say that the royalty-aware payout given epsilon of upon an introduction of to the database is such that, where , is paid out to each player .
This seems vaguely like it has some desirable properties, like the decay of a royalty with length in implications separating it from the currently outpaying type. You might even be able to reconcile it with cartesian-closedness / currying, where behaves equivalently to under the payout function.
I think to be more theoretically classy, royalties would arise from recursive structure, but it may work well enough without recursion. It'd be fun to advance all the way to coherence and incentive-compatible proofs, but I certainly don't see myself doing that.
I want a name for the following principle:
the world-spec gap hurts you more than the spec-component gap
I wrote it out much like this a couple years ago and Zac recently said the same thing.
I'd love to be able to just say "the <one to three syllables> principle", yaknow?
I'm working on making sure we get high quality critical systems software out of early AGI. Hardened infrastructure buys us a lot in the slightly crazy story of "self-exfiltrated model attacks the power grid", but buys us even more in less crazy stories about all the software modules adjacent to AGI having vulnerabilities rapidly patched at crunchtime.
<standup_comedian>
What's the deal with evals </standup_comedian>
epistemic status: tell me I'm wrong.
Funders seem particularly enchanted with evals, which seems to be defined as "benchmark but probably for scaffolded systems and scoring that is harder than scoring most of what we call benchmarks".
I can conjure a theory of change. It's like, 1. if measurement is bad then we're working with vibes, so we'd like to make measurement good. 2. if measurement is good then we can demonstrate to audiences (especially policymakers) that warning shots are substantial signals and not base it on vibes. (question: what am I missing?)
This is an at least coherent reason why dangerous capability evals pay into governance strats in such a way that maybe philanthropic pressure is correct. It relies on cruxes that I don't share, like that a principled science of measurement would outperform vibes in a meme war in the first place, but it at least has a crux that works as a fulcrum.
Everything worth doing is at least a little dual use, I'm not attacking anybody. But it's a faustian game where, like benchmarks, evals pump up races cuz everyone loves it when number go up. The primal urge to see number go up infects every chart with an x and y axis, in other words, evals come with steep capabilities externalities because they spray the labs with more charts that number hasn't gone up on yet, daring and challenging the lab to step up their game. So the theory of change in which, in spite of this dynamic, an eval is differentially defensive just has to meet a really high standard.
A further problem: the theory of change where we can have really high quality / inarguable signals as warning shots instead of vibes as warning shots doesn't even apply to most of the evals I'm hearing about from the nonprofit and independent sector. I'm hearing about evals that make me go, "huh, I wonder what's differentially defensive about that?" and I don't get good answers. Moreover, an ancient wisdom says "never ask a philanthropist for something capitalism gives you for free". The case for an individual eval's unlikeliness to be created by default lab incentives needs to be especially strong, cuz when it isn't strong one is literally doing the lab's work for them.
I don't know what legible/transferable evidence would be. I've audited a lot of courses at a lot of different universities. Anecdote, sorry.