I think more leaders of orgs should be trying to shape their organizations incentives and cultures around the challenges of "crunch time". Examples of this include:
I would be very excited to see experiments with ABMs where the agents model fleets of research agents and tools. I expect in the near future we can build pipelines where the current fleet configuration - which should be defined in something like the terraform configuration language - automatically generates an ABM which is used for evaluation, control, and coordination experiments.
This seems mostly good to me, thank you for the proposals (and sorry for my delayed response, this slipped my mind).
OR less than three consistent physical instances have been manufactured. (e.g. a total of three including prototypes or other designs doesn't count)
Why this condition? It doesn't seem relevant to the core contention, and if someone prototyped a single lock using a GS AI approach but didn't figure out how to manufacture it at scale, I'd still consider it to have been an important experiment.
Besides that, I'd agree to the above conditions!
- (8) won't be attempted, or will fail at some combination of design, manufacture, or just-being-pickable. This is a great proposal and a beautifully compact crux for the overall approach.
I agree with you that this feels like a 'compact crux' for many parts of the agenda. I'd like to take your bet, let me reflect if there's any additional operationalizations or conditioning.
...However, I believe that the path there is to extend and complement current techniques, including empirical and experimental approaches alongside formal verification - whatever
I agree with this, I'd like to see AI Safety scale with new projects. A few ideas I've been mulling:
- A 'festival week' bringing entrepreneur types and AI safety types together to cowork from the same place, along with a few talks and lot of mixers.
- running an incubator/accelerator program at the tail end of a funding round, with fiscal sponsorship and some amount of operational support.
- more targeted recruitment for specific projects to advance important parts of a research agenda.
It's often unclear to me whether new projects should actually...
First off thank you for writing this, great explanation.
This seems like an important crux to me, because I don't think greatly slowing AI in the US would require new federal laws. I think many of the actions I listed could be taken by government agencies who over-interpret their existing mandates given the right political and social climate. For instance, the eviction moratorium during COVID, obviously should have required congressional action, but was done by fiat through an over-interpretation of authority by an executive branch agency.
What they do or do not do seems mostly dictated by that socio-political climate, and by the courts, which means less veto points for industry.
I agree that competition with China is a plausible reason regulation won't happen; that will certainly be one of the arguments advanced by industry and NatSec as to why it should not be throttled. However, I'm not sure, and currently don't think it will, be stronger than the protectionist impulses,. Possibly it will exacerbate the "centralization" of AI dynamic that I listed in the 'licensing' bullet point, where large existing players receive money and de-facto license to operate in certain areas and then avoid others (as memeticimagery points out). So fo...
hah yes - seeing that great post from johnwentsworth inspired me to review my own thinking on RadVac. Ultimately I placed a lower estimate on RadVac being effective - or at least effective enough to get me to change my quarantine behavior - such that the price wasn't worth it, but I think I get a rationality demerit for not investing more in the collaborative model building (and collaborative purchasing) part of the process.
Forecast - 25 mins
Thanks for posting this. I recently reread the Fountainhead, which I similarly enjoyed and got more out of than did my teenage self - it was like a narrative, emotional portrayal of the ideals in Marc Andreessen's It's Time to Build essay.
I interpreted your section on The Conflict as the choice between voice and exit.
The larger scientific question was related to Factored Cognition, and getting a sense of the difficulty of solving problems through this type of "collaborative crowdsourcing". The hope was running this experiment would lead to insights that could then inform the direction of future experiments, in the way that you might fingertip feel your way around an unknown space to get a handle on where to go next. For example if it turned out to be easy for groups to execute this type of problem solving, we might push ahead with competitions between teams t...
Thanks, rewrote and tried to clarify. In essence the researchers were testing transmission of "strategies" for using a tool, where an individual was limited in what they could transmit to the next user, akin to this relay experiment.
In fact they found that trying to convey causal theories could undermine the next person's performance; they speculate that it reduced experimentation prematurely.
I've spent a fair bit of time in the forecasting space playing w/ different tools, and I never found one that I could reliably use for personal prediction tracking.
Ultimately for me it comes down to:
1.) Friction: the predictions I'm most interested in tracking are "5-second-level" predictions - "do I think this person is right", "is the fact that I have a cough and am tired a sign that I'm getting sick" etc. - and I need to be able to jot that down quickly.
2.) "Routine": There are certain sites that a...
The commerce clause gives the federal government broad powers to regulate interstate commerce, and in particular the the U.S. Secretary of Health and Human Services can exercise it to institute quarantine. https://cdc.gov/quarantine/aboutlawsregulationsquarantineisolation.html
Depression as a concept doesn't make sense to me. Why on earth would it be fitness enhancing to have a state of withdrawal, retreat, collapse where a lack of energy prevents you from trying new things? I've brainstormed a number of explanations:
I expect understanding something more explicitly - such as yours and another persons boundaries - w/o some type of underlying concept of acceptance of that boundary can increase exploitability. I recently wrote a shortform post on the topic of legibility that describes some patterns I've noticed here.
I don't think on average Circling makes one more exploitable, but I expect it increases variance, making some people significantly more exploitable than they were before because previously invisible boundaries are now visible, and can thus be attacke...
IMO the term "amplification" fits if the scheme results in a 1.) clear efficiency gain and 2.) it's scalable. This looks like (delivering equivalent results but at a lower cost OR providing better results for an equivalent cost. (cost == $$ & time)), AND (~ O(n) scaling costs).
For example if there was a group of people who could emulate [Researcher's] fact checking of 100 claims but do it at 10x speed, then that's an efficiency gain as we're doing the same work in less time. If we pump the number to 1000 claims and the fac...
Is there not a distillation phase in forecasting? One model of the forecasting process is person A builds up there model, distills a complicated question into a high information/highly compressed datum, which can then be used by others. In my mind its:
Model -> Distill - > "amplify" (not sure if that's actually the right word)
I prefer the term scalable instead of proliferation for "can this group do it cost-effectively" as it's a similar concept to that in CS.
Thanks for including that link - seems right, and reminded me of Scott's old post Epistemic Learned Helplessness
The only difference between their presentation and mine is that I’m saying that for 99% of people, 99% of the time, taking ideas seriously is the wrong strategy
I kinda think this is true, and it's not clear to me from the outset whether you should "go down the path" of getting access to level 3 magic given the negatives.
Probably good heuristics are proceeding with caution when encountering new/out there ideas, remember...
As a Schelling point, you can use this Foretold community which I made specifically for this thread.
I watched all of the Grandmaster level games. When playing against grandmasters the average win rate of AlphaStar across all three races was 55.25%
Detailed match by match scoring
While I don't think that it is truly "superhuman", it is definitely competitive against top players.
https://twitter.com/esyudkowsky/status/910941417928777728
I remember seeing other claims/analysis of this but don't remember where
I'd agree w/ the point that giving subordinates plans and the freedom to execute them as best as they can tends to work out better, but that seems to be strongly dependent on other context, in particular the field they're working in (ex. software engineering vs. civil engineering vs. military engineering), cultural norms (ex. is this a place where agile engineering norms have taken hold?), and reward distributions (ex. does experimenting by individuals hold the potential for big rewards, or are all rewards likely to be distributed in a normal fas...
From a 2 min brainstorm of "info products" I'd expect to be action guiding:
One concrete example is from when I worked in a business intelligence role. What executives wanted was extremely trustworthy reliable data sources to track business performance over time. In a software environment ...
This seems true that there's a lot of way to utilize forecasts. In general forecasting tends to have an implicit and unstated connection to the decision making process - I think that has to do w/ the nature of operationalization ("a forecast needs to be on a very specific thing") and because much of the popular literature on forecasting has come from business literature (e.g. How to Measure Anything).
That being said I think action-guidingness is still the correct bar to meet for evaluating the effect it has on the EA community. I would bite ...
what are the bottlenecks preventing 10x-100x scaling of Control Evaluations?
- I'm not confident in the estimates of the safety margin we get from internal only evaluations - the challenge of eliciting strong subversion performance seems very hard for getting satisfactory estimates of the subversion capability of models against control protocols.
- I'd feel more confident if we had thousands of people trying to create red-team models, while thousands of blue teams propose different monitoring methods, and control protocols.
- The type of experiments describe
... (read more)