I am confused about the opening of your analysis:
In some sense, this idea solves basically none of the core problems of alignment. We still need a good-enough model of a human and a good-enough pointer to human values.
It seems to me that while the fixed point conception here doesn't uniquely determine a learning strategy, it should be possible to uniquely determine that strategy by building it into the training data.
In particular, if you have a base level of "reality" like the P_0 you describe, then it should be possible to train a model first on this real...
My issue isn't with the complexity of a Turing machine, it's with the term "accessible." Universal search may execute every Turing machine, but it also takes adds more than exponential complexity time to do so.
In particular because if there are infinitely many schelling points in the manipulation universe to be manipulated and referenced, then this requires all of that computation to causally precede the simplest such schelling point for any answer that needs to be manipulated!
It's not clear to me what it actually means for there to exist a schelling point...
I'm confused by your intuition that team manipulation's universe has similar complexity to ours.
My prior is that scaling the size of (accessible) things in a universe also requires scaling the complexity of the universe in a not-bounded way, probably even a super-linear way, such that fully specifying "infinite computing power" or more concretely "sufficient computing power to simulate universes of complexity <=X for time horizons <=Y" requires complexity f(x,y) which is unbounded in x,y, and therefore falls apart completely as a practical solution (...
Many people commute to work in businesses in San Francisco who don't live there. I would expect GDP per capita to be misleading in such cases for some purposes.
Broadening to the San Francisco-San Jose area, there are 9,714,023 people with a GDP of $1,101,153,397,000/year, giving a GDP/capita estimate of $113,357. I know enough people who commute between Sunnyvale and San Francisco or even further that I'd expect this to be 'more accurate' in some sense, though obviously it's only slightly lower than your first figure and still absurdly high.
But the c...
tl;dr: if models unpredictably undergo rapid logistic improvement, we should expect zero correlation in aggregate.
If models unpredictably undergo SLOW logistic improvement, we should expect positive correlation. This also means getting more fine-grained data should give different correlations.
To condense and steelman the original comment slightly:
Imagine that learning curves all look like logistic curves. The following points are unpredictable:
Based on my own experience and the experience of others I know, I think knowledge starts to become taut rather quickly - I’d say at an annual income level in the low hundred thousands.
I really appreciate this specific calling out of the audience for this post. It may be limiting, but it is also likely limiting to an audience with a strong overlap with LW readership.
Everything money can buy is “cheap”, because money is "cheap".
I feel like there's a catch-22 here, in that there are many problems that probably could be solved with money, but I don't know how ...
I think this post does a good job of focusing on a stumbling block that many people encounter when trying to do something difficult. Since the stumbling block is about explicitly causing yourself pain, to the extent that this is a common problem and that the post can help avoid it, that's a very high return prospect.
I appreciate the list of quotes and anecdotes early in the post; it's hard for me to imagine what sort of empirical references someone could make to verify whether or not this is a problem. Well known quotes and a long list of anecdotes is a su...
Your model of supporters of farm animal welfare seems super wrong to me.
I would predict that actually supporters of the law will be more unhappy the more effect it has on the actual market, because that reveals info about how bad conditions are for farm animals. In particular if it means shifting pork distribution elsewhere, that means less reduction in pig torture and also fewer options to shift consumption patterns toward more humanely raised meat on the margins.
Those costs can be worth paying, if you still expect some reduction in pig torture, but obviously writing laws to be better defined and easier to measure would be a further improvement.
70% compute, 30% algo (give or take 10 percentage points) over the last 25 years. Without serious experiments, have a look at the Stockfish evolution at constant compute. That's a gain of +700 ELO points over ~8 years (on the high side, historically). For comparison, you gain ~70 ELO per double compute. Over 8 years one has on average gained ~400x compute, yielding +375 ELO. That's 700:375 ELO for compute:algo
Isn't that 70:30 algo:compute?
A friend of mine on Facebook notes that the instances of blood clots in Germany were concerning because in Germany mostly young health care workers are getting vaccinated, where it's both more possible to distinguish small numbers of blood clots from chance and more concerning to see extreme side effects.
The rate is still low enough that pausing vaccination is (obviously) a dangerous move, but dismissing the case that blood clots may be caused by the vaccine isn't a fair assessment of the evidence, and that may be important in maybe two years when supply of non-AZ vaccines is no longer a limit for the world.
Good question.
One thing I looked into was obtaining fast antibody tests (basically strips of paper with some proteins and colloidal gold soaked into them). They're "research use only" and hard to get your hands on if you're in the US, but if you're outside the US (or have a friend outside the US willing to help) it should be easier. They would make it dramatically easier and less error-prone to test, and they'd also test for binding against full COVID proteins directly (rather than the radvac peptides). If I were going to invest much more effort into this ...
Maybe but the US number lines up with 1% of the population lines up with the top 1% figure; if people outside the US are ~50x as likely to be top-1% at various hobbies that's a bold statement that needs justification, not an obvious rule of thumb!
Or it could be across all time, which lines up with ~100 billion humans in history.
Looks like the initial question was here and a result around it was posted here. At a glance I don't see the comments with counterexamples, and I do see a post with a formal result, which seems like a direct contradiction to what you're saying, though I'll look in more detail.
Coming back to the scaling question, I think I agree that multiplicative scaling over the whole model size is obviously wrong. To be more precise, if there's something like a Q-learning inner optimizer for two tasks, then you need the cross product of the state spaces, so the size of ...
I'm replying on my phone right now because I can't stop thinking about it. I will try to remember to follow up when I can type more easily.
I think the vague shape of what I think I disagree about is how dense GPT-3's sets of implicit knowledge are.
I do think we agree that GPT-5000 will be broadly superhuman, even if it just has a grab bag of models in this way, for approximately the reasons you give.
I'm thinking about "intelligent behavior" as something like the set of real numbers, and "human behavior" as covering something like rational numbers, so we ca...
I think this is obscuring (my perception of) the disagreement a little bit.
I think what I'm saying is, GPT-3 probably doesn't have any general truth+noise models. But I would expect it to copy a truth+noise model from people, when the underlying model is simple.
I then expect GPT-3 to "secretly" have something like an interesting diagnostic model, and probably a few other narrowly superhuman skills.
But I would expect it to not have any kind of significant planning capacity, because that planning capacity is not simple.
In particular my expectation is that co...
This seems like it's using the wrong ontology to me.
Like, in my mind, there are things like medical diagnostics or predictions of pharmaceutical reactions, which are much easier cognitive tasks than general conversation, but which humans are specialized away from.
For example, imagine the severity of side effects from a specific medication. can be computed by figuring out 15 variables about the person and putting them into a neural network with 5000 parameters, and the output is somewhere in a six-dimensional space, and this model is part of a general model...
After reading some of this reddit thread I think I have a better picture of how people are reacting to these events. I will probably edit or follow up on this post to follow up.
My high level takeaway is:
In the edit or possibly in a separate followup post I will try to present the model at a further disconnect from the specific events and...
I appreciate the thread as context for a different perspective, but it seems to me that it loses track of verifiable facts partway through (around here), though I don't mean to say it's wrong after that.
I think in terms of implementation of frameworks around AI, it still seems very meaningful to me how influence and responsibility are handled. I don't think that a federal agency specifically would do a good job handling an alignment plan, but I also don't think Yann LeCun setting things up on his own without a dedicated team could handle it.
I would want to see a strong justification before deciding not to discuss something that is directly relevant to the purpose of the site.
I think this makes sense, but I disagree with it as a factual assessment.
In particular I think "will make mistakes" is actually an example of some combination of inner and outer alignment problems that are exactly the focus of LW-style alignment.
I also tend to think that the failure to make this connection is perhaps the biggest single problem in both ethical AI and AI alignment spaces, and I continue to be confused about why no one else seems to take this perspective.
I am currently writing fiction that features protagonists that are EAs.
This seems at least related to the infrastructure fund goal of presenting EA principles and exposing more people to them.
I think receiving a grant would make me more likely to aggressively pursue options to professionally edit, publish, and publicize the work. That feels kind of selfish and makes me self-conscious, but also wouldn't require a very large grant. It's hard for me to unwrap my feelings about this vs. the actual public good, so I'm asking here first.
Does this sounds like a good grant use?
I am reasonably excited about fiction (and am on the Long Term Future Fund). I have written previously about my thoughts on fiction here:
...The track record of fiction
In a general sense, I think that fiction has a pretty strong track record of both being successful at conveying important ideas, and being a good attractor of talent and other resources. I also think that good fiction is often necessary to establish shared norms and shared language.
Here are some examples of communities and institutions that I think used fiction very centrally in their function.
I find both directions plausible. I do agree that I don't see any existing institutions ready to take it's place, but looking at secular solstice, for example, I definitely expect that better institutions are possible.
There might be a sufficiency stagnation following similar mechanics to crowding out, since people have a "good enough" option they don't try to build better things, and centralized leadership causes institutional conservatism.
I would bet this is supported by worse outcomes for more centralized churches, like unitarians vs megachurches or orthodox catholics, but that's a weakly held belief.
I think I find this plausible. An alternative to MichaelBowbly's take is that religion may crowd out other community organization efforts which could plausibly be better.
I'm thinking of unions, boys and girls clubs, community centers, active citizenship groups, meetup groups, and other types of groups that have never yet existed.
It could be that in practice introducing people to religious practices shows them examples of ways to organize their communities, but it could also be that religious community efforts are artificially propped up by government subsi...
A toy model that makes some sense to me is that the two population distinction is (close to) literally true; that there's a subset of like 20% of people who have reduced their risk by 95%+, and models should really be considering only the other 80% of the population, which is much more homogeneous.
Then because you started with effectively 20% population immunity, that means R0 is actually substantially higher, and each additional piece of immunity is less significant because of that.
I haven't actually computed anything with this model so I don't know whether it is actually explanatory.
I did some calculations of basic herd immunity thresholds based on fractal risk (without an infection model) a few months back, and the difference between splitting the population into high exposure vs low exposure captures more than half the change from the limit of infinite splits. The threshold stopped changing almost entirely after three splits, which was only 6 subpopulatuons.
With many other variables as exist here I'm not confident that effect would persist but my default guess is that adding fractal effects to the model will less than double the cha...
I'd like to use this feature, especially to keep track if I meet a user in the walled garden or IRL but need consistency to remember which user they are. This is a common feature in video games and without it I would have no idea who most of my friends in League of Legends are.
I wouldn't be that worried about privacy for the notes, since I'd expect few of them to contain sensitive information, though they might contain some awkward information.
Yeah I think my main disagreements are 4 and 5.
Given stories I've heard about cryonics orgs, I'd put 10-50% on 5. Given my impression of neuroscience, I'd put 4 at 25-75%.
Given that I'm more pessimistic in general, I'd put an addition 2x penalty on my skepticism of their other guesses.
That puts me around 0.01%-20% spread, or one in ten thousand lower bound, which is better than I expected. If I was convinced that a cryo org was actually a responsible business that would be enough for me to try to make it happen.
I was trying to do a back of the envelope calculations of total cost of work and total value created (where I'm using cost of rent as a (bad) proxy for (capturable) value created).
I definitely wouldn't assume that the government or any single agent would be doing the project, just that the overall amount of capturable value must be worth it for the investment costs, then different parties can pay portions of those costs in exchange for portions of or rights to that value, but I doubt adding in the different parties involved would make my estimates more accurate.
Do you have a source for cost of similar projects? My estimates are definitely very bad for many reasons.
I want to have this post in a physical book so that I can easily reference it.
It might actually work better as a standalone pamphlet, though.
I like that this responds to a conflict between two of Eliezer's posts that are far apart in time. That seems like a strong indicator that it's actually building on something.
Either "just say the truth", or "just say whatever you feel you're expected to say" are both likely better strategies.
I find this believable but not obvious. For example, if the pressure on you is you'll be executed for saying the truth, saying nothing is probably better that saying the truth. If the pressure on you is remembering being bullied on tumblr, and you're being asked if you...
The factual point that moderate liberals are more censorious is easy to lose track of, and I saw confusion about it today that sent me back to this article.
I appreciate that this post starts from a study, and outlines not just the headline from the study but the sample size. I might appreciate more details on the numbers, such as how big the error bars are, especially for subgroups stats.
Historical context links are good, and I confirm that they state what they claim to state.
Renee DiResta is no longer at New Knowledge, though her previous work there is st...
it's not a simple enough question for easy answers.
It's also plausible to me that it requires enough intersections (owns a house; rents the house out on AirBnB; in a single metro area; measures success in a reasonable way; writes about it on the internet) gets small enough that there are no results.
Looking for general advice (how to succeed as an AirBNB host) might give a model that's easy to fill in, like "you will succeed if the location is X appealing and there are f(X) listings or fewer."
That still seems like a pretty easy answer to me, but it co...
I think you're misunderstanding my analogy.
I'm not trying to claim that if you can solve the (much harder and more general problem of) AGI alignment, then it should be able to solve the (simpler specific case of) corporate incentives.
It's true that many AGI architectures have no clear analogy to corporations, and if you are using something like a satisficer model with no black-box subagents, this isn't going to be a useful lens.
But many practical AI schema have black-box submodules, and some formulations like mesa-optimization or supervised amplification-d...
As remizidae points out, most of these restrictions are not effectively enforced by governments, they are enforced by individuals and social groups. In California, certainly, the restaurants and bars thing is enforced mostly by the government, but that's mostly a "governments can't act with nuance" problem.
But for things like gatherings of friends, I think this question still applies. The government cannot effectively enforce limits on that, but your group of friends certainly can.
And I think in that context, this question remains. That is, I think groups ...
I think this misunderstands my purpose a little bit.
My point isn't that we should try to solve the problem of how to run a business smoothly. My point is that if you have a plan to create alignment in AI of some kind, it is probably valuable to ask how that plan would work if you applied it to a corporation.
Creating a CPU that doesn't lie about addition is easy, but most ML algorithms will make mistakes outside of their training distribution, and thinking of ML subcomponents as human employees is an intuition pump for how or whether your alignment plan might interact.
I like this post and would like to see it curated, conditional on the idea actually being good. There are a few places where I'd want more details about the world before knowing if this was true.
This watershed is owned and managed by the Santa Clara Valley Water District.
The bay was designated a Ramsar Wetland of International Importance on February 2, 2012.
I don't know what that ...
It looks like for pool removal there's a cost of between $20-$130 per cubic foot.
Minor correction: that source says filling in a pool is $20-80 per cubic yard, which would only be ~$1-3 per cubic foot. The higher numbers are for demolition, but that's presumably dominated by the cost of the demolition rather than the fill - jackhammers are a pain in the ass.
Not from OpenAI but the language sounds like this could be the board protecting themselves against securities fraud committed by Altman.