I made an omission mistake in just saying "sampling from noisy posteriors," note I didn't say they were performing unbiased sampling.
To extend the Psychology example: a study could be considered a sampling technique of the noisy posterior. You appear to be arguing that the extent to which this is a biased sample is a "skill issue."
I'm arguing that it is often very difficult to perform unbiased sampling in some fields; the issue might be a property of the posterior and not that the researcher has a weak prefrontal cortex. In this framing it would totally make sense if two researchers studying the same/correlated posterior(s) are biased in the same direction–its the same posterior!
Eh. feels wrong to me. Specifically, this argument feels over-complicated.
As best I can tell, the predominant mode of science in replication-crisis affected fields is that they do causal inference by sampling from noisy posteriors.
The predominant mode of science in non-replication-crisis affected fields is that they don't do this or do this less.
Most of the time it seems like science is conducted like that in those fields because they have to. Can you come up with a better way of doing Psychology research? Science in hard fields is hard is definitely a less sexy hypothesis, but it seems obviously true?
Can you come up with a better way of doing Psychology research?
Yes. More emphasis on concrete useful results, less emphasis on trying to find simple correlations in complex situations.
For example, "Do power poses work?". They did studies like this one where they tell people to hold a pose for five minutes while preparing for a fake job interview, and then found that the pretend employers pretended to hire them more often in the "power pose" condition. Even assuming there's a real effect where those students from that university actually impress those...
They’re measuring a noisy phenomenon, yes, but that’s only half the problem. The other half of the problem is that society demands answers. New psychology results are a matter of considerable public interest and you can become rich and famous from them. In the gap between the difficulty of supply and the massive demand grows a culture of fakery. The same is true of nutrition— everyone wants to know what the healthy thing to eat is, and the fact that our current methods are incapable of discerning this is no obstacle to people who cl...
I really, really, really did not like this post. I found it to be riddled with bad assumptions, questionable unsupported claims, and critical omissions. I don't think any of the core arguments survive close scrutiny.
Moreover, I took serious issue with the tone throughout. The first half hand-waves some seriously questionable claims into existence with strong confidence, while the second half opines that everyone who ever thought otherwise is some combination of sycophantic, incurious, brainwashed, or an idiot. I would have appreciated more intellectual hum...
Actually, my read of the data is that the mountain west is not more environmentally conscious than the rest of the US.
The mountain west poll does not include national numbers, so I have no idea where your national comparisons are coming from. If I did, I'd check for same year/same question, but because I don't know where they're from I can't.
Take a look at this cool visualization of different state partisan splits from 2018: https://climatecommunication.yale.edu/visualizations-data/partisan-maps-2018/
The mountain west appears neither significantly mo...
My initial reaction, admittedly light on evidence, is that the numbers you present are at least partially due to selection bias. You've picked a set of issues, like climate change, that are not representative of the entire scope of "environmentalism." It shouldn't surprise anybody that "worry about global warming" is a blue issue, but the much more conservative-y "land use," "protection of fish and wildlife" and "conservation," issues for whatever reason are often not measured. In short, it feels a little to me that your actual argument is that...
I hear what you're saying. I probably should have made the following distinction:
The question I understand you to be asking is essentially how do we make safety cases for AI agents generally? I would argue that's more situation 1 than 2, and as I understand it safety cases are basically only ever applied to case 2. The nuclear facilities document you linked definitely is 2.
So yeah, admittedly the d...
I've published in this area so I have some meta comments about this work.
First the positive:
1. Assurance cases are the state of the art for making sure things don't kill people in a regulated environment. Ever wonder why planes are so safe? Safety cases. Because the actual process of making one is so unsexy (GSNs make me want to cry), people tend to ignore them, so you deserve lots of credit for somehow getting ex-risk people to upvote this. More lesswronger types should be thinking about safety cases.
2. I do think you have good / defensible argument...
I don't think it works if there isn't a correct answer, e.g. predicting the future, but I'm positive this is a good way to improve how convincing your claims are to others.
If there isn't ground truth about a claim to refer to, any disagreement around a claim is going to be about how convincing and internally/externally consistent the claim is. As we keep learning from prediction markets, rationale don't always lead to correctness. Many cases of good heuristics (priors) doing extremely well.
If you want to be correct, good reasoning is often a nice-to-have, not a need-to-have.
I very strongly disagree. In my opinion, this argument appears fatally confused about the concept of "software."
As others have pointed out, this post seems to be getting at a distinction between code and data, but many of the examples of software given by OP contain both code and data, as most software does. Perhaps the title should have been "AI is Not Code," but since it wasn't I think mine is a legitimate rebuttal.
I'm not trying to make an argument by definition. My comment is about properties of software that I think we would likely agree o...
I think this post is probably correct, but I think most of the discourse over-complicated what I interpret to be the two core observations:
All else follows. The key takeaway for me is that you should also condition your posteriors on societal norms.
Here be cynical opinions with little data to back them.
It's important to point out that "AI Safety" in an academic context usually means something slightly different from typical LW fare. For starters, as most AI work descended from computer science, its pretty hard [1] to get anything published in a serious AI venue (conference/journal) unless you
Both PhD students and their advisors want to publish things in established venues, so by default one should expect academic AI Safety r...
I expect academia to have more appetite for AI safety work that looks like (adversarial) robustness, mechanistic interpretability, etc, than alignment qua alignment. From the outside, it doesn't seem very unlikely for academia to do projects similar to what Redwood Research does, for example.
Though typical elite academics might also be distracted by shiny publishable projects rather than be as focused/dedicated on core problems, compared to e.g. Redwood. This is somewhat counterbalanced by the potential of academia having higher quantity and possibly...
The core B/E dichotomy rang true, but the post also seemed to imply a correlated separation between autonomous and joint success/failure modes: building couples succeed/fail on one thing together, entertaining couples succeed/fail on two things separately.
I have not observed this to be true. Experientially, it seems a little like a quadrant, where the building / entertaining distinction is about the type of interaction you crave in a relationship, and autonomous / joint distinction is about how you focus your productive energies.
Examples:
"What is intelligence?" is a question you can spend an entire productive academic career failing to answer. Intentionally ignoring the nerd bait, I do think this post highlights how important it is for AGI worriers to better articulate which specific qualities of "intelligent" agents are the most worrisome and why.
For example, there has been a lot of handwringing over the scaling properties of language models, especially in the GPT family. But as Gary Marcus continues to point out in his inimitable and slightly controversial way, scaling these models...
I'm seeking some clarification, my reading of your post is that you see the following concepts as intertwined:
As you point out (and I agree) that transformer parameters live in a small space and the realities of human biology seem to imply that we can do #1 better, that is, use a "lighter" algorithm with fewer free parameters to store our learned information.
If I understand you correctly, you believe that this "far more efficient architecture trying to get out" would also...
I think you've made a motte-and-bailey argument:
Your motte is definitely defensible. Obviously, you can alter the payoff structure of the game to a point where you should play it.
That does not mean "there's no real paradox" , it just means you are no longer talking a... (read more)