yeah, but there would be a lot of worlds where the merger was totally fine and beneficial where it fell through because people had unfounded fears
i mean, in general, it's a lot easier to tell plausible-seeming stories of things going really poorly than actually high-likelihood stories of things going poorly. so the anecdata of it actually happening is worth a lot
I've been repeatedly loud and explicit about this but an happy to state again that racing to build superintelligence before we know how to make it not kill everyone (or cause other catastrophic outcomes) seems really bad and I wish we could coordinate to not do that.
there's an exogenous factor, which is that the entire country was shifting leftward during the 50s and 60s. it's plausible that the 1964 bill would have passed anyways without the 1957 bill, possibly even earlier
what's the current state of analysis on whether the civil rights act of 1957 was actually net positive or negative for civil rights in hindsight? there are two possible stories one can tell, and at the time people were arguing about which is correct:
this feels directly analogous to the question of whether we should accept very weak AI safety regulations today.
a thing i've noticed rat/autistic people do (including myself): one very easy way to trick our own calibration sensors is to add a bunch of caveats or considerations that make it feel like we've modeled all the uncertainty (or at least, more than other people who haven't). so one thing i see a lot is that people are self-aware that they have limitations, but then over-update on how much this awareness makes them calibrated. one telltale hint that i'm doing this myself is if i catch myself saying something because i want to demo my rigor and prove that i've considered some caveat that one might think i forgot to consider
i've heard others make a similar critique about this as a communication style which can mislead non-rats who are not familiar with the style, but i'm making a different claim here that one can trick oneself.
it seems that one often believes being self aware of a certain limitation is enough to correct for it sufficiently to at least be calibrated about how limited one is. a concrete example: part of being socially incompetent is not just being bad at taking social actions, but being bad at detecting social feedback on those actions. of course, many people are not even aware of the latter. but many are aware of and acknowledge the latter, and then act as if because they've acknowledged a potential failure mode and will try to be careful towards avoiding it, that they are much less susceptible to the failure mode than other people in an otherwise similar reference class.
one variant of this deals with hypotheticals - because hypotheticals often can/will never be evaluated, this allows one to get the feeling that one is being epistemically virtuous and making falsifiable predictions, without ever actually getting falsified. for example, a statement "if X had happened, then i bet we would see Y now" has prediction vibes but is not actually a prediction. this is especially pernicious when one fails but says "i failed but i was close, so i should still update positively on what i did." while not always a bad idea, there's a bias-variance tradeoff here, where doing this more often reduces variance but increases bias. i find that cases where i thought i was close but later realized i was actually far off the mark are sufficiently common that this isn't an imaginary concern.
another variant is i think we are much less susceptible to some forms of brainworms/ideology, and are also much better at understanding the mechanisms behind brainworms and identifying them in others, so we over-update on our own insusceptibility to brainworms (despite evidence from the reference class of rationalists that seems to suggest at least as much as genpop if not higher levels of obvious-cult-forming). however, it's just that we are suscpetible to different types of brainworks as normies.
another variant is introspective ability. i think we are probably better in some sense at self-introspection, in the sense that we are better at noticing certain kinds of patterns in our own behavior and developing models for those patterns. but i've also come to believe that this kind of modeling has huge blind spots, and leads many to believe they have a much greater degree of mastery over their own minds than they actually do. however, the feeling that one is aware of the possibility of one having blind spots and being aware of what they often look like in other people can lead to overconfidence about whether one would notice these blindspots in themself.
i feel like the main way i notice these things is by noticing them in other people over long periods of knowing them, and then noticing that my actions are actually deeply analogous to theirs in some way. it also helps to notice non-rats not falling into the same pitfalls sometimes.
i'm not sure how to fix this. merely being aware of it probably is not sufficient. probably the solution is not to stop thinking about one's own limitations, but rather to add some additional cogtech on top. my guess is there is probably valuable memetic technology out there that especially wise people use but which most people, rat or not, don't use. also, difficult-to-fake feedback from reality seems important.
my guess is that lots of people would change their minds if they really reflected on it with full wisdom and the assistance of an aligned and emotionally intelligent assistant. but if truly deep down some/many people value their beliefs over truth, and would never change their minds even if they reflected deeply on it, who are we to tell them not to do that? the best we can ask for is that they leave us alone to do what we believe is good, and vice versa.
in general publicly known training techniques are behind sota, so this should be taken into account.
I like the spirit of this work but it would benefit a lot from a more in depth review of the existing literature and methodologies. some examples (non exhaustive)
i recently ran into to a vegan advocate tabling in a public space, and spoke briefly to them for the explicit purpose of better understanding what it feels like to be the target of advocacy on something i feel moderately sympathetic towards but not fully bought in on. (i find this kind of thing very valuable for noticing flaws in myself and improving; it's much harder to be perceptive of one's own actions otherwise). the part where i am genuinely quite plausibly persuadable of his position in theory is important; i think if i had talked to e.g flat earthers one might say my reaction is just because i'd already decided not to be persuaded. several interesting things i noticed (none of which should be surprising or novel, especially for someone less autistic than me, but as they say, intellectually knowing things is not the same as actual experience):
one possible take is that i'm just really weird and these modes of interaction work well for normal people more because they're less independently thinking or need to be argued out of having poorly thought out bad takes or something like that, idk. i can't rule this out but my guess is normal people probably are even more this than i am. also, for the purposes of analogy to the AI safety movement, presumably we want to select for people who are independent thinkers who have especially well thought out takes more than just normal people.
also my guess is this particular interaction was probably extremely out of distribution from the perspective of those tabling. my guess is activists generally have a pretty polished pitch for most common situations which includes a bunch of concrete ways of talking they've empirically found to cause people to engage, learned through years of RL against a general audience, but the polishedness of this pitch doesn't generalize out of distribution when poked at in weird ways. my interlocutor even noted at some point that his conversations when tabling generally don't go the way ours went.