I won't comment on your specific startup, but I wonder in general how an AI Safety startup becomes a successful business. What's the business model? Who is the target customer? Why do they buy? Unless the goal is to get acquired by one of the big labs, in which case, sure, but again, why or when do they buy, and at what price? Especially since they already don't seem to be putting much effort into solving the problem themselves despite having better tools and more money to do so than any new entrant startup.
I really, really hope at some point the Democrats will acknowledge the reason they lost is that they failed to persuade the median voter of their ideas, and/or adopt ideas that appeal to said voters. At least among those I interact with, there seems to be a denial of the idea that this is how you win elections, which is a prerequisite for governing.
That seems very possible to me, and if and when we can show whether something like that is the case, I do think it would represent significant progress. If nothing else, it would help tell us what the thing we need to be examining actually is, in a way we don't currently have an easy way to specify.
If you can strike in a way that prevents retaliation that would, by definition, not be mutually assured destruction.
Correct, which is in part why so much effort went into developing credible second strike capabilities, building up all parts of the nuclear triad, and closing the supposed missile gap. Because both the US and USSR had sufficiently credible second strike capabilities, it made a first strike much less strategically attractive and reduced the likelihood of one occurring. I'm not sure how your comment disagrees with mine? I see them as two sides of the same coin.
If you live in Manhattan or Washington DC today, you basically can assume you will be nuked first, yet people live their lives. Granted people could behave differently under this scenario for non-logical reasons.
My understanding is that in the Cold War, a basic MAD assumption was that if anyone were going to launch a first strike, they'd try to do so with overwhelming force sufficient to prevent a second strike, hitting everything at once.
I agree that consciousness arises from normal physics and biology, there's nothing extra needed, even if I don't yet know how. I expect that we will, in time, be able to figure out the mechanistic explanation for the how. But right now, this model very effectively solves the Easy Problem, while essentially declaring the Hard Problem not important. The question of, "Yes, but why that particular qualia-laden engineered solution?" is still there, unexplained and ignored. I'm not even saying that's a tactical mistake! Sometimes ignoring a problem we're not yet equipped to address is the best way to make progress towards getting the tools to eventually address it. What I am saying is that calling this a "debunking" is misdirection.
I've read this story before, including and originally here on LW, but for some reason this time it got me thinking: I've never seen a discussion about what this tradition meant for early Christianity, before the Christians decided to just declare (supposedly after God sent Peter a vision, an argument that only works by assuming the conclusion) that the old laws no longer applied to them? After all, the Rabbi Yeshua ben Joseph (as the Gospels sometimes called him) explicitly declared the miracles he performed to be a necessary reason for why not believing in him was a sin.
We apply different standards of behavior for different types of choices all the time (in terms of how much effort to put into the decision process), mostly successfully. So I read this reply as something like, "Which category of 'How high a standard should I use?' do you put 'Should I lie right now?' in?"
A good starting point might be: One rank higher than you would for not lying, see how it goes and adjust over time. If I tried to make an effort-ranking of all the kinds of tasks I regularly engage in, I expect there would be natural clusters I can roughly draw an axis through. E.g. I put more effort into client-facing or boss-facing tasks at work than I do into casual conversations with random strangers. I put more effort into setting the table and washing dishes and plating food for holidays than for a random Tuesday. Those are probably more than one rank apart, but for any given situation, I think the bar for lying should be somewhere in the vicinity of that size gap.
One of the factors to consider, that contrasts with old-fashioned hostage exchanges as described, is that you would never allow your nation's leaders to visit any city that you knew had such an arrangement. Not as a group, and probably not individually. You could never justify doing this kind of agreement for Washington DC or Beijing or Moscow, in the way that you can justify, "We both have missiles that can hit anywhere, including your capital city." The traditional approach is to make yourself vulnerable enough to credibly signal unwillingness to betray one another, but only enough that there is still a price at which you would make the sacrifice.
Also, consider that compared to the MAD strategy of having launchable missiles, this strategy selectively disincentivizes people from wanting to move to whatever cities were the subject of such agreements, which were probably your most productive and important cities.
As things stand today, if AGI is created (aligned or not) in the US, it won't be by the USG or agents of the USG. I'll be by a private or public company. Depending on the path to get there, there will be more or less USG influence of some sort. But if we're going to assume the AGI is aligned to something deliberate, I wouldn't assume AGI built in the US is aligned to the current administration, or at least significantly less so than the degree to which I'd assume AGI built in China by a Chinese company would be aligned to the current CCP.
For more concrete reasons regarding national ideals, the US has a stronger tradition of self-determination and shifting values over time, plausibly reducing risk of lock-in. It has a stronger tradition (modern conservative politics notwithstanding) of immigration and openness.
In other words, it matters a lot whether the aligned US-built AGI is aligned to the Trump administration, the Constitution, the combined writings of the US founding fathers and renowned leaders and thinkers, the current consensus of the leadership at Google or OpenAI, the overall gestalt opinions of the English-language internet, or something else. I don't have enough understanding to make a similar list of possibilities for China, but some of the things I'd expect it would include don't seem terrible. For example, I don't think a genuinely-aligned Confucian sovereign AGI is anywhere near the worst outcome we could get.