So, do you maintain your decision, or was it just a spur of the moment lapse of judgement?
After a fair bit of thought, I don't. I don't think one can really categorize it as purely spur of the moment though-it lasted quite a while. Perhaps inducing a 'let the AI out of the box phase' would be a more accurate description.
Reminder! Although I haven't yet written abuot the general principle, the original Drake's Equation was bullshit. Things like this are even more bullshit since they exploit the human bias of assigning significant probabilities to everything elicited creating an unpacking bias where unpacked items are assigned much larger summed probabilities than the corresponding packed categories, meaning that the apparent probability of a conjunction goes down as you helpfully break it into more and more parts. By these means I could equally make the Moon landing appear impossible, just as I could make cryonics appear more and more likely by considering more and more disjunctive pathways to success. It also fails as probability theory because conditional dependency.
Again, general reminder: Across all cases not backed up by actual sampling, someone who offers to helpfully "elicit" a set of "conjunctive" probabilities and multiplies them together to get some low number, without considering any disjunctions, assuming conditional independence, and with no warnings about unpacking bias, is using a Fully General Counterargument that will underestimate the probability of anything. I have yet to see a good Breaking X Down for any X, unless X is a whole population (not a significant subsector of it) and the breakdown is just the actual data about X.
I feel like the unpacking/packing biases ought to be something that should be easier to get around than some other biases. Fermi estimates do work (to some extent). I somewhat wonder if perhaps giving log probabilities would help more.
Guess culture acknowledges that there is a social cost to outright denying a request. A good example from Yvain's comment:
Consider an "ask culture" where employees consider themselves totally allowed to say "no" without repercussions. The boss would prefer people work unpaid overtime so ey gets more work done without having to pay anything, so ey asks everyone. Most people say no, because they hate unpaid overtime. The only people who agree will be those who really love the company or their job - they end up looking really good. More and more workers realize the value of lying and agreeing to work unpaid overtime so the boss thinks they really love the company. Eventually, the few workers who continue refusing look really bad, like they're the only ones who aren't team players, and they grudgingly accept.
Only now the boss notices that the employees hate their jobs and hate the boss. The boss decides to only ask employees if they will work unpaid overtime when it's absolutely necessary. The ask culture has become a guess culture.
Oh, obviously there are causal reasons for why guess culture develops. If there wasn't, it wouldn't occur. I agree that having a social cost to denying a request can lead to this phenomenon, as your example clearly shows. I don't think that stops it from being silly.
I feel ask and tell culture are fairly similar in comparison to guess culture. Tell culture seems to me to be just ask culture a bit more explaining, which seems like a move in the right direction, balanced by time and energy constraints. Guess culture just seems rather silly.
Observation of individual neurons doesn't indicate they have intelligence however doe it means that intelligence of a human brain is emergence phenomenon?
Observation of individual atoms and molecules wouldn't revel any gravitation like properties either however we don't call that gravity emergence phenomena. Instead we argue that gravitation like properties of atoms and molecules are not observable. Could you conceder that we may grossly underestimate an "intelligent ability" of individual neurons?
What I meant by this is the gravitational influence of N particles is the sum of the gravitational influences of each of the individual particles, and is therefore a strict function of their individual gravitational influences. If you give me any collection of particles, and tell me nothing except their gravitational fields, I can tell you the gravitational field of the system of particles. If you tell me the intelligence of each of your neurons (0), I cannot determine your intelligence.
There are a number of aspects of EY’s ruleset I dislike. For instance, his ruleset allows the Gatekeeper to type “k” after every statement the AI writes, without needing to read and consider what the AI argues. I think it’s fair to say that this is against the spirit of the experiment, and thus I have disallowed it in this ruleset. The EY Ruleset also allows the gatekeeper to check facebook, chat on IRC, or otherwise multitask whilst doing the experiment. I’ve found this to break immersion, and therefore it’s also banned in the Tuxedage Ruleset.
Eliezer's rules uphold the spirit of the experiment in that making things easier for the AI goes very much against what we should expect of any sort of gatekeeping procedure.
I think the gatekeeper having to pay attention to the AI is very in the spirit of the experiment. In the real world, if you built an AI in a box and ignored it, then why build it in the first place?
I'll pay $20 to read the Tuxedage vs SoundLogic chat log.
I would be willing to consider it if you agreed to secrecy and raised it to 1000$. You would still have to talk to Tuxedage though.
Does SoundLogic endorse their decision to let you out of the box? How do they feel about it in retrospect?
BTW, I think your pre-planning the conversation works as a great analogue to the superior intelligence a real AI might be dealing with.
I'm not completely sure. And I can't say much more than that without violating the rules. I would be more interested in how I feel in a week or so.
It seems like many of the advantages/tactics that Tuxedage recommended for the person playing the AI would be absent (or far more difficult) with an actual AI. Or at least they could be made that way with the design of the gatekeeping protocol.
Tailor your arguments to the personality/philosophy/weaknesses/etc. of this particular gatekeeper:
* Have multiple gatekeepers, some of whom are hidden from the AI, and don't let the AI out unless they all agree. For example, have a gatekeeper one-step-removed who reads the transcripts of the conversation between the AI & the first gatekeeper and otherwise has no contact with either of them. This gatekeeper's identity (and possibly even their existence) is hidden from the AI and the first gatekeeper.
Use speed / set the pace / shock & awe with a flurry of arguments:
* Build in delays & breaks. Have a planned schedule of breaks, and a delay before the gatekeeper can choose to release the AI, and a person reading the transcript who has the option of calling for a break if the gatekeeper seems to need one. Another option is to have a bunch of people who read the transcript, where each of them has the job of focusing on one line of argument. If any of them feel like their line of argument hasn't been covered sufficiently, then they can talk to the gatekeeper about it during a break, and prevent the AI from being released until they are satisfied that the argument has been thoroughly covered.
Find a crack in the gatekeeper and press your advantage as soon as you find it:
* This is basically a combination of the previous two, and is countered by the combination of delays and multiple gatekeepers (some hidden).
Use dark arts / lies / manipulation:
* Have people read the transcript with the job of looking for lies, inconsistencies, or other uses of the dark arts from the AI; destroy the AI if they find any. Better yet, have the AI talk with multiple gatekeepers (using multiple copies of the AI which don't share memory, so that the AI talking to gatekeeper #2 doesn't know what is being said to gatekeeper #1), and have the dark-arts-evaluators read all of the transcripts. There could also be similar readers who look for irrationality in the gatekeeper, who have the power to rule the gatekeeper incompetent and in need of replacement.
Of course, an actual superhuman AI would also have many advantages compared to a human playing the role of AI.
A better mind than Tuxedage could almost certainly keep up the 'feel' of a flurry of arguments even with a schedule of breaks. I myself have had people feel irritated at me where even if I talk to them with days in between that I seem to do so. If I can do so accidentally I'm certain a superintelligence could do it reliably.
Also, I'm unsure of how much an AI could gather from a single human's text input. I know that I at least miss a lot of information that goes past me that I could in theory pick up.
An AI using timeless decision theory could easily compensate for having multiple AIs with unshared memory just by attempting to determine what the other AIs would say.
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Yes, a provably secure system has assumptions about the other systems it uses, and they necessarily amount to "all those other systems work correctly and if they don't, it's their bug, not ours."
Provable security means no security-affecting bugs. Precious few software is written to be provably correct (rather than just proving the underlying algorithm correct). None of it runs on proven-correct operating systems with proven-correct bioses, drivers, and chipset ROMs all the way down to the network card and hard drive microcontrollers, because such mythical beasts don't exist. (And these microcontrollers have long been general-purpose computers capable of hosting malware vectors.)
And none of that software runs on provably-correct hardware, which doesn't exist either: software can be proven correct because it's an algorithm, but how can you prove the perfection of a physical device like a CPU, the absence of physical implementation errors like this rowhammer bug which aren't reflected in any design documents?
Step one involves figuring out the fundamental laws of physics. Step two is input a complete description of your hardware. Step three is to construct a proof. I'm not sure how to order these in terms of difficulty.