That would be a good argument if it were merely a language model, but if it can answer complicated technical questions (and presumably any other question), then it must have the necessary machinery to model the external world, predict what it would do in such and such circumstances, etc.
Note the framing. Not “should blackmail be legal?” but rather “why should blackmail be illegal?” Thinking for five seconds (or minutes) about a hypothetical legal-blackmail society should point to obviously dystopian results. This is not a subtle. One could write the young adult novel, but what would even be the point.
Of course, that is not an argument. Not evidence.
What ? From a consequentialist point of view, of course it is. If a policy (and "make blackmail legal" is a policy) probably have bad consequences, then it is a bad policy.
It was how it was trained, but Gurkenglas is saying that GPT-2 could male a human-like conversation because Turing test transcripts are in the GPT-2 dataset, but it's conversations between humans in the GPT-2 dataset that would make possible GPT-2 making human-like conversations and thus potentially passing the Turing Test.
But if the blackmail information is a good thing to publish, then blackmailing is still immoral, because it should be published and people should be incentivized to publish it, not to not publish it. We, as a society, should ensure that if, say, someone routinely engage in kidnapping children to harvest their organs, and someone knows this information, then she should be incentivized to send this information to the relevant authorities and not to keep this information to herself, for reasons that are I hope obvious.
I'm not sure what you're trying to say. I'm only saying that if your goal is to have an AI generate sentences that look like they were wrote by humans, then you should get a corpus with a lot of sentences that were wrote by humans, not sentences wrote by other, dumber, programs. I do not see why anyone would disagree with that.
You need to define the terms you use in a way so that what you are saying is useful by having pragmatic consequences on the real world of actual things, and not simply on the same level as arguing by definition.
Good post. Some nitpicks:
There are many models of rationality from which a hypothetical human can diverge, such as VNM rationality of decision making, Bayesian updating of beliefs, certain decision theories or utilitarian branches of ethics. The fact that many of them exist should already be a red flag on any individual model’s claim to “one true theory of rationality.”
VNM rationality, Bayesian updating, decision theories, and utilitarian branches of ethics all cover different areas. They aren't incompatible and actually fit rather neatly into each oth
Noticing an unachievable goal may force it to have an existential crisis of sorts, resulting in self-termination.
Do you have reasoning behind this being true, or is this baseless anthropomorphism ?
It should not hurt an aligned AI, as it by definition conforms to the humans' values, so if it finds itself well-boxed, it would not try to fight it.
So it is an useless AI ?
Your whole comment is founded on a false assumption. Look at Bayes' formula. Do you see any mention of whether your probability estimate is "just your prior" or "the result of a huge amount of investigation and very strong reasoning" ? No ? Well this mean that this doesn't effect how much you'll update.
[1] It would be rather audacious to claim that this is true for each of the four axioms. For instance, do please demonstrate how you would Dutch-book an agent that does not conform to the completeness axiom!
How can an agent not conform the completeness axiom ? It literally just say "either the agent prefer A to B, or B to A, or don't prefer anything". Offer me an example of an agent that don't conform to the completeness axiom.
Obviously it’s true that we face trade-offs. What is not so obvious is literally the entire rest of the section I quoted.
The
(Note: I ask that you not take this as an invitation to continue arguing the primary topic of this thread; however, one of the points you made is interesting enough on its own, and tangential enough from the main dispute, that I wanted to address it for the benefits of anyone reading this.)
...[1] It would be rather audacious to claim that this is true for each of the four axioms. For instance, do please demonstrate how you would Dutch-book an agent that does not conform to the completeness axiom!
How can an agent not conform the completeness axiom ? It li
This one is not a central example, since I’ve not seen any VNM-proponent put it in quite these terms. A citation for this would be nice. In any case, the sort of thing you cite is not really my primary objection to VNM (insofar as I even have “objections” to the theorem itself rather than to the irresponsible way in which it’s often used), so we can let this pass.
VNM is used to show why you need to have utility functions if you don't want to get Dutch-booked. It's not something the OP invented, it's the whole point of VNM. One wonder what you thought VN
VNM is used to show why you need to have utility functions if you don’t want to get Dutch-booked. It’s not something the OP invented, it’s the whole point of VNM. One wonder what you thought VNM was about.
This is a confused and inaccurate comment.
The von Neumann-Morgenstern utility theorem states that if an agent’s preferences conform to the given axioms, then there exists a “utility function” that will correspond to the agent’s preferences (and so that agent can be said to behave as if maximizing a “utility function”).
We may then ask whether there is a
This bidimensional model is weird.
But I can't imagine pure top left mood. This lead me to think that the mood square is actually a mood triangle, and that there is no top left mood, only a spectrum of moods between anxiety and mania.
I'm torn. I don't want you to be anxious about writing posts which happen to disagree with a point made by someone. Overall, writing more is better and I hope you don't feel punished by your honorable (IMO) removal of the post. However, I don't think LW is a good place for rebuttals of posts made elsewhere.
If you're making a point of interest to rationalists, I'd recommend to make it stand alone, and refer in passing to the incorrect/misleading posts only as a pointer to a different take. I wouldn't generally address it specifically to the outside poster, I'd make it more general than that.
The author explain very clearly what the differences are between "people hate losses more than they like gains" and loss aversion. Loss aversion is people hating losing $1 while having $2 more than they like gaining $1 while having $1, even though it both case this the difference between having $1 and $2.
How does this interact with time preference ? As stated, an elementary consequence of this theorem is that either lending (and pretty much every other capitalist activity) is unprofitable, or arbitrage is possible.