If an AI can be Aligned externally, then it's already safe enough. It feels like...
- You're not talking about solving Alignment, but talking about some different problem. And I'm not sure what that problem is.
- For your proposal to work, the problem needs to be already solved. All the hard/interesting parts need to be already solved.
I'm talking about the need for all AIs (and humans) to be bound by legal systems that include key consensus laws/ethics/values. It may seem obvious, but I think this position is under-appreciated and not universally accepted.
By foc...
I'm not sure exactly how many people are working on it, but I have the impression that it is more than a dozen, since I've met some of them without trying.
Glad to hear it. I hope to find and follow such work. The people I'm aware of are listed on pp. 3-5 of the paper. Was happy to see O'Keefe, Bai et al. (Anthropic), and Nay leaning this way.
...It seems to me like you are somewhat shrugging off those concerns, since the technological interventions (eg smart contracts, LLMs understanding laws, whatever self-driving-car people get up to) are very "light"
I believe you have to argue two things:
- Laws are distinct enough from human values (1), but following laws / caring about laws / reporting about predicted law violations prevents the violation of human values (2).
I think the post fails to argue both points. I see no argument that instilling laws is distinct enough from instilling values/corrigibility/human semantics in general (1) and that laws actually prevent misalignment (2).
My argument goes in a different direction. I reject premise (1) and claim there is an "essential equivalence and intimate link betw...
I feel as if there is some unstated idea here that I am not quite inferring. What is the safety approach supposed to be? If there were an organization devoted to this path to AI safety, what activities would that organization be engaged in?
The summary I posted here was just a teaser to the full paper (linked in pgph. 1). That said, your comments show you reasoned pretty closely to points I tried to make therein. Almost no need to read it. :)
...The first part is just "regulation". The second part, "instilling law-abiding values in AIs and humans", seems like a
I submit that current legal systems (or something close) will apply to AIs. And there will be lots more laws written to apply to AI-related matters.
It seems to me current laws already protect against rampant paperclip production. How could an AI fill the universe with paperclips without violating all kinds of property rights, probably prohibitions against mass murder (assuming it kills lots of humans as a side effect), financial and other fraud to aquire enough resources, etc. I see it now: some DA will serve a 25,000 count indictment. That AI will be in B...
I guess here I'd reiterate this point from my latest reply to orthonormal:
Again, it's not only about having lots of rules. More importantly it's about the checks and balances and enforcement the system provides.
It may not be helpful to think of some grand utility-maximising AI that constantly strives to maximize human happiness or some other similar goals, and can cause us to wake up in some alternate reality some day. It would be nice to have some AIs working on how to maximize some things human's value, e.g., health, happiness, attractive and sensibl...
For this reason, giving an AI simple goals but complicated restrictions seems incredibly unsafe, which is why SIAI's approach is figuring out the correct complicated goals.
Tackling FAI by figuring out complicated goals doesn't sound like a good program to me, but I'd need to dig into more background on it. I'm currently disposed to prefer "complicated restrictions," or more specifically this codified ethics/law approach.
In your example of a stamp collector run amok, I'd say it's fine to give an agent the goal of maximizing the number of stamps...
We would probably start with current legal systems and remove outdated laws, clarify the ill-defined, and enact a bunch of new ones. And our (hyper-)rational AI legislators, lawyers, and judges should not be disposed to game the system. AI and other emerging technologies should both enable and require such improvements.
The laws might be appropriately viewed primarily as blocks that keep the AI from taking actions deemed unacceptable by the collective. AIs could pursue whatever goals they sees fit within the constraints of the law.
However, the laws wouldn't be all prohibitions. The "general laws" would be more prescriptive, e.g., life, liberty, justice for all. The "specific laws" would tend to be more prohibition oriented. Presumably the vast majority of them would be written to handle common situations and important edge cases. If someone suspects th...
Thanks for the links. I'll try to make time to check them out more closely.
I had previously skimmed a bunch of lesswrong content and didn't find anything that dissuaded me from the Asimov's Laws++ idea. I was encouraged by the first post in the Metaethics Sequence where Eliezer warns about not "trying to oversimplify human morality into One Great Moral Principle." The law/ethics corpus idea certainly doesn't do that!
RE: your first and final paragraphs: If I had to characterize my thoughts on how AIs will operate, I'd say they're likely to be emin...
Thanks for the thoughts.
You seem to imply that AIs motivations will be substantially humanlike. Why might AIs be motivated to nobble the courts, control pens, overturn vast segments of law, find loopholes, and engage in other such humanlike gamesmanship? Sounds like malicious programming to me.
They should be designed to treat the law as a fundamental framework to work within, akin to common sense, physical theories, and other knowledge they will accrue and use over the course of their operation.
I was glib in my post suggesting that "before taking acti...
This is a very timely question for me. I asked something very similar of Michael Vassar last week. He pointed me to Eliezer's "Creating Friendly AI 1.0" paper and, like you, I didn't find the answer there.
I've wondered if the Field of Law has been considered as a template for a solution to FAI--something along the lines of maintaining a constantly-updating body of law/ethics on a chip. I've started calling it "Asimov's Laws++." Here's a proposal I made on the AGI discussion list in December 2009:
"We all agree that a few simple laws...
Sure. Getting appropriate new laws enacted is an important element. From the paper:
I'd say the EU AI Act (and similar) work addresses the "new laws" im... (read more)