Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: JoshuaFox 29 March 2017 05:35:11PM 1 point [-]

Eliezer is still writing AI Alignment content on it, ... MIRI ... adopt Arbital ...

How does Eliezer's work on Arbital relate to MIRI? Little is publicly visible of what is is doing in MIRI. Is he focusing on Arbital? What is the strategic purpose?

Comment author: luminosity 15 January 2017 12:42:54PM 0 points [-]

How did you go about finding people to come to your meetups? Pre-existing friends? LW site postings? Something else?

Comment author: JoshuaFox 19 January 2017 09:52:55AM 1 point [-]

Pre-existing friends, postings on Facebook (even though FB does not distribute events to the timelines of group members if there are more than 250 people in a group), occasionally lesswrong.com (not event postings, but more that people who are actively interested LW seek out a Tel Aviv group)

In response to Meetup Discussion
Comment author: JoshuaFox 11 January 2017 07:42:15AM *  1 point [-]

In Tel Aviv, we have three types of meetings, all on Tuesdays. Monthly we have a full meeting, usually a lecture or sometimes Rump Sessions (informal lightning talks). Typical attendance is 12.

Monthly, alternating fortnights from the above, we do game nights.

We are graciously hosted by Meni Rosenfeld's Cluster startup hub. (For a few years we were hosted at Google.)

On other Tuesdays a few LessWrongers get together at a pub.

Comment author: JoshuaFox 04 January 2017 09:06:54AM 0 points [-]

There certainly should be more orgs with different approaches. But possibly, CHCAI plays a role as the representative of MIRI in the mainstream academic world, and so from the perspective of goals, it is OK that the two are quite close.

Meetup : NLP for large scale sentiment detection

0 JoshuaFox 06 September 2016 07:13AM

Discussion article for the meetup : NLP for large scale sentiment detection

WHEN: 13 September 2016 07:00:00PM (+0300)

WHERE: Menachem Begin 125, Floor 36, Tel Aviv


Our speaker is Alon Rozental, Lead Data Scientist at Amobee.

The talk:

Sentiment analysis is about the modest task of taking a text and figuring out what it's positive or negative about. Turns out it's rather hard. This talk will discuss one approach to the problem, which issues it tackles (like millions sentences per day and high level negation), where it falls flat, plans to improve the process and how the heck do you deal with an emoticon of a dolphin or and eggplant.

Meeting each other to chat at 19:00, presentation at 19:30

Amobee, Menachem Begin 125, Floor 36, Tel Aviv

Discussion article for the meetup : NLP for large scale sentiment detection

Meetup : Quantum Homeschooling

1 JoshuaFox 09 February 2016 09:30AM

Discussion article for the meetup : Quantum Homeschooling

WHEN: 16 February 2016 07:00:00PM (+0200)

WHERE: Google Campus, Yigal Alon 98, 34th floor, Tel Aviv

This time we have two talks:

Louise Fox will talk about homeschooling in Israel.

Zvi Schreiber will talk about interpretations of quantum mechanics.

It will be at Google Campus, 34th floor, and not the Google lecture hall where we have been meeting until recently.

You can call Joshua at 0545691165 if you have a hard time finding us.

Please RSVP at the Facebook Event https://www.facebook.com/events/1560682680921315/

Discussion article for the meetup : Quantum Homeschooling

MIRIx Israel, Meeting summary

6 JoshuaFox 31 December 2015 07:04AM

Aur Saraf hosted a 3-hour MIRIx meeting on December 18. Yoav Hollander, chip verification pioneer and creator of the e verification language, was there, as well as MIRI Research Associate Vadim Kosoy. Also in attendance were Benjamin Fox, Ofer Mustigman, Matan Mor, Aur Saraf, Eli Sennesh, and Joshua Fox.

Our  discussion had its roots in Eliezer Yudkowsky’s 2008 article, in which he suggested that FAI researchers take an example from “computer engineers prov[ing] a chip valid”. Yet, as Yoav pointed out (in a lecture at LessWrong Tel Aviv in October 2015), there are strong limitations on formal verification at all levels, from the smallest arithmetic component on a chip up to entire systems like an airport. Even something as simple as floating-point division can barely be proven formally; as we move up to higher levels of complexity, any formal proofs always rest on very tightly constrained specifications on the input and environment. It is impossible to prove even a tiny part of the full range of relevant predicates.

Formal verification has a role, but most verification is done dynamically, for example by running a simulation against test cases. The goal of this meeting was to come up with a list of directions for applying ideas from the verification world to FAI research.

The state of FAI research

Vadim Kosoy described the state of the art in FAI research, catching us up  on the last few years of MIRI’s work. FAI research can be divided into three levels: Modules, optimization under self improvement, and the selection of a human-compatible goal function.


I. Most basically, we can verify lower level modules that can make up an AGI.

II. Second -- and this is most of the research effort -- we can make sure that future AIs optimize for a given implicit human-compatible goal function, even as they grow in strength.

MIRI is focused on accomplishing this with verifiable goal preservation under self-improvement. Some other ideas include:

  • Agents that are deliberately kept weak.
  • Limited intelligence AIs evaluated on their mathematical ability, but with no  knowledge of physics or our real world. (Such AIs might not be strong in induction given real-world physics, but at least this evaluation procedure might allow the relatively safe development of a certain kind of AI.)
  • AIs locked in cryptographic boxes. They run with homomorphic encryption  that prevents any side effects of their computation from being revealed to the outside world: Only their defined output can reach us.

Such an AI could still accomplish a lot, while keeping potentially dangerous information from us. As an illustration, you might ask it to prove the Riemann Hypothesis, also passing in a proof verifier. Operating under the protection of homomorphic encryption, the AI might find a proof for the Riemann Hypothesis and feeds it to the proof verifier. It outputs a single bit “Yes, there is a proof to the Riemann Hypothesis,” but it never shows us the proof.

  • Negative utility for any act of self-analysis.
  • Corrigibility: Ensuring that the AI will allow us to turn it off if we so wish, typically by carefully defining its utility function. 

III. The third area of FAI research is in choosing a goal function that matches human values, or, less ambitiously, a function that has some characteristics that match human values.

Verification for Autonomous Robots

Yoav Hollander asked to focus on autonomous robots like drones and self-driving cars, given the extreme difficulties in verification for AGIs--while fully recognizing the different natures of the challenges.

The state of the art in validating safety for these systems is pretty basic. There is some work on formal verification of some aspects of autonomous robots (e.g. here), and some initial work on dynamic, coverage-driven verification (e.g. here). The most advanced work in autonomous vehicle verification consists of dynamic verification of the  entire software stack on automatically generated scenarios. These scenarios are based on recordings of video, LIDAR and other sensors taken while driving real roads; interesting events like pedestrians jumping on the road are superimposed on these.

An important model for robot behavior is “Belief, Desire, Intention” (BDI), which is expressed in the AgentSpeak language (among others). The Jason Java-based interpreter (among others) then execute these behavioral models.

We can connect the three areas of FAI research (above) to the work of autonomous robot engineers:

  1. Analyzing the modules is a good idea, though figuring out what the modules would be in an FAI is not easy..

  2. Optimizing goal functions according to our true intentions--Do What I Mean--is relevant for autonomous robots just as it would be for an AGI.

  3. Choosing a utility function looks a bit more difficult if we don't have the AI-under-test to output even a half-page description of its preferences that humans would read. There is no clear way to identify unexpected perversities fully automatically, even in the limited autonomous robots of today.

Safety Tools for AI Development

Aur Saraf suggested a software tool that checks a proposed utility function for ten simple perversities. Researchers are urged to run it before pressing the Big Red Button.

After a few runs, the researcher would start to abandon their facile assumptions about the safety of their AI. For example, the researcher runs it and learns that “Your AI would have turned us all into paperclips”. The researcher fixes the problem, runs the tool and learns “Your improved AI would have cryogenically frozen all us.” Again, the researcher fixes the AI and runs the tool, which answers “Your twice-fixed AI would have turned the universe into a huge blob of molten gold.” At this point, maybe they would start realizing the danger.

If we can create an industry-standard  “JUnit for AGI testing,” we can then distribute safety testing as part of this AGI. The real danger in AGI, as Vadim pointed out, is that a developer has a light finger on the Run button while developing the AI. (“Let’s just run it for a few minutes, see what happens") A generic test harness for AGI testing, which achieves widespread distribution and might be a bother for others  to reimplement, could then be a platform for ensuring much more awareness and care about safety among AI developers, as per the “perversity checker” mentioned by Aur.



  1. Here is Yoav’s summary of his opinions on this, as formed after the meeting.

  2. Thanks to Aur Saraf for the notes on which this is based and to other participants for comments. Any flaws are mine.

Meetup : Biases and making better decisions

1 JoshuaFox 26 August 2015 02:58PM

Discussion article for the meetup : Biases and making better decisions

WHEN: 01 September 2015 07:00:00PM (+0300)

WHERE: 98 Yigal Alon Street, Tel Aviv

We're going to have a meetup on Tuesday, Sep. 1, at Google Tel Aviv offices, Electra Tower, 98 Yigal Alon St., Tel Aviv.

Yonatan Cale will discuss 2 topics in today's lecture: - Biases: The cognitive mistakes we have as humans. Noticing and overcoming our biases is one of the core topics of LessWrong. - Decision making: A new method designed to let us make better decisions, easily, to advance our own goals.

We'll meet at the 29th floor of the building (not the one with Google Campus) at 19:00. If you arrive and can't find your way around, call Anatoly who's hosting us at 054-245-1060 or Joshua Fox at 0545691165.

If you have FB, please click "Join" at the FB page so we can get a sense of attendance. https://www.facebook.com/events/109945732685336/

Discussion article for the meetup : Biases and making better decisions

In response to Gatekeeper variation
Comment author: kilobug 10 August 2015 01:41:27PM 1 point [-]

This wont work, like with all other similar schemes, because you can't "prove" the gatekeeper down to the quark level of what makes its hardware (so you're vulnerable to some kind of side-attack, like the memory bit flipping attack that was spoken about recently), nor shield the AI from being able to communicate through side channels (like, varying the temperature of its internal processing unit which it turns will influence the air conditioning system, ...).

And that's not even considering that the AI could actually discover new physics (new particles, ...) and have some ability to manipulate them with its own hardware.

This whole class of approach can't work, because there are just too many ways for side-attacks and side-channels of communication, and you can't formally prove none of them are available, without going down to making proof over the whole (AI + gatekeeper + power generator + air conditioner + ...) down at Schrödinger equation level.

Comment author: JoshuaFox 10 August 2015 01:58:48PM 0 points [-]

You're quite right--these are among the standard objections for boxing, as mentioned in the post. However, AI boxing may have value as a stopgap in an early stage, so I'm wondering about the idea's value in that context.

Comment author: Luke_A_Somers 08 August 2015 03:28:39AM 1 point [-]

Seems like your usual 'independently verify everything the AI says' concept, only way more restrictive.

Comment author: JoshuaFox 08 August 2015 06:21:09PM 0 points [-]

Sure, but to "independently verify" the output of an entity smarter than you is generally impossible. This makes it possible, while also limiting the potential of the boxed AI to choose its answers.

View more: Next