Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Raemon 25 February 2017 08:37:52PM 2 points [-]

Note: I'd prefer comments go in the EA forum.

[Link] What Should the Average EA Do About AI Alignment?

4 Raemon 25 February 2017 08:37PM
Comment author: Raemon 15 February 2017 03:23:08AM 1 point [-]

Thank you for writing this up - don't know if I had formally asked for it but I was hoping for a more concise summary. :)

In response to The Social Substrate
Comment author: WhySpace 10 February 2017 06:40:45AM *  4 points [-]

I rather like this concept, and probably put higher credence on it than you. However, I don't think we are actually modeling that many layers deep. As far as I can tell, it's actually rare to model even 1 layer deep. I think your hypothesis is close, but not quite there. We are definitely doing something, but I don't think it can properly be described as modeling, at least in such fast-paced circumstances. It's something close to modeling, but not quite it. It's more like what a machine learning algorithm does, I think, and less like a computer simulation.

Models have moving parts, and diverge rapidly at points of uncertainty, like how others might react. When you build a model, it is a conscious process, and requires intelligent thought. The model takes world states as inputs, and simulates the effects these have on the components of the model. Then, after a bunch of time consuming computation, the model spits out a play-by-play of what we think will happen. If there are any points of uncertainty, the model will spit out multiple possibilities stemming from each, and build up multiple possible branches. This is extremely time consuming, and resource intense.

But there's a fast, System 1 friendly way to route around needing a time-consuming model: just use a look-up-table.^[1] Maybe run the time consuming model a bunch of times for different inputs, and then mentally jot down the outputs for quick access later, on the fly. Build a big 2xn lookup table, with model inputs in 1 column, and results in the other. Do the same for every model you find useful. Maybe have 1 table for a friend's preferences: inputting tunafish outputs gratitude (for remembering her preferences). Imputing tickling outputs violence.

Perhaps this is why we obsess over stressful situations, going over all the interpretations and everything we could have done differently. We're building models of worrying situations, running them, and then storing the results for quick recall later. Maybe some of this is actually going on in dreams and nightmares, too.

But there's another way to build a lookup table: directly from data, without running any simulation. I think we just naturally keep tabs of all sorts of things without even thinking about it. Arguably, most of our actions are being directed by these mental associations, and not by anything containing conscious models.

Here's an example of what I think is going on, mentally:

Someone said something that pattern matches as rash? Quick, scan through all the lookup tables within arm’s reach for callous-inputs. One output says joking. Another says accident. A third says he's being passive aggressive. Joking seems to pattern match the situation the best.

But oh look, you also ran it through some of the lookup tables for social simulations, and one came up with a flashing red light saying Gary's mom doesn't realize it was a joke.

That's awkward. You don't have any TAPs (Trigger Action Plans) installed for what to do in situations that pattern match to an authority figure misunderstanding a rude joke as serious. Your mind spirals out to less and less applicable TAP lookup tables, and the closest match is a trigger called "friend being an ass". You know he's actually joking, but this is the closest match, so you look at the action column, and it says to reprimand him, so you do.

Note that no actual modeling has occurred, and that all lookup tables used could have been generated purely experimentally, without ever consciously simulating anyone. This would explain why it's so hard to explain the parts of our model when asked: we have no model, just heuristics, and fuzzy gut feeling about the situation. Running the model again would fill in some of the details we've forgotten, but takes a while to run, and slows down the conversation. That level of introspection is fine in an intimate, introspective conversation, but if it's moving faster, the conversation will have changed topics by the time you've clarified your thoughts into a coherent model.

Most of the time though, I don't think we even explicitly think about the moving parts that would be necessary to build a model. Take lying, for example:

We rarely think "A wants B to think X about C, because A models B as modeling C in a way that A doesn't like, and A realizes that X is false but would cause B to act in a way that would benefit A if B believed it." (I'm not even sure that will parse correctly for anyone who reads it. That's kind of my point though.)

Instead, we just think "A told lie X to B about C". Or even just "A lied", leaving out all the specific details unless they become necessary. All the complexity of precisely what a lie is gets tucked away neatly inside the handle "lie", so we don't have to think about it or consciously model it. We just have to pattern match something to it, and then we can apply the label.

If pressed, we'll look up what "lied" means, and say that "A said X was true, but X is actually false". If someone questions whether A might actually believe X, we'll improve out model of lying further, to include the requirement that A not actually believe X. We'll enact a TAP to search for evidence that A thinks X, and come up with memories Y and Z, which we will recount verbally. If someone suspects that you are biased against A, or just exhibiting confirmation bias, they may say so. This just trips a defensive TAP, which triggers a "find evidence of innocence" action. So, your brain kicks into high gear and automatically goes and searches all your lookup tables for things which pattern match as evidence in your favor.

We appear to be able to package extremely complex models up into a single function, so it seems unlikely that we are doing anything different with simpler models of things like lying. There's no real difference in how complex the concept of god feels from the concept of a single atom or something, even though one has much more moving parts under the hood of the model. We're not using any of the moving parts of the model, just spitting out cashed thoughts from a lookup table, so we don't notice the difference.

If true, this has a bunch of other interesting implications:

  • This is likely also why people usually act first and pick a reason for that choice second: we don't have a coherent model of the results until afterward anyway, so it's impossible to act like an agent in real time. We can only do what we are already in the habit of doing, by following cashed TAPs. This is the reason behind akrasia, and the "elephant and rider" (System 1 and System 2) relationship.

  • Also note that this scales much better: you don't need to know any causal mechanisms to build a lookup table, so you can think generally about how arbitrarily large groups will act based only on past experience, without needing to build it up from simulating huge numbers of individuals.

  • It implies that we are just Chinese Rooms most of the time, since conscious modeling is not involved most of the time. Another way of thinking of it is that we store keep the answers to the sorts of common computations we expect to do in (working?) memory, so that the more computationally intense consciousness can concentrate on the novel or difficult parts. Perhaps we could even expand our consciousness digitally to always recompute responses every time.

[1] For the record, I don't think our minds have neat, orderly lookup tables. I think they use messy, associative reasoning, like the Rubes and Bleggs in How An Algorithm Feels From The Inside. This is what I'm referring to when I mention pattern matching, and each time I talk about looking something up in a empirically derived lookup table, a simulation input/results lookup table, or a TAP lookup table.

I think these sorts of central nodes with properties attached make up a vast, web-like networks, built like network 2 in the link. All the properties are themselves somewhat fuzzy, just like the central "rube"/"blegg" node. We could de-construct "cube" into constituent components the same way: 6 sides, all are flat, sharp corners, sharp edges, sides roughly 90 degrees apart, etc. You run into the same mental problems with things like rhombohedrons, and are forced to improve your sloppy default mental conception of cubes somehow if you want to avoid ambiguity.

All nodes are defined only by it's relation to adjacent nodes, just like the central rube/blegg node. There are no labels attached to the nodes, just node clusters for words and sounds and letters attached to the thing they are meant to represent. It would be a graph theory monster if we tried to map it all out, but in principle you could do it by asking someone how strongly they associated various words and concepts.

Comment author: Raemon 10 February 2017 08:34:03PM 0 points [-]

I'm not sure you actually disagree with the OP. I think you are probably right about the mechanism by which people identify and react to social situations.

I think the main claims of the OP hold whether you're making hyper-fast calculations, or lookup checks. The lookup checks still correspond roughly to what the hyperfast calculations would be, and I read the OP mainly as a cautionary tale for people who attempt to do use System 2 reasoning to analyze social situations (and, especially, if you're attempting to change social norms)

Aspiring rationalists are often the sort of people who look for inefficiencies in social norms and try to change them. But this often results in missing important pieces of all the nuances that System 1 was handling.

Comment author: Lumifer 09 February 2017 05:24:17PM 0 points [-]

Let me be more clear.

Point 1: The Newcomb Problem tells you nothing about actual social interactions between actual humans. If you're interested in social structures, techniques, etc., the Newcomb Problem is the wrong place to start.

Point 2: Trust in this context can be defined more or less as "accepting without verifying". There is no trust involved in the Newcomb problem.

Oh, and in case you're curious, I two-box.

Comment author: Raemon 09 February 2017 07:13:58PM 1 point [-]

If you 2-box, shouldn't Point 1 be "Newcomb's problem doesn't tell you anything useful about anything" rather than "Newcomb's probably doesn't tell you anything useful about trust?"

In response to The Social Substrate
Comment author: Dagon 09 February 2017 08:44:50AM 0 points [-]

I'm not sure William Newcomb would agree to pose as Omega, and if you're going to change the problem, you really ought to explore the ramifications. Like what happens if the prediction is wrong - it becomes a boring "convince the agent you'll one-box, then two-box" problem if you assume only human-like predictive abilities. Being a cloaked psychopath in a community of cooperators probably nets you more points than just cooperating all the time.

Also, Hofstadter's idea of superrationality deserves a link in the "other sources" list.

In response to comment by Dagon on The Social Substrate
Comment author: Raemon 09 February 2017 08:46:43AM 1 point [-]

I think this is explored in Critch's post (which is linked at the bottom)


Comment author: lukeprog 27 January 2017 02:41:03PM *  3 points [-]

Today I encountered a real-life account of a the chain story — involving a cow rather than an elephant — around 24:10 into the "Best of BackStory, Vol. 1" episode of the podcast BackStory.

Comment author: Raemon 29 January 2017 09:25:18PM 0 points [-]

Cool, I'd be wondering about that. :)

Comment author: Vaniver 19 January 2017 07:11:19PM 23 points [-]

Two notes on things going on behind the scenes:

  1. Instead of Less Wrong being a project that's no org's top focus, we're creating an org focused on rationality community building, which will have Less Wrong as its primary project (until Less Wrong doesn't look like the best place to have the rationality community).

  2. We decided a few weeks ago that the LW codebase was bad enough that it would be easier to migrate to a new codebase and then make the necessary changes. My optimistic estimate is that it'll be about 2 weeks until we're ready to migrate the database over, which seems like it might take a week. It's unclear what multiplier should be applied to my optimism to get a realistic estimate.

Comment author: Raemon 21 January 2017 03:08:12PM 1 point [-]

I'm curious what plans you have re: open source accessibility on the new codebase?

It might be cool to get the minimum viable version up and running, with a focus on making the documentation necessary to contribute really good, and then do a concerted push to get people to make various improvements.

That may not work, but it'd be an obvious time to try for it.

Comment author: lifelonglearner 21 January 2017 02:38:20AM 0 points [-]


Comment author: Raemon 21 January 2017 02:40:20AM 0 points [-]

Lists and Headers are especially vulnerable.

Comment author: lifelonglearner 21 January 2017 02:34:03AM 0 points [-]

Yeah. When I first posted this, It turned to Times New Roman, which looked really bad. I had to transfer it to Word, change the font, and then repaste it here. Do you know specifically what's going on?

Comment author: Raemon 21 January 2017 02:36:36AM 2 points [-]

LW parses formatting differently than Word or Wordpress.

What I end up doing is writing the document in a plaintext file, and then copy it into both LW and other blogs if I'm putting it in other blogs.

View more: Next