Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Learning values versus learning knowledge

4 Stuart_Armstrong 14 September 2016 01:42PM

I just thought I'd clarify the difference between learning values and learning knowledge. There are some more complex posts
about the specific problems with learning values, but here I'll just clarify why there is a problem with learning values in the first place.

Consider the term "chocolate bar". Defining that concept crisply would be extremely difficult. But nevertheless it's a useful concept. An AI that interacted with humanity would probably learn that concept to a sufficient degree of detail. Sufficient to know what we meant when we asked it for "chocolate bars". Learning knowledge tends to be accurate.

Contrast this with the situation where the AI is programmed to "create chocolate bars", but with the definition of "chocolate bar" left underspecified, for it to learn. Now it is motivated by something else than accuracy. Before, knowing exactly what a "chocolate bar" was would have been solely to its advantage. But now it must act on its definition, so it has cause to modify the definition, to make these "chocolate bars" easier to create. This is basically the same as Goodhart's law - by making a definition part of a target, it will no longer remain an impartial definition.

What will likely happen is that the AI will have a concept of "chocolate bar", that it created itself, especially for ease of accomplishing its goals ("a chocolate bar is any collection of more than one atom, in any combinations"), and a second concept, "Schocolate bar" that it will use to internally designate genuine chocolate bars (which will still be useful for it to do). When we programmed it to "create chocolate bars, here's an incomplete definition D", what we really did was program it to find the easiest thing to create that is compatible with D, and designate them "chocolate bars".

 

This is the general counter to arguments like "if the AI is so smart, why would it do stuff we didn't mean?" and "why don't we just make it understand natural language and give it instructions in English?"

Not all theories of consciousness are created equal: a reply to Robert Lawrence Kuhn's recent article in Skeptic Magazine [Link]

3 oge 04 September 2016 08:35PM

I found this article on the Brain Preservation Foundation's blog that covers a lot of common theories of consciousness and shows how they kinna miss the point when it comes to determining if certain folks should or should not upload our brains if given the opportunity.

Hence I see no reason to agree with Kuhn’s pessimistic conclusions about uploading even assuming his eccentric taxonomy of theories of consciousness is correct.  What I want to focus on in the reminder of this blog is challenging the assumption that the best approach to consciousness is tabulating lists of possible theories of consciousness and assuming they each deserve equal consideration (much like the recent trend in covering politics to give equal time to each position regardless of any empirical relevant considerations). Many of the theories of consciousness on Kuhn’s list, while reasonable in the past, are now known to be false based on our best current understanding of neuroscience and physics (specifically, I am referring to theories that require mental causation or mental substances). Among the remaining theories, some of them are much more plausible than others.

http://www.brainpreservation.org/not-all-theories-of-consciousness-are-created-equal-a-reply-to-robert-lawrence-kuhns-recent-article-in-skeptic-magazine/

Inefficient Games

14 capybaralet 23 August 2016 05:47PM

There are several well-known games in which the pareto optima and Nash equilibria are disjoint sets.
The most famous is probably the prisoner's dilemma.  Races to the bottom or tragedies of the commons typically have this feature as well.

I proposed calling these inefficient games.  More generally, games where the sets of pareto optima and Nash equilibria are distinct (but not disjoint), such as a stag hunt could be called potentially inefficient games.

It seems worthwhile to study (potentially) inefficient games as a class and see what can be discovered about them, but I don't know of any such work (pointers welcome!)

Non-Fiction Book Reviews

9 SquirrelInHell 11 August 2016 05:05AM

Time start 13:35:06

For another exercise in speed writing, I wanted to share a few book reviews.

These are fairly well known, however there is a chance you haven't read all of them - in which case, this might be helpful.

 

Good and Real - Gary Drescher ★★★★★

This is one of my favourite books ever. Goes over a lot of philosophy, while showing a lot of clear thinking and meta-thinking. Number one replacement for Eliezer's meta-philosophy, if it had not existed. The writing style and language is somewhat obscure, but this book is too brilliant to be spoiled by that. The biggest takeaway is the analysis of ethics of non-causal consequences of our choices, which is something that actually has changed how I act in my life, and I have not seen any similar argument in other sources that would do the same. This book changed my intuitions so much that I now pay $100 in counterfactual mugging without second thought.

 

59 Seconds - Richard Wiseman ★★★

A collection of various tips and tricks, directly based on studies. The strength of the book is that it gives easy but detailed descriptions of lots of studies, and that makes it very fun to read. Can be read just to check out the various psychology results in an entertaining format. The quality of the advice is disputable, and it is mostly the kind of advice that only applies to small things and does not change much in what you do even if you somehow manage to use it. But I still liked this book, and it managed to avoid saying anything very stupid while saying a lot of things. It counts for something.

 

What You Can Change and What You Can't - Martin Seligman ★★★

It is a heartwarming to see that the author puts his best effort towards figuring out what psychology treatments work, and which don't, as well as builiding more general models of how people work that can predict what treatments have a chance in the first place. Not all of the content is necessarily your best guess, after updating on new results (the book is quite old). However if you are starting out, this book will serve excellently as your prior, on which you can update after checking out the new results. And also in some cases, it is amazing that the author was right about them 20 years ago, and mainstream psychology is STILL not caught up (like the whole bullshit "go back to your childhood to fix your problems" approach, which is in wide use today and not bothered at all by such things as "checking facts").

 

Thinking, Fast and Slow - Daniel Kahneman ★★★★★

A classic, and I want to mention it just in case. It is too valuable not to read. Period. It turns out some of the studies the author used for his claims have been later found not to replicate. However the details of those results is not (at least for me) a selling point of this book. The biggest thing is the author's mental toolbox for self-analysis and analysis of biases, as well concepts that he created to describe the mechanisms of intuitive judgement. Learn to think like the author, and you are 10 years ahead in your study of rationality.

 

Crucial Conversations - Al Switzler, Joseph Grenny, Kerry Patterson, Ron McMillan ★★★★

I have almost dropped this book. When I saw the style, it reminded me so much of the crappy self-help books without actual content. But fortunately I have read on a litte more, and it turns out that even while the style is the same in the whole book and it has litte content for the amount of text you read, it is still an excellent book. How is that possible? Simple: it only tells you a few things, but the things it tells you are actually important and they work and they are amazing when you put them into practice. Also on the concept and analysis side, there is precious little but who cares as long as there are some things that are "keepers". The authors spend most of the book hammering the same point over and over, which is "conversation safety". And it is still a good book: if you get this one simple point than you have learned more than you might from reading 10 other books.

 

How to Fail at Almost Everything and Still Win Big - Scott Adams ★★★

I don't agree with much of the stuff that is in this book, but that's not the point here. The author says what he thinks, and also he himself encourages you to pass it through your own filters. Around one third of the book, I thought it was obviously true; another one third, I had strong evidence that told me the author made a mistake or got confused about something; and the remaining one third gave me new ideas, or points of view that I could use to produce more ideas for my own use. This felt kind of like having a conversation with any intelligent person you might know, who has different ideas from you. It was a healthy ratio of agreement and disagreement, such that leads to progress for both people. Except of course in this case the author did not benefit, but I did.

 

Time end: 14:01:54

Total time to write this post: 26 minutes 48 seconds

Average writing speed: 31.2 words/minute, 169 characters/minute

The same data calculated for my previous speed-writing post: 30.1 words/minute, 167 characters/minute

Causality and Moral Responsibility

24 Eliezer_Yudkowsky 13 June 2008 08:34AM

Followup toThou Art Physics, Timeless Control, Hand vs. Fingers, Explaining vs. Explaining Away

I know (or could readily rediscover) how to build a binary adder from logic gates.  If I can figure out how to make individual logic gates from Legos or ant trails or rolling ping-pong balls, then I can add two 32-bit unsigned integers using Legos or ant trails or ping-pong balls.

Someone who had no idea how I'd just done the trick, might accuse me of having created "artificial addition" rather than "real addition".

But once you see the essence, the structure that is addition, then you will automatically see addition whenever you see that structure.  Legos, ant trails, or ping-pong balls.

Even if the system is - gasp!- deterministic, you will see a system that, lo and behold, deterministically adds numbers.  Even if someone - gasp! - designed the system, you will see that it was designed to add numbers.  Even if the system was - gasp!- caused, you will see that it was caused to add numbers.

Let's say that John is standing in front of an orphanage which is on fire, but not quite an inferno yet; trying to decide whether to run in and grab a baby or two.  Let us suppose two slightly different versions of John - slightly different initial conditions.  They both agonize.  They both are torn between fear and duty.  Both are tempted to run, and know how guilty they would feel, for the rest of their lives, if they ran.  Both feel the call to save the children.  And finally, in the end, John-1 runs away, and John-2 runs in and grabs a toddler, getting out moments before the flames consume the entranceway.

This, it seems to me, is the very essence of moral responsibility - in the one case, for a cowardly choice; in the other case, for a heroic one.  And I don't see what difference it makes, if John's decision was physically deterministic given his initial conditions, or if John's decision was preplanned by some alien creator that built him out of carbon atoms, or even if - worst of all - there exists some set of understandable psychological factors that were the very substance of John and caused his decision.

continue reading »

Link: Re-reading Kahneman's Thinking, Fast and Slow

11 toomanymetas 04 July 2016 06:32AM

"A bit over four years ago I wrote a glowing review of Daniel Kahneman’s Thinking, Fast and Slow. I described it as a “magnificent book” and “one of the best books I have read”. I praised the way Kahneman threaded his story around the System 1 / System 2 dichotomy, and the coherence provided  by prospect theory.

What a difference four years makes. I will still describe Thinking, Fast and Slow as an excellent book – possibly the best behavioural science book available. But during that time a combination of my learning path and additional research in the behavioural sciences has led me to see Thinking, Fast and Slow as a book with many flaws."

Continued here: https://jasoncollins.org/2016/06/29/re-reading-kahnemans-thinking-fast-and-slow/

Zombies Redacted

33 Eliezer_Yudkowsky 02 July 2016 08:16PM

I looked at my old post Zombies! Zombies? and it seemed to have some extraneous content.  This is a redacted and slightly rewritten version.

continue reading »

Revitalizing Less Wrong seems like a lost purpose, but here are some other ideas

19 John_Maxwell_IV 12 June 2016 07:38AM

This is a response to ingres' recent post sharing Less Wrong survey results. If you haven't read & upvoted it, I strongly encourage you to--they've done a fabulous job of collecting and presenting data about the state of the community.

So, there's a bit of a contradiction in the survey results.  On the one hand, people say the community needs to do more scholarship, be more rigorous, be more practical, be more humble.  On the other hand, not much is getting posted, and it seems like raising the bar will only exacerbate that problem.

I did a query against the survey database to find the complaints of top Less Wrong contributors and figure out how best to serve their needs.  (Note: it's a bit hard to read the comments because some of them should start with "the community needs more" or "the community needs less", but adding that info would have meant constructing a much more complicated query.)  One user wrote:

[it's not so much that there are] overly high standards,  just not a very civil or welcoming climate . why write content for free and get trashed when I can go write a grant application or a manuscript instead?

ingres emphasizes that in order to revitalize the community, we would need more content.  Content is important, but incentives for producing content might be even more important.  Social status may be the incentive humans respond most strongly to.  Right now, from a social status perspective, the expected value of creating a new Less Wrong post doesn't feel very high.  Partially because many LW posts are getting downvotes and critical comments, so my System 1 says my posts might as well.  And partially because the Less Wrong brand is weak enough that I don't expect associating myself with it will boost my social status.

When Less Wrong was founded, the primary failure mode guarded against was Eternal September.  If Eternal September represents a sort of digital populism, Less Wrong was attempting a sort of digital elitism.  My perception is that elitism isn't working because the benefits of joining the elite are too small and the costs are too large.  Teddy Roosevelt talked about the man in the arena--I think Less Wrong experienced the reverse of the evaporative cooling EY feared, where people gradually left the arena as the proportional number of critics in the stands grew ever larger.

Given where Less Wrong is at, however, I suspect the goal of revitalizing Less Wrong represents a lost purpose.

ingres' survey received a total of 3083 responses.  Not only is that about twice the number we got in the last survey in 2014, it's about twice the number we got in 20132012, and 2011 (though much bigger than the first survey in 2009).  It's hard to know for sure, since previous surveys were only advertised on the LessWrong.com domain, but it doesn't seem like the diaspora thing has slowed the growth of the community a ton and it may have dramatically accelerated it.

Why has the community continued growing?  Here's one possibility.  Maybe Less Wrong has been replaced by superior alternatives.

  • CFAR - ingres writes: "If LessWrong is serious about it's goal of 'advancing the art of human rationality' then it needs to figure out a way to do real investigation into the subject."  That's exactly what CFAR does.  CFAR is a superior alternative for people who want something like Less Wrong, but more practical.  (They have an alumni mailing list that's higher quality and more active than Less Wrong.)  Yes, CFAR costs money, because doing research costs money!
  • Effective Altruism - A superior alternative for people who want something that's more focused on results.
  • Facebook, Tumblr, Twitter - People are going to be wasting time on these sites anyway.  They might as well talk about rationality while they do it.  Like all those phpBB boards in the 00s, Less Wrong has been outcompeted by the hot new thing, and I think it's probably better to roll with it than fight it.  I also wouldn't be surprised if interacting with others through social media has been a cause of community growth.
  • SlateStarCodex - SSC already checks most of the boxes under ingres' "Future Improvement Wishlist Based On Survey Results".  In my opinion, the average SSC post has better scholarship, rigor, and humility than the average LW post, and the community seems less intimidating, less argumentative, more accessible, and more accepting of outside viewpoints.
  • The meatspace community - Meeting in person has lots of advantages.  Real-time discussion using Slack/IRC also has advantages.

Less Wrong had a great run, and the superior alternatives wouldn't exist in their current form without it.  (LW was easily the most common way people heard about EA in 2014, for instance, although sampling effects may have distorted that estimate.)  But that doesn't mean it's the best option going forward.

Therefore, here are some things I don't think we should do:

  • Try to be a second-rate version of any of the superior alternatives I mentioned above.  If someone's going to put something together, it should fulfill a real community need or be the best alternative available for whatever purpose it serves.
  • Try to get old contributors to return to Less Wrong for the sake of getting them to return.  If they've judged that other activities are a better use of time, we should probably trust their judgement.  It might be sensible to make an exception for old posters that never transferred to the in-person community, but they'd be harder to track down.
  • Try to solve the same sort of problems Arbital or Metaculus is optimizing for.  No reason to step on the toes of other projects in the community.

But that doesn't mean there's nothing to be done.  Here are some possible weaknesses I see with our current setup:

  • If you've got a great idea for a blog post, and you don't already have an online presence, it's a bit hard to reach lots of people, if that's what you want to do.
  • If we had a good system for incentivizing people to write great stuff (as opposed to merely tolerating great stuff the way LW culture historically has), we'd get more great stuff written.
  • It can be hard to find good content in the diaspora.  Possible solution: Weekly "diaspora roundup" posts to Less Wrong.  I'm too busy to do this, but anyone else is more than welcome to (assuming both people reading LW and people in the diaspora want it).

ingres mentions the possibility of Scott Alexander somehow opening up SlateStarCodex to other contributors.  This seems like a clearly superior alternative to revitalizing Less Wrong, if Scott is down for it:

  • As I mentioned, SSC already seems to have solved most of the culture & philosophy problems that people complained about with Less Wrong.
  • SSC has no shortage of content--Scott has increased the rate at which he creates open threads to deal with an excess of comments.
  • SSC has a stronger brand than Less Wrong.  It's been linked to by Ezra Klein, Ross Douthat, Bryan Caplan, etc.

But the most important reasons may be behavioral reasons.  SSC has more traffic--people are in the habit of visiting there, not here.  And the posting habits people have acquired there seem more conducive to community.  Changing habits is hard.

As ingres writes, revitalizing Less Wrong is probably about as difficult as creating a new site from scratch, and I think creating a new site from scratch for Scott is a superior alternative for the reasons I gave.

So if there's anyone who's interested in improving Less Wrong, here's my humble recommendation: Go tell Scott Alexander you'll build an online forum to his specification, with SSC community feedback, to provide a better solution for his overflowing open threads.  Once you've solved that problem, keep making improvements and subfora so your forum becomes the best available alternative for more and more use cases.

And here's my humble suggestion for what an SSC forum could look like:

As I mentioned above, Eternal September is analogous to a sort of digital populism.  The major social media sites often have a "mob rule" culture to them, and people are increasingly seeing the disadvantages of this model.  Less Wrong tried to achieve digital elitism and it didn't work well in the long run, but that doesn't mean it's impossible.  Edge.org has found a model for digital elitism that works.  There may be other workable models out there.  A workable model could even turn in to a successful company.  Fight the hot new thing by becoming the hot new thing.

My proposal is based on the idea of eigendemocracy.  (Recommended that you read the link before continuing--eigendemocracy is cool.)  In eigendemocracy, your trust score is a composite rating of what trusted people think of you.  (It sounds like infinite recursion, but it can be resolved using linear algebra.)

Eigendemocracy is a complicated idea, but a simple way to get most of the way there would be to have a forum where having lots of karma gives you the ability to upvote multiple times.  How would this work?  Let's say Scott starts with 5 karma and everyone else starts with 0 karma.  Each point of karma gives you the ability to upvote once a day.  Let's say it takes 5 upvotes for a post to get featured on the sidebar of Scott's blog.  If Scott wants to feature a post on the sidebar of his blog, he upvotes it 5 times, netting the person who wrote it 1 karma.  As Scott features more and more posts, he gains a moderation team full of people who wrote posts that were good enough to feature.  As they feature posts in turn, they generate more co-moderators.

Why do I like this solution?

  • It acts as a cultural preservation mechanism.  On reddit and Twitter, sheer numbers rule when determining what gets visibility.  The reddit-like voting mechanisms of Less Wrong meant that the site deliberately kept a somewhat low profile in order to avoid getting overrun.  Even if SSC experienced a large influx of new users, those users would only gain power to affect the visibility of content if they proved themselves by making quality contributions first.
  • It takes the moderation burden off of Scott and distributes it across trusted community members.  As the community grows, the mod team grows with it.
  • The incentives seem well-aligned.  Writing stuff Scott likes or meta-likes gets you recognition, mod powers, and the ability to control the discussion--forms of social status.  Contrast with social media sites where hyperbole is a shortcut to attention, followers, upvotes.  Also, unlike Less Wrong, there'd be no punishment for writing a low quality post--it simply doesn't get featured and is one more click away from the SSC homepage.

TL;DR - Despite appearances, the Less Wrong community is actually doing great.  Any successor to Less Wrong should try to offer compelling advantages over options that are already available.

Room For More Funding In AI Safety Is Highly Uncertain

12 Evan_Gaensbauer 12 May 2016 01:57PM

(Crossposted to the Effective Altruism Forum)


Introduction

In effective altruism, people talk about the room for more funding (RFMF) of various organizations. RFMF is simply the maximum amount of money which can be donated to an organization, and be put to good use, right now. In most cases, “right now” typically refers to the next (fiscal) year.  Most of the time when I see the phrase invoked, it’s to talk about individual charities, for example, one of Givewell’s top-recommended charities. If a charity has run out of room for more funding, it may be typical for effective donors to seek the next best option to donate to.
Last year, the Future of Life Institute (FLI) made the first of its grants from the pool of money it’s received as donations from Elon Musk and the Open Philanthropy Project (Open Phil). Since then, I've heard a few people speculating about how much RFMF the whole AI safety community has in general. I don't think that's a sensible question to ask before we have a sense of what the 'AI safety' field is. Before, people were commenting on only the RFMF of individual charities, and now they’re commenting of entire fields as though they’re well-defined. AI safety hasn’t necessarily reached peak RFMF just because MIRI has a runway for one more year to operate at their current capacity, or because FLI made a limited number of grants this year.

Overview of Current Funding For Some Projects


The starting point I used to think about this issue came from Topher Hallquist, from his post explaining his 2015 donations:

I’m feeling pretty cautious right now about donating to organizations focused on existential risk, especially after Elon Musk’s $10 million donation to the Future of Life Institute. Musk’s donation don’t necessarily mean there’s no room for more funding, but it certainly does mean that room for more funding is harder to find than it used to be. Furthermore, it’s difficult to evaluate the effectiveness of efforts in this space, so I think there’s a strong case for waiting to see what comes of this infusion of cash before committing more money.


My friend Andrew and I were discussing this last week. In past years, the Machine Intelligence Research Institute (MIRI) has raised about $1 million (USD) in funds, and received more than that  for their annual operations last year. Going into 2016, Nate Soares, Executive Director of MIRI, wrote the following:

Our successful summer fundraiser has helped determine how ambitious we’re making our plans; although we may still slow down or accelerate our growth based on our fundraising performance, our current plans assume a budget of roughly $1,825,000 per year [emphasis not added].


This seems sensible to me as it's not too much more than what they raised last year, and it seems more and not less money will be flowing into AI safety in the near future. However, Nate also had plans for how MIRI could've productively spent up to $6 million last year, to grow the organization. So, far from MIRI believing it had all the funding it could use, it was seeking more. Of course, others might argue MIRI or other AI safety organizations already receive enough funding relative to other priorities, but that is an argument for a different time.

Andrew and I also talked about how, had FLI had enough funding to grant money to all the promising applicants for its 2015 grants in AI safety research, that would have been millions more flowing into AI safety. It’s true what Topher wrote: that, being outside of FLI, and not otherwise being a major donor, it may be exceedingly difficult for individuals to evaluate funding gaps in AI safety. While FLI has only received $11 million to grant in 2015-16 ($6 million already granted in 2015, with $5 million more to be granted in the coming year), they could easily have granted more than twice that much, had they received the money.

To speak to other organizations, Niel Bowerman, Assistant Director at the Future of Humanity Institute (FH)I, recently spoke about how FHI receives most of its funding exclusively for research, and bottlenecks like the operations he runs more depend on private donations FHI could use more of.  Sean O HEigeartaigh, Executive Director at the Centre for the Study of Existential Risk (CSER), at Cambridge University, recently stated in discussion that CSER and the Leverhulme Centre for the Future of Intelligence (CFI), which CSER is currently helping launch, face the same problem with their operations. Nick Bostrom, author of Superintelligence, and Director of FHI, is in the course of launching the Strategic Artificial Intelligence Research Centre (SAIRC), which received $1.5 million (USD) in funding from FLI. SAIRC seems good for funding for at least the rest of 2016.

 


The Big Picture
Above are the funding summaries for several organizations listed in Andrew Critch’s 2015 map of the existential risk reduction ecosystem.There are organizations working on existential risks other than those from AI, but they aren’t explicitly organized in a network the same way AI safety organizations are. So, in practice, the ‘x-risk ecosystem’ is mapable almost exclusively in terms of AI safety.

It seems to me the 'AI safety field', if defined just as the organizations and projects listed in Dr. Critch’s ecosystem map, and perhaps others closely related (e.g., AI Impacts), could have productively absorbed between $10 million and $25 million in 2016 alone. Of course, there are caveats rendering this a conservative estimate. First of all, the above is a contrived version of the AI safety "field", as there is plenty of research outside of this network popping up all the time. Second, I think the organizations and projects I listed above could've themselves thought of more uses for funding. Seeing as they're working on what is (presumably) the most important problem in the world, there is much millions more could do for foundational research on the AGI containment/control problem, safety research into narrow systems aside.


Too Much Variance in Estimates for RFMF in AI Safety

I've also heard people setting the benchmark for truly appropriate funding for AI safety to be in the ballpark of a trillion dollars. While in theory that may be true, on its face it currently seems absurd. I'm not saying there won't be a time in even the next several years when $1 trillion/year couldn't be used effectively. I'm saying that if there isn't a roadmap for how to increase the productive use of ~$10 million/year to AI safety, to $100 million to $1 billion dollars, talking about $1 trillion/year isn't practical. I don't even think there will be more than $1 billion on the table per year for the near future.

This argument can be used to justify continued earning to give on the part of effective altruists. That is, there is so much money, e.g., MIRI could use, it makes sense for everyone who isn't an AI researcher to earn to give. This might make sense if governments and universities give major funding to what they think is AI safety, give 99% of it to only robotic unemployment or something, miss the boat on the control problem, and MIRI gets a pittance of the money that will flow into the field. The idea that there is effectively something like a multi-trillion dollar ceiling for effective funding for AI safety is still unsound.

When the range for RFMF for AI safety ranges between $5-10 million (the amount of funding AI safety received in 2015) and $1 trillion, I feel like anyone not already well-within the AI safety community cannot reasonably make an estimate of how much money the field can productively use in one year.
On the other hand, there are also people who think that AI safety doesn’t need to be a big priority, or is currently as big a priority as it needs to be, so money spent funding AI safety research and strategy would be better spent elsewhere.

All this stated, I myself don’t have a precise estimate of how much capacity for funding the whole AI safety field will have in, say, 2017.

Reasonable Assumptions Going Forward

What I'm confident saying right now is:

  1. The amount of money AI safety could've productively used in 2016 alone is within an order of magnitude of $10 million, and probably less than $25 million, based on what I currently know.
  2. The amount of total funding available will likely increase year over year for the next several years. There could be quite dramatic rises.. The Open Philanthropy Project, worth $10+ billion (USD), recently announced AI safety will be their top priority next year, although this may not necessarily translate into more major grants in the next 12 months. The White House recently announced they’ll be hosting workshops on the Future of Artificial Intelligence, including concerns over risk. Also, to quote Stuart Russell (HT Luke Muehlhauser): "Industry [has probably invested] more in the last 5 years than governments have invested since the beginning of the field [in the 1950s]." This includes companies like Facebook, Baidu, and Google each investing tons of money into AI research, including Google’s purchase of DeepMind for $500 million in 2014. With an increasing number of universities and corporations investing money and talent into AI research, including AI safety, and now with major philanthropic foundations and governments paying attention to AI safety as well, it seems plausible the amount of funding for AI safety worldwide might balloon up to $100+ million in 2017 or 2018. However, this could just as easily not happen, and there's much uncertainty in projecting this.
  3. The field of AI safety will also grow year over year for the next several years. I doubt projects needing funding will grow as fast as the amount of funding available. This is because the rate at which institutions are willing to invest in growth will not only depend on how much money they're receiving now, but how much they can expect to receive in the future. Since how much those expectations reasonably vary is so uncertain, organizations are smartly conservative to hold their cards close to their chest. While OpenAI has pledged $1 billion for funding AI research in general, and not just safety, over the next couple decades, nobody knows if such funding will be available to organizations out of Oxford or Berkeley like AI Impacts MIRI, FHI or CFI. However,

 

  • i) increased awareness and concern over AI safety will draw in more researchers.
  • ii) the promise or expectation of more money to come may draw in more researchers seeking funding.
  • iii) the expanding field and the increased funding available will create a feedback loop in which institutions in AI safety, such as MIRI, make contingency plans to expand faster, if able to or need be.

Why This Matters

I don't mean to use the amount of funding AI safety has received in 2015 or 2016 as an anchor which will bias how much RFMF I think the field has. However, it seems more extreme lower or upper estimates I’ve encountered are baseless, and either vastly underestimate or overestimate how much the field of AI safety can productively grow each year. This is actually important to figure out.

80,000 Hours rates AI safety as perhaps the most important and neglected cause currently prioritized by the effective altruism movement. Consequently, 80,000 Hours recommends how similarly concerned people can work on the issue. Some talented computer scientists who could do best working in AI safety might opt to earn to give in software engineering or data science, if they conclude the bottleneck on AI safety isn’t talent but funding. Alternatively, small but critical organization which requires funding from value-aligned and consistent donors might fall through the cracks if too many people conclude all AI safety work in general is receiving sufficient funding, and chooses to forgo donating to AI safety. Many of us could make individual decisions going either way, but it also seems many of us could end up making the wrong choice. Assessments of these issues will practically inform decisions many of make over the next few years, determining how much of our time and potential we use fruitfully, or waste.

Everything above just lays out how estimating room for more funding in AI safety overall may be harder than anticipated, and to show how high the variance might be. I invite you to contribute to this discussion, as it only just starting. Please use the above info as a starting point to look into this more, or ask questions that will usefully clarify what we’re thinking about. The best fora to start further discussion seem to be the Effective Altruism Forum, LessWrong, or the AI Safety Discussion group on Facebook, where I initiated the conversation leading to this post.

The Web Browser is Not Your Client (But You Don't Need To Know That)

22 Error 22 April 2016 12:12AM

(Part of a sequence on discussion technology and NNTP. As last time, I should probably emphasize that I am a crank on this subject and do not actually expect anything I recommend to be implemented. Add whatever salt you feel is necessary)1


If there is one thing I hope readers get out of this sequence, it is this: The Web Browser is Not Your Client.

It looks like you have three or four viable clients -- IE, Firefox, Chrome, et al. You don't. You have one. It has a subforum listing with two items at the top of the display; some widgets on the right hand side for user details, RSS feed, meetups; the top-level post display; and below that, replies nested in the usual way.

Changing your browser has the exact same effect on your Less Wrong experience as changing your operating system, i.e. next to none.

For comparison, consider the Less Wrong IRC, where you can tune your experience with a wide range of different software. If you don't like your UX, there are other clients that give a different UX to the same content and community.

That is how the mechanism of discussion used to work, and does not now. Today, your user experience (UX) in a given community is dictated mostly by the admins of that community, and software development is often neither their forte nor something they have time for. I'll often find myself snarkily responding to feature requests with "you know, someone wrote something that does that 20 years ago, but no one uses it."

Semantic Collapse

What defines a client? More specifically, what defines a discussion client, a Less Wrong client?

The toolchain by which you read LW probably looks something like this; anyone who's read the source please correct me if I'm off:

Browser -> HTTP server -> LW UI application -> Reddit API -> Backend database.

The database stores all the information about users, posts, etc. The API presents subsets of that information in a way that's convenient for a web application to consume (probably JSON objects, though I haven't checked). The UI layer generates a web page layout and content using that information, which is then presented -- in the form of (mostly) HTML -- by the HTTP server layer to your browser. Your browser figures out what color pixels go where.

All of this is a gross oversimplification, obviously.

In some sense, the browser is self-evidently a client: It talks to an http server, receives hypertext, renders it, etc. It's a UI for an HTTP server.

But consider the following problem: Find and display all comments by me that are children of this post, and only those comments, using only browser UI elements, i.e. not the LW-specific page widgets. You cannot -- and I'd be pretty surprised if you could make a browser extension that could do it without resorting to the API, skipping the previous elements in the chain above. For that matter, if you can do it with the existing page widgets, I'd love to know how.

That isn't because the browser is poorly designed; it's because the browser lacks the semantic information to figure out what elements of the page constitute a comment, a post, an author. That information was lost in translation somewhere along the way.

Your browser isn't actually interacting with the discussion. Its role is more akin to an operating system than a client. It doesn't define a UX. It provides a shell, a set of system primitives, and a widget collection that can be used to build a UX. Similarly, HTTP is not the successor to NNTP; the successor is the plethora of APIs, for which HTTP is merely a substrate.

The Discussion Client is the point where semantic metadata is translated into display metadata; where you go from 'I have post A from user B with content C' to 'I have a text string H positioned above visual container P containing text string S.' Or, more concretely, when you go from this:

Author: somebody
Subject: I am right, you are mistaken, he is mindkilled.
Date: timestamp
Content: lorem ipsum nonsensical statement involving plankton....

to this:

<h1>I am right, you are mistaken, he is mindkilled.</h1>
<div><span align=left>somebody</span><span align=right>timestamp</span></div>
<div><p>lorem ipsum nonsensical statement involving plankton....</p></div>

That happens at the web application layer. That's the part that generates the subforum headings, the interface widgets, the display format of the comment tree. That's the part that defines your Less Wrong experience, as a reader, commenter, or writer.

That is your client, not your web browser. If it doesn't suit your needs, if it's missing features you'd like to have, well, you probably take for granted that you're stuck with it.

But it doesn't have to be that way.

Mechanism and Policy

One of the difficulties forming an argument about clients is that the proportion of people who have ever had a choice of clients available for any given service keeps shrinking. I have this mental image of the Average Internet User as having no real concept for this.

Then I think about email. Most people have probably used at least two different clients for email, even if it's just Gmail and their phone's built-in mail app. Or perhaps Outlook, if they're using a company system. And they (I think?) mostly take for granted that if they don't like Outlook they can use something else, or if they don't like their phone's mail app they can install a different one. They assume, correctly, that the content and function of their mail account is not tied to the client application they use to work with it.

(They may make the same assumption about web-based services, on the reasoning that if they don't like IE they can switch to Firefox, or if they don't like Firefox they can switch to Chrome. They are incorrect, because The Web Browser is Not Their Client)

Email does a good job of separating mechanism from policy. Its format is defined in RFC 2822 and its transmission protocol is defined in RFC 5321. Neither defines any conventions for user interfaces. There are good reasons for that from a software-design standpoint, but more relevant to our discussion is that interface conventions change more rapidly than the objects they interface with. Forum features change with the times; but the concepts of a Post, an Author, or a Reply are forever.

The benefit of this separation: If someone sends you mail from Outlook, you don't need to use Outlook to read it. You can use something else -- something that may look and behave entirely differently, in a manner more to your liking.

The comparison: If there is a discussion on Less Wrong, you do need to use the Less Wrong UI to read it. The same goes for, say, Facebook.

I object to this.

Standards as Schelling Points

One could argue that the lack of choice is for lack of interest. Less Wrong, and Reddit on which it is based, has an API. One could write a native client. Reddit does have them.

Let's take a tangent and talk about Reddit. Seems like they might have done something right. They have (I think?) the largest contiguous discussion community on the net today. And they have a published API for talking to it. It's even in use.

The problem with this method is that Reddit's API applies only to Reddit. I say problem, singular, but it's really problem, plural, because it hits users and developers in different ways.

On the user end, it means you can't have a unified user interface across different web forums; other forum servers have entirely different APIs, or none at all.2 It also makes life difficult when you want to move from one forum to another.

On the developer end, something very ugly happens when a content provider defines its own provision mechanism. Yes, you can write a competing client. But your client exists only at the provider's sufferance, subject to their decision not to make incompatible API changes or just pull the plug on you and your users outright. That isn't paranoia; in at least one case, it actually happened. Using an agreed-upon standard limits this sort of misbehavior, although it can still happen in other ways.

NNTP is a standard for discussion, like SMTP is for email. It is defined in RFC 3977 and its data format is defined in RFC 5536. The point of a standard is to ensure lasting interoperability; because it is a standard, it serves as a deliberately-constructed Schelling point, a place where unrelated developers can converge without further coordination.

Expertise is a Bottleneck

If you're trying to build a high-quality community, you want a closed system. Well kept gardens die by pacifism, and it's impossible to fully moderate an open system. But if you're building a communication infrastructure, you want an open system.

In the early Usenet days, this was exactly what existed; NNTP was standardized and open, but Usenet was a de-facto closed community, accessible mostly to academics. Then AOL hooked its customers into the system. The closed community became open, and the Eternal September began.3 I suspect, but can't prove, that this was a partial cause of the flight of discussion from Usenet to closed web forums.

I don't think that was the appropriate response. I think the appropriate response was private NNTP networks or even single servers, not connected to Usenet at large.

Modern web forums throw the open-infrastructure baby out with the open-community bathwater. The result, in our specific case, is that if we want something not provided by the default Less Wrong interface, it must be implemented by Less Wrongers.

I don't think UI implementation is our comparative advantage. In fact I know it isn't, or the Less Wrong UI wouldn't suck so hard. We're pretty big by web-forum standards, but we still contain only a tiny fraction of the Internet's technical expertise.

The situation is even worse among the diaspora; for example, at SSC, if Scott's readers want something new out of the interface, it must be implemented either by Scott himself or his agents. That doesn't scale.

One of the major benefits of a standardized, open infrastructure is that your developer base is no longer limited to a single community. Any software written by any member of any community backed by the same communication standard is yours for the using. Additionally, the developers are competing for the attention of readers, not admins; you can expect the reader-facing feature set to improve accordingly. If readers want different UI functionality, the community admins don't need to be involved at all.

A Real Web Client

When I wrote the intro to this sequence, the most common thing people insisted on was this: Any system that actually gets used must allow links from the web, and those links must reach a web page.

I completely, if grudgingly, agree. No matter how insightful a post is, if people can't link to it, it will not spread. No matter how interesting a post is, if Google doesn't index it, it doesn't exist.

One way to achieve a common interface to an otherwise-nonstandard forum is to write a gateway program, something that answers NNTP requests and does magic to translate them to whatever the forum understands. This can work and is better than nothing, but I don't like it -- I'll explain why in another post.

Assuming I can suppress my gag reflex for the next few moments, allow me to propose: a web client.

(No, I don't mean write a new browser. The Browser Is Not Your Client.4)

Real NNTP clients use the OS's widget set to build their UI and talk to the discussion board using NNTP. There is no fundamental reason the same cannot be done using the browser's widget set. Google did it. Before them, Deja News did it. Both of them suck, but they suck on the UI level. They are still proof that the concept can work.

I imagine an NNTP-backed site where casual visitors never need to know that's what they're dealing with. They see something very similar to a web forum or a blog, but whatever software today talks to a database on the back end, instead talks to NNTP, which is the canonical source of posts and post metadata. For example, it gets the results of a link to http://lesswrong.com/posts/message_id.html by sending ARTICLE message_id to its upstream NNTP server (which may be hosted on the same system), just as a native client would.

To the drive-by reader, nothing has changed. Except, maybe, one thing. When a regular reader, someone who's been around long enough to care about such things, says "Hey, I want feature X," and our hypothetical web client doesn't have it, I can now answer:

Someone wrote something that does that twenty years ago.

Here is how to get it.



  1. Meta-meta: This post took about eight hours to research and write, plus two weeks procrastinating. If anyone wants to discuss it in realtime, you can find me on #lesswrong or, if you insist, the LW Slack.

  2. The possibility of "universal clients" that understand multiple APIs is an interesting case, as with Pidgin for IM services. I might talk about those later.

  3. Ironically, despite my nostalgia for Usenet, I was a part of said September; or at least its aftermath.

  4. Okay, that was a little shoehorned in. The important thing is this: What I tell you three times is true.

View more: Next