Comment author: ozziegooen 10 October 2016 10:38:57PM *  1 point [-]
Comment author: tog 19 August 2015 03:15:15PM 0 points [-]

I like the name it sounds like you may be moving to - "guesstimate".

Comment author: ozziegooen 19 August 2015 05:33:16PM *  0 points [-]

Thanks!

Guesstimates as a thing aren't very specific, what I am proposing is at least a lot more involved than what has been typically considered a guesstimate. That said, very few people seem familiar with the old word, so it seems like it could b extended easily.

Comment author: jessicat 04 June 2015 06:01:12AM 3 points [-]

Thanks for the detailed response! I do think the framework can still work with my assumptions. The way I would model it would be something like:

  1. In the first stage, we have G->Fremaining (the research to an AGI->FAI solution) and Gremaining (the research to enough AGI for UFAI). I expect G->Fremaining < Gremaining, and a relatively low leakage ratio.
  2. after we have AGI->FAI, we have Fremaining (the research for the AGI to input to the AGI->FAI) and Gremaning (the research to enough AGI for UFAI). I expect Fremaining > Gremaining, and furthermore I expect the leakage ratio to be high enough that we are practically guaranteed to have enough AGI capabilities for UFAI before FAI (though I don't know how long before). Hence the strategic importance of developing AGI capabilities in secret, and not having them lying around for too long in too many hands. I don't really see a way of avoiding this: the alternative is to have enough research to create FAI but not a paperclip maximizer, which seems implausible (though it would be really nice if we could get this state!).

Also, it seems I had misinterpreted the part about rg and rf, sorry about that!

Comment author: ozziegooen 04 June 2015 06:36:10AM 2 points [-]

Good point.

I guess the most controversial, and hopefully false, assumption of this paper is #3: 'If Gremaining is reached before Fremaining, a UFAI will be created. If after, an FAI will be created.'

This basically is the AI Foom scenario, where the moment an AGI is created, it will either kill us or all or bring about utopia (or both).

If this is not the case, and we have a long time to work with the AGI as it develops to make sure it is friendly, then this model isn't very useful.

If we do assume these assumptions, I would also expect that we will reach Gremaining before Fremaining, or at least that a private organization will end up doing so. However, I am also very skeptical in the power of secrets. I think I find us reaching Fremaining first more likely than a private institution reaching Gremaining first, but hiding it until it later reaches Fremaining, though both may be very slim. If the US military or a similar group with a huge technological and secretive advantage were doing this, there could be more of a chance. This definitely seems like a game of optimizing small probabilities.

Either way, I think we definitely would agree here that the organization developing these secrets can strategically choose projects that deliver the high amounts of FAI research relative to the amount AGI research they will have to keep secretive. Begin with the easy, non-secretive wins and work from there.

We may need the specific technology to create a paperclip maximizer before we make an FAI, but if we plan correctly, we hopefully will be really close to reaching an FAI by that point.

Comment author: jessicat 03 June 2015 10:02:59PM *  13 points [-]

This model seems quite a bit different from mine, which is that FAI research is about reducing FAI to an AGI problem, and solving AGI takes more work than doing this reduction.

More concretely, consider a proposal such as Paul's reflective automated philosophy method, which might be able to be implemented using epsiodic reinforcement learning. This proposal has problems, and it's not clear that it works -- but if it did, then it would have reduced FAI to a reinforcement learning problem. Presumably, any implementations of this proposal would benefit from any reinforcement learning advances in the AGI field.

Of course, even if we a proposal like this works, it might require better or different AGI capabilities from UFAI projects. I expect this to be true for black-box FAI solutions such as Paul's. This presents additional strategic difficulties. However, I think the post fails to accurately model these difficulties. The right answer here is to get AGI researchers to develop (and not publish anything about) enough AGI capabilities for FAI without running a UFAI in the meantime, even though the capabilities to run it exist.

Assuming that this reflective automated philosophy system doesn't work, it could still be the case that there is a different reduction from FAI to AGI that can be created through armchair technical philosophy. This is often what MIRI's "unbounded solutions" research is about: finding ways you could solve FAI if you had a hypercomputer. Once you find a solution like this, it might be possible to define it in terms of AGI capabilities instead of hypercomputation, and at that point FAI would be reduced to an AGI problem. We haven't put enough work into this problem to know that a reduction couldn't be created in, say, 20 years by 20 highly competent mathematician-philosophers.

In the most pessimistic case (which I don't think is too likely), the task of reducing FAI to an AGI problem is significantly harder than creating AGI. In this case, the model in the post seems to be mostly accurate, except that it neglects the fact that serial advances might be important (so we get diminishing marginal progress towards FAI or AGI per additional researcher in a given year).

Comment author: ozziegooen 04 June 2015 05:19:54AM *  3 points [-]

[Edited: replaced Gremaining with Fremaining, which is what I originally meant]

Thanks for the comment jessicat! I haven't read those posts yet, will do more research on reducing FAI to an AGI problem.

A few responses & clarifications:

Our framework assumes the FAI research would happen before AGI creation. If we can research how to reduce FAI to an AGI problem in a way that would reliably make a future AGI friendly, then that amount of research would be our variable Fremaining. If that is quite easy to do, then that's fantastic; an AI venture would have an easy time, and the leakage ratio would be low enough to not have to worry about. Additional required capabilities that we'll find out we need would be added to Fremaining.

"I think the post fails to accurately model these difficulties." -> This post doesn't attempt to model the individual challenges to understand how large Fremaining actually is. That's probably a more important question than what we addressed, but one for a different model.

"The right answer here is to get AGI researchers to develop (and not publish anything about) enough AGI capabilities for FAI without running a UFAI in the meantime, even though the capabilities to run it exist." -> This paper definitely advocates for AGI researchers to develop FAI research while not publishing much AGI research. I agree that some internal AGI research will probably be necessary, but hope that it won't be a whole lot. If the tools to create an AGI were figured out, even if they were kept secret by an FAI research group, I would be very scared. Those would be the most important and dangerous secrets of all time, and I doubt they could be kept secret for very long (20 years max?)

"In this case, the model in the post seems to be mostly accurate, except that it neglects the fact that serial advances might be important (so we get diminishing marginal progress towards FAI or AGI per additional researcher in a given year)."

-> This paper purposefully didn't model research effort, but rather, abstract units of research significance. "the numbers of rg and rf don't perfectly correlate with the difficulty to reach them. It may be that we have diminishing marginal returns with our current levels of rg, so similar levels of rf will be easier to reach."

A model that would also take into account the effort required would require a few more assumptions and additional complexity. I prefer to start simple and work from there, so we at least know what people do agree on before adding additional complexity.

Comment author: 27chaos 03 June 2015 07:57:18PM 7 points [-]

This seems like a mathematical write up of a very simple idea. I dislike papers such as this. The theory itself could have been described in one sentence, and nothing other than the theory itself is presented here. No evidence of the theory's empirical value, no discussion of what the actual leakage ratio is or what barriers to Friendliness remain. A lot of math used as mere ornamentation.

Comment author: ozziegooen 03 June 2015 08:33:11PM 2 points [-]

The theory will be a lot more useful once actual leakage ratios are estimated. This paper was mathematically specific, because the purpose of it was to establish a few equations to use when estimating the Friendliness ratio and constraints to AI projects. It was written more to build a mathematical foundation for that than it was a simple intro of the ideas to most readers.

Obviously this was meant as more of a research article than a blog post, but we felt like LessWrong was a good place to publish it given the subject.

Comment author: [deleted] 08 January 2015 01:31:27AM 1 point [-]

I vouch for Ozzie Estimate.

I take shminux's point to be primarily one of ease, or maybe portability. The need to understand sensitivity in heuristical estimation is a real one, and I also believe that your tools here may be the right approach for a different level of scale than was originally conceived by Fermi. It might be worth clarifying the kinds of decisions that require the level of analysis involved with your method to prevent confusion.

Have you seen the work of Sanjoy Mahajan? Street-Fighting Mathematics, or The Art of Insight in Science and Engineering?

In response to comment by [deleted] on Graphical Assumption Modeling
Comment author: ozziegooen 08 January 2015 05:00:01AM 1 point [-]

I actually watched his TED talk last night. Will look more into his stuff.

The main issues I'm facing are understanding the math behind combining estimates and actually making the program right now. However, he definitely seems to be one of the top world experts on actually making these kinds of models.

Comment author: Kenny 07 January 2015 07:08:21PM *  3 points [-]

And even cooler if (web) discussions of models included embedded diagrams like what you've produced.

Comment author: ozziegooen 08 January 2015 04:58:44AM 0 points [-]

Good point

Comment author: ozziegooen 06 January 2015 06:57:40AM 0 points [-]

Quick comment; I'm still having a lot of questions with the problem of combining estimate probability distributions. If any of you know of good research on how to combined large group estimates / probability distributions I would be very interested. I realize that the field of 'decision research' and similar is quite significant, but the specific math for combining probabilistic estimates is something I'm having a hard time finding literature on. (Much of this may be because a lot of it is behind academic paywalls)

Comment author: shminux 03 January 2015 08:56:20PM 1 point [-]

Re your combined and larger models:

If your Fermi estimate does not fit on the back of an envelope, it's no longer a Fermi estimate.

Comment author: ozziegooen 03 January 2015 09:10:54PM 2 points [-]

Perhaps 'Fermi estimate' was not the best term to use but I couldn't think of an equally understandable but better one. It could be called simply 'estimate', but I think the important thing here is that its used very similarly to how a Fermi estimate would be (with very high uncertainty of the inputs, and done in a very simple manner). What would you call it? (http://lesswrong.com/lw/h5e/fermi_estimates/).

Graphical Assumption Modeling

13 ozziegooen 03 January 2015 08:22PM

The Flaws of Fermi Estimates

Why don’t we use more Fermi estimates?[1] Many of us want to become more rational. We have lots of numbers we can think of and important variables to consider. There are a few reasons.

Fermi calculations get really messy. After a few variables introduced, they could quickly become difficult to imagine and outline a problem. Many people, especially those who were not used to writing academic papers, do not practice the skills of formalizing inputs and outputs. It can be tedious for those who do.

Fermi models typically do not include estimates of certainty. Certainty propagates. It creates bottlenecks. As a Fermi model grows, specific uncertain assumptions could underscore the result. Certainty estimates are typically not measured, and when they are they require formalization and significant calculation.

Fermi calculations are not fun to share. Most of them are pretty simple; they just involve multiplication and addition and 3–5 variables. However, in order to write them one must formalize them as few lines of math him or few long paragraphs which really should be math.

We propose the use of simple graphical models in order to represent estimates and Fermi models. We think these have the capacity to solve the issues mentioned above and make complex estimations more simple, more sharable, and more calculable. A formal and rigorous graphical model could not only improve on existing Fermi calculations, but it could also extend them to functions they have not yet been used for.

Multiplication

Let’s say we are trying to estimate the number of smiles per day in a park. A first attempt at this may be to guess the number of people in the park and to estimate the number of smiles on average per person in the park.

This is easy to calculate directly. 100 People x 10 smiles/(day * person) = 1000 smiles/day.

As a model, we can represent the variables as lines and the function as a box in between them. This fits nicely with similar diagramming standards. The function of multiplication acts as an object with inputs and outputs.

multiplication_1.png

Independent variables, or user selected variables, are shown in black, and dependent variables are shown in blue.

We can condense this diagram by moving the number of smiles per day per person into the multiplication block.

multiplication_2.png

Say we wanted to find the total smiles per year in the park. We can simply extend the model as follows.

multiplication_3.png

Addition

Perhaps we think that kids and adults have different rates of smiling and would like to separate our model accordingly. We estimate the number of kids in the park, the number of adults in the park, and their corresponding smiling estimates. Then we add them with a similar block as we used for multiplication.

addition_1.png

Uncertainty

If we have uncertainty estimates we can make them explicit. Estimates of certainty typically get left out of Fermi calculations, but become essential when making large models.

addition_1.png

It is not clear what the best way is to annotate an uncertainty interval. In this case, the intervals described are meant as 90% Gaussian confidence intervals, but these could vary. They do not have to be Gaussian-like intervals, but could be complex probability distributions. These may require graphical representations and additional software. However, for many estimations, even simple models of uncertainty would be advantageous.

Estimate Combination

If two people give two estimates for a number, they could be combined to find the resulting probability distribution.

combination_estimates.png

Uncertainty distributions are valuable for this. If two agents both state their uncertainty distributions, we can find a weighted average of their estimations with a calculated resulting uncertainty distribution.

Model Combination

We can combine models by combining their resulting estimates. So far we have shown two unique attempts at modeling the number of smiles in a park. They produced the same unit output, so they can be combined.

combination_models.png

Both of them still have predictive power, and a combination could produce a more accurate estimate than either alone. The model with greater certainty, in this case the adult/child split model, will have more influence in the final calculation, but it will still be moderated by it. Combining many properly calibrated models will always give a more accurate result.

Abstraction

Large sections can be combined into black boxes.[2] Black boxes can be used to summarize large models into simple objects with specified inputs and outputs. This means that one can work on a very large total model in small pieces and have it be manageable.

black boxing.png

Decision Making

Say we must decide between two options. One common way to do so is to estimate a value for each, and choose the one with a higher (or lower) value.

decision making.png

In this case we make a decision of which lemonade will sell better. We use a decision ‘block’, which could hold any arbitrary decision function. In this case, it simply outputs the value of the highest input value.

This can be useful if one can assume the use of the best option of alternatives. In a larger model, there may be many decisions determined by model. The outputs of these decisions could be used for later estimations or decisions.

Larger Models

These techniques can be combined to produce large and intricate models. As these increase in size they can become more valuable.

complex.png

In the model above, a person is attempting to find the best use of their time to produce money. There are several options to sell lemonade, and there’s also the opportunity to work overtime. The estimator makes an estimate for each and uses the model to understand them in relation to each other.

This larger model demonstrates the option of configuration in these models. The profit percentage of lemonade sales was expected to be similar for different kinds of lemonade in different locations. It could have instead been multiplied individually for each one, but it was simpler to move it after the decision block between them.

In this case it may have been reasonable to use a table instead of a graphical model. However, a table would not necessarily demonstrate the unique constraints and considerations of each type of input. For instance, lemonade sales had a margin of profit, and overtime work had a different net income number. In tables many of the important calculations are often difficult to read at the same time as the data. We believe this form of modeling helps make the numbers understandable as well as the assumptions and certainties that go into those numbers.

Possible Automated Analysis

Once we arrive at the model above, we would have enough information to calculate the value of information (VOI) of additional certainty for each metric. For instance, a reduction of uncertainty of the variable ‘Regular Lemonade at Dolores Park’ to 0 could produce an expected few dollars per hour, assuming that resulting decisions would be made using the model.

The value of new options could also be calculated easily if one could come up with a probability distribution of their expected earnings per hour.

While these kinds of analysis are well established in academia, they are currently difficult to use. If estimations could be simply mapped, it may make them significantly more accessible.

Similar Work

This work can be seen as similar to Unified Modeling Language (UML) in that it attempts to graphically specify a complex system of knowledge. UML was an attempt to define a graphical language for software architecture. There were claims that programs that produced UML could be used to produce their corresponding programs. This hasn’t really happened. The UML spec went through several versions and became so specific and complex that few programmers now bother with it. However, it did encourage the use of whiteboard modeling for other programmers and experiences some popularity with larger projects.

Graphical computer software is challenging. Most attempts have failed, but a few companies have had success with it. LabView is a popular visual programming tool used by scientists and engineers. It uses a Dataflow programming paradigm, which would also be appropriate for Graphical Assumption Modeling.

The theory of this work is similar to that of Probabilistic Graphical Models. These are typically more formal models aimed at computer input and output rather than direct human interaction.

Future Work

This research is very young. The diagrams could use more experimentation and exploration. We have not included a method for subtraction or division, for example. Even if they were better established, it could take a long time for them to become accepted by other communities.

It’s obvious that if these models are useful, it would be valuable to have a computer program to make them. Ozzie Gooen has made a simple attempt called Fermihub. Fermihub is functional, free, and open source. However, it applies only a few simple analytic approximations and does not incorporate Monte Carlo simulations. For accurate or large models, Monte Carlo simulations will be necessary.

There could be more research done in this kind of estimation. While much of the math has already been solved, the art of efficiently creating large models and collaborating with others has a lot of work left. There is also some debate on the proper way to combine estimates, which is crucial for large models.


Note: I realize that the math in the models above, specifically in the combinations of estimates, is incorrect.  I'm currently investigating how to do it correctly.   

References

  1. Fermi Estimates, LukeProg. 2013
  2. See wikipedia for a high level understanding of black boxes. They are a fundamental unit for systems research, which in part has lead to many diagrams we see today.

View more: Next