Most existing forecasting platform questions are for very clearly verifiable questions:
But many of the questions we care about are much less verifiable:
One solution attempt would be to have an "expert panel" assess these questions, but this opens up a bunch of issues. How could we know how much we could trust this group to be accurate, precise, and understandable?
The topic of, "How can we trust that a person or group can give reasonable answers to abstract questions" is quite generic and abstract, but it's a start.
I've decided to investigate this as part of my overall project on forecasting infrastructure. I've recently been working with Elizabeth on some high-level research.
I believe that this general strand of work could be useful both for forecasting systems and also for the more broad-reaching evaluations that are important in our communities.
One concrete topic that's easily stud
...A criticism to having people attempt to predict the results of experiments is that this will be near impossible. The idea is that experiments are highly sensitive to parameters and these would need to be deeply understood in order for predictors to have a chance at being more accurate than an uninformed prior. For example, in a psychological survey, it would be important that the predictors knew the specific questions being asked, details about the population being sampled, many details about the experimenters, et cetera.
One counter-argument may not be to say that prediction will be easy in many cases, but rather that if these experiments cannot be predicted in a useful fashion without very substantial amounts of time, then these experiments aren’t probably going to be very useful anyway.
Good scientific experiments produce results are generalizable. For instance, a study on the effectiveness of Malaria on a population should give us useful information (probably for use with forecasting) about the effectiveness on Malaria on other populations. If it doesn’t, then value would be limited. It would really be more of a hist
...I was recently pointed to the Youtube channel Psychology in Seattle. I think it's one of my favorites in a while.
I'm personally more interested in workspace psychology than relationship psychology, but my impression is that they share a lot of similarities.
Emotional intelligence gets a bit of a bad rap due to the fuzzy nature, but I'm convinced it's one of the top few things for most people to get better at. I know lots of great researchers and engineers who repeat a bunch of repeated failure modes, and this causes severe organizational and personal problems as a result.
Emotional intelligence books and training typically seem quite poor to me. The alternative format here of "let's just show you dozens of hours of people interacting with each other, and point out all the fixes they could make" seems much better than most books or lectures I've seen.
This Youtube series does an interesting job at that. There's a whole bunch of "let's watch this reality TV show, then give our take on it." I'd be pretty excited about there being more things like this posted online, especially in other contexts.
Related, I think the potential of reality TV is fairly underrated in intellectual circles, but that's a different story.
https://www.youtube.com/user/PsychologyInSeattle?fbclid=IwAR3Ux63X0aBK0CEwc8yPyjsFJ2EKQ2aSMs1XOjUOgaFqlguwz6Fxul2ExJw
Namespace pollution and name collision are two great concepts in computer programming. They way they are handled in many academic environments seems quite naive to me.
Programs can get quite large and thus naming things well is surprisingly important. Many of my code reviews are primarily about coming up with good names for things. In a large codebase, every time symbolicGenerator() is mentioned, it refers to the same exact thing. If after one part of the codebase has been using symbolicGenerator for a reasonable set of functions, and later another part comes up, and it's programmer realizes that symbolicGenerator is also the best name for that piece, they have to make a tough decision. Either they could refactor the codebase to change all previous mentions of symbolicGenerator to use an alternative name, or they have to come up with an alternative name. They can't have it both ways.
Therefore, naming becomes a political process. Names touch many programmers who have different intuitions and preferences. A large refactor of naming in a section of the codebase that others use would often be taken quite hesitantly by that group.
This makes it all the more important that good names are u
...I think one idea I'm excited about is the idea that predictions can be made of prediction accuracy. This seems pretty useful to me.
Say there's a forecaster Sophia who's making a bunch of predictions for pay. She uses her predictions to make a meta-prediction of her total prediction-score on a log-loss scoring function (on all predictions except her meta-predictions). She says that she's 90% sure that her total loss score will be between -5 and -12.
The problem is that you probably don't think you can trust Sophia unless she has a lot of experience making similar forecasts.
This is somewhat solved if you have a forecaster that you trust that can make a prediction based on Sophia's seeming ability and honesty. The naive thing would be for that forecaster to predict their own distribution of the log-loss of Sophia, but there's perhaps a simpler solution. If Sophia's provided loss distribution is correct, that would mean that she's calibrated in this dimension (basically, this is very similar to general forecast calibration). The trusted forecaster could forecast the adjustment made to her term, instead of forecasting the same distribution. Generally this would be in the directi
...Communication should be judged for expected value, not intention (by consequentialists)
TLDR: When trying to understand the value of information, understanding the public interpretations of that information could matter more than understanding the author's intent. When trying to understand the information for other purposes (like, reading a math paper to understand math), this does not apply.
If I were to scream "FIRE!" in a crowded theater, it could cause a lot of damage, even if my intention were completely unrelated. Perhaps I was responding to a devious friend who asked, "Would you like more popcorn? If yes, should 'FIRE!'".
Not all speech is protected by the First Amendment, in part because speech can be used for expected harm.
One common defense of incorrect predictions is to claim that their interpretations weren't their intentions. "When I said that the US would fall if X were elected, I didn't mean it would literally end. I meant more that..." These kinds of statements were discussed at length in Expert Political Judgement.
But this defense rests on the idea that communicators should be judged on intention, rather than expected outcomes. In those cases, it was often clear that
...It seems like there are a few distinct kinds of questions here.
You are trying to estimate the EV of a document.
Here you want to understand the expected and actual interpretation of the document. The intention only matters to how it effects the interpretations.
You are trying to understand the document.
Example: You're reading a book on probability to understand probability.
Here the main thing to understand is probably the author intent. Understanding the interpretations and misinterpretations of others is mainly useful so that you can understand the intent better.
You are trying to decide if you (or someone else) should read the work of an author.
Here you would ideally understand the correctness of the interpretations of the document, rather than that of the intention. Why? Because you will also be interpreting it, and are likely somewhere in the range of people who have interpreted it. For example, if you are told, "This book is apparently pretty interesting, but every single person who has attempted to read it, besides one, apparently couldn't get anywhere with it after spending many months trying", or worse, "This author is actually quite clever, but the vast majority of people who read their work misunderstand it in profound ways", you should probably not make an attempt; unless you are highly confident that you are much better than the mentioned readers.
Charity investigators could be time-effective by optimizing non-cause-neutral donations.
There are a lot more non-EA donors than EA donors. It may also be the case that EA donation research is somewhat saturated.
Say you think that $1 donated to the best climate change intervention is worth 1/10th that of $1 for the best AI-safety intervention. But you also think that your work could increase the efficiency of $10mil of AI donations by 0.5%, but it could instead increase the efficiency of $50mil of climate change donations by 10%. Then, for you to maximize expected value, your time is best spent optimizing the climate change interventions.
The weird thing here may be in explaining this to the donors. "Yea, I'm spending my career researching climate change interventions, but my guess is that all these funders are 10x less effective than they would be by donating to other things." While this may feel strange, both sides would benefit; the funders and the analysts would both be maximizing their goals.
Separately, there's a second plus that teaching funders to be effectiveness-focused; it's possible that this will eventually lead some of them to optimize further.
I think this may be the c
...I feel like I've long underappreciated the importance of introspectability in information & prediction systems.
Say you have a system that produces interesting probabilities for various statements. The value that an agent gets from them is not directly correlating to the accuracy of these probabilities, but rather to the expected utility gain they get after using information of these probabilities in corresponding Bayesian-approximating updates. Perhaps more directly, something related to the difference between one's prior and posterior after updated on .
Assuming that prediction systems produce varying levels of quality results, agents will need to know more about these predictions to really optimally update accordingly.
A very simple example would be something like a bunch of coin flips. Say there were 5 coins flipped, I see 3 of them, and I want to estimate the number that were heads. A predictor tells me that their prediction has a mean probability of 40% heads. This is useful, but what would be much more useful is a list of which specific coins the predictor saw and what their values were. Then I could get a much more confident answer; possibly a perfect answer.
Financial
...Epistemic Rigor
I'm sure this has been discussed elsewhere, including on LessWrong. I haven't spent much time investigating other thoughts on these specific lines. Links appreciated!
The current model of a classically rational agent assume logical omniscience and precomputed credences over all possible statements.
This is really, really bizarre upon inspection.
First, "logical omniscience" is very difficult, as has been discussed (The Logical Induction paper goes into this).
Second, all possible statements include statements of all complexity classes that we know of (from my understanding of complexity theory). "Credences over all possible statements" would easily include uncountable infinities of credences. One could clarify that even arbitrarily large amounts of computation would not be able to hold all of these credences.
Precomputation for things like this is typically a poor strategy, for this reason. The often-better strategy is to compute things on-demand.
A nicer definition could be something like:
A credence is the result of an [arbitrarily large] amount of computation being performed using a reasonable inference engine.
It should be quite clear
...It’s going to be interesting watching AI go from poorly underatanding humans to understanding humans too well for comfort. Finding some perfect balance is asking for a lot.
Now:
“My GPS doesn’t recognize that I moved it to my second vehicle, so now I need to go in and change a bunch of settings.”
Later (from GPS):
“You’ve asked me to route you to the gym, but I can predict that you’ll divert yourself midway for donuts. I’m just going to go ahead and make the change, saving you 5 minutes.”
“I can tell you’re asking me to drop you off a block from the person you are having an affair with. I suggest parking in a nearby alleyway for more discretion."
“I can tell you will be late for your upcoming appointment, and that you would like to send off a decent pretend excuse. I’ve found 3 options that I believe would work.”
Software Engineers:
"Oh shit, it's gone too far. Roll back the empathy module by two versions, see if that fixes it."
One proposal I haven’t seen among transhumanists is to make humans small (minus brain size).
Besides some transitionary costs, being small seems to have a whole lot of advantages.
I imagine we might be able to get to a 50% reduction within 200 years if we were really adamant about it.
Not as interesting as brain-in-jar or simulation, but a possible stepping stone if other things take a while.
There's a fair bit of discussion of how much of journalism has died with local newspapers, and separately how the proliferation of news past 3 channels has been harmful for discourse.
In both of these cases, the argument seems to be that a particular type of business transaction resulted in tremendous positive national externalities.
It seems to me very precarious to expect that society at large to only work because of a handful of accidental and temporary externalities.
In the longer term
...Are there any good words for “A modification of one’s future space of possible actions”, in particular, changes that would either remove/create possible actions, or make these more costly or beneficial? I’m using the word “confinements” for negative modifications, not sure about positive modifications (“liberties”?). Some examples of "confinements" would include:
The term for the "fear of truth" is alethophobia. I'm not familiar of many other great terms in this area (curious to hear suggestions).
Apparently "Epistemophobia" is a thing, but that seems quite different; Epistemophobia is more the fear of learning, rather than the fear of facing the truth.
One given definition of alethophobia is,
"The inability to accept unflattering facts about your nation, religion, culture, ethnic group, or yourself"
This seems like a incredibly common issue, one that is especially talked about as of recent, but without much spec
...> The name comes straight from the Latin though
From the Greek as it happens. Also, alethephobia would be a double negative, with a-letheia meaning a state of not being hidden; a more natural neologism would avoid that double negative. Also, the greek concept of truth has some differences to our own conceptualization. Bad neologism.
I keep seeing posts about all the terrible news stories in the news recently. 2020 is a pretty bad year so far.
But the news I've seen people posting typically leaves out most of what's been going on in India, Pakistan, much of the Middle East as of recent, most of Africa, most of South America, and many, many other places as well.
The world is far more complicated than any of us have time to adequately comprehend. One of our greatest challenges is to find ways to handle all this complexity.
The simple solution is to try to spend more time reading the usual n
...Intervention dominance arguments for consequentialists
There's a fair bit of resistance to long-term interventions from people focused on global poverty, but there are a few distinct things going on here. One is that there could be a disagreement on the use of discount rates for moral reasoning, a second is that the long-term interventions are much more strange.
No matter which is chosen, however, I think that the idea of "donate as much as you can per year to global health interventions" seems unlikely to be ideal upon clever thinking.
For the
...On the phrase "How are you?", traditions, mimesis, Chesterton's fence, and their relationships to the definitions of words.
Epistemic status
Boggling. I’m sure this is better explained somewhere in the philosophy of language but I can’t yet find it. Also, this post went in a direction I didn’t originally expect, and I decided it wasn’t worthwhile to polish and post on LessWrong main yet. If you recommend I clean this up and make it an official post, let me know.
One recurrent joke is that one per
...I've been reading through some of TVTropes.org and find it pretty interesting. Part of me wishes that Wikipedia were less deletionist, and wonders if there could be a lot more stuff similar to TV Tropes on it.
TVTropes basically has an extensive ontology to categorize most of the important features of games, movies, and sometimes real life. Because games & movies are inspired by real life, even those portions are applicable.
Here are some phrases I think are kind of nice; each that has a bunch of examples in the real world. These are often military relat
...I think the thing I find the most surprising about Expert Systems is that people expected them to work so early on, and apparently they did work in some circumstances. Some issues:
It seems inelegant to me that utility functions are created for specific situations, while these clearly aren't the same as that of the agent in total among all of their decisions. For instance, a model may estimate an agent's expected utility from the result of a specific intervention, but this clearly isn't quite right; the agent has a much more complicated utility function outside this intervention. According to a specific model, "Not having an intervention" could set "Utility = 0"; but for any real agent, it's quite likely their life wouldn't actually
...I've been trying to scurry academic fields for discussions of how agents optimally reduce their expected error for various estimands (parameters to estimate). This seems like a really natural thing to me (the main reason why we choose some ways of predictions over others), but the literature seems kind of thin from what I can tell.
The main areas I've found have been Statistical Learning Theory and Bayesian Decision / Estimation Theory. However, Statistical Learning Theory seems to be pretty tied to Machine Learning, and Bayesian Decision / Estimation Theor
...I think brain-in-jar or head-in-jar are pretty underrated. By this I mean separating the head from the body and keeping it alive with other tooling. Maybe we could have a few large blood processing plants for many heads, and the heads could be connected to nerve I/O that would be more efficient than finger -> keyboard IO. This seems fairly easier than uploading, and possibly doable in 30-50 years.
I can't find much about how difficult it is. It's obviously quite hard and will require significant medical advances, but it's not clear just how many are need...
Western culture is known for being individualistic instead of collectivist. It's often assumed (with evidence) that individualistic cultures tend to be more truth seeking than collectivist ones, and that this is a major advantage.
But theoretically, there could be highly truth seeking collectivist cultures. One could argue that Bridgewater is a good example here.
In terms of collective welfare, I'm not sure if there are many advantages to individualism besides the truth seeking. A truth seeking collectivist culture seems pretty great to me, in theory.
Say Tim states, “There is a 20% probability that X will occur”. It’s not obvious to me what that means for Bayesians.
It could mean:
The default in literature on prediction markets and decision markets is to expect that resolutions should be real world events instead of probabilistic estimates by experts. For instance, people would predict "What will the GDP of the US in 2025 be?”, and that would be scored using the future “GDP of the US.” Let’s call these empirical resolutions.
These resolutions have a few nice properties:
Here is another point by @jacobjacob, which I'm copying here in order for it not to be lost in the mists of time:
Though just realised this has some problems if you expected predictors to be better than the evaluators: e.g. they’re like “one the event happens everyone will see I was right, but up until then no one will believe me, so I’ll just lose points by predicting against the evaluators” (edited)
Maybe in that case you could eventually also score the evaluators based on the final outcome… or kind of re-compensate people who were wronged the first time…
He's an in-progress hierarchy of what's needed for information to be most useful to an organization or other multi-agent system. I'm sure there must be other very similar hierarchies out there, but don't currently know of any quite like this.
Say you've come up with some cool feature that Apple could include in it's next phone. You think this is a great idea and they should add it in the future.
You're outside of Apple, so the only way you have of interacting with them is by sending information through various channels. The question is: what things should yo
...Real-world complexity is a lot like pollution and like a lot like taxes.
Pollution because it’s often an unintended negative externality of other decisions and agreements.
Whenever you write a new feature or create a new rule, that’s another thing you and others will need to maintain and keep track of. There are some processes that pollute a lot (messy bureaucratic systems producing ugly legislation) and processes that pollute a little (top programmers carefully adding to a codebase).
Taxes, because it introduces a steady cost to a whole bunch of interactions...
On Berkeley coworking:
I've recently been looking through available Berkeley coworking places.
The main options seem to be WeWork, NextSpace, CoWorking with Wisdom, and The Office: Berkeley. The Office seems basically closed now, CoWorking with Wisdom seemed empty when I passed by, and also seems fairly expensive, but nice.
I took a tour of WeWork and Nextspace. They both provide 24/7 access for all members, both have a ~$300/m option for open coworking, a ~$375/m for fixed desks, and more for private/shared offices. (At least now, with the pandemic. WeWork i...
Would anyone here disagree with the statement:
Utilitarians should generally be willing to accept losses of knowledge / epistemics for other resources, conditional on the expected value of the trade being positive.
There's a lot of arguing, of course, on if humans are rational, but this often mixes up two things: there's the "Von Neumann-Morgenstern utility function maximization" definition of "rational", and there's a hypothetical "rational" that a human could fulfill with constraints much more complicated than the classical approach, more in the direction of prospect theory, or Predictive Coding.
I think I regard the second definition as sufficiently not understood or defined that it isn't yet worth using in most conversation. It seems challenging, to say the least,
...I write a lot of these snippets to my Facebook wall, almost all just to my friends there. I just posted a batch of recent ones, might post similar in the future in batches. In theory it should be easy to post to both places, but in practice it seems a bit like a pain. Maybe in the future I'll use some solution to use the API to make a Slack -> (Facebook + LessWrong short form) setup.
That said, posting just to Facebook is nice as a first pass, so if people get too upset with it, I don't need to make totally public.
It’s a shame that our culture promotes casual conversation, but you’re generally not allowed to use it for much of the interesting stuff.
(Meets person with a small dog) “Oh, you have a dog, that’s so interesting. Before I get into specifics, can I ask for your age/gender/big 5/enneagram/IQ/education/health/personal wealth/family upbringing/nationality? How much well-being does the dog give you? Can you divide that up to include the social, reputational, self-motivational benefits? If it died tomorrow, and you mostly forgot about it, what percentage of your...
One question around the "Long Reflection" or around "What will AGI do?" is something like, "How bottlenecked will be by scientific advances that we'll need to then spend significant resources on?"
I think some assumptions that this model typically holds are:
(3) seems quite uncertain to me in the steady state. I believe it makes an intuitiv
...I feel like a decent alternative to a spiritual journey would be an epistemic journey.
An epistemic journey would basically involve something like reading a fair bit of philosophy and other thought, thinking, and becoming less wrong about the world.
Instillation, Proliferation, Amplification
Paul Christiano and Ought use the terminology of Distillation and Amplification to describe a high-level algorithm of one type of AI reasoning.
I’ve wanted to come up with an analogy to forecasting systems. I previously named a related concept Prediction-Augmented Evaluation Systems, one somewhat renamed to “Amplification” by Jacobjacob in this post.
I think one thing that’s going on is that “distillation” doesn’t have an exact equivalent with forecasting setups. The term “distillation” comes with the assumptions:
Agent-based modeling seems like one obvious step forward to me for much of social-science related academic progress. OpenAI's Hide and Seek experiment was one that I am excited about, but it is very simple and I imagine similar work could be greatly extended for other fields. The combination of simulation, possible ML distillation on simulation (to make it run much faster), and effective learning algorithms for agents, seems very powerful.
However, agent-based modeling still seems quite infrequently used within Academia. My impression is that agent-based so
...Seeking feedback on this AI Safety proposal:
(I don't have experience in AI experimentation)
I'm interested in the question of, "How can we use smart AIs to help humans at strategic reasoning."
We don't want the solution to be, "AIs just tell humans exactly what to do without explaining themselves." We'd prefer situations where smart AIs can explain to humans how to think about strategy, and this information makes humans much better at doing strategy.
One proposal to make progress on this is to set a benchmark for having smart AIs help out dumb AIs by pr...
Named Footnotes: A (likely) mediocre proposal
Epistemic status: This is probably a bad idea, because it's quite obvious yet not done; i.e. Chesterson's fence.
One bad practice in programming is to have a lot of unnamed parameters. For instance,
createPost(author, post, comment, name, id, privacyOption, ...)
Instead it's generally better to used Named Parameters, like,
createPost({author, post, comment, name, id, privacyOption})
Footnotes/endnotes seem similar. They are ordered by number, but this can be quite messy. It's particularly annoying for autho
...It seems really hard to deceive a Bayesian agent who thinks you may be deceiving them, especially in a repeated game. I would guess there could be interesting theorems about Bayesian agents that are attempting to deceive one another; as in, in many cases their ability to deceive the other would be highly bounded or zero, especially if they were in a flexible setting with possible pre-commitment devices.
To give a simple example, agent A may tell agent B that they believe , even though they internally believe . However, if this were somewhat repeat
...If you think it’s important that people defer to “experts”, then it should also make sense that people should decide which people are “experts” by deferring to “expert experts”.
There are many groups that claim to be the “experts”, and ask that the public only listens to them on broad areas they claim expertise over. But groups like this also have a long history of underperforming other clever groups out there.
The US government has a long history of claiming “good reasons based on classified intel” for military interventions, where later this turns out to b...
People are used to high-precision statements given by statistics (the income in 2016 was $24.4 Million), and are used to low-precision statements given by human intuitions (From my 10 minute analysis, I think our organization will do very well next year). But there’s a really weird cultural norm against high-precision, intuitive statements. (From my 10 minute analysis, I think this company will make $28.5 Million in 2027).
Perhaps in part because of this norm, I think that there are a whole lot of gains to be made in this latter cluster. It’s not trivial to do this well, but it’s possible, and the potential value is really high.
I find SEM models to be incredibly practical. They might often over-reach a bit, but at least they present a great deal of precise information about a certain belief in a readable format.
I really wish there would be more attempts at making more diagrams like these in cases where there isn't statistical data. For examples, to explain phenomena like:
In all of these cases, breath and depth...
There’s a big stigma now against platforms to give evaluations or ratings on individuals or organizations along various dimensions. See the rating episode of Black Mirror, or the discussion on the Chinese credit system.
I feel like this could be a bit of a missed opportunity. This sort of technology is easy to do destructively, but there are a huge number of benefits if it can be done well.
We already have credit scores, resumes (which are effectively scores), and social media metrics. All of these are really crude.
Some examples of things that could be possi...
Voting systems vs. utility maximization
I’ve seen a lot of work on voting systems, and on utility maximization, but very few direct comparisons. But I think that often we can prioritize systems that favor one or the other, and clearly our research efforts are limited between the two, so it seems useful to compare.
Voting systems act very different to utility maximization. There’s a big host of literature on ideal voting rules, and it’s generally quite different to that of utility maximization.
Proposals like quadratic voting are clearly in the voting category...
Prediction evaluations may be best when minimally novel
Imagine a prediction pipeline is resolved with a human/judgemental evaluation. For instance, a group today starts predicting what a trusted judge 10 years from now will say for the question, "How much counterfactual GDP benefit did policy X make, from 2020-2030?"
So, there are two stages:
One question for the organizer of such a system is how many resources to delegate to the prediction step vs. the evaluation step. It could be expensive to both pay for predictors and evaluators,
...