All of snarles's Comments + Replies

Thanks for the link MakoYass.

I am familiar with the concept of superrationality, which seems similar with what you are describing. The lack of special relationship between observer moments--let's call it non-continuity--is also a common concept in many mystical traditions. I view both of these concepts as different than the concept of unity, "we are all one".

Superrationality combines a form of unity with a requirement for rationality. I could think that "we are all one" without thinking that we should behave rationally. If I th... (read more)

I am not familiar with those concepts. References would be appreciated. 🙏

4mako yass
I'm not sure if these articles try to convey the personal, spiritual dimension of LDT's claims about agency, but they describe what it is https://arbital.com/p/logical_dt/?l=5gc Basically: LDT is the realisation that we should act as if our decisions will be reflected by every similarly rational agent that exists, it is one way of saying "all is one, you are not separate from others". It could even be framed as a paraphrasing of a constrained notion of karma, in a way ("your policy will be reflected back at you by others"). What's extraordinary about it is it says these things in the most precise, pragmatic terms. Metaphysical continuity of measure.. you're probably familiar with the concept even if you wouldn't have a name for it.. like.. you know how people worry that being teleported would be a kind of death? Because there's an interruption, a discontinuity between selves? And then one may answer, "but there is a similar, perhaps greater discontinuity every night, during sleep, but you don't seem to fear that." I don't know how many of us have noticed this, I've met a few, but we're starting to realise that anthropic measure, the substance of experience or subjectivity, there isn't some special relationship between observer-moments that're close in time and space, there's just a magnitude, and the magnitude can change over time. If we want to draw a line connecting observer-moments, it's artificial. So what I'm getting at is, that substance of experience can't really be divided into a bunch of completely separate lines of experience. If we care about one being's experience, we should generally care about every being's experience. We don't have to, of course, because of the orthogonality thesis, but I think most people will once they get it.
It seems obvious that your change in relationship with suffering constitutes a kind of value shift, doesn't it?

This is not obvious to me. In the first place, I never had the value "avoid suffering" even before I started my practices. Since before I even knew the concept of suffering, I have always had the compulsion of avoiding suffering, but the value to transcend it.

What's your relationship with value drift? Are you unafraid of it? That gradual death by mutation? The infidelity of your future self?

I am afraid of value drift, but I ... (read more)

Anyone, it seems, can have the experience of “feeling totally fine and at ease while simultaneously experiencing intense … pain”[1]:

It would greatly please me if people could achieve a deeper understanding of suffering just by taking analgesics. If that were the case, perhaps we should encourage people to try them just for that purpose. However, I'm guessing that the health risks, especially cognitive side-effects (a reduction of awareness that would preclude the possibility of gaining any such insight), risks of addiction and logistical issues surr... (read more)

5mako yass
It seems obvious that your change in relationship with suffering constitutes a kind of value shift, doesn't it? What's your relationship with value drift? Are you unafraid of it? That gradual death by mutation? The infidelity of your future self? Do you see it as a kind of natural erosion, a more vital aspect of the human telos than the motive aspects it erodes?

I would assign that a probability less than 0.1, and that's because I already experienced some insights which defy verbal transmission. For instance, I feel that I am close to experientially understanding the question of "what is suffering?" The best way I can formulate my understanding into words is, "there is no such thing as suffering. It is an illusion." I don't think additional words or higher-context instructions would help in conveying my understanding to someone who cannot relate to the experience of feeling totall... (read more)

Anyone, it seems, can have the experience of “feeling totally fine and at ease while simultaneously experiencing intense … pain”[1]:

It turns out there is painless pain: lobotomized people experience that, and “reactive dissociation” is the phrase used to describe the effects sometimes of analgesics like morphine when administered after pain has begun, and the patient reports, to quote Dennett 1978 [PDF] (emphasis in original), that “After receiving the analgesic subjects commonly report not that the pain has disappeared or diminished (as with aspirin) bu

... (read more)

I'm reducing my subjective probability that you will abandon rationality...

I suppose what you are attempting is similar to what Buddha did in the first place. The sages of his time must have felt pained to see their beautiful non-dualism sliced and diced into mass-produced sutras, rather than the poems and songs and mythology which were, up until then, the usual vehicle of expression for these truths.

I guess I'm just narcissistic enough to still be a Quinean naturalist and say 'yep, that is also me.'

Considering God to be part of yourse... (read more)

3romeostevensit
yeah, intimacy with aversions is one decent compression of the path.

The truths of General Relativity cannot be conveyed in conventional language. But does one have to study the underlying mathematics before evaluating its claims?

Just as there exists a specialized language that accurately conveys General Relativity, there similarly exists a specialized language (mythological language) for conveying mystical truths. However, I think the wrong approach would be to try to understand that language without having undergone the necessary spiritual preparation. As St. Paul says in 1 Corinthians 2:14

The natural person does not a
... (read more)
1SpectrumDT
General Relativity makes testable predictions. Conversely, whenever I hear descriptions of "nonduality", it is not at all clear that these claims make any predictions at all. Most statements I have heard about nonduality seem like non-statements with no ramifications. But I might be wrong. You do bring up one example of a potentially testable prediction of nonduality: Why merely "others around them"? If "we are all One", I would think it should also be possible to detect the thoughts and feelings of people on the other side of the Earth.
0mako yass
Hm, I think there might be something really interesting here. If I were to try to phrase this claim about God in terms of LDT's synchronicity, and the incoherence of the notion of any metaphysical continuity between observer-moments (or, vessels of anthropic measure), would you agree that we're talking about the same thing? (Are you familiar with these terms?)
8Said Achmiz
Yes. Of course you do. The delusion that such statements “approximately capture the truth” of things like GR is pervasive, but no less a delusion for it. Once again, this is delusion. Eliezer wrote an entire sequence about this. Basically your entire set of claims and comments is mostly “mysterious answers to mysterious questions”.

I started out as a self-identified rationalist, got fascinated by mysticism and 'went native.' Ever since, I have been watching the rationality from the sidelines to see if anyone else will 'cross over' as well.

I predict that if Romeo continues to work on methods for teaching meditation, that eventually he will also 'go mystical' and publicly rescind his claim that all perceived metaphysical insights can be explained as pathological disconnects with reality caused by neural rewiring. Conditional on his continuing to teach, I... (read more)

2thecupisblue
I understand your viewpoint. I've always been a rationalist but being raised while being exposed to topics of taoism and buddhism always made me doubt that it's everything. Then, after understanding more and more of how the reality works but after looking 'mystic' and talking about it to people who did psychedelics I realised that they're saying the same things. Then I took some and even tho I still have doubt, I know the truth is what it is. The Gateway Experience by the CIA describes diving into 'subspace' or 'taking off this reality blanket' pretty nicely. If you can dive into the zone deep enough then your consciousness gets pulled up into a state of uninterrupted connection to the one - it's the brain that makes us human, that give us the ego, but it's also what renders this reality for our consciousness in this life. Everyone is trying to talk about the same thing, but it's hard when people are so stuck onto words and status and images. I believe that with higher entropy and complexity of technology, we'll all achieve being the one on both this plane or create a new universe so we can be everything again. We don't have to waste time describing it constantly.
9romeostevensit
I've had multiple religious experiences (total reality dissolution, contact with seeming entities of infinite benevolence etc.) and I guess I'm just narcissistic enough to still be a Quinean naturalist and say 'yep, that is also me.' I'd say I basically endorse Theory M already? I posit that words (and images, and felt senses) are low dimensional projections of many-dimensional objects.
Benquo210

Why no probability on "there exists a truth that is very difficult to express in conventional language, such that as contexts change, fixed written accounts of it tend to decay into uselessness, it's so difficult that even most people who get it lack the verbal skill to express it clearly in their words in their time, this is compounded by most people needing higher-context instruction than words alone to get to the point where the words can mean anything to them, and because of this the vast majority of people trying to talk about this round it ... (read more)

3Elo
There are a few of us that have "crossed over" as you call it. From my journey it seems to be a developmentally relevant stage.
6Kaj_Sotala
This seems hard to engage with, given that you've said little about the mystical truth in question, and in fact stated that it can't be expressed in conventional language. How can I evaluate the claim that M is true, if I don't know what M is?

Cool, I will take a look at the paper!

Great comment, mind if I quote you later on? :)

That said, if you have example problems where a logically omniscient Bayesian reasoner who incorporates all your implicit knowledge into their prior would get the wrong answers, those I want to see, because those do bear on the philosophical question that I currently see Bayesian probability theory as providing an answer to--and if there's a chink in that armor, then I want to know :-)

It is well known where there might be chinks in the armor, which is what happens when two logically omniscient Bayesians si... (read more)

7So8res
Sure! I would like to clarify, though, that by "logically omniscient" I also meant "while being way larger than everything else in the universe." I'm also readily willing to admit that Bayesian probability theory doesn't get anywhere near solving decision theory, that's an entirely different can of worms where there's still lots of work to be done. (Bayesian probability theory alone does not prescribe two-boxing, in fact; that requires the addition of some decision theory which tells you how to compute the consequences of actions given a probability distribution, which is way outside the domain of Bayesian inference.) Bayesian reasoning is an idealized method for building accurate world-models when you're the biggest thing in the room; two large open problems are (a) modeling the world when you're smaller than the universe and (b) computing the counterfactual consequences of actions from your world model. Bayesian probability theory sheds little light on either; nor is it intended to. I personally don't think it's that useful to consider cases like "but what if there's two logically omniscient reasoners in the same room?" and then demand a coherent probability distribution. Nevertheless, you can do that, and in fact, we've recently solved that problem (Benya and Jessica Taylor will be presenting it at LORI V next week, in fact); the answer, assuming the usual decision-theoretic assumptions, is "they play Nash equilibria", as you'd expect :-)

If the game is really working like they say it is, then the frequentist is often concentrating probability around some random psi for no good reason, and when we actually draw random thetas and check who predicted better, we'll see that they actually converged around completely the wrong values. Thus, I doubt the claim that, setting up the game exactly as given, the frequentist converges on the "true" value of psi. If we assume the frequentist does converge on the right answer, then I strongly suspect either (1) we should be using a prior where

... (read more)
7So8res
I understand the "no methods only justifications" view, but it's much less comforting when you need to ultimately build a reliable reasoning system :-) I remain mostly unperturbed by this game. You made a very frequentist demand. From a Bayesian perspective, your demand is quite a strange one. If you force me to achieve it, then yeah, I may end up doing frequentist-looking things. In attempts to steel-man the Robins/Wasserman position, it seems the place I'm supposed to be perturbed is that I can't even achieve the frequentist result unless I'm willing to make my prior for theta depend on pi, which seems to violate the spirit of Bayesian inference? Ah, and now I think I see what's going on! The game that corresponds to a Bayesian desire for this frequentist property is not the game listed; it's the variant where theta is chosen adversarially by someone who doesn't want you to end up with a good estimate for psi. (Then the Bayesian wants a guarantee that they'll converge for every theta.) But those are precisely the situations where the Bayesian shouldn't be ignoring pi; the adversary will hide as much contrary data as they can in places that are super-difficult for the spies to observe. Robins and Wasserman say "once a subjective Bayesian queries the randomizer (who selected pi) about the randomizer’s reasoned opinions concerning theta (but not pi) the Bayesian will have independent priors." They didn't show their math on this, but I doubt this point carries their objection. If I ask the person who selected pi how theta was selected, and they say "oh, it was selected in response to pi to cram as much important data as possible into places that are extraordinarily difficult for spies to enter," then I'm willing to buy that after updating (which I will do) I now have a distribution over theta that's independent of pi. But this new distribution will be one where I'll eventually converge to the right answer on this particular pi! So yeah, if I'm about to start play

Ok. So the scenario is that you are sampling only from the population f(X)=1.

EDIT: Correct, but you should not be too hung up on the issue of conditional sampling. The scenario would not change if we were sampling from the whole population. The important point is that we are trying to estimate a conditional mean of the form E[Y|f(X)=1]. This is a concept commonly seen in statistics. For example, the goal of non-parametric regression is to estimate a curve defined by f(x) = E[Y|X=x].

Can you exhibit a simple example of the scenario in the section &qu

... (read more)
0Richard_Kennaway
That looks like a parametric model. There is one parameter, a binary variable that chooses h or j. A belief about that parameter is a probability p that h is the function. Yes, I can see that updating p on sight of the data may give a better estimate of E[Y|f(X)=1], which is known a priori to be either h(1) or j(1). I expect it would be similar for small numbers of parameters also, such as a linear relationship between X and Y. Using the whole sample should improve on only looking at the subsample around f(X)=1. However, in the nonparametric case (I think you are arguing) this goes wrong. The sample size is not large enough to estimate a model that gives a narrow estimate of E[Y|f(X)=1]. Am I understanding you yet? It seems to me that the problem arises even before getting to the nonparametric case. If a parametric model has too many parameters to estimate from the sample, and the model predictions are everywhere sensitive to all of the parameters (so it cannot be approximated by any simpler model) then trying to estimate E[Y|f(X)=1] by first fitting the model, then predicting from the model, will also not work. It so clearly will not work that it must be a wrong thing to do. It is not yet clear to me that a Bayesian statistician must do it anyway. The set {Y|f(X)=1} conveys information about E[Y|f(X)=1] directly, independently of the true model (assumed for the purpose of this discussion to be within the model space being considered). Estimating it via fitting a model ignores that information. Is there no Bayesian method of using it? A partial answer to your question: would be that the less the model helps, the less attention you pay it relative to calculating Mean{Y|f(X)=1}. I don't have a mathematical formulation of how to do that though.

Update from the author:

Thanks for all of the comments and corrections! Based on your feedback, I have concluded that the article is a little bit too advanced (and possibly too narrow in focus) to be posted in the main section of the site. However, it is clear that there is a lot of interest in the general subject. Therefore, rather than posting this article to main, I think it would be more productive to write a "Philosophy of Statistics" sequence which would provide the necessary background for this kind of post.

6IlyaShpitser
I enjoyed your article, and learned things from you. I want to encourage you to post more on stats subjects here.

The confusion may come from mixing up my setup and Robins/Ritov's setup. There is no missing data in my setup.

I could write up my intuition for the hierarchical model. It's an almost trivial result if you don't assume smoothness, since for any x1,...,xn the parameters g(x1)...g(xn) are conditionally independent given p and distributed as F(p), where F is the maximum entropy Beta with mean p (I don't know the form of the parameters alpha(p) and beta(p) off-hand). Smoothness makes the proof much more difficult, but based on high-dimensional intuition one ... (read more)

I didn't reply to your other comment because although you are making valid points, you have veered off-topic since your initial comment. The question of "which observations to make?" is not a question of inference but rather one of experimental design. If you think this question is relevant to the discussion, it means that you neither understand the original post nor my reply to your initial comment. The questions I am asking have to do with what to infer after the observations have already been made.

1Richard_Kennaway
Ok. So the scenario is that you are sampling only from the population f(X)=1. Can you exhibit a simple example of the scenario in the section "A non-parametric Bayesian approach" with an explicit, simple class of functions g and distribution over them, for which the proposed procedure arrives at a better estimate of E[ Y | f(X)=1 ] than the sample average? Is the idea that it is intended to demonstrate, simply that prior knowledge about the joint distribution of X and Y would, combined with the sample, give a better estimate than the sample alone?

By "importance sampling distribution" do you mean the distribution that tells you whether Y is missing or not?

Right. You could say the cases of Y1|D=1 you observe in the population are an importance sample from Y1, the hypothetical population that would result if everyone in the population were treated. E[Y1], the quantity to be estimated, is the mean of this hypothetical population. The importance sampling weights are q(x) = Pr[D=1|x]/p(x) where p(x) is the marginal distribution (ie you invert these weights to get the average), the importance sampling distribution is the conditional density of X|D=1.

4IlyaShpitser
Still slightly confused. I think Robins and Ritov has a theorem (cited in your blog link) claiming to get E[Y] if Y is MAR you need to incorporate info about 1/p(x) somewhere into your procedure (?the prior?) or you don't get uniform consistency. Is your claim that you can get around this via some hierarchical model, e.g.: Is this just intuition or did you write this up somewhere? That sounds very interesting. ---------------------------------------- Why did you start thinking about conditional sampling at all? If estimating E[Y] via importance sampling/inverse weights/covariate adjustment is already something of a difficulty for Bayesians, why think about E[Y | event]? Isn't that trivially at least as hard?

I will go ahead and answer your first three questions

  1. Objective Bayesians might have "standard operating procedures" for common problems, but I bet you that I can construct realistic problems where two Objective Bayesians will disagree on how to proceed. At the very least the Objective Bayesians need an "Objective Bayesian manifesto" spelling out what are the canonical procedures. For the "coin-flipping" example, see my response to RichardKennaway where I ask whether you would still be content to treat the problem as coin-fl

... (read more)

It is worth noting that the issue of non-consistency is just as troublesome in the finite setting. In fact, in one of Wasserman's examples he uses a finite (but large) space for X.

Yes, I think you are missing something (although it is true that causal inference is a missing data problem).

It may be easier to think in terms of the potential outcomes model. Y0 is the outcome is no treatment, Y1 is the outcome of treatment, you only ever observe either Y0 or Y1, depending on whether D=0 or 1. Generally you are trying to estimate E[Y1] or E[Y0] or their difference.

The point is that the quantity Robbins and Wasserman are trying to estimate, E[Y], does not depend on the importance sampling distribution. Whereas the quantity I am trying ... (read more)

2IlyaShpitser
Not following. By "importance sampling distribution" do you mean the distribution that tells you whether Y is missing or not? If so changing this distribution will change what you have to do to estimate E[Y] in the Robins/Wasserman case. For example, if you change the distributiion to just depend on an independent coin flip you move from "MAR" to "MCAR" (in causal inference from "conditional ignorablity" to "ignorability.") Then your procedure depends on this distribution (but your target does not, this is true). Similarly "p(y | do(a))" does not change, but the functional of the observed data equal to "p(y | do(a))" will change if you change the treatment assignment distribution. (Btw, people do versions of ETT where D is complicated and not a simple treatment event. Actually I have something in a recent draft of mine called "effect of treatment on the indirectly treated" that's like that).

My example is very similar to the Robbins/Wasserman example, but you end up drawing different conclusions. Robbins/Wasserman show that you can't make sense of importance sampling in a Bayesian framework. My example shows that you can't make sense of "conditional sampling" in a Bayesian framework. The goal of importance sampling is to estimate E[Y], while the goal of conditional sampling is to estimate E[Y|event] for some event.

We did talk about this before, that's how I first learnt of the R/W example.

4IlyaShpitser
I think these are isomorphic, estimating E[Y] if Y is missing at random conditional on C is the same as estimating E[Y | do(a)] = E[Y | "we assign you to a given C"]. "Causal inference is a missing data problem, and missing data is a causal inference problem." ---------------------------------------- Or I may be "missing" something. :)

I do not need to model the process f by which that population was selected, only the behaviour of Y within that population?

There are some (including myself and presumably some others on this board) who see this practice as epistemologically dubious. First, how do you decide which aspects of the problem to incorporate into your model? Why should one only try to model E[Y|f(X)=1] and not the underlying function g(x)=E[Y|x]? If you actually had very strong prior information about g(x), say that "I know g(x)=h(x) with probability 1/2 or g(x) = j(x) ... (read more)

0Richard_Kennaway
That question must be directed at both the Bayesian and the frequentist. In my other comment I gave two toy examples, in one of which looking at a wider sample is provably inferior to looking only at f(X)=1, and one in which the reverse is the case. Anyone faced with the problem of estimating E[Y|f(X)=1] needs to decide, somehow, what observations to make. How do a Bayesian or a frequentist make that decision?
4Richard_Kennaway
What would it tell you if you could? The problem is to estimate Y for a certain population. Therefore, look at that population. I am not seeing a reason why one would consider modelling g, so I am at a loss to answer the question, why not model g? Jaynes and a few others generally write things like E[ Y | I ] or P( Y | I ) where I represents "all of your background knowledge", not further analysed. f(X)=1 is playing the role of I here. It's a placeholder for the stuff we aren't modelling and within which the statistical reasoning takes place. Suppose f was a very simple function, for example, the identity. You are asked to estimate E[ Y | X=1 ]. What do the Bayesian and the frequentist do in this case? They are still only being asked about the population for which X=1. Can either of them get better information about E[ Y | X=1 ] by looking (also) at samples where X is not 1? The example is a simplification of Wasserman's; I'm not sure if a similar answer can be made there. BTW, I'm not a statistician, and these aren't rhetorical questions. ETA: Here's an even simpler example, in which it might be possible to demonstrate mathematically the answer to the question, can better information be obtained about E[ Y | X=1 ] by looking at members of the population where X is not 1? Suppose it is given that X and Y have a bivariate normal distribution, with unknown parameters. You take a sample of 1000, and are given a choice of taking it either from the whole population, or from that sliver for which X is in some range 1 +/- ε for ε very small compared with the standard deviation of X. You then use whatever tools you prefer to estimate E[ Y | X=1 ]. Which method of sampling will allow a better estimate? ETA2: Here is my own answer to my last question, after looking up some formulas concerning linear regression. Let Y1 be the mean of Y in a sample drawn from a narrow neighbourhood of X=1, and let Y2 be the estimate of E[ Y | X=1 ] obtained by doing linear regression on a

Good catch, it should be Beta(991, 11). The prior is uniform = Beta(1,1 ) and the data is (990 successes, 10 fails)

0OrphanWilde
It looks like you didn't replace all the distributions with the update?
2gjm
Yup, sorry, I made two separate mistakes there (misparameterizing the beta distribution by alpha+beta,beta rather than alpha,beta, and the off-by-one error) -- but at least my wrong parameters were less wrong than yours :-).

How do you get the top portion of the second payoff matrix from the first? Intuitively, it should be by replacing the Agent A's payoff with the sum of the agents' payoffs, but the numbers don't match.

Most people are altruists but only to their in-group, and most people have very narrow in-groups. What you mean by an altruist is probably someone who is both altruistic and has a very inclusive in-group. But as far as I can tell, there is a hard trade-off between belonging to a close-knit, small in-group and identifying with a large, diverse but weak in-group. The time you spend helping strangers is time taken away from potentially helping friends and family.

0leplen
It's the average({4-2}/2), rather than the sum, since the altruistic agent is interested in maximizing the average utility. The tribal limitations on altruism that you allude to are definitely one of the tendencies that much of our cultural advice on altruism targets. In many ways the expanding circle of trust, from individuals, to families, to tribes, to cities, to nation states, etc. has been one of the fundamental enablers of human civilization. I'm less sure about the hard trade-off that you describe. I have a lot of experience being a member of small groups that have altruism towards non-group members as an explicit goal. In that scenario, helping strangers also helps in-group members achieve their goals. I don't think large-group altruism precludes you from belonging to small in-groups, since very few in-groups demand any sort of absolute loyalty. While full effort in-group altruism, including things like consciously developing new skills to better assist your other group members would absolutely represent a hard trade-off with altruism on a larger scale, people appear to be very capable of belonging to a large number of different in-groups. This implies that the actual level of commitment required to be a part of most in-groups is rather low, and the socially normative level of altruism is even lower. Belonging to a close-knit in-group with a particularly needy member, (e.g. having a partially disabled parent, spouse, or child) may shift the calculus somewhat, but for most in-groups being a member in good-standing has relatively undemanding requirements. Examining my own motivations it seems that for many of the groups that I participate in most of the work that I do to fulfilling expectations and helping others within those group is more directly driven by my desire for social validation than my selfless perception of the intrinsic value of the other group members.
snarles00

Like V_V, I don't find it "reasonable" for utility to be linear in things we care about.

I will write a discussion topic about the issue shortly.

EDIT: Link to the topic: http://lesswrong.com/r/discussion/lw/mv3/unbounded_linear_utility_functions/

snarles30

I'll need some background here. Why aren't bounded utilities the default assumption? You'd need some extraordinary arguments to convince me that anyone has an unbounded utility function. Yet this post and many others on LW seem to implicitly assume unbounded utility functions.

2JamesPfeiffer
1) We don't need an unbounded utility function to demonstrate Pascal's Mugging. Plain old large numbers like 10^100 are enough. 2) It seems reasonable for utility to be linear in things we care about, e.g. human lives. This could run into a problem with non-uniqueness, i.e., if I run an identical computer program of you twice, maybe that shouldn't count as two. But I think this is sufficiently murky as to not make bounded utility clearly correct.
0Lumifer
Because here the default utility is the one specified by the Von Neumann-Morgenstern theorem and there is no requirement (or indication) that it is bounded. Humans, of course, don't operate according to VNM axioms, but most of LW thinks it's a bug to be fixed X-/
snarles20

Let's talk about Von Neumann probes.

Assume that the most successful civilizations exist digitally. A subset of those civilizations would selfishly pursue colonization; the most convenient means would be through Von Neumann machines.

Tipler (1981) pointed out that due to exponential growth, such probes should already be common in our galaxy. Since we haven't observed any, we must be alone in the universe. Sagan and Newman countered that intelligent species should actually try to destroy probes as soon as they are detected. This counterargument, known as... (read more)

snarles20

Sociology, political science and international politics, economics (graduate level), psychology, psychiatry, medicine.

snarles20

Undergraduate mathematics, Statistics, Machine Learning, Intro to Apache Spark, Intro to Cloud Computing with Amazon

snarles00

Thanks--this is a great analysis. It sounds like you would be much more convinced if even a few people already agreed to tutor each other--we can try this as a first step.

snarles00

That's OK, you can get better. And you can use any medium which suits you. It could be as simple as assigning problems and reading, then giving feedback.

snarles00

This is an interesting counterexample, and I agree with Larry that using priors which depend on pi(x) is really no Bayesian solution at all. But if this example is really so problematic for Bayesian inference, can one give an explicit example of some function theta(x) for which no reasonable Bayesian prior is consistent? I would guess that only extremely pathological and unrealistic examples theta(x) would cause trouble for Bayesians. What I notice about many of these "Bayesian non-consistency" examples is that they require consistency over ve... (read more)

snarles10

EDIT: Edited my response to be more instructive.

On some level it's fine to make the kinds of qualitative arguments you are making. However, to assess whether a given hypothesis really robust to parameters like ubiquity of civilizations, colonization speed, and alien psychology, you have to start formulating models and actually quantify the size of the parameter space which would result in a particular prediction. A while ago I wrote a tutorial on how to do this:

http://lesswrong.com/lw/5q7/colonization_models_a_tutorial_on_computational/

which covers the ... (read more)

snarles00

The second civ would still avoid building them too close to each other. This is all clear if you do the analysis.

0Wes_W
So instead of every civ fillings its galaxy, we get every civ building one in every galaxy. For this to not result in an Engine on every star, you still have to fine-tune the argument such that new civs are somehow very rare. There are some hypotheticals where the details are largely irrelevant, and you can back up and say "there are many possibilities of this form, so the unlikeliness of my easy-to-present example isn't the point". "Alien civs exist, but prefer to spread out a lot" does not appear to be such a solution. As such, the requirement for fine-tuning and multiple kinds of exotic physics seem to me like sufficiently burdensome details that this makes a bad candidate.
snarles10

Thanks for the references.

I am interested in answering questions of "what to want." Not only is it important for individual decision-making, but there are also many interesting ethical questions. If a person's utility function can be changed through experience, is it ethical to steer it in a direction that would benefit you? Take the example of religion: suppose you could convince an individual to convert to a religion, and then further convince them to actively reject new information that would endanger their faith. Is this ethical? (My opi... (read more)

0Dagon
Cool. The (Metaethics Sequence)[http://wiki.lesswrong.com/wiki/Metaethics_sequence] is useful for some of those things. I have to admit that, for myself, I remain unconvinced that there is an objective truth to be had regarding "what should I want". Partly because I'm unconvinced that "I" is a coherent unitary thing at any given timepoint, let alone over time. And partly because I don't see how to distinguish "preferences" from "tendencies" without resorting to unmeasurable guesses about qualia and consciousness.
snarles00

Ordinarily, yes, but you could imagine scenarios where agents have the option to erase their own memories or essentially commit group suicide. (I don't believe these kinds of scenarios are extreme beyond belief--they could come up in transhuman contexts.) In this case nobody even remembers which action you chose, so there is no extrinsic motivation for signalling.

snarles00

The second civilization would just go ahead and build them anyways, since doing so maximizes their own utility function. Of course, there is an additional question of whether and how the first civilization will try to stop this from happening, since the second civ's Catastrophe Engines reduce their own utility. If the first civ ignores them, the second civ builds Catastrophe Engines the same way as before. If the first civ enforces a ban on Catastrophe Engines, then the second civ colonizes space using conventional methods. But most likely the first civ would eliminate the second civ (the "Berserker" scenario.)

0Wes_W
Then why isn't there an Engine on every star?
snarles00

For the original proposal:

Explain:

  • A mechanism for explosive energy generation on a cosmic scale might also explain the Big Bang.

Invalidate:

  • Catastrophe engines should still be detectable due to extremely concentrated energy emission. A thorough infrared sky survey would rule them out along with more conventional hypotheses such as Dyson spheres.

  • If it becomes clear there is no way to exploit vacuum energy, this eliminates one of the main candidates for a new energy source.

  • A better understanding of the main constraints for engineering Matrioshka br

... (read more)
snarles10

Disclaimer: I am lazy and could have done more research myself.

I'm looking for work on what I call "realist decision theory." (A loaded term, admittedly.) To explain realist decision theory, contrast with naive decision theory. My explanation is brief since my main objective at this point is fishing for answers rather than presenting my ideas.

Naive Decision Theory

  1. Assumes that individuals make decisions individually, without need for group coordination.

  2. Assumes individuals are perfect consequentialists: their utility function is only a funct

... (read more)
1Stingray
Why do people even signal anything? To get something for themselves from others. Why would signaling be outside the scope of consequentialism.
1Dagon
Unpack #1 a bit. Are you looking for information about situations where an individual's decisions should include predicted decisions by others (which will in turn take into account the individual's decisions)? The (Game Theory Sequence)[http://lesswrong.com/lw/dbe/introduction_to_game_theory_sequence_guide/] is a good starting point. Or are you looking for cases where "individual" is literally not the decision-making unit? I don't have any good less-wrong links, but both (Public Choice Theory)[http://lesswrong.com/lw/2hv/public_choice_and_the_altruists_burden/] and the idea of sub-personal decision modules come up occasionally. Both topics fit into the overall framework of classical decision theory (naive or not, you decide) and expected value. Items 2-4 don't contradict classical decision theory, but fall somewhat outside of it. decision theory generally looks at instrumental rationality - how to best get what one wants, rather than questions of what to want.
snarles00

I mostly agree with you, but we may disagree on the implausibility of exotic physics. Do you consider all explanations which require "exotic physics" to be less plausible than any explanation that does not? If you are willing to entertain "exotic physics", then are there many ideas involving exotic physics that you find more plausible than Catastrophe Engines?

In the domain of exotic physics, I find Catastrophe Engines to be relatively plausible since are already analogues of similar phenomena to Catastrophe Engines in known physics: f... (read more)

1Dentin
Before we go further: * What specific observations and evidence does your idea explain, other than the Fermi paradox? * What specific observations and evidence, if we had them, would invalidate your idea?
snarles00

There are only a limited number of ideas we can work on

You are right in general. However, it is also a mistake to limit your scope to too few of the most promising ideas. Suppose we put a number K on the number of different explanations we should consider for the Fermi paradox. What number K do you think would give the best tradeoff between thoroughness and time?

snarles10

It's not a contest. And although my explanation invokes unknown physics, it makes specific predictions which could potentially be validated or invalidated, and it has actionable consequences. Could you elaborate on what criteria make an idea "worth entertaining"?

3[anonymous]
But it is. There are only a limited number of ideas we can work on, so we'd better have some reason to think that this idea has more potential than any of the innumerable other ideas we could be working on instead.
snarles00

Regardless of whether ETs are sending signals, presumably we should be able to detect Type II or Type III civilizations given most proposals for how such civilizations should look like.

snarles20

There exists a technological plateau for general intelligence algorithms, and biological neural networks already come close to optimal. Hence, recursive self-improvement quickly hits an asymptote.

Therefore, artificial intelligence represents a potentially much cheaper way to produce and coordinate intelligence compared to raising humans. However, it will not have orders of magnitude more capability for innovation than the human race. In particular, if humans are unable to discover breakthroughs enabling vastly more efficient production of computational ... (read more)

snarles70

There is no way to raise a human safely if that human has the power to exponentially increase their own capabilities and survive independently of society.

-1kokotajlod
Yep. "The melancholy of haruhi suzumiya" can be thought of as an example of something in the same reference class.
snarles00

You can try reduce philosophy to science, but how can you justify the scientific method itself? To me, philosophy refers to the practice of asking any kind of "meta" question. To question the practice of science is philosophy, as is the practice of questioning philosophy. The arguments you make are philosophical arguments--and they are good arguments. But to make a statement to the effect of "all philosophy is cognitive science" is too broad a generalization.

What Socrates was doing was asking "meta" questions about intuiti... (read more)

2[anonymous]
By the fact that it works, where works defined as getting goals reached? I didn't mean to discount or disparage philosophy by it, I meant to improve it. Once we know what philosophers are trying to learn, we can try to find better methods to achieve the same. Whether that would be called philosophy or cognitive science is beside the point: the point is it would deliver what philosophers want to get delivered. What kind of meta? "How to recruit recruiters who can recruit the kind of recruiters who can recruit the kind of recruiters who can recruit a lot of people?" is very meta, but not philosophical. "What is music?" is philosophical. But it reduces to "By what algorithm does our subconscious hindbrain to find a sequences of sounds musical or not?" The point is, once we know philosophy is largely looking for this kind of meta, we can try to propose more efficient methods for finding them. But he was doing precisely that, consider the example of the guy with the borrowed sword. He designed thought experiments to test people's intuitions about justice. To be more precisely, to test people's intuitive proposals for an algorithm of justice against their intuitive judgements of examples which the algorithm was supposed to predict. He was looking for an algorithm that predicts intuitive judgements of examples, and tested all proposals. This is scientific enough. He was trying to arrive to a universal truth - could not find it, but science cannot always get a final answer at the first try. He managed to at least dispel some commonly proposed bad algorithms, such as justice is obedience to rulers, or paying debts, or similar ones, and proving some common ideas are misconceptions is an important part of the work of science. Given that his experiments were not fully succesful at finding the final truth, of course it was not the universal truth yet, but going that direction by dispelling some myths. To make it clear: it would be a truth about how, by what algorithms, do
snarles40

Hopefully people here do not interpret "rationalists" as synonymous for "the LW ingroup." For one, you can be a rationalist without being a part of LW. And secondly, being a part of LW in no way certifies you as a rationalist, no matter how many internal "rationality tests" you subject yourself to.

snarles40

A different kind of "bias-variance" tradeoff occurs in policy-making. Take college applications. One school might admit students based only on the SAT score. Another admits students based on scores, activities, essays, etc. The first school might reject a lot of exceptional people who just happen to be bad at test-taking. The second school tries to make sure they accept those kinds of exceptional people, but in the process of doing so, they will admit more unexceptional people with bad test scores who somehow manage to impress the admissions... (read more)

3Andy_McKenzie
Thanks, these are great examples. How did you come up with them?
snarles30

EM202623997 state complexity hierarchy

Relative to any cellular automata capable of universal computation, initial states can be classified according to a nested hierarchy'of complexity classes. The first three levels of the hierarchy were informally known since the beginnings of cellular automata theory in the 20th century, and the next two levels were also speculated to exist, motivated by the idea of formalizing an abstract notion of "organism" and an abstract notion of "sentience", respectively. EM-brain 202623897, a descendant of ... (read more)

snarles190

Simulated dream state experiments

Simulated dream state experiments (SDSEs) are computer simulation experiments involving simulated humans sentiences in a dream state. Since the passing of the Banford agreement (1) in 2035, SDSEs are the exclusive means of ethically conducting simulation experiments of simulated human sentiences without active consent (2), although contractual consent (3) is still universally required for SDSEs. SDSEs have widespread scientific, commercial, educational, political, military and legal purposes. Scientific studies using SDSEs... (read more)

snarles20

Daniel grew up as a poor kid, and one day he was overjoyed to find $20 on the sidewalk. Daniel could have worked hard to become a trader on Wall Street. Yet he decides to become a teacher instead, because of his positive experiences in tutoring a few kids while in high school. But as a high school teacher, he will only teach thousand kids in his career, while as a trader, he would have been able to make millions of dollars. If he multiplied his positive experience with one kid by a thousand, it still probably wouldn't compare with the joy of finding $20 on the sidewalk times a million.

0A1987dM
Nice try, but even if my utility for oiled birds was as nonlinear as most people's utility for money is, the fact that there are many more oiled birds than I'm considering saving means that what you need to compare is (say) U(54,700 oiled birds), U(54,699 oiled birds), and U(53,699 oiled birds) -- and it'd be a very weird utility function indeed if the difference between the first and the second is much larger than one-thousandth the difference between the second and the third. And even if U did have such kinks, the fact that you don't know exactly how many oiled birds are there would smooth them away when computing EU(one fewer oiled bird) etc. (IIRC EY said something similar in the sequences, using starving children rather than oiled birds as the example, but I can't seem to find it right now.) Unless you also care about who is saving the birds -- but you aren't considering saving them with your own hands, you're considering giving money to save them, and money is fungible, so it'd be weird to care about who is giving the money.
0Jiro
Because Daniel has been thinking of scope insensitivity, he expects his brain to misreport how much he actually cares about large numbers of dollars: the internal feeling of satisfaction with gaining money can't be expected to line up with the actual importance of the situation. So instead of just asking his gut how much he cares about making lots of money, he shuts up and multiplies the joy of finding $20 by a million....
Load More