All of kim0's Comments + Replies

kim000

And yet again I am reminded why I do not frequent this supposedly rational forum more. Rationality swishes by over most peoples head here, except for a few really smart ones. You people make it too complicated. You write too much. Lots of these supposedly deep intellectual problems have quite simple answers, such as this Ellsberg paradox. You just have to look and think a little outside their boxes to solve them, or see that they are unsolvable, or that they are wrong questions.

I will yet again go away, to solve more useful and interesting problems on my own.

Oh, and Orthonormal, here is my correct final answer to you: You do not understand me, and this is your fault.

[This comment is no longer endorsed by its author]Reply
kim0-30

Bayesian reasoning is for maximizing the probability of being right. Kelly´s criterion is for maximizing aggregated value.

And yet again, the distributions of the probabilities are different, because they have different variance, and difference in variance give different aggregated value, which is what people tend to try to optimize.

Aggregating value in this case is to get more pies, and fewer boots to the head. Pies are of no value to you when you are dead from boots to the head, and this is the root cause for preferring lower variance.

This isn´t much of a discussion when you just ignore and deny my argument instead of trying to understand it.

0orthonormal
If I decide whether you win or lose by drawing a random number from 1 to 60 in a symmetric fashion, then rolling a 60-sided die and comparing the result to the number I drew, this is the same random variable as a single fair coinflip. Unless you are playing multiple times (in which case you'll experience higher variance from the correlation) or you have a reason to suspect an asymmetric probability distribution of green vs. blue, the two gambles will have the exact same effect in your utility function. The above paragraph is mathematically rigorous. You should not disagree unless you find a mathematical error.
kim020

No, because expected value is not the same thing as variance.

Betting on red gives 1/3 winnings, exactly.

Betting on green gives 1/3 +/- x winnings, and this is a variance, which is bad.

5Caspian
You don't get exactly 1/3 of a win with no variance in either case. You get exactly 1 win, 1/3 of the time, and no win 2/3 of the time. As an example when betting on green, suppose there's a 1/3 chance of 30 blue and 30 green balls, 1/3 chance of 60 green, 1/3 chance of 60 blue. And there's always 30 red balls. There is a 1/3 of 1/3 chance that there are 30 green balls and you pick one. There is a 2/3 of 1/3 chance that there are 60 green balls and you pick one. There is no chance that there are no green balls and you still pick one. Therre is no other way to get a green ball. The total chance of picking a green ball is therefore 1/3, that is, 1/3 of 1/3 plus 2/3 of 1/3. That means that 1/3 of the time you win and 2/3 of the time you lose, just as in the case of betting on the red ball. A distribution of 1 one third of the time and 0 two thirds of the time has some computable variance. Whatever it is, that's the variance in your number of wins when you bet on green, and it's also the variance in you number of wins when you bet red.
0orthonormal
Like I said below, write out the actual random variables you use as a Bayesian: they have identical distributions if the mean of your green:blue prior is 30 to 30. There is literally no sane justification for the "paradox" other than updating on the problem statement to have an unbalanced posterior estimate of green vs. blue.
kim000

Preferring red is rational, because it is a known amount of risk, while each of the other two colours have unknown risks.

This is according to Kellys criterion and Darwinian evolution. Negative outcomes outweigh positive ones because negative ones lead to sickness and death through starvation, poorness, and kicks in the head.

This is only valid in the beginning, because when the experiment is repeated, the probabilities of blue and green become clearer.

0orthonormal
I think what you're saying is just that humans are risk-averse, and so a gamble with lower variance is preferable to one with higher variance (and the same mean)... but if the number of green vs. blue is randomly determined with expected value 30 to 30, then it has the same variance. You need to involve something more (like the intentional stance) to explain the paradox.
kim0231

There often is not any difference at all between flirting and friendliness. People vary very much in their ways. And yet we are supposed to easily tell the difference, with threat of imprisonment for failing.

The main effects I have seen and experienced, is that flirting typically involve more eye contact, and that a lot of people flirt while denying they do it, and refusing to to tell what they would do if they really flirted, and disparaging others for not knowing the difference.

My experience is also that ordinary people are much more direct and clear in the difference between flirting and friendship, while academic people muddle it.

wnoise450

yet we are supposed to easily tell the difference, with threat of imprisonment for failing.

It can be hard to tell the difference, and it can be easy to mess up when trying to flirt back, but it takes rather more than than simply not telling the difference between flirtation and friendliness for imprisonment. There has to be actual unwelcome steps taken that cross significant lines.

The way the mating dance typically goes is as a series of small escalations. One of the purposes this serves is to let parties make advances without as much risk of everyo... (read more)

and that a lot of people flirt while denying they do it

Or without even realising. Several years ago an acquaintance on whom I was developing a crush told me she was aware of this; this puzzled me since I thought I hadn't yet initiated anything like flirting, so I asked how she knew. Then she took my hand and replicated the way in which, a few days before, I had passed her some small object (probably a pen). I didn't realise I was doing it at the time, but in that casual gesture I was prolonging the physical contact a lot more than necessary, and once put on the receiving side it was bloody obvious what was going on.

kim030

Most places I have worked, the reputation of the job has been quite different from the actual job. I have compared my experiences with those of friends and colleagues, and they are relatively similar. Having a M.Sc. in physics and lots of programming experience made it possible for me to have more different kinds of engineering jobs, and thus more varied experience.

My conclusion is that the anthropic principle holds for me in the work place, so that each time I experience Dilbertesque situations, they are representative of typical work situations. So yes, I do think my work situation is typical.

My current job doing statistical analysis for stock analysts pay $ 73 000, while the average pay elsewhere is $ 120 000.

kim040

I am, and I am planning to leave it to get a higher more average pay. From my viewpoint, it is terribly overrated and undervalued.

6Daniel_Burfoot
Can you expand on this? Do you think your experience is typical?
kim000

That was a damn good article!

It was short, to the point, and based on real data, and useful as well. So unlike the polite verbiage of karma whores. Even William of Ockham would have been proud of you.

Kim0+

kim030

I wondered how humans are grouped, so I got some genes from the world, and did an eigenvalue analysis, and this is what i found:

http://kim.oyhus.no/EigenGenes.html

As you can see, humans are indeed clustered in subspecies.

0Jack
This doesn't demonstrate subspecies.
kim000

Many-Worlds explained, with pretty pictures.

http://kim.oyhus.no/QM_explaining_many-worlds.html

The story about how I deduced the Many-Worlds interpretation, with pictures instead of formulas.

Enjoy!

0RobinZ
There is a more recent open-thread if you want to post there.
5wedrifid
It did work, in a way. But I did expend some effort trying to tease out the positive parts from the presentation. With only a minute's worth of extra effort you could have made it work better and I would be lining up here to give you status and respect for your contribution. I know you (may believe and or assert that you) aren't interested in status or respect here but if you care about the value are trying to 'work' here then you should be. You don't want your beliefs to be associated with 'the kind of things that only social outcasts would think'. That's what will happen if you go around supporting things that you care about while also acting like a prat, due to similar dynamics to the ones you described.
2RobinZ
If you intended the original comment as a test, you should have pointed out that it was a test. To do otherwise is quite rude.
kim000

Yes. Quadratic regression is better, often. The problem is that the number of coefficients to adjust in the model gets squared, which goes against Ockhams razor. This is precisely the problem I am working on these days, though in the context of the oil industry.

0Matt_Simpson
It's not too difficult to check to see if adding the extra terms improves the regression. In my original comment, I listed AIC and BIC among others. On the other hand, different diagnostics will give different answers, so there's the question of which diagnostic to trust if they disagree. I haven't learned much about regression diagnostics yet, but at the moment they all seem ad hoc (maybe because I haven't seen the theory behind them yet).
4XFrequentist
Thank you for making such a polite and kind comment!
wedrifid100

It frustrates me to read this comment. There are some important insights in there that are being sullied by involvement in such a low status presentation. The comment is needlessly confrontational, in need of proof reading, and uses hyperbole where it does not help. It also misses the details of the situation, suggesting an "all I have is a hammer" understanding.

The social dynamics mentioned in the parent do occur and there is potential for detrimental psychological consequences for both parties of letting the status game become so unbalanced. Th... (read more)

kim010

I guess you down-voters of me felt quite rational when doing so.

And this is precisely the reason I seldom post here, and only read a few posters that I know are rational from their own work on the net, not from what they write here:

There are too many fake rationalists here. The absence of any real arguments either way to my article above, is evidence of this.

My Othello/Reversi example above was easy to understand, and a very central problem in AI systems, so it should be of interest to real rationalists interested in AI, but there is only negative reaction... (read more)

3SarahNibs
Your Othello Reversi example is fundamentally flawed, but it may not seem like it unless you realize that at LW the tradition is to say that utility is linear in paperclips to Clippy. That may be our fault, but there's your explanation. "Winning 60-0", to us using our jargon, is equivalent to one paperclip, not 60. And "winning 33-31" is also equivalent to one paperclip, not 33. (or they're both equivalent to x paperclips, whatever) So when I read your example, I read it as "80% chance of 1 paperclip, or 90% chance of 1 paperclip". I'm sure it's very irritating to have your statement miscommunicated because of a jargon difference (paperclip = utility rather than f(paperclip) = utility)! I encourage you to post anyway, and begin with the assumption that we misunderstand you rather than the assumption that we are "fake rationalists", but realize that in the current environment (unfortunately or not, but there it is) the burden of communication is on the poster.
0Benquo
While most of this of this seems sensible, I don't understand how your last sentence follows. I have heard similar strategies suggested to reduce the probability of paperclipping, but it seems like if we actually succeed in producing a true friendly AI, the quantity it tries to maximize (expected winning, P(winning), or something else) will depend on how we evaluate outcomes.
0[anonymous]
This made some sense to me, at least to the point where I'd expect an intelligent refutation from disagreers, and seems posted in good faith. What am I missing about the voting system? Or about this post.
1kim0
I guess you down-voters of me felt quite rational when doing so. And this is precisely the reason I seldom post here, and only read a few posters that I know are rational from their own work on the net, not from what they write here: There are too many fake rationalists here. The absence of any real arguments either way to my article above, is evidence of this. My Othello/Reversi example above was easy to understand, and a very central problem in AI systems, so it should be of interest to real rationalists interested in AI, but there is only negative reaction instead, from people I guess have not even made a decent game playing AI, but nevertheless have strong opinions on how they must be. So, for getting intelligent rational arguments on AI, this community is useless, as opposed to Yudkowsky, Schmidhuber, Hansen, Tyler, etc. which has shown on their own sites that they have something to contribute. To get real results in AI and rationality, I do my own math and science.
kim0-40

You got voted down because you were rational. You went over some peoples heads.

These are popularity points, not rationality points.

That is something we worry about from time to time, but in this case I think the downvotes are justified. Tim Tyler has been repeating a particular form of techno-optimism for quite a while, which is fine; it's good to have contrarians around.

However, in the current thread, I don't think he's taking the critique seriously enough. It's been pointed out that he's essentially searching for reasons that even a Paperclipper would preserve everything of value to us, rather than just putting himself in Clippy's place and really asking for the most efficient way... (read more)

1timtyler
I think the usual example assumes that the machine assigns a low probability to the hypothesis that paperclips are not the only valuable thing - because of how it was programmed.
kim0-10

I have an Othello/Reversi playing program.

I tried making it better by applying probabilistic statistics to the game tree, quite like antropic reasoning. It then became quite bad at playing.

Ordinary minimax with A-B did very well.

Game algorithms that ignore density of states in the game tree, and only focus on minimaxing, do much better. This is a close analogy to the experience trees of Eliezer, and therefore a hint that antropic reasoning here has some kind of error.

Kim0

4rwallace
That's because those games are nonrandom, and your opponent can be expected to single out the best move. Algorithms for games like backgammon and poker that have a random element, do pay attention to density of states. (Oddly enough, so nowadays do the best known algorithms for Go, which surprised almost everyone in the field when this discovery was made. Intuitively, this can be seen as being because the game tree of Go is too large and complex for exhaustive search to work.)
kim010

What exactly makes it difficult to use Russian? I know Russian, so I will understand the explanation.

I find my native Norwegian better to express concepts in than English. If I program something especially difficult, or do some difficult math, physics, or logic, I also find Norwegian better.

However, if I do some easier task, where I have studied it in English, I find it easy to write in English, due to a "cut and paste" effect. I just remember stuff, combine it, and write it down.

1cousin_it
Whenever I try translating some math or programming stuff from Russian into English or vice versa, the Russian version ends up about 20% longer. Maybe it's because many useful connective words in Russian are polysyllabic, e.g. "kotoryi" (which) ,"chtoby" (to), "poetomu" (so), making sentences with complex logical structure sound clumsy. Translating into Russian always feels like a poetic jigsaw puzzle to make the phrase sound okay, while translating into English feels more anything-goes at the expense of emotional nuance. YMMV.
kim0-30

Interesting, but too verbose.

The author is clearly not aware of the value of the K.I.S.S. principle, or Ockhams razor, in this context.

1Annoyance
The rules of chess don't explicitly state whether the vast majority of moves are good ideas or bad ones. (Exceptions involve moves that would put your king in check - and that's not bad, it's disallowed.) You can know all of the rules, and not be able to determine how you should react in a chess game. Because all of the principles that govern 'good' play arise as consequences from the explicit rules. If proper moves were as easy to determine in chess as they were in Tic-Tac-Toe, no one would bother playing it. Go is the same, only more so.
kim0-30

Giving it up is rational thinking, because there is no "it" there when the label is too broad.

In Bayesian inference, it is equivalent to P( A | B v C v D v ...), which is somewhat like underfitting. The space of possibilities becomes too large for it to be possible to find a good move. In games it is precisely the unclear parts of the game space that is interesting to the loosing part, because it is most likely there will be better moves there. But when it is not even possible to analyze those parts, then true optimal play regresses to quarrelin... (read more)

-1Annoyance
The rules of Go are perfectly clear. It's the consequences of those rules that we have a great deal of trouble understanding. Or that you do, at least.
kim000

I agree. We seem to have the same goal, so my first advice stands, not my second.

I am currently trying to develop a language that is both simple and expressive, and making some progress. The overall design is finished, and I am now down to what instructions it should have. It is a general bi-graph, but with a sequential program structure, and no separation of program and data.

It is somewhat different from what you want, as I also need something that have measurable use of time and memory, and is provable able to run fast.

0bogdanb
Could I have a link, or maybe some more information about this language? It's something I'm very interested in (merging expressiveness and guarantees about resource usage).
4loqi
Massive semantic confusion. Just because the word "Go" is used to denote a family of games and game-like activities doesn't mean there can't be concrete realizations of the concept that capture most or all of its interesting qualities. Concluding that the game has "no true core" and giving it up, merely because its label is too broad for your taste, strikes me as very confused thinking.
kim000

Then I would go for Turing machines, Lambda calculus, or similar. These languages are very simple, and can easily handle input and output.

Even simpler languages, like cellular automaton No.110 or Combinatory Logic might be better, but those are quite difficult to get to handle input and output correctly.

The reason simple languages, or universal machines, should be better, is that the upper bound for error in estimating probabilities is 2 to the power of the complexity of the program simulating one language in another, according to algorithmic information t... (read more)

0timtyler
Using simple languages is the conventional approach. However, simple languages typically result in more complex programs. The game of life is very simple - yet try writing a program in it. If you are trying to minimise the size of an emulator of other languages, then highly simple languages don't seem to fit the bill. Why would one want a decent formulation of Occam's razor? To help solve the problem of the priors.
kim000

That depends on what you mean by "best".

Is speed of calculation important? What about suitability for humans? I guess you want one where complexities are as small as possible.

Given 2 languages, L1 & L2, and their complexity measures, K1 & K2.

If K1(L2) < K2(L1) then I take that as a sign that L1 is better for use in the context of Ockhams razor. It is also a sign that L1 is more complex than L2, but that effect can be removed by doing lots of comparisons like that, so the unnecessarily complex languages loose against those that are actu... (read more)

2timtyler
"Best": most accurate - i.e. when Occam's razor says A is a more likely hypothesis than B, then that is actually true.
1Vladimir_Nesov
kim0, you are trolling now. You are not communicating clearly, and then claim that the objections to your unclear communication are invalid, because you can retroactively amend the bad connotations and ambiguities, but in the process of doing so, you introduce further false-sounding and ambiguous statements. You should choose your words more carefully.
kim0-30

It is universal, because every possible sequence is generated.

It is universal, because it is based on universally recursive functions.

It is universal, because it uses an universal computer.

People knowing algorithmic complexity know that it is about probability measures, spaces, universality, etc. You apparently did not, while nitpicking instead.

0smoofra
I'm not nitpicking, you're wrong. "Universal" in this context means, to quote the original poster What the heck does that have to do with every possible sequence being generated? For that matter what does it have to do with sequences at all? The solomonoff measure is a measure over sequences in a finite alphabet, or to put it simpler, Integers. How do I express an event like "it will rain next tuesday" as a subset of the integers? Whatever you are using the word "universal" to mean, it is not anything like what the OP had in mind. The Solomonoff measure is an interesting mathematical object for sure, and it may be quite relevant to the topic of real-world Bayesian reasoning, but it's obviously not universal in that sense. also: what the heck does "universally recursive" mean? Did you just make up that term right now? Because I've never heard it before, it only has 10 google hits, and none of them are relevant to this discussion.
kim0-20

You are wrong because I did specify a probability space.

The probability space I specified was one where the sample space was the set of all outputs of all programs for some universal computer, and the measure was one from the book I mentioned. One could for instance choose the Solomonoff measure, from 4.4.3.

From your writings I conclude that is it quite likely that you are neither quite aware of the concept, nor understanding what I write, while believing you do.

0smoofra
No, you specified the points and not the measure. OK! Now we've got a space. Of course if you wanted to talk about solomonoff measure, why didn't you just say so 5 comments ago. Pretty much everyone reading less wrong would have immediately known what you were talking about. You still haven't justified calling the solomonoff space "universal". Now you're just being rude. You don't know me, you certainly don't know what I do or don't know.
0smoofra
I have. I'm not an expert in it, but I'm quite aware of the concept. You have not specified a probability space, and you have not made any attempt to justify calling the space you didn't specify "universal" uh huh.
kim030

I guess the point is to model artificial intelligences, of which we know almost nothing, so the models and problems need the robustness of logic and simpleness.

Thats why they are brittle when used for modeling people.

kim000

O.K.

One wants an universal probability space where one can find the probability of any event. This is possible:

One way of making such a space is to take all recursive functions of some universal computer, run them, and storing the output, resulting in an universal probability space because every possible set of events will be there, as the results of infinitely many recursive functions, or programs as they are called. The probabilities corresponds to the density of these outputs, these events.

A counterargument is that it is too dependent on the actual univ... (read more)

4smoofra
OK.... what!? You haven't yet described a probability space. The aforementioned set is infinite, so the uniform distribution is unavailable. What probability distribution will you have on this set of recursive-function-runs. And in what way is the resulting probability space universal?
kim0-10

The technically precise reference was this part:

"This is algorithmic information theory,.."

But if you claim my first line was too obfuscated, I can agree.

Kim Øyhus

8Vladimir_Nesov
Please specify in what sense the first line was correct, or declare it an error. Pronouncing assertions known to be incorrect and then just shrugging that off shouldn't be acceptable on this forum.
kim0-30

All recursive probability spaces converge to the same probabilities, as the information increases.

Not that those people making up probabilities knows anything about that.

If you want an universal probability space, just take some universal computer, run all programs on it, and keep those that output event A. Then you can see how many of those that output event B, and thus you can get p(B|A) whatever A and B are.

This is algorithmic information theory, and should be known by any black belt bayesian.

Kim Øyhus

1smoofra
As far as I can tell, you are talking absolute gibberish. If I'm wrong, please explain. edit: if someone who downvoted me could please explain what the heck a "recursive probability space" is supposed to be, I'd appreciate it.
4Vladimir_Nesov
Google gives 0 hits on "recursive probability space". Blanket assertions like this need to be technically precise. I refer interested readers to the Algorithmic probability article on Scholarpedia.
kim000

Very interesting article that.

However, evolution is able to test and spread many genes at the same time, thus achieving higher efficiency than the article suggests. Sort of like spread spectrum radio.

I am quite certain its speed is lower than some statistical methods, but not by that much. I guess at something like a constant factor slower, for doubling gene concentration, as compared to 1 std deviation certainty for the goodness of the gene by Gaussian statistics.

Random binary natural testing of a gene is less accurate than statistics, but it avoids putti... (read more)

1timtyler
We can see that intelligent design beats random mutations by quite a stretch - by looking at the acceleration of change due to cultural evolution and technology. Of course cultural evolution is still a kind of evolution - but intelligent mutations, multi-partner recombination and all the other differences do seem to add up to something pretty substantial.
kim000

What is your evidence for this assertion?

In my analysis, evolution by sexual reproduction can be very good at rationality, collecting information of about 1 bit per generation per individual, because an individual can only be naturally selected or die 1 time.

The factors limiting the learning speed of evolution is the high cost of this information, namely death, and that this is the only kind data going into the system. And the value to be optimized is avoidance of death, which also avoids data gathering. And this optimization function is almost impossible ... (read more)

0JGWeissman
Well, perhaps you could reduce the effectiveness of even a Bayesian super intelligence to the level of evolution by restricting the evidence it observes to the evidence that evolution actually uses. But that is not the point. Evolution ignores a lot of evidence, for example, it does not notice that a gene that confers a small advantage is slowly increasing in frequency and that it would save a lot of time to just give every member of the evolving population a copy of that gene. When a mutation occurs, evolution is incapable of copying that mutation in a hundred organisms to filter out noise from other factors in evaluating its contribution to fitness. An amazingly beneficial mutation could die with the first organism to have it, because of the dumb luck of being targeted by a predator when only a few days old. For more on the limitations of evolution, and some example of how human intelligence does much better, see Evolutions Are Stupid.
kim010

All control systems DO have models of what they are controlling. However, the models are typically VERY simple.

A good principle for constructing control systems are: Given that I have a very simple model, how do I optimize it?

The models I learned about in cybernetics were all linear, implemented as matrices, resistors and capacitors, or discrete time step filters. The most important thing was to show that the models and reality together did not result in amplification of oscillations. Then one made sure that the system actually did some controlling... (read more)

kim010

Verbal probabilities are typically impossible because the priors are unknown and important.

However: relative probabilities and similar can often be given usueful estimates, or limits.

For instance: Seeing a cat is more likely than seeing a black cat because black cats are a subset of cats.

Stuff like this is the reason that pure probability calculations are not sufficient for general intelligence.

Probability distributions however, seem to me to be sufficient. This cat example cuts the distribution in 2.

Kim Øyhus