Comments

niplav16h20

The obsessive autists who have spent 10,000 hours researching the topic and writing boring articles in support of the mainstream position are left ignored.

It seems like you're spanning up three different categories of thinkers: Academics, public intellectuals, and "obsessive autists".

Notice that the examples you give overlap in those categories: Hanson and Caplan are academics (professors!), while the Natália Mendonça is not an academic, but is approaching being a public intellectual by now(?). Similarly, Scott Alexander strikes me as being in the "public intellectual" bucket much more than any other bucket.

So your conclusion, as far as I read the article, should be "read obsessive autists" instead of "read obsessive autists that support the mainstream view". This is my current best guess—"obsessive autists" are usually not under much strong pressure to say politically palatable things, very unlike professors.

niplav19h50

My best guess is that people in these categories were ones that were high in some other trait, e.g. patience, which allowed them to collect datasets or make careful experiments for quite a while, thus enabling others to make great discoveries.

I'm thinking for example of Tycho Brahe, who is best known for 15 years of careful astronomical observation & data collection, or Gregor Mendel's 7-year-long experiments on peas. Same for Dmitry Belayev and fox domestication. Of course I don't know their cognitive scores, but those don't seem like a bottleneck in their work.

So the recipe to me looks like "find an unexplored data source that requires long-term observation to bear fruit, but would yield a lot of insight if studied closely, then investigate".

niplav3d80

I think the Diesel engine would've taken 10 years or 20 years longer to be invented: From the Wikipedia article it sounds like it was fairly unintuitive to the people at the time.

niplav4d145

A core value of LessWrong is to be timeless and not news-driven.

I do really like the simplicity and predictability of the Hacker News algorithm. More karma means more visibility, older means less visibility.

Our current goal is to produce a recommendations feed that both makes people feel like they're keeping up to date with what's new (something many people care about) and also suggest great reads from across LessWrong's entire archive.

I hope that we can avoid getting swallowed by Shoggoth for now by putting a lot of thought into our optimization targets

(Emphasis mine.)

Here's an idea[1] for a straightforward(?) recommendation algorithm: Quantilize over all past LessWrong posts by using inflation-adjusted karma as a metric of quality.

The advantage is that this is dogfooding on some pretty robust theory. I think this isn't super compute-intensive, since the only thing one has to do is to compute the cumulative distribution function once a day (associating it with the post), and then inverse transform sampling from the CDF.

Recommending this way has the disadvantage of not being recency-favoring (which I personally like), and not personalized (which I also like).

By default, it also excludes posts below a certain karma threshold. That could be solved by exponentially tilting the distribution instead of cutting it off (, otherwise to be determined (experimentally?)). Such a recommendation algorithm wouldn't be as robust against very strong optimizers, but since we have some idea what high-karma LessWrong posts look like (& we're not dealing with a superintelligent adversary… yet), that shouldn't be a problem.


  1. If I was more virtuous, I'd write a pull request instead of a comment. ↩︎

niplav5d20

My reason for caring about internal computational states is: In the twin prisoners dilemma[1], I cooperate because we're the same algorithm. If we modify the twin to have a slightly longer right index-finger-nail, I would still cooperate, even though they're a different algorithm, but little enough has been changed about the algorithm that the internal states that they're still similar enough.

But it could be that I'm in a prisoner's dilemma with some program that, given some inputs, returns the same outputs as I do, but for completely different "reasons"—that is, the internal states are very different, and a slight change in input would cause the output to be radically different. My logical correlation with is pretty small, because, even though it gives the same output, it gives that output for very different reasons, so I don't have much control over its outputs by controlling my own computations.

At least, that's how I understand it.


  1. Is this actually ECL, or just acausal trade? ↩︎

niplav6d20

I don't have a concrete usage for it yet.

niplav6d20

The strongest logical correlation is -0.5, the lower the better.

For and , the logical correlation would be , assuming that and have the same output. This is a pretty strong logical correlation.

This is because equal output guarantees a logical correlation of at most 0, and one can then improve the logical correlation by also having similar traces. If the outputs have string distance 1, then the smallest logical correlation can be only 0.5.

niplav6d30

Whenever people have written/talked about ECL, a common thing I've read/heard was that "of course, this depends on us finding some way of saying that one decision algorithm is similar/dissimilar to another one, since we're not going to encounter the case of perfect copies very often". This was at least the case when I last asked Oesterheld about this, but I haven't read Treutlein 2023 closely enough yet to figure out whether he has a satisfying solution.

The fact we didn't have a characterization of logical correlation bugged me and was in the back of my mind, since it felt like a problem that one could make progress on. Today in the shower I was thinking about this, and the post above is what came of it.

(I also have the suspicion that having a notion of "these two programs produce the same/similar outputs in a similar way" might be handy in general.)

Load More