All of wangscarpet's Comments + Replies

He's also a writer with book titles that sound like LessWrong articles, though they were written before this site hit the mainstream. He wrote "The Overton Window" in 2010, and "The Eye of Moloch" in 2012.

What does Flutter do that various JS/CS frameworks don't? Does it give design advantages on web, or is the benefit that it brings web design to non-web applications? 

2mako yass
It makes non-web applications possible. It has a better layout system, rendering system. Animates everything properly. Centers Dart, which seems to be a pretty good language: It can be compiled ahead of time for faster boots (although I'm not completely sure that typescript wont be basically just as compilable once wasm-gc is up), has better type reflection, will have better codegen (already supports annotations), has a reasonable import system, better data structures, and potentially higher performance due to the type system not being an afterthought (although typescript is still very good relative to dart).

This is a little like game theory coordination vs cooperation actually. Coordination is if you can constrain all actors to change in the same way: competition is if each can change while holding the others fixed. "Evolutionary replicator dynamics" is a game theory algorithm that encompasses the latter.

Even if the beetles all currently share the same genes, any one beetle can have a mutation that competes with his/her peers in future generations. Therefore, reduced variation at the current time doesn't cause the system to be stable, unless there's some way to ensure that any change is passed to all beetles (like having a queen that does all the breeding).

For the preprint --> journal point: if all the bad ones never made it to publication, that would show the same conclusion. Maybe journals are more filtering than enhancement sometimes.

Or, the norm of expecting to publish in a journal means people are motivated to do good work. If you preprint with the expectation to publish you'll do good work. But there can still be a lot of terrible science out there in preprints. I don't think what you wrote is evidence we should trust random PDFs on the internet.

What do median and mode case peaks mean here? Is that on your ordering of day? Like, if for Jan 15-19 there's a 10,20,30,14,16 is the mode the 17, the median the 19th, and the mean whatever you'd get by averaging?

How would you go about getting a high-risk person close to the front of the line for pavloxid treatment? I was really heartened to see the news about imminent approval because a very high-risk family member was recently exposed, but I think without hustle he would definitely not get it in time. I plan to call hospitals in his area tomorrow -- anything else?

Answer by wangscarpet40

Fantasy by Eternity Forever: posthardcore supergroup with the grooviest instrumentals

I really loved reading this series. Came for the puns, stayed for the story. Thank you for writing!

In continuous control problems what you're describing is called "bang-bang control", or switching between different full-strength actions. In continuous-time systems this is often optimal behavior (because you get the same effect doing a double-strength action for half as long over a short timescale). Until you factor non-linear energy costs in, in which case a smoother controller becomes preferred.

2Caridorc Tergilti
Half as long right?

Kiln People is a fantastic science fiction story which explores the same question, if the embodied copies are temporary (~24 hours). It explores questions of employment, privacy, life-purpose, and legality in a world where this cloning procedure is common. I highly recommend it to those interested.

"Now, suppose that in addition to g, you learn that Bob did well on paragraph comprehension. How does this change your estimate of Bob's coding speed? Amazingly, it doesn't. The single number g contains all the shared information between the tests."

I don't think this is right, if some fraction of the test for g is paragraph comprehension. If g is the weighted average between paragraph comprehension and addition skill, knowing g and paragraph comprehension gives you addition skill.

2Richard_Kennaway
Yes, conditioning on g makes all the observations anti-correlated with each other (assuming for simplicity that the coefficients that define g are all positive). The case of two factors has come up on LW a few times, but I don't have a reference: if you can get into an exclusive university by a combination of wealth and intelligence, then among its members wealth and intelligence will be anticorrelated, . When all the eigenvalues after the first are equal, what has zero correlation conditional on g is the measurements along the corresponding principal axes, which are other linear combinations of the observations.

Yep, they're different. It's just an architecture. Among other things, Chess and Go have different input/action spaces, so the same architecture can't be used on both without some way to handle this.

This paper uses an egocentric input, which allows many different types of tasks to use the same architecture. That would be the equivalent of learning Chess/Go based on pictures of the board.

Can you extrapolate the infectiousness ratio between the newest most virulent strain and the original? I assume the original has all but died out, but maybe by chaining together estimates of intermediate strains?

I'm reading BDA3 right now, and I'm on chapter 6. You described it well. It takes a lot of thinking to get through, but is very comprehensive. I like how it's explicitly not just a theory textbook. They demonstrate each major point by describing a real-world problem (measuring cancer rates across populations, comparing test-prep effectiveness), and attacking it with multiple models (usually frequentist to show limitations and then their Bayesian model more thoroughly. It has a focus on learning the tools well enough to apply them to real-world problems.

I p... (read more)

1Jan Christian Refsgaard
I loved that example as well, I have heard it elsewhere described as "The law of small numbers", where small subsets have higher variance and therefore more frequent extreme outcomes. I think it's particularly good as the most important part of the Bayesian paragdime is the focus on uncertainty. The appendix on HMC is also a very good supplement to gain a deeper understanding of the algorithm after having read the description in another book first.

I think it comes from a feeling that proportion of blame needs to add to one, and by apologizing first you're putting more of the blame on your actions. You often can't say "I apologize for the 25% of this mess I'm responsible for."

I think the general mindset of apportioning blame (as well as looking for a single blame-target) is a dangerous one. There's a whole world of things that contribute to every conflict outside of the two people having it.

8jbash
You're already being tactical when you decide that Carol isn't a threat and (falsely) uprank her. What changes if you go a step further to decide that she is a threat? In fact, I think that the standard formalism for defining "tactical voting" is in terms of submitting a vote that doesn't faitfully reflect your true preferences. Under that formalism, falsely upranking Carol is tactical, but switching back to your true preferences because of what you expect others to do actually isn't tactical. ... and it's odd to talk about tactical voting as a "downside" of one system or another, since there's a theorem that says tactical voting opportunities will exist in any voting system choosing between more than two alternatives: https://en.wikipedia.org/wiki/Gibbard–Satterthwaite_theorem . At best you can argue about which system has the worst case of the disease. And, if you're comparing the two, plurality has a pretty bad case of tactical vulnerability, probably worse than IRV/RCV. That's why people want to change it: because tactical voting under plurality entrenches two-party systems.
Answer by wangscarpet10

I think things become simpler when you look at the sum of all stocks, versus particular ones. Then, you only need to consider the market cap of the entire stock market, and what makes it change over time.

The economy is much bigger than the stock market. Money flows from small companies to larger one as the economy consolidates -- since the former are more likely to be publicly traded than the latter, that makes the market become bigger.

It's easier to invest in the stock market now than in the past. Since it's accessible to more people, then more people's ... (read more)

When reading HPMOR, I thought Harry's decision to not kill Voldemort was hubris, even considering his morals. He should have known that all plans have a non-zero chance of failing, and that Voldemort coming back could be an existential risk. So, finite harm now to stop a small chance at nearly infinite harm later. Nice to see this coming to bite him. I'm looking forward to the conclusion!

3Pattern
0. Listed above. 1. You're ignoring any possible benefits. (Like 2. You might say he did kill him (there was a prophecy, and prophecies may have a 'you know when it's over property'). 3. Arguably (post-obliviation, it's a risk like he's a risk, except maybe less).

But Harry had no way of killing him because of all the horcruxes. Memory-erasure was simply the best he could do.

In what world is giving the second dose to the same person, raising them from 87% to 96% protected, a higher priority than vaccinating a second person? 

One benefit of 96 vs 87 is that the former could allow you to live an un-quarantined life, while the latter wouldn't result in much behavioral change. Clearly, the one-dose is better for net deaths etc, but the QALY calculation looks a little different.

I still agree with you. But it's worth considering the above reasoning for completeness.

2Zvi
Agreed that this is the counter-argument, if you think that 87% wouldn't be enough to allow changes in behavior - which of course can also be an advantage, since now you really did reduce that person's risk by 87%, and the risk of them infecting others too. Note that when I ask, for myself, whether 87% now (with 96% coming in 4-6 months) would be enough for me to un-quarantine, I get a strong yes. I would still not take 'stupid' risks but would mostly e.g. see friends freely.

So interesting that I thought you were going to go the opposite direction at the end. I have felt slight amounts of imposter syndrome before, and it came from feeling like a well-liked and well-respected person whose skills did not fully back up my reputation. So, I was high on the social hierarchy but I perceived it coming from dominance and not prestige.

4a gently pricked vein
I was also surprised. Having spoken to a few people with crippling impostor syndrome, the summary seemed to be "people think I'm smart/skilled, but it's not Actually True."  I think the claim in the article is they're still in the game when saying that, just another round of downplaying themselves? This becomes really hard to falsify (like internalized misogyny) even if true, so I appreciate the predictions at the end.
1Sarahí Aguilar
Yes. I think it could also go the other way around: one can come from a privileged background which gives them a 8-10 dominance, yet they lack in skill. However, I believe this is not a sustainable scenario. Or is it? 

A parallel problem with prediction markets: at non-financial-industry scales: they're used as signals of confidence. How often do you see, after someone makes a bold claim, a response saying to "put your money where your mouth is." But just the act of signalling confidence can be intrinsically valuable to the person making a claim. In bet-capped places like predictit, this can make equilibrium that are different from optimal, because there are non-monetary incentives at work.

This is a good and valid question -- I agree, it isn't fair to say generalization comes entirely from human beliefs.

An illustrative example: suppose we're talking about deep learning, so our predicting model is a neural network. We haven't specified the architecture of the model yet. We choose two architectures, and train both of them from our subsampled human-labeled D* items. Almost surely, these two models won't give exactly the same outputs on every input, even in expectation. So where did this variability come from? Some sort of bias from the model architecture!