I went through Gwern’s posts and collected all the posts with importance 8 and higher as of 2024-09-04 in case someone else was searching for something like this.
10
9
8
...Things I learned that surprised me from a deep dive into how the medication I've been taking for years (Vyvanse) actually gets metabolized:
Here is some real data, which fit the simple exponential decay rather well (It's from children though, which metabolize dextroamphetamine faster, which is why the half-life is only ~10h)
If legibility of expertise is a bottleneck to progress and adequacy of civilization, it seems like creating better benchmarks for knowledge and expertise for humans might be a valuable public good. While that seems difficult for aesthetics, it seems easier for engineering? I'd rather listen to a physics PhD, who gets Thinking Physics questions right (with good calibration), years into their professional career, than one who doesn't.
One way to do that is to force experts to make forecasts, but this takes a lot of time to hash out and even more time to resolve.
One idea I just had related to this: the same way we use datasets like MMLU and MMMU, etc. to evaluate language models, we use a small dataset like this and then experts are allowed to take the test and performance on the test is always public (and then you make a new test every month or year).
Maybe you also get some participants to do these questions in a quiz show format and put it on YouTube, so the test becomes more popular? I would watch that.
The disadvantage of this method compared to tests people prepare for in academia would be that the data would be quite noisy. On the other hand, this measure could be more robust to g...
Has anyone here investigated before if washing vegetables/fruits is worth it? Until recently I never washed my vegetables, because I classified that as a bullshit value claim.
Intuitively, if I am otherwise also not super hygienic (like washing my hands before eating) it doesn't seem that plausible to me that vegetables are where I am going to get infected from other people having touched the carrots etc... . Being in quarantine during a pandemic might be an exception, but then again I don't know if I am going to get rid of viruses if I am just lazily rinsi...
While there is currently a lot of attention on assessing language models, it puzzles me that no one seems to be independently assessing the quality of different search engines and recommender systems. Shouldn't this be easy to do? The only thing I could find related to this is this Russian site (It might be propaganda from Yandex, as it is listed as the top quality site?). Taking their “overall search quality” rating at face value does seem to support the popular hypothesis that search quality of Google has slightly deteriorated over the last 10 years (alt...
One reason why I find Lesswrong valuable is that it serves as a sort of "wisdom feed" for myself, where I got exposed to a lot of great writing. Especially writing that ambitiously attempts to build long-lasting gears. Sadly, for most of the good writers on Lesswrong, I have already read or at least skimmed all of their posts. I wonder, though, to which extent I am missing out on great content like that on the wider internet. There are textbooks, of course, but then there's also all this knowledge that is usually left out of textbooks. For myself, it proba...
It is time that I apply the principle of more dakka and just start writing (or rather publishing) more. I know deliberate practice works for writing. If you want to be good at something, you need to do it badly and then just keep going with doing the thing is very common advice. I still find it very hard to do.
It's hard to decide what to write about
I get anxiety about writing something that is not good enough
Topics that go into my head about what to write about revolve mostly about how it is
I feel like there should exist a more advanced sequence that explains problems with filtered evidence leading to “confirmation bias”. I think the Luna sequence is already a great step in the right direction. I do feel like there is a lack of the equivalent non-fiction version, that just plainly lays out the issue. Maybe what I am envisioning is just a version of What evidence filtered evidence with more examples of how to practice this skill (applied to search engines, language models, someone’s own thought process, information actively hidden from you, ra...
If I had more time I would have written a shorter letter.
TLDR: I looked into how much it would take to fine-tune gpt-4 to do Fermi estimates better. If you liked the post/paper on fine-tuning Language models to make predictions you might like reading this. I evaluated gpt-4 on the first dataset I found, but gpt-4 was already making better fermi estimates than the examples in the dataset, so I stopped there (my code).
First problem I encountered: there is no public access to fine-tuning gpt-4 so far. Ok, we might as well just do gpt-3.5 I guess.
First, I foun...
Probably silly
Quantifying uncertainty is great and all, but also exhausting precious mental energy. I am getting quite fond of giving probability ranges instead of point estimates when I want to communicate my uncertainty quickly. For example: “I'll probably (40-80%) show up to the party tonight.” For some reason, translating natural language uncertainty words into probability ranges feels more natural (at least to me) so requires less work for the writer.
If the difference is important, the other person can ask, but it still seems better than just saying 'probably'.
I am not sure how much this was a problem, but I felt like listening more to pop music on Spotify slowly led to value drift, because so many songs are about love and partying.
I felt a stronger desire to invest more time to fix the fact that I am single. I do not actually endorse that on reflection. The best solution I've found so far is starting to listen to music in languages I don't understand, which works great!
Hypothesis based on the fact that status is a strong drive and people who are on the outer ends of that spectrum get classified as having a "personality disorder" and are going to be very resistant to therapy:
Can anyone here recommend particular tools to practice grammar? Or with strong opinions on the best workflow/tool to correct grammar on the fly? I already know Grammarly and LanguageTool, but Grammarly seems steep at $30 per month when I don’t know if it is any good. I have tried GPT-4 before, but the main problems I have there, is that it is too slow and changes my sentences more than I would like (I tried to make it do that less through prompting, which did not help that much).
I notice that feeling unconfident about my grammar/punctuation leads me to wri...
Metaculus recently updated the way they score user predictions. For anyone who used to be active on Metaculus and hasn't logged on for a while, I recommend checking out your peer and baseline accuracy scores in the past years. With the new scoring system, you can finally determine whether your predictions were any good compared to the community median. This makes me actually consider using it again instead of Manifold.
By the way, if you are new to forecasting and want to become better, I would recommend past-casting and/or calibration games instead, becaus...
Not sure what's going on, but gpt-4o keeps using its search tool when it shouldn't and telling me about either the weather, or sonic the hedgehog. I couldn't find anything about this online. Are funny things like this happening to anyone else? I checked both my custom instructions and the memory items and nothing there mentions either of these.
Epistemic Status: Anecdote
Two weeks ago, I’ve been dissatisfied with the amount of workouts I do. When I considered how to solve the issue, my brain generated the excuse that while I like running outside, I really don’t like doing workouts with my dumbbells in my room even though that would be a more intense and therefore more useful workout. Somehow I ended up actually thinking and asked myself why I don’t just take the dumbbells with me outside. Which was of course met by resistance because it looks weird. It’s even worse! I don’t know how to “properly” ...
I was just thinking that there is actually a way to justify using occams razor, because by using it, you will always converge on the true hypothesis in the limit of accumulating evidence. Not sure if I've seen this somewhere else before, or if I gigabrained myself into some nonsense:
Let's say the true world is some finite state machine M'∈M with the input alphabet {1} and the output alphabet {0,1}. Now I feed into this an infinite sequence of 1s. If I use a uniform prior over all possible finite state automatons, then at any step of observing the output, t...
I like the agreement voting feature for comments! Not only does it change incentives/signals people receive, I also notice how looking at whether to press this button I am more often actually asking myself whether I actually just endorse a comment or whether I actually belief it. Which seems great. I do feel the added time considering to press a button costly, but for this particular one that seems more like a feature than a bug.
Summary: I have updated on being more conscientious than I thought.
Since most of the advice on 80.000 hours is aimed at high performing college students, I find it difficult how much this advice should apply to myself, who just graduated from high school. Previously I had thought of myself as talented in math (I was the best in my class with 40 students, since first grade), but mid- to below average in conscientiousness. I also feel slightly ashamed of my (hand-)writing: most of my teachers commented that my texts were too short and my writing is not ...
If every private message on lesswrong or every cold email you ever wrote has received a response, you are either spending too much time writing them, are very young or aren't sending enough of them.
Thinking about this thread on "How could I have thought of that faster?". In practice, I noticed the phrasing doesn't work well for me and I prefer "How could I have seen this faster"? I especially like this framing for when I have identified a "blindspot", "developmental milestone" or noticing someone's wizard power is clearly hinting at a powerful learnable skill or concept I wasn't aware of and that I don't yet possess.
I feel this frame helps me to better find areas where there are cached beliefs to correct, and remember hints at the shape of the thing ...
The recent post on reliability and automation reminded me that my "textexpansion" tool Espanso is not reliable enough on Linux (Ubuntu, Gnome, X11). Anyone here using reliable alternatives?
I've been using Espanso for a while now, but its text expansions miss characters too often, which is worse than useless. I fiddled with Espanso's settings just now and set the backend to Clipboard, which seems to help with that, but it still has bugs like the special characters remaining ("@my_email_shorthand" -> "@myemail@gmail.com").
I noticed some time ago there is a big overlap between lines of hope mentioned in Garret Baker's post and lines of hope I already had. The remaining things he mentions are lines of hope that I at least can't antipredict which is rare. It's currently the top plan/model of Alignment that I would want to read a critique of (to destroy or strengthen my hopes). Since no one else seems to have written that critique yet I might write a post myself (Leave a comment if you'd be interested to review a draft or have feedback on the points below).
Testing a claim from the lesswrong_editor tag about the spoiler feature: first trying ">!":
! This should be hidden
Apparently markdown does not support ">!" for spoiler tags. now ":::spoiler ... :::"
It's hidden!
works.
Inspired by John's post on How To Make Prediction Markets Useful For Alignment Work I made two markets (see below):
I feel like there are pretty important predictions to be made around things like whether the current funding situation is going to continue as it is. It seems hard to tell, though what kind of question to ask that provides someone more value, than just reading something like the recent post on what the marginal LTFF grant looks like.
Has someone bothered moving the content on Arbital into a format where it is (more easily) accessible? By now I figured out that and where you can see all math and ai-alignment related content, but I only found that by accident, when Arbitals main page actually managed to load not like the other 5 times I clicked on its icon. I had already assumed it was nonexistent, but it's just slow as hell.
I wonder if you could exploit instrumental convergence for IRL. For example, with humans that we lack information about, we would still guess that money would probably help them. In some sense, most of the work is probably done by the assumption that the human is rational.
Epistemic status: Speculation
"Everyone" is misinterpreting the implications of the original "no-free-lunch theorems". Stuart Armstrong is misinterpreting the implications of his no-free-lunch theorems for value learning.
The original no-free-lunch theorems show, that if you use a terrible prior over your hypothesis, then it will not converge/learning is impossible. In practice, this is not important, because we always make the assumption that learning is possible. We call these priors "simplicity priors", but the actually important bit about these is not th...
While reading p vs. np for dummies recently, I was really intrigued by Scott's probabilistic reasoning about math questions. It occurred to me that of all science areas, math seems like a really fruitful area for betting markets, because compared to areas like psychology where you have to argue with people what results of studies actually mean, it seems mathematicians are better at getting at a consensus (it could potentially also help to uncover areas where this is not the case?) I also just remembered that There are a few math-related questions on Metacu...
“Causality is part of the map, not the territory”. I think I had already internalized that this is true for probabilities, but not for “causality”, a concept that I don't have a solid grasp on yet. This should be sort of obvious. It's probably written somewhere in the sequences. But not realizing this made me very confused when thinking about causality in a deterministic setting after reading the post on finite factored sets in pictures (causality doesn't seem to make sense in a deterministic setting). Thanks to Lucius for making me realize this.
This could have been a post so more people could link it (many don't reflexively notice that you can easily get a link to a Lesswrong quicktake or Twitter or facebook post by mousing over the date between the upvote count and the poster, which also works for tab and hotkey navigation for people like me who avoid using the mouse/touchpad whenever possible).