Note for posterity: “Let’s think step by step” is joke.
I downvoted this and I feel the urge to explain myself - the LLMism in the writing is uncanny.
The combination of “Let's think step by step”, “First…” and “Not so fast…” gives me a subtle but dreadful impression that a highly valued member of the community is being finetuned by model output in real time. This emulation of the “Wait, but!” pattern is a bit too much for my comfort.
My comment hasn’t too much to do with the content but more about how unsettled I feel. I don’t think LLM outputs are all necessarily infohazardous - but I am beginning to see the potentially failure modes that people have been gesturing at for a while.
Prolific as ever! Small nitpick - the SBF interview link appears to be pointing at something else?
FYI I do find that aider using a mixed routing between r1 and o3-mini-high as the architect model with sonnet as the editor model to be slightly better than cursor/windsurf etc.
Or for minimal setup, this is what is ranking the highest on aider-polyglot test:aider --architect --model openrouter/deepseek/deepseek-r1 --editor-model sonnet
Is the bet for general purpose model still open? I guess it depends on the specific resolver/resolution criteria - considering that OpenAI have gotten the answer and solution to most of the hard questions. Does o3's 25% even count?
The "biologically imposed minimal wage" is definitely going into my arsenal of verbal tools. This is one of the clearest illustration of the same position that has been argued since the dawn of LW.
I think this is a rather legitimate question to ask - I often dream about retiring to an island for the last few months of my life, hangout with friends and reading my books. And then look to the setting sun until my carbon and silicon are repurposed atom by atom.
However, that is just a dream. I suspect the moral of the story is often at the end:
"Don’t panic. Don’t despair. And don’t give up."
I am a fool - what does RSI mean in this case? I couldn't find it in the og post.
I think that is just true. Now in hindsight, my mistake is that I haven't really updated sufficiently towards how the major players are shifting towards their own chip design capacity. (Apple comes to mind but I am definitely caught a bit off guard on how even Meta and Amazon had moved forward.) I had the impression that Amazon had a bad time in their previous generation of chips - and that new generation of their chips is focused on inference anyways.
But now with the blending of inference and training regime, maybe the "intermediaries" like Nvidia n...
I think this category of actors are neglected as a whole. (As well as SKH, micron etc.)
TSMC makes the chips for NVIDIA and everyone - I didn’t talk too much about them because they are already a lynchpin in many countries’ AI/national security policy (China PRC, Taiwan and at least United States). And by their nature, they are already under heavy surveillance for prosaic (trad. National security and chip self-sufficiency) reasons.
Great stuff! I don't have strong fundamentals in math and statistics but I was still able to hobble along and understand the post. It reminds me of what Rissanen said about data/observation - that data is really all we have, and there is no true state of nature. Our job is to squeeze as much alpha out of observation as possible, instead of trying to find a "true" generator function. This post hit the same spot for me :)
p=1, Soylent still seems to be the top choice at the moment. (They are running into some supply chain problem at the moment / recently.)
(Huel seemed fine too from personal experience. If you care about refined oil/canola oil and protein sources it could be a decent alt)
Good stuff! Though it did take a while for me to extrapolate what M&E is actually suppose to do and looks like; Or "What does good M&E even look like?".
Non-profit seems quite hard and naturally easy for power to entrench (especially in an environment where people oppose legibility). I hope Abi finds their next venture more meaningful.
Update - HackewNews posts today and Lesswrong posts today are very similar in length. That doesn't mean they do an equal job at being concise - maybe Lesswrongers say preciously little for the length of their treatises. But deriving the sophistication of the posts is left as an exercise for the readers and beyond my paygrade:
Hackewnews - avg. 2876.125 words. For the current top 10 posts.[1]
Lesswrong - avg. 2581.2 words. For the top ten post in the last 24 hrs. (God damn it Zvi)
A few problem with this 5 minute method of comparison:
While I agree that we don't live at the Pareto frontier of conciseness, explain-ability and etc, those are some odd examples to use to support your thesis. And the comparison to the hackernews post is likely using the wrong reference class.
Two of the three examples are heavily downvoted. Whether that's because of untruthful content or stylistic (length, tone, etc) or memetic reason (Eliezer ~ prophet), those posts are hardly the poster child of what Lesswrong can do or even is.
As for Vanessa Kosoy's piece, the last third was filled with quotati...
And for me, the (correct) reframing of RL as the cherry on top of our existing self-supervised stack was the straw that broke my hopeful back.
And o3 is more straws to my broken back.
Could it be worth it to buy 23andme stocks if you want some of their user's data?
Naively, the sticker price of 75M USD (today's market cap) for all of their user data might seem cheap - all together for genomes of roughly 12 million users. It seems reasonably cheap to me on a replacement cost and opportunity-cost basis.
However, the 49% basically-majority shareholder is CEO Anne Wojcicki and a "possibility is that Wojcicki has unreasonable plans to take the company private at a bargain-basement price[1]". If you takes this path forward as a decidedly import...
A super silly heuristic I often use is "What media do you consume?". Intuitively, it kinda make sense as a sort of an informational parallel to the old adage "You are what you eat." Look at their spotify/RSS/blogging/shortform-video consumption habit tends to inform me whether A or B would at least have a decent first conversation. But this seems much better at matching friends than partners - presumably common interest and shared consumption of information is a slightly less important factor in continued romantic life (because there are so many other things!).
Is it Moskovitz's "irrational" responses that got him or a set of rather legible needs like "avoiding funding anything that might have unacceptable reputational costs for Dustin Moskovitz"?
we irrationally expect everyone else to be rational..
I am not sure that we do? I think we are not immune to the typical mind fallacy. But there is plenty of talk around these parts about optimal strategies when confronted with irrational opponents - the correct decision is not always throwing away your own rationality. Communicating with emotional people with the languages they can resonate with seems like a fine practice of rationality.
Now, that is strawmanning a little bit. Perhaps this is talking about a maximum exploitative strategy against irrational/...
Thanks! I have built these before I ran into cleanairkits and the school of thought that "lower efficiency with higher throughput is better" - I think per dollar, their CADR is likely quite a bit better! Looking at their Exhalaron line up - we have what is essentially two of my style of filter glued and tension-ed down in a portable package.
And a similar HEPA filter is used here as well. With two fans that are each nominally figure of 75 CFM. 92cfm / 2 / 75cfm = 61% - instead of the 80% figure I handwaved! (the new numbers should be up soon tha...
I think Thomas's "Instead of using HEPA to 'one-shot' (original design intention) the air filtration task, the 'few-shot' approach with much higher through put with a MERV 13ish lvl of efficency is generally better" is mostly correct. I see that Dynomight's IKEA filter investigations have also made a similar conclusion (although it is more in the case of HEPA vs MORE HEPA).
However, I didn't want to 3D print/jerryrig an enclosure to fit in the recommended filters, and where I am, I couldn't source a nice self-supporting (non-HEPA) filter that I can easily p...
For those interested in Chinese philosophy, I'd suggest 韩非子 (Han Feizi), which offers a thoughtful meta-analysis of earlier philosophers like Laozi and contemporaries such as Xunzi, in so far as their thoughts applied to statecraft. (The first Emperor was a big fan of the work.). Note: Avoid the Burton Watson translation.
This recommendation assumes some basic knowledge of Chinese history.
For those new to the subject, [Recommendation to come, I am trying to find the English version for a children's book to Chinese Philosophy and History] might b...
I did read the original. It was long and I skimmed it. It was better in the coherence-sense that the OOP didn’t post a probability on whether it is true or not. Hell, the OOP hedged it by saying “ Do I believe what I’m saying? Well, yes and no”.
I guess the core of my confusion is the radical mismatch in confidence projection in its explicit form and implicit form (through tone and context setting). [Note: the updated wording definitely tempers the expectations in the right direction, thou still a bit bonkers at first glance.]
50% is extremely high. And lighthearted tones are often used to convey a sense of “I know this is farfetched theory. But I hold this strong claim very/appropriately weakly”.
Though not meant as derision, it is absolutely wild to read “Though I don't know that much about orcas” and “50% that orcas could do superhuman scientific problem solving” in the same paragraph.
My uneasiness with this post is that I am not sure how serious/joking the post is. It has some of the hallmark of a relatively lighthearted post written in a serious way. (The interaction with the IP, for example) And tones of conversation is light at parts. Yet the call to actions are confusing - it is not really motivating and seems to offload responsibility too eagerly for someone that actually believes what they are writing about.
I am very confused about the post and not sure what to think about it.
Stephen puts it elegantly. Though for me who is more of a code monkey, I'd like to think of it as "Runtime Non-Zero cost type safety through some const generics".
I can see how the article can be convincing. But it is worth it to keep in mind that Hunterbrook is also a hedge fund that trades on their own news - an obvious case of potential alignment failure if there ever was one. Though I am not sure if they are shorting this one.
Perhaps more damningly:
...Jiangsu Pacific Quartz Co., Ltd. (SHA: 603688) produces HPQ in China. Earlier this year, state legislators evaluated North Carolina House Bill 385, which could ban ownership of local quartz mines by foreign entities from countries designated as adversarial to the U.
A quick sanity check on the Chinese side of the web had revealed a couple of manufacturers for semi-conductor grade quartz, allegedly with manufacturing and processing centres in Jiangsu, CN.
My prior on this product type actually being a critical single point of failure is low.
See below: http://www.quartzpacific.com/api/upload/uploadService/dowloadEx?fileId=1113&tenantId=147391 ^Product spec (one of many semiconductor grade product shape) http://zj.people.com.cn/BIG5/n2/2023/0316/c186327-40338436.html ^investment news on new sites and manufacturing cap...
It doesn’t seem like you are arguing that breastfeeding is universally more convenient than formula. But breast feeding can be very inconvenient:
Formula’s convenience lays in enabling asynchronous feeding of the baby - by separating the role of the producer and the role of the feeder, the other partner can take care of the baby whilst the mother sleeps.
Another compromise to make is store breast milk and reheating it on demand!
Continuing the list...
On Lesswrong being a dispersed internet community:
If the ACX survey is informative here, discussing local policy works surprisingly well here! I’d say a significant chunk of people are in the Bay Area at large and Boston/NYC/DC area - it should be enough of a cluster to support discussions of local policy. And policies in California/DC has an oversized effect on things we care about as well.
I am curious, what were other "visions" of this workshop that you generated in the pre-planning stage?
And now that you have done the workshop, which part of the previous visions might you incorporate into later workshops?
I hope the partial unveiling of a your user_id hash will not doom us all, somehow.
I am not everyone else, but the reason I downvoted on the second axis is because:
There is some good stuff here! And i think it is accurate that some of these are controversial. But it also seems like a strange mix of good and “reverse-stupidity is not necessarily intelligence” ideas.
Directionally good but odd framing: It seems like great advice to offer to people that going straight for the goal (“software programming”) is a good way to approach a seemingly difficult problem. But one does not necessarily need to be mentored - this is only one of many ways. In fact, many programmers started and expanded their curiosity from typing something like ‘man systemctl’ into their shell.
It seems like, instead of asking the objective lvl question, asking a probing “What can you tell me about the drive to the conference?” And expanding from there might get you closer to desired result.
Witty, but I feel like that is not actually true?
It is likely that the rationality oft named is not the true name of the thing. Or “just be a perfect bayesian agent lol” is not practical. But that does actually mean anything legible is immediately false?
TLDR: This is an long metaphor to draw parallels between the Hanseatic League and the broader EA/LW communities. It is OK to not be a {corporation, societas, collegium, universitus} with common/top-down/bottom-up violence-monopolizing system. The price to implement a resolution system people would find satisfactory might be too high.
The fact it is hard to resolve conflict is because it is an integral part of the bargain, not an isolated bug. I personally don't want us to become a chimera with nine heads - a chimera for the sole purpose so we can utter that...
Oh my, I hope your sanity is holding.
In a sort of morbid way, seems like things are working as intended - the "sharp" fella is winning social battles (invented or not) and keep exploiting the ever widening strategy space. Emboldened, he quickly gets to the "this is the line and no further" boundary of his current strategy. But instead of modifying it and keep his old strategy as a tool in his arsenal, he over-exploit it and disrupts the equilibrium so much he gets kicked out.
Seems like he's winning the battle but losing the war. He's not making allies, friends, or experiencing happiness.
Definitely preferable if he wins a longer term, positive sum game.
It is very possible that it works - though I am somewhat doubtful and I don’t have a unit to test it.
A quick way for us to learn more would be to I guess duct tape the screen to the laptop at the angle/height your want - and work with it for a bit. Might be able to get more experimental data than our theory crafting.
Indeed! But these are side loads instead of directly above the hinges.
Imagine this... you are a hinge. You are designed to take loads that roughly matches the motion of opening and closing of the lid + a bit of additional tolerance. But when someone mounts something heavy on the side of the laptop, you are mostly annoyed but OK with it because the side load will try to rip out the hinges out of their respective housing in different directions - the housing is usually plastic on cheaper computers, but perhaps aluminum on macs?
The problem with ha...
It might work, but it seems like the main difficulty would be the laptop hinge.
The hinge would be taking way more force than it is intended to take. And from my experience, it is somewhere around 3kg for macbook air 13 (your model is likely different) - so a M156 mounted 35cm above the hinge might produce quite a bit of stress.
Just going off a hunch, mostly the asymmetry of risk and award?:
Award: Spread the gospel of probabilistic truth, personal intellectual growth potentially
Risk: retaliation (especially if the author is in the field), harassment, threats, law suits, wait… just more kinda of retaliation really. And potentially been seen as some one against the field of anti-aging despite an attempt at doing good science.
Hiya Yovel! Q1: How have you been impacted by the recent hostilities? Q2: What do you think are the potential end goals of this newly re-escalated conflict for the Israeli government? (As an naive observer, seems like [occupying Gaza / leave a power vacuum / let the Hamas reorg after] are all rather bad outcomes)
On #2, my personal view is that Israel has as much of an end-game strategy in Gaza as the US did after 9/11 invading Afghanistan - essentially none, but so much public pressure to overreact that they will go in and try to take over anyways.
Always welcome more optionality in the opportunity space!
Suggestion: Potential Improvement in Narrative Signalling by lowering the range of RAs to hire (thus increasing pay):
Hey Winston, thanks for writing this out. This is something we talked a lot about internally. Here are a few thoughts:
Comparisons: At 35k a year, it seems it might be considerably lower than industry equivalent even when compared to other programs
I think the more relevant comparison is academia, not industry. In academia, $35k is (unfortunately) well within in the normal range for RAs and PhD students. This is especially true outside the US, where wages are easily 2x - 4x lower.
Often academics justify this on the grounds that you're receiving m...
Viktor has a point here - the title is informative, but not well optimized (perhaps intentionally) for attracting eyeballs.
Something akin to:
Military and AI Compute: DoD's 100 million cheque and what did it get for them?
Might do the trick a bit better.
*Not actual advice
Blow the matters up in an election season, concentrate media focus with minimal cost. Contact local political activists and famous NIMBYs, mass pamphlet style mobilization. Silent protest (of even just one) outside the Berkeley Department of Transportation.
Agreed, there can be a optimum. But I think the intuition here is that it is exceedingly rare enough to run into a situation where it is local optima in all "directions".
It is only an "optimum" when all 175 billion parameters are telling you to screw off and stop trying.
My instinct is that, the lotteries odds were not truly random or close to truly random. Or the odds for the specific lotteries were a lot better than assumed.
Or in other words, the prior for the lotteries being fair is low.
Quick note on “Ukraine…has not trained its people in guerrilla warfare.” I am sure that Ukraine has not engaged in public programs to turn a significant percentage of its population into capable guerrilla fighters.
However, from my sources in the NATO deployments, the Ukrainian irregulars and volunteers have been rigorously trained in “…irregular warfare” in significant numbers - to quote my sources. Will provide more rigorous and structure info shortly.
I am curious, and you have probably thought much about this. But how would the transition happen from the existing economy to this new economy? How do you convince existing property owners to give up their “out-of-proportion” ownership claims? (Would it just be political coercion like in the post? Then who/how would we convince the state?)