Don't worry, we eventually did find the source and context! (see the thread) Just had to be patient and wait to get lucky.
There will be no genius-level insights in 2025, but he could automate a lot of routine alignment work, like evaluating models.
What routine research work of your own have you automated with your digital-me?
Because as far as I can tell, the LLMs don't seem to train on the Markdown versions of pages
I link the Markdown versions with rel="alternate" link metadata (as well as in-page), but it doesn't seem to work, so I've taken an additional step of serving the Markdown source to HTTP requests which specify that they accept Markdown anywhere in their request as an alternative to HTML. This is a trick which seems increasingly common with LLM agents, since they handle Markdown so much better than HTML gobbledegook, and I hope that it becomes universally used. See https://github.com/gwern/gwern.net/commit/79ded21772a9aa338158c16a19dd4dad5a8f3d6b for details/background/nginx implementation.
Or you could go the other direction and overload in terms of allusion/connotation, which is part of the goal of "October".
Reviewing my LW posts/comments (any clear flaws, any objections I should pre-empt, how others might respond)
Does Gemini-2.5-pro still work for this given how sycophantic the post-0325 models were?
Even if that were true, it might not mean anything. Why might a country not invest in Y2K prevention? Well, maybe it's not a problem there! You don't decide on investments at random, after all.
And this is clearly a case where (1) USA/Western investments would save a lot of other countries the need to invest in Y2K prevention because that is where most software comes from; and (2) those countries might not have the problem in the first place because they computerized later (and skipped the phase of hardwiring in dangerously short data types), or hadn't computerized at all. ("We don't have a Y2K problem because we don't have any computers" doesn't imply Y2K prevention is a bad idea.)
The genre here is psychological horror fiction, and the style is first-person short story; so it's reminiscent of Edgar Allan Poe or Ted Chiang; but it's not clearly condensed or tightly edited the way those tend to be, and the narrator's style is prolix and euphuistic. From an editing perspective, I think the question I would have is to what extent this is a lack of editing & killing-your-darlings, and a deliberate unreliable-narrator stylistic choice in which the prose is trying to mirror the narrator's brute-force piano style or perhaps the dystonia-inducing music itself (because it can be hard to distinguish between deliberately flawed writing and just flawed writing - especially in the vastness of the Internet where there is so much flawed writing). I expect the latter given Tomas's other stories, but I think it's not sufficiently well done if I have to even ask the question. So that could probably be improve by more precise writing, or more ostentatiously musical structure, or some carefully chosen formal game; but also one could imagine making much more drastic revisions and escalating the horror in a more linguistic way, like a variant in which the narrator suffers an aphasia or dyslexia from reading a pessimized piece of text (revenge?) and the writing itself disintegrates towards the end, say.
Hard to say. Oyster larvae are highly mobile and move their bodies around extensively both to eat and to find places to eventually anchor to, but I don't know how I would compare that to spores or seeds, say, or to lifetime movement; and oysters "move their bodies around" and are not purely static - they would die if they couldn't open and close their shells or pump water. (And all the muscle they use to do that is why we eat them.)
Anthropic doesn't delete weights of released models
How do you know that? Because OpenAI has done that.
https://en.wikipedia.org/wiki/Manege_Affair