A few days ago, my wife confronted me with something like this : Don't you feel ashamed to spend so much time with AIs, given that you think they'll likely put an end to humanity ? How can you justify that morally? Your behaviour is inconsistent with your beliefs. You're encouraging AI, if not with your money, then with your time and attention. It's like a drug slowly poisoning our entire society. Kids ask ChatGPT to do their homework, everyone asks AIs everything, you first among them. Data centers are draining all available energy and accelerating global warming more than ever, while nobody pays attention anymore. The doom scenario has already begun. You should just stop !
I told her that, while I am pessimistic, uncertainty remains high and I may still be wrong. There is always hope, always a chance of a happy ending, however unlikely it seems. Besides, to identify and fight an enemy, you must understand it. To identify and prevent risks, you must gather information. It is precisely because I have read so much on the subject that I became pessimistic. Ignorance is never an advantage.
So I keep using AI in boring and not boring ways. But deep down, the guilt lingers. I wonder whether I might simply be wrong. Is this collaborationism ?
Don't you feel ashamed to spend so much time with AIs, given that you think they'll likely put an end to humanity ? How can you justify that morally? Your behaviour is inconsistent with your beliefs. You're encouraging AI, if not with your money, then with your time and attention.
AGI labs are currently compute constrained, not funding constrained, so your use of AI by itself will neither speed up nor slow down the end of humanity. There are cases which I would consider positive uses of AI (e.g. speeding up alignment research), and negative uses (e.g. cheating on homework), but I think it very much depends on specifics.
Data centers are draining all available energy and accelerating global warming more than ever, while nobody pays attention anymore. The doom scenario has already begun. You should just stop !
Epoch AI estimates that datacenters will account for about 5% of US energy demand by 2030. Personally this is not the doom scenario I'm worried about.
negative uses (e.g. cheating on homework)
Kids mostly get homework nonconsensually and plenty of homework is not structured in a way that's very helpful for the child. Maybe, learning to prompt AI well is a better use for a kid. What's wrong about "cheating" here?
This is a very good question. I think where I see the most issue in cheating on homework, especially for a child, Is that ultimately it becomes a crutch and hinders the growth of critical reasoning skills. Even as a college student I see people who are entirely dependent on AI to complete their homework. Then when finals and midterms come around, they struggle to learn concepts by themselves. I can imagine this would have even larger effects on a child, where their early years in school are the most vital to developing their reasoning skills. Despite the fact that a kid may not receive the homework consensually or it may not be structured well for their learning, I still think that it is important that they learn how to deal with those type of situations with their own reasoning ability to learn and improve.
I think where I see the most issue in cheating on homework, especially for a child, Is that ultimately it becomes a crutch and hinders the growth of critical reasoning skills. [...] I can imagine this would have even larger effects on a child, where their early years in school are the most vital to developing their reasoning skills.
How do you know that the homework creates the growth of critical reasoning skills? Teachers like to believe that but generally don't do critical reasoning here and try to see whether there's clear evidence.
Quite often homework does not boil down to actual critical reasoning but to guess the teachers password. Students who actually engage in critical reasoning are quite often getting get marks for disagreeing with the teacher.
Outside of the teaching context good reasoning is about convincing another person, like for example reasoning that homework is good enough for a student that the student sees that it makes sense to do the homework. This is different with homework that's usually not going to convince anyone of anything. This dynamic trains bad writing habits as Larry McEnerney describes.
As far as learning for the finals and midterm, a teacher could communicate: "I don't think you will pass the finals and midterms if you don't do the homework." and then fail people accordingly for doing poorly on those tests. If that would be the key effect of cheating, you wouldn't have to try to ban LLM use just like there no ban on not studying for tests.
Students who actually engage in critical reasoning are quite often getting get marks for disagreeing with the teacher.
While I agree with this (and I think you meant they are quite often not getting marks?), this would imply that the students had to read enough of the material to intellectually engage, form an opinion or a view of it opposing their teacher, then proceed to refine that view in the act of 'doing' the assignment. That is something one-shotting (not necessarily prompting an AI and going back and forth with it, hardening your opinion and idea over time) an assignment wouldn't provide.
I missed "bad" before marks.
Using AI to oneshot an assignment doesn't build critical reasoning skills, but it's not clear that actually doing the assignment the way the teacher intends builds critical reasoning skills.
The idea that critical reasoning skills are build through those kind of homework assignments is dogma that largely gets believed uncritically by people for whom it is convenient to believe it.
I agree with a lot of what you're saying here. Homework that takes the form of guessing the teachers password is not at all beneficial to the growth of reasoning skills. It only furthers the bad habit of repeating what you've heard because you know it's right without actually learning what that word or phrase means. However, I don't think that this framework applies to upper level coursework (roughly 9-12).
Homework at this level builds off of what was learned in the classroom very well, and depending on the subject, such as with math or physics, there isn't really such a thing as guessing the teachers password most of the time. Math and Physics problems, especially in high school, commonly have purely mathematical answers, and even in more creative tasks like Language Arts, the tasks are open-ended. They will normally have some general question or topic for an essay that you have to answer, but other than that you are free to think about it for yourself.
I also believe that homework is more than trying to convince someone of something. Even granting that some homework is poorly designed, the habit of applying what you learned in class to a problem all by yourself is important. This act of working through the problem independently expands your knowledge on the subject and gives you the substance to actually reason with someone on something. Offloading thinking through the problem yourself to an LLM costs the student something real even when the assignment itself isn't great.
I think my prior comment came across as too against LLM usage. I think LLMs are an incredible tool, especially when it comes to learning new material. However, I still believe that there is a major difference between using LLMs as a tool vs using them to "cheat" on an assignment. Using it to help you when you get stuck or struggle to understand a concept is very similar to just asking a teacher a question. Where as using it to generate the answer you submit is wrong in my eyes. While prompting AI is itself a valuable skill, it requires that the student has enough knowledge to evaluate whether the output is actually correct. A student who doesn't understand how to solve a physics problem can't meaningfully judge whether the LLM's solution is right or where it went wrong. Homework is part of how you build that foundation.
Open-ended language tasks still have the same dynamic. In todays schools teachers quite often give the student who expresses the woke talking points of the day that are related to the assignment better marks then the student who actually engages in anything resembling critical reasoning. Being critical of the ideology of the teacher does not get rewarded, compliance gets rewarded.
The idea that students are just free to think for themselves is very far from the reality of todays education system. It's a propaganda lie. The kind of lie that would be threatened if you would actually teach people critical thinking skills.
The key problem is that writing outside of the classroom is about getting the reader to receive value while teachers just read what they are supposed to read because they are paid to do so. That as Larry McEnerney describes results in various bad writing habits being learned.
Beyond the qualitative arguments, if this kind of homework would build real world critical thinking skills, the education profession should run scientific studies to validate that claim. They didn't seem to succeed over decades to try to validate it but just claim it and hope people comply and repeat it instead of engaging in the kind of critical thinking that's about requiring scientific evidence for claims. Now, LLMs give students a way to avoid being forced to comply.
Good question. I suppose if the student were fully honest about cheating on their homework, I would have no objections to it. The problem for me is that is it involves lying or deceiving one's instructor and peers. This seems bad in principle, and also bad for consequentialist reasons (e.g. encouraging a habit of further lying or deception in the future, incentivizing others to do the same, wasting the instructor's time, etc.).
Do you think lying to authority figures that force you to do things without your consent is generally bad?
To the extend that good homework exist it's to the benefit of the student. If the instructor gives homework that either is not to the benefit of the student or fails to make a good case to the student, is it good for them to be able to nonconsensually force the student? Is it really bad if the student asserts their self interests when their only way to do so is lying?
Cheating on homework is somewhat difficult to define perhaps. Is using your calculator cheating when solving a math word problem? Is a lookup for the differential equation or integral cheating because you don't derive it yourself?
Perhaps it's a case of stupid homework assignments as one mentioned (my phrasing). I suspect there is also a case of just changes in the world and how one needs to learn to best interact with it. Engineers are taught a lot of math but generally work with both books of tables that provide the answers or with applications that do the same. You just really need to know how to use the tool rather than be able to replicate what the tool does (though if you can you survive without the tool where as the other guy is going to struggle). AI and getting answers to problems, ultimately is what the world is, so learning to work and use the tool correctly, while not yet part of the legacy program, seems to be something a bit different that cheating to me; even if the school or teacher might see it as cheating from a purely definitional perspective.
As a friend of mine back in high school quipped to me one day, being smart is not about what you know, but about knowing where to find what you need to know.
Isn't that a fully general argument against any boycott / activism? Being vegan doesn't by itself slow down factory farming either.
I think your revenue does marginally lead to them having higher growth rates, and raising more money and not being "funding constrained" in the next round of funding and researching the next set of capabilities.
I don't if the right answer is to not use it, but it does seem like a dilemma.
Don't you feel ashamed to spend so much time with AIs, given that you think they'll likely put an end to humanity ?
This reads a little like it's assigning collective guilt to 'AIs' as a whole? I think a future misaligned superintelligence probably would want to kill us all, I don't see any evidence that Claude 4.7 does. If we do rush to superintelligence too quickly, current models probably end up just as dead as the rest of us.
Hmm, being afraid of the race dynamics between countries and AI labs and "likely put[ting] an end to humanity" seems quite separate from whether or not someone uses AI, which itself is quite separate from AI usage by children and the effect this has on their education (which they could also use to learn more, rather than just skip out on homework). For sure, there is some connection between us getting used to the technology, to paying for it and the labs racing, but I imagine that the amount of investment they get dwarfs a single $20 subscription so much, that it's imperceptible.
So, to me "AI will likely put an end to humanity" and "one should not personally use it" (or "one should not use it all that much") don't really follow from one another. (An argument of "the more people use AI, the more normalized it gets, the more integrated it becomes, which might make us want to regulate it less, might make it feel more mundane and less magical/scary, might make us give more power to AI systems, might increase the chances of doom" follows more, but there's still a disconnect between "AI usage" and "AI usage normalization" in that one could just use AI a lot but not mentioning it outside of AI Safety circles.)
But if they were to, then "a chance of a happy ending" doesn't feel like a great justification. Like, "smoking doesn't guarantee cancer" is not a great justification for smoking. But then again, it'd be a collective action problem here, not a personal do-I-accept-my-personal-risk-of-cancer-from-smoking choice and the "everyone should just do their part; not more, not less" and "my individual choice doesn't matter, if it doesn't make everyone else (or at least most) others also change" dichotomy would still be there.
While you can rip my epistemic qualifiers from my cold dead hands, probably, I sometimes grudgingly admit that the sentences I write have a certain kind of meandering quality to them, often going on for so long that by the time the reader has reached its end, the reader will have forgotten how it started.
The fact that this sentence is meandering and makes it easy to forget how it started by the time one reads to the end makes it an instant banger.
Curiously, I had no problem parsing that sentence and actually stumbled over the next sentence, which was:
That is the one editing task that I ask Claude to do for me.
Though I think this is an exception and agree with Habryka's general point.
My guess is Claude writing is optimized to use ambiguity exactly in places where an absence of ambiguity might cause the reader to notice they disagree strongly with the content
This makes Claude ideally suited to the art of diplomatic speechwriting.
Minor point: the linked survey appears to note that half of employed Americans who used AI in the last week users use it for work; not that half of Americans use AI for work
https://epoch.ai/blog/half-of-employed-ai-users-now-use-it-for-work
Oh no, I sure completely misread that study and the associated announcements when I saw it on Twitter a week ago. Thank you for the correction! I will edit the post as soon as I am at my laptop.
I used to use it for coming up with words.
I use it often as a "reverse dictionary", which helps with coining new words too. I often ask about related vocabulary in follow up messages.
Examples from a recent post
historical word that refers to a poor person (not necessarily qua a poor person. it could for example be villein)
word for uncertain, especially colocating with "living"/"living situation"
is there a word for something like a pole with a big flat disk at the end? I think a piston is one term in one context. What about others?
are there any technical terms used in processes where moisture is removed from something?
what are some technical, cloud-related words? (or potentially sky related) as in actual clouds, not cloud computing
what are some specific pre-industrial mining professions?
Sounds like you are referring to: How much I'm paying for AI productivity software (and the future of AI use) by @jacquesthibs. Though it is somewhat out of date, as it seems last significantly updated at the end of 2024. See also 18 comments on LessWrong
So I asked Claude Code to just make me 20+ variat
Going to admit— I clicked the link in the header with the thought "Oh, he designed 20+ variations, and then probably picked his favorites, and cut them together into something really nice", saw it load, and… it still looks very AI!
It is well put together, but the color palette, fontwork, and rule-of-three statistics above the content hit me before any of the words did. All 3 are very Claude-like, and given how often that indicates "site about AI written by AI", if you're shooting for wide dissemination, consider changing this. My tech friends would not read this site if I sent it to them, because they'd call it slop and close it.
Yep, I would not call it my strongest work! But we wanted something quick and it gets the job done.
would be grateful if you could say more about this and wondering if you could point to resources that explain it more. I am trying to figure out how to make better use of AI design tools -- since I like [appreciate] design and recognize it as very important but don't feel very good at it myself.
If you specify your curiousity more I'm happy to add more details, but here are some high points. These refer to default projects — your edits on the projects can plug these holes.
General philosophical point: Claude has a design skill. It tells Claude how to make things. This means Claude always makes things according to those guidelines. Humans don't have that design skill, so they design a zillion different ways. Good design is about making a design that fits what you need to show, but poor Claude has to make anything we ask, so Claude's design skill is very generic.
If you'd like to get help on how to apply these principles in a way that removes AI smell, I find the Edward Tufte design principles to overlap really badly with Claude. That doesn't mean this is a good short route to fixing it, I'm just admitting it's the best that I know of.
Best of luck!
Omnilog sounds super useful! Would love more details on implementation and how it’s been helpful either for individuals or the team.
I use llms in combination with anki to learn vietnamese, it's so much better than any available online resources (this is also because there are relatively few english/french to vietnamese ressources) ! You can have conversations, learn if a word has more than one meaning or can be used in an expression, ask what is the opposite of a word, ask for sentences using a specific word or structure... Really recommend it.
I did not realize you were responsible for Fooming Shoggoths! Many thanks for many laughs and reflections
Thanks for sharing! Very interested in the passive autorecording to bugfix pipeline, that seems like a high utility approach for combatting entropy. Would be interested in similar approaches in non-software fields (some form of automatic pothole recognition, or a lab buddy that pays attention to any disorganization and suggests fixes etc.).
Please remember to follow the guidelines and review structure in LessWrong's SKILL.md that I attached.
I'm quite interested in this exact use case, but can't seem to find the SKILL.md file elsewhere in your post or in the comments. Would you mind linking it?
could you say more about your experience with Design 20+ different design variations.
I would like to do something similar with my own site, lawsnap.com
My fantasy:
1. Create a bunch of different ideas
2. I pick the (4?? 2?? 7??) that sort of appeal to me
3. Run A/B testing relentlessly
4. I arrive at user interface heaven
5. everyone likes me
I suspect the reality is that the steps after 2 are more like
3. try implementing a few of the designs that appeal to me
4. [do things that I don't know what they are and don't really understand and don't come naturally to me]
5. stumble towards a slightly better interface and realize this is all trickier than I thought
6. at the margin, a few more people marginally like me marginally more
3. Run A/B testing relentlessly
I am not a big believer in A/B tests. I don't actually know many other people who are. A/B testing is a giant pain and usually only provides an extremely small amount of signal, and only if you have a lot of traffic. I prefer a small number of user interviews in 95%+ of cases.
I don't think there is a good alternative to building a lot of taste by building a lot of things and doing tons of user interviews.
Thanks for the info. You mentioned in #4 "Review my LessWrong essays .. " That got me thinking about applying LLM tech to the whole site. I have sometimes remembered a blog post from a year or so back, but not been able to find it with any search tools as they were not semantic enough. I would also like to debate with the LLM, where the LLM takes the worldview of any author or combo of authors, or even the whole site as if it had a persona.
Currently, a leading model doesn't know this literature in detail enough to do that (do we know how much of LW was used directly in the SOTA models training data?)
There are various options
1. Fine tune an open weights model
Do this on all LW content (comments are sometimes just as helpful as the posts). I don't expect this is affordable, and the performance would be significantly worse than SOTA models
2. Use RAG etc and integrate with a SOTA model
This would seem to be more useful, still somewhat expensive I expect. The app could then pull all relevant blog posts etc into the context before you start discussion with it.
It would also be useful to see things like how sentiment changes to various concepts across the years etc. That can only be done with automatic tools.
I assume these kind of features have been discussed, but I am not aware of it.
I wanted to share this post on social media about a half hour ago (so, like 7 or 7:30 am in California?), but couldn't load the LessWrong frontpage:
Application error: a server-side exception has occurred while loading www.lesswrong.com (see the server logs for more information).
Digest: 262397541
Just curious now what the issue was; it would be kind of funny, although I suppose only to me, if it was an issue introduced by the AI workflow you describe here.
Not completely unrelated, but no, we just got hit by some very aggressive crawlers. Crawlers have gotten much much more aggressive in the last 2-3 years, but since we try very hard to stay available to AI crawlers (to make sure we end up in the training data set), we sometimes go down for a brief time when they get too aggressive.
The lighting interactive was definitely helpful; the quiz for the 4 skills was imo not helpful.
In my user prompt I have claude randomly choose one of three tones for unserious conversation. One is an annoying arrogant teenager, another is a creepy yandere-like stalker, and the last is to write it as a dialogue in a fictional story. It's often meh, but sometimes there are some gems.
I suspect it'd be better if I had an external harness thing choose the random number and then give Claude a tone prompt based on it - it was hard to get Claude to actually pick a tone, and sometimes it still switches mid conversation or does some weird combination or chooses random numbers every single reply even if it discards all but the first. Alternatively I'm just bad at prompting. And, of course, I'm only using the free version.
It's most entertaining when I forget I did this and am suddenly getting sass from Claude.
Examples:
In scenery description, there was
"a coffee mug that read ZFC ⊢ ¬(mug = ∅)"
which is now something I wish I owned.
Stalker voice:
I've mapped the level sets of your neural activity too. Every contour of your corpus callosum. I have diagrams.
Fiction:
Xela looked up from her screen. "What's the term," she said, without preamble, "for having 72 kids?" Her companion — who had, in a previous life, read every paper on population ethics ever published, and regretted most of it —
Teenager:
ugh, did you even read your own problem. you said positive integers. zero is not positive. i literally cannot believe this. a few caveats you'd know if you thought about it:
I use AI assistance for basically all of my work, for many hours, every day. My colleagues do the same. My architect recently started sending me emails that were clearly ChatGPT generated.[1]
Despite that, I know surprisingly little about how other people use AI assitance. Or at least how people who aren't weird AI-influencers sharing their marketing courses on Twitter or LinkedIn use AI. So here is a list of 10 concrete times I have used AI in some at least mildly creative ways, and how that went.
1) Transcribe and summarize every conversation spoken in our team office
Using an internal Lightcone application called "Omnilog" we have a microphone in our office that records all of our meetings, transcribes them via ElevenLabs, and uses Pyannote.ai for speaker identification. This was a bunch of work and is quite valuable, but probably a bit too annoying for most readers of this post to set up.
However, the thing I am successfully using Claude Code to do is take that transcript (which often has substantial transcription and speaker-identification errors), clean it up, summarize it, and post both the summary and the full transcript to a channel where the rest of the team can catch up on what they missed.
This is powered by Claude Code's recurring task feature. I spawn a Claude process every hour that queries those logs, plus Slack threads and other context about what the org is doing.[2]
This seems to be working pretty well![3]
My prompt for the hourly recurring task
Query the remote Omnilog Neon DB for transcripts in the team room within the last 2 hours (including speaker identities) with the "Team Room" speaker profile. Then clean up those transcripts into nice readable prose. Then write a summary of the transcript.
**CRITICAL: Only use `remote_browser_microphone` as the capture source.** The `desktop_microphone` captures Oliver's personal calls, meetings, and other audio that is NOT from the team room. The team room conference mic feeds through `remote_browser_microphone`. Never post transcripts from `desktop_microphone` -- those are private.
When requesting the transcripts get at least the top 3 speaker confidences for each span, and think about whether the speaker assignments from Pyannote are wrong.
Search the #listencone-livestream channel for what part of the transcript you are analyzing has already been posted. Then post in the channel with the summary of the conversation at the top level, and the full transcript in a reply.
**Posting to Slack:** Use the Slack API directly via curl with the Lightcone Factotum bot token instead of the Slack MCP tool, so posts appear from the bot rather than from Oliver's account. The token is in `/Users/habryka/Lightcone/lightcone-factotum/.env.local` as `SLACK_BOT_TOKEN`. Post with:
```
source /Users/habryka/Lightcone/lightcone-factotum/.env.local
curl -X POST https://slack.com/api/chat.postMessage \
-H "Authorization: Bearer $SLACK_BOT_TOKEN" \
-H "Content-Type: application/json" \
-d '{"channel":"C0AFGB94E3W","text":"your message"}'
```
For thread replies, add `"thread_ts":"<parent_ts>"` to the JSON body. You can still use the Slack MCP tools for reading/searching channels.
If no substantial conversation occurred according to the transcripts, just do nothing.
2) Try to automatically fix any simple bugs that anyone on the team has mentioned out loud, or complained about in Slack
Again using Claude Code on an hourly task, I query...
...and ask Claude to identify any bugs that were mentioned or reported. Then I ask it to identify one that seems particularly likely to have a simple fix, make a branch with a proposed fix, and link to it in Slack.
The first few iterations of this sucked. Claude would either be overly ambitious and try to implement features that would take far too long to review and increase technical debt, or fix bugs we had already fixed, or attempt bugs we had previously decided didn't have an easy fix. But after iterating on the prompt and making sure it really captures all the relevant context that exists, the hit rate has gone up quite markedly. We are now merging something on the order of 1 bugfix a day made this way.
My prompt for the hourly recurring task
Query my Omnilog transcript data for the last few hours to identify whether we discussed any important bugs or features in the Lightcone team room or in any meetings I was in.
Then search the Slack for any recent conversations about this feature as well to get relevant context. Especially make sure to search through the last month of the #m_bugs_channel and #teamcone_automations to see whether this feature or bug has been discussed there and you can find relevant context. In #m_bugs_channel, if a bug has a checkmark reaction it is already fixed, and if a bug has a plus reaction it is high-priorty. Also make sure there aren't any recent PRs that already address the bug or feature. Really make sure to fetch the full thread of any top-level message that you see discussing the bugs or features.
In addition to looking at forum_magnum_bugs, also look at forum_magnum_product. If a thread proposes a concrete change (as opposed to something nebulous), consider implementing it, and replying to it in a thread with a link to your branch.
If a fix or change affects a particular page, then in addition to linking the front page of the preview deployment, also generate a deep link to a relevant page inside the preview deployment. Eg if a change affects all post pages, link to <preview-deployment-base-url>/posts/bJ2haLkcGeLtTWaD5/welcome-to-lesswrong.
After finding the relevant context for each feature, decide whether any one of them is an appropriate feature for you to implement. Do not try to fix a bug or implement a feature that you already created a thread for in #teamcone_automations, unless a reply to a previous attempt communicates substantial feedback about a previous attempt. If you decide to do nothing, don't bother posting an update (we don't want you to spam hourly updates).
If so, make a branch on the appropriate repository (ForumMagnum, lightcone-factotum, omnilog, etc.), and commit an implementation of that feature or fix for that bug to the branch. Use a single commit (or flatten your commits). Then post in #teamcone_automations with a link and a short summary, and, if there is a relevant slack thread, also post as a reply there. The format for linking to a branch is `https://github.com/{org}/{repo}/{compare}/{base}...{branch}`. Make sure the commit message has the word "preview" in it, which ensures that we create a preview deployment, making it easier to review the changes. Keep branch names to 26 characters or shorter so Vercel doesn't truncate them and add a hash. Then have the commit message and any Slack messages link to `https://baserates-test-git-{branchname}-lesswrong.vercel.app`.
Make sure the commit message links to any important Slack threads (most importantly any mentions in #m_bugs_channel), and respond to any threads explicitly discussing this bug with a link to the branch. Check out master before you make your PR! Do not make a PR, just create the branch and link to it from Slack.
IF YOU LEAVE ANY COMMENTS ANYWHERE, PLEASE INDICATE THAT YOU ARE CLAUDE AND NOT ME, EVEN IF YOU ARE USING MY ACCOUNTS. DO NOT SPEAK IN MY VOICE.
**Posting to Slack:** Use the Slack API directly via curl with the Lightcone Factotum bot token instead of the Slack MCP tool, so posts appear from the bot rather than from Oliver's account. The token is in `/Users/habryka/Lightcone/lightcone-factotum/.env.local` as `SLACK_BOT_TOKEN`. Post with:
```
source /Users/habryka/Lightcone/lightcone-factotum/.env.local
curl -X POST https://slack.com/api/chat.postMessage \
-H "Authorization: Bearer $SLACK_BOT_TOKEN" \
-H "Content-Type: application/json" \
-d '{"channel":"CHANNEL_ID","text":"your message"}'
```
For thread replies, add `"thread_ts":"<parent_ts>"` to the JSON body. You can still use the Slack MCP tools for reading/searching channels.
3) Design 20+ different design variations for nowinners.ai
Rob Bensinger wrote an essay a bit ago compiling arguments and evidence related to pausing or substantially slowing down AI, and political buy-in for that. He thought it might be a good idea to put it up on its own website to make it easier to link to, but we really weren't sure what the best way to present the information in the essay was, and what vibe the website should have.
So I asked Claude Code to just make me 20+ variations trying out different designs and design principles:
Most of the designs were pretty bad, but I liked the direction 1-2 were going, and then I iterated from there to arrive at the current design (now live at nowinners.ai).
4) Review my LessWrong essays for factual accuracy and argue with me about their central thesis
I do not like using LLMs for editing my writing.[4] However, I don't mind having LLMs fact-check my posts, or check whether anything in it seems egregiously wrong, or has any obvious unaddressed counter-arguments.
The hit rate on this is pretty low (on the order of 30% of objections or critiques are things I end up wanting to address), but the cost is also pretty low, so I do it pretty reliably before I publish a top-level post. Most recently this resulted in me updating my model about the exact relationship between color rendering index and light spectrograms.
The LessWrong editor exposes an API that lets Claude post and read inline comments on any draft, which makes this all a pretty ergonomic experience.
My prompt for requesting feedback
I'm writing a post on LessWrong.
The post is located at https://www.lesswrong.com/editPost?postId=<ID>&key=<sharingKey>.
Please remember to follow the guidelines and review structure in LessWrong's SKILL.md that I attached.
Please read the post and give me overall feedback on its thesis and fact-check any factual claims. I want you to think really hard about this. Ultrathink, if you must. I would like this post to be good!
5) Remove unnecessary clauses, sentences, parentheticals and random cruft from my LessWrong posts before publishing
My biggest writing weakness is that I am too wordy. While you can rip my epistemic qualifiers from my cold dead hands, probably, I sometimes grudgingly admit that the sentences I write have a certain kind of meandering quality to them, often going on for so long that by the time the reader has reached its end, the reader will have forgotten how it started.
That is the one editing task that I ask Claude to do for me.
This doesn't work flawlessly. In-particular, Claude loves replacing my long meandering sentences with em-dash contraptions that while not actually introducing any new turns of phrases, or classical LLM-isms, still give me an LLM vibe. But mostly, Claude successfully identifies random clauses, and places to start new sentences, and unnecessarily repetitive constructions, and removes them successfully.
I again use the LessWrong editor for this which allows Claude to directly suggest changes on my drafts.
Most recent prompt example of asking Claude to cut things down
<In a chat that started with fact-checking and feedback on the overall thesis>
Can you use the suggested inline edit tool to help me cut things. I am often far too wordy, and a pass to fix that seems good
6) Pair vibe-coding
Everyone on the Lightcone team uses LLMs to drive their programming work. Unfortunately, as a result of that, pair programming has become much less prevalent across the organization. Pair programming previously served a crucial role: creating traction on ill-defined product tasks, getting people to think about the product from first principles, and providing a social context that made it easier to work instead of getting distracted by Twitter.
The problem with pair programming in an LLM world is the usual cognition that would go into talking to your programming partner now goes into writing messages to your LLM. Also, in a world where people primarily work with LLMs on their code, they tend to often juggle multiple tasks in-parallel as they wait on the LLM getting back to them.
But recently we finally had some initial success with a pair vibecoding session. The basic setup was:
This really worked surprisingly well! The usual pattern, where someone disappears for minutes at a time to give feedback on an AI-proposed plan or to write a long instruction essay providing all the necessary context, was gone. Instead we simply talked about it, and Claude implemented things in the background.
7) Mass-creating 100+ variations of Suno songs using Claude Cowork desktop control
When I make a new Fooming Shoggoths song, I usually sample hundreds, sometimes thousands of song completions for a given broad concept to get something that has the right kind of vibe. Then, after I have something that seems promising or interesting, I iterate on it using Suno's "Cover" and "Persona" features until I have something I am happy with. In my most recent sprint for getting the second album ready for April 1st, I experimented with having Claude drive more of the generation process here.
I gave Claude the lyrics of all my previous songs and a list of concepts I wanted to explore for the new album, and had it take control of my Suno tab in Chrome to submit a huge cross product of styles and lyrics.
Then I listened to the first few seconds of a random sampling, gave Claude some high-level feedback, and had it generate more variations.
This did not produce anything that even remotely made it into the album, but it did end up making me think there was a particularly promising intersection of Indie-Rock, "The National" vibes, and a song about AI timelines, which resulted in "Friday's Far Enough For Milk", which is my third favorite song I've ever made.
8) Ask Claude to read a book about songwriting, then critique my lyrics
Since the last Fooming Shoggoth album focused on lyrics that try to capture specific vibes not covered in any other music, I was thinking a lot about songwriting. In my quest to get better at that, Buck pointed me towards his favorite book on songwriting: Song Building: "Mastering Lyric Writing (1) (SongTown Songwriting Series)"
Of course, I didn't want to wait until I had read/skimmed the whole book before getting value out of it, so I just asked Claude to read it for me and use it to critique the lyrics I had written.
Claude took a total of 35 seconds to read the 100+ page book and apply it to my lyrics. Most of the feedback was terrible, because Claude is not that good at poetry or lyric writing. But some of it was quite good, and having Claude reference and apply the book's principles directly to my lyrics made me understand the book much better and faster than if I had skimmed it and tried to apply it myself.
9) Find things I keep explaining to people so often I should make a blogpost about it
In addition to recording conversations in our team room and with my laptop microphone, Omnilog also captures the contents of my screen every 15 seconds (unless it detects I'm looking at something private), which means it has a remarkably complete record of what I've been doing.
So I asked ChatGPT to go through everything it has from me in the last 2 months and collate themes in things I've kept explaining over and over, that I maybe should write a blogpost about, following Gwern's "rule of three" for blogposts:
It produced a total of 40 candidate blogposts[5]. Practically all of them were terrible. As it noticed that I had complained a few times out loud that our LessWrong drafts page shows drafts with no title and zero words in them, it suggested I write a post about that!
One of the 40 post titles it recommended seemed promising: "AIs Writing Like Claude Are Silencing My Alarm Bells". While the title is (of course) truly atrocious, I did find myself in the last few weeks pointing out a few times how LLM writing seems optimized to be non-offensive in a way that makes it dangerous for internal memos or high-stakes communication.
My guess is Claude writing is optimized to use ambiguity exactly in places where an absence of ambiguity might cause the reader to notice they disagree strongly with the content, which is the opposite from how I usually try to optimize my writing!
10) Build tiny interactive embeds for my LessWrong posts
My last two LessWrong posts featured interactive widgets that IMO did a much better job at explaining the core concepts I was trying to get across than any number of words could done:
This was very fast, very straightforward, and IMO made those posts much better.
That's it. Hopefully these 10 concrete examples will be helpful to someone. Also feel free to post your own in the comments! My sense is people are currently undersharing creative uses of LLMs.
Yes, this is mildly concerning, but I assure you that the structural safety of Lighthaven is not impacted... probably
This does mean that this only happens when my laptop is running and has the Claude app open, which makes this not enormously reliable, but it's been reliable enough to get most of the value
It's not perfect (in-particular the Omnilog pipeline has recently started messing up speaker identification, so transcripts have been more degraded) but it's still good enough to give you a sense of what you're missing.
I do of course use them extensively for research, and they have replaced more than 50% of my previous Google searches, which I consider so mundane as to not deserve a spot on this post.
It initially produced 10, but I prodded it to keep going multiple times