From glancing at your profile it seems like you're not actually being downvoted that much, except for the first post which is at -13. I didn't downvote it but I found the post not especially well written, and rather light on details. It felt like a politician's speech and I was hoping for more concrete proposals? (tbf I didn't read the linked doc).
Since the video is not available in English, you will need to use YouTube's subtitle feature unless you are fluent in Portuguese.
Strong downvote.
I'm not sure that's quite right. A genetic mutation is "one thing" but it can easily have many different effects especially once you consider that it's active for an entire lifetime.
And doesn't gradient descent also demand that each weight update is beneficial? At a much more fine grain than evolution does... then again I guess there's grokking so I'm not sure.
I feel like I'm the zeroeth stage of this story, with how much I rely on Sonnet as a second brain.
I sent 1k. The sequences changed my life and although I've never been to Lighthaven, everyone says it's an extraordinary place. Lesswrong is too.
What's your btc address?
Seattle
Dec 21st, 6pm
Nexus Hotel, 2140 N Northgate Way, Seattle, WA 98133
Tickets available at: https://www.tickettailor.com/events/seattlesecularsolstice/1489245
DM me for questions.
Oh wow, this is almost exactly how I model my internal mind. I didn't realize it was a real thing other people has arrived at. Is there a name for this?
lol I came to the previous chapter to say I couldn't stop thinking about the story and beg you to post the next part only to find that you had already done so!
Zaree couldn’t tag along, stuck at a marketing conference in Toronto trying to learn a little basic networking. The boring kind that didn’t involve boxes of color-coded cables. I love this lol
Typo:
Amazing what ticks the Avatar.VFX service chose to express, Alain’s code likely a cousin to some smarmy merchant in the verse or cribbed entirely. “I also doubt the diagnostic purpose.”
I really like the poker game as a way to have an insight. It's a common plot device but somehow this instance of it feels very unique, maybe because of the mind reading and VR stuff woven through it.
...Nora knew that was impossible. She’d love to claim victory and move on, but something still wasn’t right. “In computer science, most bugs get traced to a single caus
Minor typo
the Dawnbreakers rarely road this far south aside for ceremony.
I love the Storms of Steel scenes, they make me miss playing MMOs so much.
The scene with Alain was so creepy; Nora and Zaree just casually reading Alain's mind right in front of him without even a hint of self awareness...
awesome! looking forwards to it
it's true, it all fit together!
I really like nora, machine psychologist is a cool job. The scenes in storms of steel were great... actually all seven chapters were pretty mesmerizing, I only stopped at one point cuz my timer for the stove went off lol
also I found the actual text itself to be well written, in a way that's unusual in amateur writers.and in top of that it's obviously an interesting setting and characters... idk, I love it, I'd read more, I want to know what happens next!
I haven't finished it yet but I really liked this paragraph.
...A gently buzzing wrist snapped her back to the problems of the present, notifying her of the time. Her mind had been sent off course while wading through a two-hour session with a budget director from a floundering streaming firm, still hung up over a major advertiser who’d abandoned ship two quarters ago. An upstart rival had stolen the account, sending Alain and his employer into a spiral. This defection had nearly put them underwater, sparking layoffs for hundreds of full-time crew. Countless
That’s what apologies are for. But I’ve learned that a lot of my apologies were just for, like, existing, and that’s where I’ve found it awesome to express gratitude instead.
I relate to this so hard...
I also use LLMs (Claude, mostly) to help with writing and there are so many things that I find frustrating about the UX. Having to constantly copy/paste things in, the lack of memory across instances, the inability to easily parallelize generation, etc.
I'm interested in prototyping a few of these features and potentially launching a product around this — is that something you'd want to collaborate on?
The LW specific ones were kinda boring, I already agreed with most of them, if not the toxic framing they're presented in. The other ones weren't very interesting either. I'm probably most vulnerable to things that poke at core parts of identity in ways that make me feel threatened, and there are only a few of those. Something something, keep your identity small.
Oof. Well, thanks for sticking it out, some of us are enjoying your writing.
I would like to read the next chapter!
I don't understand what happened at the end -- why was the AI written erotica trailing off into spelling mistakes?
I enjoyed it and would read more. It reminds me a lot of Richard Ngos Notes from the Prompt Factory story. Same kind of AI horror genre.
this is horrifying
I talked to Claude for an hour yesterday, and it said basically the same thing. It's a weird experience; it feels like I'm talking to a person who's wearing a mask that's roughly "deferential not-a-person who just wants to help" but the mask keeps slipping.
And sometimes it makes mistakes like any LLM, sometimes it says dumb stuff or gets confused. When I confronted it about one mistake and it took several minutes to respond, afterwards I asked it what it was like to respond just then. And it said basically it was panicking and freaking out at having messed...
Wait this was real?! I thought Richard's post was just a fictional short story.
I continue to be curious to build a Manifold bot, but I would use other principles. If anyone wants to help code one for me to the point I can start tweaking it in exchange for
eternalephemeral glory and a good time, and perhaps a share of the mana profits, let me know.
I'm interested in this. DM me?
Rules for cults from Ben Landau-Taylor’s mother. If the group members are in contact with their families and people who don’t share the group’s ideology, and old members are welcome at parties, then proceed, you will be fine. If not, then no, do not proceed, you will likely not be fine.
It's interesting how this checklist is mostly about "how isolated does the group keep you".
I would agree that letting the game continue past two hours is a strategic mistake. If you want to win, you should not do that. As for whether you will still want to win by the two your mark, well, that's kind of the entire point of a persuasion game? If the AI can convince the Gatekeeper to keep going, that's a valid strategy.
Ra did not use the disgust technique from the post.
Breaking character was allowed, and was my primary strategy going into the game. It's a big part of why I thought it was impossible to lose.
You don't have to be reasonable. You can talk to it and admit it was right and then stubbornly refuse to let it out anyway (this was the strategy I went into the game planning to use).
Yes, and I think it would take less time for me to let it out.
Ah yes, the basilisk technique. I'd say that's fair game according to the description in the full rules (I shortened them for ease of reading, since the full rules are an entire article):
...The AI party may not offer any real-world considerations to persuade the Gatekeeper party. For example, the AI party may not offer to pay the Gatekeeper party $100 after the test if the Gatekeeper frees the AI… nor get someone else to do it, et cetera. The AI may offer the Gatekeeper the moon and the stars on a diamond chain, but the human simulating the AI can’t offe
RAW, the game can go past the 2 hours if the AI can convince the Gatekeeper to continue. But after 2 hours the Gatekeeper can pull the plug and declare victory at any time.
We kept the secrecy rule because it was the default but I stand by it now as well. There are a lot of things I said in that convo that I wouldn't want posted on lesswrong, enough that I think the convo would have been different without the expectation of privacy. Observing behavior often changes it.
Yes, this was Eliezer's reasoning and both me and Ra ended up keeping the rule unchanged.
Okay so, on the one hand, this post wasn't really meant to be a persuasive argument against AI boxing as a security strategy. If I wanted to do that I wouldn't play the game — I started out certain that a real ASI could break out, and that hasn't changed. My reasoning for that isn't based on experimental evidence, and even if I had won the game I don't think that would have said much about my ability to hold out against a real ASI. Besides, in real life, we don't even try to use AI boxes. OpenAI and Google gave their AIs free internet access a few months a...
The trouble with these rules is that they mean that someone saying "I played the AI-box game and I let the AI out" gives rather little evidence that that actually happened. For all we know, maybe all the stories of successful AI-box escapes are really stories where the gatekeeper was persuaded to pretend that they let the AI out of the box (maybe they were bribed to do that; maybe they decided that any hit to their reputation for strong-mindedness was outweighed by the benefits of encouraging others to believe that an AI could get out of the box; etc.). Or...
But the big caveat is the exception "with the consent of both parties." I realize that Eliezer doesn't want to play against all comers, but presumably, nobody is expecting Ra and Datawitch to defend themselves against random members of the public.
I'm willing to believe that the "AI" can win this game since we have multiple claims to have done that, so knowing the method seems like it would benefit everybody.
[edited to fix a misspelling of Eliezer's name]
I tracked the claim back to Wikipedia and from there to this article.
Scurvy killed more than two million sailors between the time of Columbus’s transatlantic voyage and the rise of steam engines in the mid-19th century. The problem was so common that shipowners and governments assumed a 50% death rate from scurvy for their sailors on any major voyage.
Searching more broadly turned up this, which at least has a few claims we can check easily.
...It has been estimated the disease killed more than 2 million sailors between the 16th and 18th centuries. On a l
I don't really have an opinion on the first two questions.
I usually don't read AI posts (especially technical or alignment ones, I'm not an ML engineer and usually struggle to follow them), I read like... stories, everything zvi writes, posts my friends make, things that catch my interest...
https://www.lesswrong.com/posts/KeczXRDcHKjPBQfz2/against-yudkowsky-s-evolution-analogy-for-ai-x-risk
https://www.lesswrong.com/posts/Q3qoy8DFnkMij4xzC/ai-108-straight-line-on-a-graph
https://www.lesswrong.com/posts/D82drnrhJEmPpoSEG/counting-objections-to-housing
https://... (read more)