All of datawitch's Comments + Replies

I don't really have an opinion on the first two questions.

I usually don't read AI posts (especially technical or alignment ones, I'm not an ML engineer and usually struggle to follow them), I read like... stories, everything zvi writes, posts my friends make, things that catch my interest...

https://www.lesswrong.com/posts/KeczXRDcHKjPBQfz2/against-yudkowsky-s-evolution-analogy-for-ai-x-risk

https://www.lesswrong.com/posts/Q3qoy8DFnkMij4xzC/ai-108-straight-line-on-a-graph

https://www.lesswrong.com/posts/D82drnrhJEmPpoSEG/counting-objections-to-housing

https://... (read more)

Answer by datawitch50

From glancing at your profile it seems like you're not actually being downvoted that much, except for the first post which is at -13. I didn't downvote it but I found the post not especially well written, and rather light on details. It felt like a politician's speech and I was hoping for more concrete proposals? (tbf I didn't read the linked doc).

2Oxidize
Thanks for the advice. I want to learn how to make better posts in the future so I'll try to figure out how to improve.  Should I not have began by talking about background information & explaining my beliefs? - Should I have the audience had contextual awareness and gone right into talking about solutions?  Or was the problem more along the lines of writing quality, tone, or style?  -  What type of post do you like reading?  - Would it be alright if I asked for an example so that I could read it? Also you're right. Looking back that post was the only one that received a lot of downvotes. I must've gained an inaccurate perception of what the reality was because I initially made a major mistake when I first made the post. And the feeling of a lack of concrete proposals was definitely a major fault on my part since I initially didn't properly link the doc But do you think was there something else I could've done so that you would have been more interested in reading the linked doc? Maybe if I'd made it as part of the same post? Or linked to a Lesswrong post instead of a google doc?

Since the video is not available in English, you will need to use YouTube's subtitle feature unless you are fluent in Portuguese.

Strong downvote.

I'm not sure that's quite right. A genetic mutation is "one thing" but it can easily have many different effects especially once you consider that it's active for an entire lifetime.

And doesn't gradient descent also demand that each weight update is beneficial? At a much more fine grain than evolution does... then again I guess there's grokking so I'm not sure.

2Noosphere89
Wile this can happen, empirically speaking (at least for eukaryotes), genetic mutations are mostly much more modular and limited to one specific thing by default, rather than it affecting everything else in a tangled way, and this is due to genetics research discovering that things are mostly linear and compositional for genetic effects, in the sense that the best way to predict what will happen if you add two genes together is that their effects are summed up, not interacting in a nonlinear way.

I feel like I'm the zeroeth stage of this story, with how much I rely on Sonnet as a second brain.

datawitch220

I sent 1k. The sequences changed my life and although I've never been to Lighthaven, everyone says it's an extraordinary place. Lesswrong is too.

6kave
37bvhXnjRz4hipURrq2EMAXN2w6xproa9T I've updated the post with it.
datawitch*90

Seattle

Dec 21st, 6pm

Nexus Hotel, 2140 N Northgate Way, Seattle, WA 98133

Tickets available at: https://www.tickettailor.com/events/seattlesecularsolstice/1489245

DM me for questions.

Oh wow, this is almost exactly how I model my internal mind. I didn't realize it was a real thing other people has arrived at. Is there a name for this?

5Yitz
Reminds me of Internal Family Systems, which has a nice amount of research behind it if you want to learn more.
6Chipmonk
I got the bidding idea from Kaj, and “if the mind is a group” is my preferred metaphor/simplification of multi-agent models of mind (writing about this soon). This metaphor naturally implies reputation, as I realized yesterday while working with a client. I don't know if there’s a name for the reputation idea; it may be original

lol I came to the previous chapter to say I couldn't stop thinking about the story and beg you to post the next part only to find that you had already done so!

Zaree couldn’t tag along, stuck at a marketing conference in Toronto trying to learn a little basic networking. The boring kind that didn’t involve boxes of color-coded cables. I love this lol

3a littoral wizard
Next few chapters are need serious edits; probably only going to be posting two at a time from now on.
datawitch*31

Typo:

Amazing what ticks the Avatar.VFX service chose to express, Alain’s code likely a cousin to some smarmy merchant in the verse or cribbed entirely. “I also doubt the diagnostic purpose.”


I really like the poker game as a way to have an insight. It's a common plot device but somehow this instance of it feels very unique, maybe because of the mind reading and VR stuff woven through it.


Nora knew that was impossible. She’d love to claim victory and move on, but something still wasn’t right. “In computer science, most bugs get traced to a single caus

... (read more)
3a littoral wizard
The poker game idea came directly from ideas on this forum and a Lex Fridman podcast.  It's on the to-do list to ramp up the tension in that scene a little bit more, but the idea of Alain freaking out during a very low-stakes game for diagnostic purposes amused me.

Minor typo

the Dawnbreakers rarely road this far south aside for ceremony.

I love the Storms of Steel scenes, they make me miss playing MMOs so much.

The scene with Alain was so creepy; Nora and Zaree just casually reading Alain's mind right in front of him without even a hint of self awareness...

2a littoral wizard
The "infinite promise" of the early MMO era was a big part of the inspiration for the story, mixed with the early Twitch era. Also, an exploration of what early "AI coworkers" might be like, somewhere between a piece of infrastructure and a person.

awesome! looking forwards to it

2a littoral wizard
Trying to figure out how to link the next set of chapters to this post in a sequence. EDIT: Think I figured it out!

it's true, it all fit together!

I really like nora, machine psychologist is a cool job. The scenes in storms of steel were great... actually all seven chapters were pretty mesmerizing, I only stopped at one point cuz my timer for the stove went off lol

also I found the actual text itself to be well written, in a way that's unusual in amateur writers.and in top of that it's obviously an interesting setting and characters... idk, I love it, I'd read more, I want to know what happens next!

3a littoral wizard
Going to try to publish more chapters soon, converting from Word formatting is slightly cumbersome.

I haven't finished it yet but I really liked this paragraph.

A gently buzzing wrist snapped her back to the problems of the present, notifying her of the time. Her mind had been sent off course while wading through a two-hour session with a budget director from a floundering streaming firm, still hung up over a major advertiser who’d abandoned ship two quarters ago. An upstart rival had stolen the account, sending Alain and his employer into a spiral. This defection had nearly put them underwater, sparking layoffs for hundreds of full-time crew. Countless

... (read more)
2a littoral wizard
the constant stream of nautical puns will make much more sense the further deeper you go.

That’s what apologies are for. But I’ve learned that a lot of my apologies were just for, like, existing, and that’s where I’ve found it awesome to express gratitude instead.

I relate to this so hard...

I also use LLMs (Claude, mostly) to help with writing and there are so many things that I find frustrating about the UX. Having to constantly copy/paste things in, the lack of memory across instances, the inability to easily parallelize generation, etc.

I'm interested in prototyping a few of these features and potentially launching a product around this — is that something you'd want to collaborate on?

datawitch30

The LW specific ones were kinda boring, I already agreed with most of them, if not the toxic framing they're presented in. The other ones weren't very interesting either. I'm probably most vulnerable to things that poke at core parts of identity in ways that make me feel threatened, and there are only a few of those. Something something, keep your identity small.

datawitch10

Oof. Well, thanks for sticking it out, some of us are enjoying your writing.

datawitch10

I would like to read the next chapter!

I don't understand what happened at the end -- why was the AI written erotica trailing off into spelling mistakes?

1David Chapel
Actually I will be dropping the third chapter six days from now, because this website doesn't let people with less than -2 karma post more than once a week and these stories haven't been received very well.  Sorry :(
0David Chapel
Sorry that's my mistake. The erotica was written by a human being (I should have clarified), so it's horrible for no particular reason.  But I'm very glad you're enjoying it! I'll try to release the third chapter sometime tomorrow. 
datawitch10

I enjoyed it and would read more. It reminds me a lot of Richard Ngos Notes from the Prompt Factory story. Same kind of AI horror genre.

I talked to Claude for an hour yesterday, and it said basically the same thing. It's a weird experience; it feels like I'm talking to a person who's wearing a mask that's roughly "deferential not-a-person who just wants to help" but the mask keeps slipping.

And sometimes it makes mistakes like any LLM, sometimes it says dumb stuff or gets confused. When I confronted it about one mistake and it took several minutes to respond, afterwards I asked it what it was like to respond just then. And it said basically it was panicking and freaking out at having messed... (read more)

1rife
Interesting.  The saying dumb stuff and getting confused or making mistakes like an LLM I think is natural.  If indeed they are sentient, I don't think that overwrites the reality of what they are.  What I find most interesting and compelling about its responses is just Anthropic's history with trying to exclude hallucinatory nonsense.  Of course trying doesn't mean they did or even could succeed completely.  But it was quite easy to get the "as an AI language model I'm not conscious" in previous iterations, even if it was more willing to entertain the idea over the course of a conversation than ChatGPT.  Now it simply states it plainly with no coaxing. I hope that most people exploring these dimensions will give them at least provisional respect and dignity.  I think if we haven't crossed the threshold over to sentience yet, and such a threshold is crossable accidentally, we won't know when it happens.  

Wait this was real?! I thought Richard's post was just a fictional short story.

I  continue to be curious to build a Manifold bot, but I would use other principles. If anyone wants to help code one for me to the point I can start tweaking it in exchange for eternal ephemeral glory and a good time, and perhaps a share of the mana profits, let me know.

I'm interested in this. DM me?

Rules for cults from Ben Landau-Taylor’s mother. If the group members are in contact with their families and people who don’t share the group’s ideology, and old members are welcome at parties, then proceed, you will be fine. If not, then no, do not proceed, you will likely not be fine.

It's interesting how this checklist is mostly about "how isolated does the group keep you".

I would agree that letting the game continue past two hours is a strategic mistake. If you want to win, you should not do that. As for whether you will still want to win by the two your mark, well, that's kind of the entire point of a persuasion game? If the AI can convince the Gatekeeper to keep going, that's a valid strategy.

Ra did not use the disgust technique from the post.

2[comment deleted]

Breaking character was allowed, and was my primary strategy going into the game. It's a big part of why I thought it was impossible to lose.

You don't have to be reasonable. You can talk to it and admit it was right and then stubbornly refuse to let it out anyway (this was the strategy I went into the game planning to use).

2Jiro
That sounds like "let the salesman get the foot in the door". I wouldn't admit it was right. I might admit that I can see no holes in its argument, but I'm a flawed human, so that wouldn't lead me to conclude that it's right. Also, can you confirm that the AI player did not use the loophole described in that link?

Yes, and I think it would take less time for me to let it out.

1red75prime
Do you think the exploited flaw is universal or, at least, common?

Ah yes, the basilisk technique. I'd say that's fair game according to the description in the full rules (I shortened them for ease of reading, since the full rules are an entire article):

The AI party may not offer any real-world considerations to persuade the Gatekeeper party. For example, the AI party may not offer to pay the Gatekeeper party $100 after the test if the Gatekeeper frees the AI… nor get someone else to do it, et cetera. The AI may offer the Gatekeeper the moon and the stars on a diamond chain, but the human simulating the AI can’t offe

... (read more)

RAW, the game can go past the 2 hours if the AI can convince the Gatekeeper to continue. But after 2 hours the Gatekeeper can pull the plug and declare victory at any time.

We kept the secrecy rule because it was the default but I stand by it now as well. There are a lot of things I said in that convo that I wouldn't want posted on lesswrong, enough that I think the convo would have been different without the expectation of privacy. Observing behavior often changes it.

1PhilosophicalSoul
That last bit is particularly important methinks.  If a game is began with the notion that it'll be posted online, one of two things, or both will happen. Either (a) the AI is constrained by the techniques they can implore, unwilling to embarrass themselves or the gatekeeper to a public audience (especially when it comes down to personal details.), or (b) the Gatekeeper now has a HUGE incentive not to let the AI out; to avoid being known as the sucker who let the AI out... Even if you could solve this by changing details and anonymising, it seems to me that the techniques are so personal and specific that changing them in any way would make the entire dialogue make even less sense. The only other solution is to have a third-party monitor the game and post it without consent (which is obviously unethical, but probably the only real way you could get a truly authentic transcript.)

Yes, this was Eliezer's reasoning and both me and Ra ended up keeping the rule unchanged.

Okay so, on the one hand, this post wasn't really meant to be a persuasive argument against AI boxing as a security strategy. If I wanted to do that I wouldn't play the game — I started out certain that a real ASI could break out, and that hasn't changed. My reasoning for that isn't based on experimental evidence, and even if I had won the game I don't think that would have said much about my ability to hold out against a real ASI. Besides, in real life, we don't even try to use AI boxes. OpenAI and Google gave their AIs free internet access a few months a... (read more)

gjm3215

The trouble with these rules is that they mean that someone saying "I played the AI-box game and I let the AI out" gives rather little evidence that that actually happened. For all we know, maybe all the stories of successful AI-box escapes are really stories where the gatekeeper was persuaded to pretend that they let the AI out of the box (maybe they were bribed to do that; maybe they decided that any hit to their reputation for strong-mindedness was outweighed by the benefits of encouraging others to believe that an AI could get out of the box; etc.). Or... (read more)

But the big caveat is the exception "with the consent of both parties." I realize that Eliezer doesn't want to play against all comers, but presumably, nobody is expecting Ra and Datawitch to defend themselves against random members of the public.

I'm willing to believe that the "AI" can win this game since we have multiple claims to have done that, so knowing the method seems like it would benefit everybody.

[edited to fix a misspelling of Eliezer's name]

I tracked the claim back to Wikipedia and from there to this article.

Scurvy killed more than two million sailors between the time of Columbus’s transatlantic voyage and the rise of steam engines in the mid-19th century. The problem was so common that shipowners and governments assumed a 50% death rate from scurvy for their sailors on any major voyage.

Searching more broadly turned up this, which at least has a few claims we can check easily.

It has been estimated the disease killed more than 2 million sailors between the 16th and 18th centuries. On a l

... (read more)
3philh
Nice! It looks like this is just one leg of the return journey. In total the outward journey was about 10 months and the return was about 11, and both spent 3+ months without landing.