LESSWRONG
LW

All of Bridgett Kay's Comments + Replies

Alignment Paradox and a Request for Harsh Criticism

"Future progress is a part of current human values" of course- the danger lies in the "future" always being just that- the future. One would naturally hope that it wouldn't go this way, but continuously putting off the future because now is always the present is a possible outcome. It can even be a struggle with current models to get them to generate novel ideas, because of a stubbornness not to say anything for which there is not yet evidence.

Thank you for that criticism- I hadn't necessarily given that point enough thought, and I think I am starting to see where the weaknesses are.

Alignment Paradox and a Request for Harsh Criticism

Bridgett Kay1mo10

Yeah- calling myself a failed scifi writer really was half in jest- had some very limited success as an indie writer for a good number of years, and recently need has made me shift direction. Thank you for the encouragement, though!

Alignment Paradox and a Request for Harsh Criticism

Bridgett Kay1mo30

"If your conclusion is that we don't know how to do value alignment, I and I think most alignment thinkers would agree with you. If the conclusion is that AGI is useless, I don't think it is at all."

Sort of- I worry that it may be practically impossible for current humans to align AGI to the point of usefulness.

"If we had external help that allowed us to focus more on what we truly want—like eliminating premature death from cancer or accidents, or accelerating technological progress for creative and meaningful projects—we’d arrive at a very different futur... (read more)

3Seth Herd1mo

I agree with everything you've said there. The bigger question is whether we will achieve usefully aligned AGI. And the biggest question is what we can do. Ease your mind! Worries will not help. Enjoy the sunshine and the civilization while we have it, don't take it all on your shoulders, and just do something to help! As Sarah Connor said: NO FATE We are not in her unfortunately singular shoes. It does not rest on our shoulders alone. As most heroes in history have, we can gather allies and enjoy the camaraderie and each day. On a different topic, I wish you wouldn't call yourself a failed scifi author or a failed anything. I hope it's in jest or excessive modesty. Failure is only when you give up on everything or are dead. I think there is much to accomplish in writing good fiction. It doesn't have to be perfect. Changing directions isn't failing either; it's changing strategy, hopefully as a result of learning.

Don’t ignore bad vibes you get from people

Bridgett Kay1mo10

Does anyone have any generally helpful advice for someone who doesn't really get vibes? Should I just continue to be more timid than normal, or is there a helpful heuristic I can use (aside from the 'don't talk to strangers, don't join MLM's, be wary of things that are too good to be true' stuff that our parents tell us at a young age.)

7Said Achmiz1mo

As with any expertise, the standard heuristic is “if you can’t do it in-house, outsource it”. In this case, that means “if you have a trusted friend who does ‘get vibes’, consult with them when in doubt (or even when not in doubt, for the avoidance thereof)”. Of course, the other standard heuristic is “it takes expertise to recognize expertise”, so finding a sufficiently trusted friend to consult on such things may be difficult, if you do not already have any such. Likewise, principal-agent problems apply (although sufficiently close friends should be as close to perfect alignment with the principal as any agent can realistically get).

shminux's Shortform

Bridgett Kay5mo81

1991/1992, actually (Harry Potter was born July 1980, and the story takes place the school year after his 11th birthday.)

Yoav Ravid's Shortform

Bridgett Kay5mo10

Seems to me that the only winning move is not to play.

This is already your second chance

Bridgett Kay7mo32

This might be our third, fourth fifth... nth chance.

Meta Alignment: Communication Wack-a-Mole

Bridgett Kay8mo10

Thank you.

Bridgett Kay10mo10

This is legitimate- the definition of weirdness was kept open-ended. I intended weirdness to be any behavior that is divergent from what most in a certain group considers to be the status quo, but even within a group, each member may have a different definition of what weird behavior is, and a consensus will be difficult to pin down.

I would consider rudeness to be weird behavior under this definition. It is a social behavior that comes with the cost of disrupting social cohesion. What is considered rude, vs. frank and straightforward, will vary from ... (read more)

[April Fools' Day] Introducing Open Asteroid Impact

Bridgett Kay1y329

We don't know how to align asteroids' trajectories, so it's important to use smaller asteroids to align larger ones- like a very large game of amateur billiards.

LessWrong's (first) album: I Have Been A Good Bing

Bridgett Kay1y129

I love this! But I find myself a little disappointed there's not a musical rendition of the "I have been a good bing" dialogue.

Can we get an AI to "do our alignment homework for us"?

Answer by Bridgett KayFeb 28, 202410

As one scales up a system, any small misalignment within that system will become more apparent- more skewed. I use shooting an arrow as an example. Say you shoot an arrow at a target from only a few feet away. If you are only a few degrees off from being lined up with the bullseye, when you shoot the close target your arrow will land very close to the bullseye. However, if you shoot a target many yards away with the same degree of error, your arrow will land much, much farther from the bullseye.

So if you get a less powerful AI aligned with your goals... (read more)

My Weirdest Experience

Bridgett Kay2y*40

That seems fairly consistent with what happened to me. I did not experience my entire life in the dream- just the swim meet and the aftermath, and my memories were things I just summoned in the moment, like just coming up with small pieces of a story in real time. The thing that disturbed me the most wasn't living another life- though that was disturbing enough- but the fact that a character in the dream knew a truth that "I" did not.

3[anonymous]2y

Who knows, maybe it was your right hemisphere. Shout-outs to them, if so. Almost definitely the first time someone has directly referred to them, that's got to be very exciting. Even if you are not literally their right hemisphere (not like you would know of course), but if you are there and if you have access to high-level knowledge of the world: hi, good job all of these years!

My Weirdest Experience

Bridgett Kay2y10

I have a similar trick I use with pirouettes- if I can turn and turn without stopping, then it is a dream. Of course, in this dream, I was not a dancer and had never danced, so I didn't even think of it.

Ways I Expect AI Regulation To Increase Extinction Risk

Bridgett Kay2y64

Lately I've been appreciating, more and more, something I'm starting to call "Meta-Alignment." Like, with everything that touches AI, we have to make sure that thing is aligned just enough to where it won't mess up or "misalign" the alignment project. For example, we need to be careful about the discourse surrounding alignment, because we might give the wrong idea to people who will vote on policy or work on AI/AI adjacent fields themselves. Or policy needs to be carefully aligned, so it doesn't create misaligned incentives that mess up the alignment project; the same goes for policies in companies that work with AI. This is probably a statement of the obvious, but it is really a daunting prospect the more I think about it.

The LessWrong 2019 Review

Bridgett Kay4y30

I was just wondering, on the subject of research debt, if there was any sort of system so that people could "adopt" the posts of others. Like say, if someone posts an interesting idea that they don't have the time to polish or expand upon, they could post is somewhere for people who can.

4Ruby4y

I think a good option here is to take the core idea of the post and make its own wiki page for it (we hope to shortly make wiki-page creation straightforward, for now it's fine to treat tag pages as wikis even when you don't want the tags). This might be unconventional in the sense that wikis generally are more for "established" facts, but I think a wiki where people are fleshing out thoughts would be cool and good, definitely worth people trying.

Raemon4y110

There isn't a formal system, but in general people are free to write new distillations of old posts.

My Weirdest Experience

Bridgett Kay4y20

Yeah- the experience really shook me. I'm prone to fairly vivid and interesting dreams, but this was definitely the strangest.

Null-boxing Newcomb’s Problem

Bridgett Kay5y80

But this was the final trick, for as soon as Maxwell accepted the two million dollars, the simulation ended.

Seeing the Matrix, Switching Abstractions, and Missing Moods

Bridgett Kay6y30

How would you compare this technique to a more standard mindfulness practice?

7Raemon6y

I’d call this more of an ability I randomly got one day than a technique, and it seems plausible that mindfulness practice is a a path to more reliably gain the ability

Open Thread January 2019

Bridgett Kay6y10

Well, I'm setting up a SETI style project looking for extra-temporal info... in other words looking for time travelers. I did an initial set of experiments which were poorly planned out and riddled with paradox, but I've redesigned the experiments and will be starting them soon.

Open Thread January 2019

Bridgett Kay6y10

I see. Just running with the premise as it stood.

Open Thread January 2019

Bridgett Kay6y40

Do you think it is more likely that r&d will simply cease rather than there being fewer and fewer returns from r&d over time, causing companies to put more money into it to stay competitive? I wonder if the situation might not cause the prices to actually go up, like with medication.

2avturchin6y

I don't think that r&d will cease. My argument was in style if "A then B", but I don't think that A is true. I am argue here against those who associate the end of Moore's law with the end of growth of computational power.

Open Thread January 2019

Bridgett Kay6y170

I've been lurking for a while but haven't posted very much. I'm a writer who also enjoys doing weird experiments in my spare time. Hi there :)

3Pee Doom6y

What weird experiments?

3Raemon6y

Welcome!

Is Science Slowing Down?

Bridgett Kay6y30

I'm also partial to the low hanging fruit explanation. Unfortunately, it seems to me we can really only examine progress on already established fields. Much harder to tell if there is much left to discover outside of established fields- the opportunities to make big discoveries that establish whole new fields of study. This is where the undiscovered, low hanging fruit would be, i think.

Double-Dipping in Dunning--Kruger

Bridgett Kay6y40

This is probably good general advice, but it's a different matter when there is evidence that points to being an actual imposter. For example, when I write novels that do not sell, or blog posts that get downvoted to oblivion, it is difficult to get honest feedback as to how I might improve my writing. The feedback I get is almost always positive, but reviews are self-selected because people rarely are motivated to review something unless they especially like it. Plus, politeness prohibits people from being harsh when you ask for feedback. For these reas

... (read more)

Hero Licensing

Bridgett Kay7y60

There's one person who always has the authority to say you can't try-yourself. Some people have a harder time than others when it comes to ignoring the discouragement of Pat and Maude.