All of Perhaps's Comments + Replies

There are some models on HuggingFace that do automatic PII data redaction, I've been working on a project to automate redaction for documents with them. AI4privacy's models and Microsoft Presidio have been helpful.

You might find some puzzle games to be useful. In particular Understand is a game that was talked about on here as being good for learning how to test hypotheses and empirically deduce patterns. Similar to your Baba Is You experiments. 

Any way that we can easily get back our own results from the survey? I know you can sometimes get a copy of your responses when you submit a Google form.

7Screwtape
Not easily. In order to give people a copy of their responses with google forms, I would need to collect emails. It even becomes a required question on the form. Collecting emails changes the tenor of the survey quite a bit I think, even if I invited people to enter nonsense for an email if they didn't want to give that information.

What happens to the general Lightcone portfolio if you don't meet a fundraising target, either this year or a future year?

For concreteness, say you miss the $1M target by $200K. 

The karma buttons are too small for actions that in my experience, are done a lot more than clicking to listen to the post. It's pretty easy to misclick.

Additionally, it's unclear what the tags are, as they're no longer right beside the post to indicate their relevance. 

6habryka
The big vote buttons are at the bottom of the post, where I would prefer more of the voting to happen (I am mildly happy to discourage voting at the top of the post before you read it, though I am not confident).

I think this post would benefit from an abstract / summary / general conclusion that summarizes the main points and makes it easier to interact with. Usually I read a summary to get an idea of a post, then browse the main points and see if I'm interested enough to read on. Here, it's hard to engage, because the writing is long and the questions it seems to deal with are nebulous.

How did you find LessWrong?

Do still have any Mormon friends? Do you want to help them break away, do you think it's something they should do on their own, or do you find whether they remain Mormon or not immaterial?

Do you think being a Mormon was not suited for you, or do you think it doesn't work as a way of life in general? How do you think that your answer would change 50 years ago vs today?

Did you have contact/ongoing relationships with other Mormon communities while you were there? What is the variation between people/communities? How devout/lax are d... (read more)

7ErioirE
1. I found LessWrong via HPMoR. I found HPMoR because it was mentioned offhand on an unrelated article as an example of a piece of fan-fiction that was actually good...and it was. 2. Many of my friends and most of my immediate and extended family are Mormon. I'll happily discuss my opinions to any who seem like they want help/are searching for answers, but I'm not in the habit of proactively questioning their beliefs since bringing up info directly critical of the Church tends to make True Believers instantly defensive and suspicious. Social 'immune systems' are incredible in how they defend existing beliefs. I do try to indirectly raise the sanity waterline in the hope that others will find their own way out of their own volition. I'm more or less resigned to leave those who are content alone, it seems quite futile to try to rescue someone who doesn't want to be rescued, and they might not appreciate it even if the 'brute force logic' solution did manage to convince them. 3. Anything that can be destroyed by the truth should be. 50 years ago would have been more difficult to find my way out of because the internet didn't exist. 4. Even between households there's a lot of variation in how certain rules and interpreted. My family was more on the extreme end of the Sabbath Observance spectrum, while most other kids I knew were allowed to play video games and generally do what they wanted on Sunday. It was very difficult to grow up dreading every Sunday, and even though I wasn't all that 'rebellious' of a teenager necessity caused me to find a quiet spot to read sci-fi/fantasy novels or play video games when my parents weren't looking. Some other things my parents didn't allow: * No Pg-13 movies, mostly because anything remotely sexually suggestive was considered 'inappropriate'. This essentially was enforced only at home. They acknowledged there was nothing stopping us from watching whatever movies at our friends houses and they wouldn't try t

Well, someone was working on a similar-ish project recently, @Bruce Lewis with HowTruthful. Maybe you two can combine your ideas or settle on an amalgamation together. 

If possible, please let us know how it goes a couple months from now!

1Bruce Lewis
The best path forward might be for @DPiepgrass to make a prototype or mockup, borrowing ideas from HowTruthful and then discussing from there.

So this is Sam Altman raising the 5-7 trillion, not OpenAI as an entity, right?

6Carl Feynman
Right.  He’s raising the money to put into a new entity.

Could some kind of caustic gas, or the equivalent of a sandstorm be used to make drones not useful? I feel like large scale pellet spreads wouldn't be too useful if the drones are armoured, but I don't know too much about armour or how much piercing power you could get. I wonder if some kind of electric netting could be fired to mass electrocute a swarm, or maybe just regular netting that interferes with their blades. Spiderwebs from the sky?

Interesting post, although I feel like it would benefit from inline references. For most of the post it feels like you're pulling your assertions out of nowhere, and only at the end do we get some links to some of the things you said. I understand time/effort constraints though.

4RussellThor
Thanks - yes I was somewhat pulling my assertions out of nowhere, it was somewhat of an invitation to the reader to think about physical limits and question the current situation than to say I knew the details of where it would all lead. If the articles I linked to did not exist yet I would still be writing a similar article and claiming they would soon.  Specifically for netting, it is already used on choke points (trenches for both sides in Ukraine) - however couldn't the cheapest suicide drone explode against the netting to then let others through?

I derive a lot of enjoyment from these posts, just walking through tidbits of materials science is very interesting. Please keep making them.

I think at its most interesting it looks like encrypting your actions and thought processes so that they look like noise or chaos to outside observers.

1nim
I think that's definitely an aspect of the interesting side: effective encryption relies on deep understanding of how well the opponent can break the encryption. It needs to be strong enough to seem certain it won't be broken in a reasonable timeframe, but that balances against being fast enough to encrypt/decrypt so it's practical to use. The encryption metaphor also highlights a side of rationality as rendering one's thoughts and actions maximally legible to observers, which strikes me as being true in some ways and interestingly limited in others.

I would say value preservation and alignment of the human population. I think these are the hardest problems the human race faces, and the ones that would make the biggest difference if solved. You're right, humanity is great at developing technology, but we're very unaligned with respect to each other and are constantly losing value in some way or another. 

If we could solve this problem without AGI, we wouldn't need AGI. We could just develop whatever we want. But so far it seems like AGI is the only path for reliable alignment and avoiding Molochian issues.

1lemonhope
I agree deeply with the first paragraph. I was going to list coordination as the only great thing I know of where AI might be able to help us do something we really couldn't do otherwise. But I removed it because it occurred to me that I have no plausible story for how that would actually happen. How do you imagine that going down? All I've got is "some rogue benevolent actor does CVE or pivotal act" which I don't think is very likely.

I think what those other things do is help you reach that state more easily and reliably. It's like a ritual that you do before the actual task, to get yourself into the right frame of mind and form a better connection, similar to athletes having pre game rituals.

Also yeah, I think it makes the boredom easier to manage and helps you slowly get into it, rather than being pushed into it without reference. 

Probably a lot of other hidden benefits though, because most meditation practices have been optimized for hundreds of years, and are better than others for a reason.

1Decaeneus
The parallel to athlete pre game rituals is an interesting one, but I guess I'd be interested in seeing the comparison between the following two groups: group A: is told to meditate the usual way for 30 minutes / day, and does group B: is told to just sit there for 30 minutes / day, and does So both of the groups considered are sitting quietly for 30 minutes, but one group is meditating while the other is just sitting there. In this comparison, we'd be explicitly ignoring the benefit from meditation which acts via the channel of just making it more likely you actually sit there quietly for 30 minutes.

I feel like it's not very clear here what type of coordination is needed.

How strong does coordination need to become before we can start reaching take off levels? And how material does that coordination need to be?

Strong coordination, as I'm defining here, is about how powerfully the coordination constrains certain actions.

Material coordination, as I'm defining here, is about on what level the coordination "software" is running. Is it running on your self(i.e. it's some kind of information that's been coded into the algorithm that runs on your brain, examp... (read more)

Rather than this Feeling Good app for patients, I'd be more interested in an app that let people practice applying CBT techniques to patient case studies(or maybe even LLMs with specified traits), in order to improve their empathy and help them better understand people. If this could actually develop good therapists with great track records, then that would prove the claims made in this article and help produce better people.

2ChristianKl
Interacting with a case study and interacting with a real person are quite different as far as empathy goes. If you train on making LLM feel understood that might not make a person feel understood if you do the same thing.

I'm not sure it only applies to memory. I imagine that ancient philosophers had to do most of their thinking in their heads, without being able to clean it up by writing it out and rethinking it. They might be better able to edit their thoughts in real time, and might have a stronger control over letting unreasonable or not-logical thoughts and thought processes take over. In that sense, being illiterate might lend a mental stability and strength that people who rely on writing things out may lack. 

Still, I think that the benefits of writing are too enormous to ignore, and it's already entrenched into our systems. Reversing the change won't give a competitive edge.

If compute is limited in the universe, we can expect that civilizations or agents with access to it will only run simulations strategically, unless running simulations is part of their value function. Simulations according to a value function would probably be more prevalent, and would probably have spiderman or other extreme phenomena. 

However, we can't discount being in one of those information gathering simulations. If for some reason you needed to gather information from a universe, you'd want to keep everything as simple as possible, and only tun... (read more)

It's possible that with the dialogue written, a well prompted LLM could distill the rest. Especially if each section that was distilled could be linked back to the section in the dialogue it was distilled from.

6Said Achmiz
Sure, it’s possible. I don’t trust LLMs nearly enough to depend directly on such a thing in a systematic way, but perhaps there could be a workflow where the LLM-generated summary is then fed back to the dialogue participants to sign off on. That might be a very useful thing for either the LW team or some third party to build, if it worked.

I like the ideal, but as a form of social media it doesn't seem very engaging, and as a single source of truth it seems strictly worse than say, a wiki. Maybe look at Arbital, they seem to have been doing something similar. I also feel that dealing with complex sentences with lots of implications would be tough, there are many different premises that lead to a statement. 

Personally I'd find it more interesting if each statement was decomposed into the premises and facts that make it up. This would allow tracing an opinion back to find the crux between... (read more)

1Bruce Lewis
Agreed there's a lot of work ahead in making it engaging. I define "pro" as anything one might say in defense of a statement, and that includes decomposing it. It can also include disambiguating it. Or citing a source. Thanks for the well-wishes. Only two paid users so far, but I'm getting very useful feedback and will have a second iteration with key improvements.

I guess while we're anthropomorphizing the universe, I'll ask some crux-y questions I've reached.

If humanity builds a self-perpetuating hell, does the blame lie with humanity or the universe?

If humanity builds a perfect utopia, does the credit lie with humanity or the universe?

Frankly it seems to me like what's fundamentally wrong with the universe is that it has conscious observers, when it needn't have bothered with any to begin with.    

Answer by Perhaps8-4

If there's something wrong with the universe, it's probably humans who keep demanding so much of it. 

Most universes are hostile to life, and at most would develop something like prokaryotes. That our universe enabled the creation of humans is a pretty great thing. Not only that, but we seem to be pretty early in the universal timespan, which means that we get a great view of the night sky and less chances of alien invasion. That's not something we did ourselves, that's something the universe we live in enabled. None of the systemic problems faced by h... (read more)

1Caerulea-Lawrence
Hello Perhaps,  In a way, you are deftly eluding the issue at hand, by implying we should be thankful, and possibly, get our shit together ourselves.  You say that none of the systemic problems faced by humans today are caused by the universe, but that is making the opposite claim of the one I have made in my question, and so you aren't tackling the issue with the given rules, you seem to ignore them to give an answer that is right given Your arguments.  It isn't that I don't see your points, simply that they skip mine, and as such it oversimplifies the issue. If you could include what I wrote to how you answer, I would appreciate it. Caerulea-Lawrence
9Ratios
"If there's something wrong with the universe, it's probably humans who keep demanding so much of it. " Frankly, this is one of the most infuriating things I've read in LessWrong recently, It's super disappointing to see it being upvoted. Look, if you weigh the world's suffering against its joy through hedonistic aggregation, it might be glaringly obvious that Earth is closer to hell than to heaven. Recall Schopenhauer’s sharp observation: “One simple test of the claim that the pleasure in the world outweighs the pain…is to compare the feelings of an animal that is devouring another with those of the animal being devoured.” It's all roses and sunshine when you're sitting pretty in a Western country, isn't it? But I bet that perspective crumbles if you're looking through the eyes of a Sudanese child soldier or an animal getting torn apart. If a human dreamt up the cruel process that is evolution, we'd call him a deranged lunatic. At the very least, we should expect the universe not to treat conscious beings like disposable objects, but apparently, that's "demanding so much of it."
2dr_s
I mean, by anthropic principle, we couldn't ever be born in such universes. And I think this only applies if we consider various tunings of our known laws of physics. At the two extremes of the possibility spectrum though there is: * the constants aren't really constants but emergent phenomena themselves, and thus this is actually the only universe possible * the laws aren't in any way constant, the general symmetries and principles aren't constant, there's a whole Tegmark Type IV multiverse out there to explore In the former case, then I guess there's not much to complain about. In the latter, there surely are an infinity of universes better than this one (though also many much worse).

I think that most of the people who would take notes on LW posts are the same people who would benefit from, and may use, a general note taking system. A system like Obsidian or Notion or whatever would be used for a bunch of stuff, LW posts included. In that sense, I think it's unlikely that they'd want a special way to note-take just for LW, when it'd probably be easier and more standardized to use their existing note taking system.

If you do end up going for it, an "Export Notes" feature would be nice, in an easily importable format.

3Steven Byrnes
FWIW, I have a general notes system (in the genre of Obsidian / Notion / Roam / LogSeq / etc.) and also a Zotero full of downloaded articles and books. I keep almost everything in my general notes system … but I also sometimes attach individual notes to things in Zotero. These days those Zotero notes are usually one of the following two categories: (1) a note that just says “See my general notes”, indicating that I have written something about this article into my general notes system (and then I can go find it by searching for the author’s name, if I want to), or (2) a list of acronyms / jargon that are specific to that article and that I don’t already know. I can imagine using LW private notes for those same two things, just like my Zotero notes, if such a feature existed. It’s very low priority for me, though—I’m getting by just fine as is.

I think this is pretty good advice. I am allergic to nuts, and that has defined a small but occasionally significant part of my interactions with people. While on the whole I'd say I've probably experienced more negative experiences because of it(once went into anaphylaxis), I've often felt that it marked me as special or different from other people. 

About 5 or so years ago my mom heard about a trial run by a doctor where they fed you small amounts of what you're allergic to in order to desensitize and acclimate your immune system to the food. She rec... (read more)

It seems like multi-modality will also result in AIs that are much less interpretable than pure LLMs.

9Adam Shai
This is not obvious to me. It seems somewhat likely that the multimodaility actually induces more explicit representations and uses of human-level abstract concepts, e.g. a Jennifer Aniston neuron in a human brain is multimodal.

This seems like a pretty promising approach to interpretability, and I think GPT-6 will probably be able to analyze all the neurons in itself with >0.5 scores. Which seems to be recursive self-improvement territory. It would be nice if by the time we got there, we already mostly knew how GPT-2, 3, 4, and 5 worked. Knowing how previous generation LLMs work is likely to be integral to aligning a next generation LLM and it's pretty clear that we're not going to be stopping development, so having some idea of what we're doing is better than none. Even if an... (read more)

1sanxiyn
This seems to assume the task (writing explanations for all neurons with >0.5 scores) is possible at all, which is doubtful. Superposition and polysemanticity are certainly things that actually happen.

Indeed, in India especially it's not uncommon for people to be dragged off the streets and have their organs removed and sold by human traffickers, and killed after that. Making selling kidneys illegal at least ensures that this isn't an easy and straightforward thing to do. In Pakistan for example, an estimated 2500 kidneys were sourced in 2007.

Waiting for the day all my AI safety bookmarks can be summarized into just one website.

4jacquesthibs
Asking because I’m gonna work on stuff like this: What do you use for bookmarking? What does “summarized” mean to you? What would be the ideal setup?
8plex
I'm working on it. aisafety.world and /map are still a WIP, but very useful already.

Just read your novel, it's good! And has successfully reignited my AI doomer fears! I was a bit surprised by the ending, I was about 60/40 for the opposite outcome. I enjoyed the explainer at the end and and I'm impressed by your commitment to understanding AI. Please keep writing, we need more writers like you!

3Karl von Wendt
Wow! That was quick! Thank you very much! You may want to read this post by blaked, which he posted at about the same time I wrote this post. In it, he describes the kind of emotional manipulation that plays an important role in my novel. Only that it already happened in reality.

Well in the end, I think the correct view is that as long as the inventor is making safety measures from first principles, it doesn't matter whether they're an empath or a psychopath. Why close off part of the human race who are interested in aligning the world ending AI just because they don't have some feelings? It's not like their imagined utopia is much different from yours anyways.

4blaked
It sounds correct when you approach it theoretically. And it might well be that this results in a good outcome, it doesn't preclude it, at least if we talk about a random person that has psychopathy. However, when I think about it practically, it feels wrong, like when I think about which world has the best chance to produce utopia, the one where AGI is achieved by Robert Miles, or by the North Korea. There are a few more nation states that are making large progress that I would want to name but won't, to avoid political debate. These are the people I mostly was referring to, not random sociopaths working in AI field about whom I don't know anything. Which is why my personal outlook is such that I want as many people who are not like that to participate in the game, to dilute the current pool of lottery participants, who are, most of them, let's be honest, not particularly virtuous individuals, but currently have very high chances of being the first to achieve this.

Honestly I don't think that in the aftermath of a full-scale nuclear war or large asteroid impact any government would be funneling money into AGI. The entire supply chain would be broken, and they'd be scrambling just to keep basic life support on. This is mostly a nitpick though, as I agree with your points and I think this is sufficiently unlikely as to not matter.

1[anonymous]
The bias when I wrote this was I was thinking just a smidge into the future. A nuclear war literally tomorrow or an asteroid impact, sure. But once narrow AIs that run robotics pretty well are available? Early proto AGIs with the source code stored on some flash drive or tape that survives the war? Sounds like a plot to a science fiction novel.

I love this story, thanks for making it.

I love the Team Physics and Team Manipulation characterization, gives big pokemon vibes.

Excited and happy that you are moving forward with this project. It's great to know that more paths to alignment are being actively investigated. 

Bought this game because of the recommendation here, and it has replaced reading I Spy books with my sister as our bonding activity. I really like the minimalism, and its lack of addictive qualities. I've only got to 2-7 so far, but the fact that I eventually get stuck after about half an hour to an hour of playing means that it provides a natural stopping point for me, which is pretty nice. Thank you for the great review!

2MondSemmel
Thanks for the kind comment! I figured the game might work well as a cooperative experience, so I'm glad to hear that that was indeed the case :).

I think it's pretty reasonable when you consider the best known General Intelligence, humans. Humans frequently create other humans and then try to align them. In many cases the alignment doesn't go well, and the new humans break off, sometimes to vast financial and even physical loss to their parents. Some of these cases occur when the new humans are very young too, so clearly it doesn't require having a complete world model or having lots of resources. Corrupt governments try to align their population, but in many cases the population successfully revolt... (read more)

I think the point is more like, if you believe that the brain could in theory be emulated, with infinite computation(no souls or mysterious stuff of consciousness), then it seems plausible that the brain is not the most efficient conscious general intelligence. Among the general space of general intelligences, there are probably some designs that are much simpler than the brain. Then the problem becomes that while building AI, we don't know if we've hit one of those super simple designs, and suddenly have a general intelligence in our hands(and soon out of... (read more)

In addition to what Jay Bailey said, the benefits of an aligned AGI are incredibly high, and if we successfully solved the alignment problem we could easily solve pretty much any other problem in the world(assuming you believe the "intelligence and nanotech can solve anything" argument). The danger of AGI is high, but the payout is also very large.

In terms of utility functions, the most basic is: do what you want. "Want" here refers to whatever values the agent values. But in order for the "do what you want" utility function to succeed effectively, there's a lower level that's important: be able to do what you want. 

Now for humans, that usually refers to getting a job, planning for retirement, buying insurance, planning for the long-term, and doing things you don't like for a future payoff. Sometimes humans go to war in order to "be able to do what you want", which should show you that satisfyi... (read more)

1scott loop
Thanks for the response. Definitely going to dive deeper into this.

The first type of AI is a regular narrow AI, the type we've been building for a while. The second type is an agentic AI, a strong AI, which we have yet to build. The problem is, AIs are trained using gradient descent, which basically involves running AI designs from all possible AI designs. Gradient descent will train the AI that can maximize the reward best. As a result of this, agentic AIs become more likely because they are better at complex tasks. While we can modify the reward scheme, as tasks get more and more complex, agentic AIs are pretty much the way to go, so we can't avoid building an agentic AI, and have no real idea if we've even created one until it displays behaviour that indicates it.

1Sune
+1 for the word agentic AI. I think that is what I was looking for. However, I don’t believe that gradient descent alone can turn an AI agentic. No matter how long you train a language model, it is not going to suddenly want to acquire resources to get better at predicting human language (unless you specifically ask it questions about how to do that, and then implement the suggestions. Even then you are likely to only do what humans would have suggested, although maybe you can make it do research similar to and faster than humans would have done it).

Awesome post, putting into words the intuitions I had for what dimensions the alignment problem stayed in. You've basically meta-bounded the alignment problem, which is exactly what we need when dealing with problems like this.

China, overrated probably - I'm worried about signs that Chinese research is going stealth in an arms race. On the other hand, all of the samples from things like CogView2 or Pangu or Wudao have generally been underwhelming, and further, Xi seems to be doing his level best to wreck the Chinese high-tech economy and funnel research into shortsighted national-security considerations like better Uighur oppression, so even though they've started concealing exascale-class systems, it may not matter. This will be especially true if Xi really is insane enough to

... (read more)
1David Johnston
Are the unranked Chinese exascale systems relevant for AI research, or is it more that if they've built 2-3 such systems semi-stealthily, they might also be building AI-focused compute capacity too?
-5burmesetheater

Well it depends on your priors for how an AGI would act, but as I understand it, all AGIs will be powerseeking. If an AGI is powerseeking, and has access to some amount of compute, then it will probably bootstrap itself to superintelligence, and then start pushing its utility function all over. Different utility functions cause different results, but even relatively mundane ones like "prevent another superintelligence from being created" could result in the AGI killing all humans and taking over the galaxy to make sure no other superintelligence gets made.... (read more)

1HunterJay
Aye, I agree it is not a solution to avoiding power seeking, only that there may be a slightly easier target to hit if we can relax as many constraints on alignment as possible.

I'd say building an AGI that self-destructs would be pretty good. Especially since up until the point that a minimum breeding population of humans exists, and assuming life is not totally impossible(i.e. the AI hasn't already deconstructed the earth, or completely poisoned all water and atmosphere), humans could still survive. Making an AGI that doesn't die would probably not be in our best interests until almost exactly the end.

2Shmi
Yeah, I assume the case where humans are completely extinct, except in the memory banks of the AI.

Thanks for the answer! As you suspected, I don't think wireheading is a good thing, but after reading about infinite ethics and the repugnant conclusion I'm not entirely sure that there exists a stable mathematically expressible form of ethics we could give to an AGI. Obviously I think it's possible if you specify exactly what you want and tell the AGI not to extrapolate. However I feel that realistically, it's going to take our ethics and take it to its logical end, and there exists no ethical theory that really expresses how utility should be valued with... (read more)

Would the appropriate analogy to agents be that humans are a qualitatively different type of agent compared to animals and basic RL agents, and thus we should expect that there will be a fundamental discontinuity between what we have so far, and conscious agents?

You may also want to consider opportunities on the EA Volunteer Job Board. Some of them are similar low effort wiki building.

https://airtable.com/embed/shrQvU9DMl0GRvdIN/tbll2swvTylFIaEHP

5robopsychologist
Regarding that airtable, I don't know who manages it but it needs a date posted field on each post because you can't tell what is new or what is old (and these things take time to apply to, which could be totally wasted if they're all years old). Better yet would be a date posted and date closed field so people know when they can apply. 

I think in general, the most innovative candies have been candies that break the norm. I remember a lot of buzz when some gum company made gum wrappers that you could eat with your gum(Cinnaburst?) Nowadays though, it seems like companies don't need to go that far for people to buy their new chocolate/candy, and there are so many flavours and textures they can slap on if people get tired.

Hi, I really like this series and how it explains some of the lower level results we can expect from high level future scenarios. However I'd like to know how you expect digital people will interact with an economy that has been using powerful, high-level AI models or bureaucracies for a couple decades or longer(approximately my timeline for mind uploading, assuming no singularity). I've mostly read LessWrong posts and haven't done anything technical, but I feel that a lot of the expected areas in which digital people would shine might end up being accommodated by narrow-ish AI. 

Load More