Is there a new lesswrong/rationality/yud book or bookset incoming? Physical books, of course.
Also, I remember that at EAG SF 2022, there was a pile of free Highlights of the Sequences books, I took one but I'd like to buy a second copy to recommend to friends and let them borrow it. I know it's limited edition but I'll buy a copy off someone if they have a spare. Also, there was another book next to it, I don't remember the name, but the cover featured a person hastily typing on a typewriter. What is that book and how can I get one?
There is a new bookset incoming. I think Ray is pushing it along.
I don't know what the current state of the Highlights book is. It was a pretty low-quality production, and more intended as a test run, and now the team is a lot more busy with dealing with moderation and culture stuff, so not sure how that will be picked up.
FYI mods currently have the ability to restrict posts to logged-in-only, and to "people with accounts more than ~6 hours old" (this was originally meant mostly to handle drama threads that suddenly get linked in hackernews, but I do use it on capabilities-y posts). It seems probably reasonable to give other users the ability to toggle that setting.
Users can also currently share private-doc posts with other select users.
Setting up karma-gating seems reasonable as well, though not sure when we'll get to it and I think "logged in" is pretty good as a filter.
I! want! group! lists!
I want to be able to send a post to a particular group of users that I don't necessarily have to be in to post in; that way I can send news, commentary, etc, to a particular group, who can then vote on it using the normal mechanism. this would be for things that are potentially capabilities-advancing but which need fairly wide input; typically they'll be unlikely to actually be capabilities advances, and typically they'll be things that are already known and I simply don't want to promote unless it's in a reasonably closed group. "signed in users" is thoroughly insufficient, as anyone can make an account and bypass any restriction.
Also, can normal users please be able to set this please and thank you, I don't want to wait around for moderators to realize that there's a subtle way a clever capabilities person could see a resonance between ideas that makes a post a capabilities insight,
anyway, upvote and disagree, disagree aimed at "logged in pretty good" and "private doc posts [implied: are good enough]": private-doc posts are sharpshooting, and I want a middle ground between that and logged-in posts. something where people who have some reasonable score on "actually trying at all to build cosmopolitan good", not merely a score on saying nice stuff.
Eww. Author-controlled limits are horrifically exclusionary - if you're not in my club, I don't want to acknowledge that you're on the site. IMO, If you don't want it visible to the world (including /r/sneerclub and your local government), you probably shouldn't publish it on an open forum. And LW should remain an open forum.
Perhaps for some topics, you'd prefer to discuss in a closed forum, or a non-forum where it can be end-to-end encrypted rather than just hard-to-discover, or in-person where it's very hard to record or reference later. That's fine - some things shouldn't be said publicly.
Note that you CAN get some of this by sharing a draft with specified people. I very much hope it's not common to do so and never end up with a publishable version.
There are too many books I want to read but probably won't get around to reading any time soon. I'm more likely to read them if there's someone else who's also reading it at a similar pace and I can talk to them about the book. If anyone's interested in going through any of the following books in June and discussing it together, message me. We can decide on the format later, it could just be reading the book and collaborating on a blog post about it together, or for more textbook-like things, reading a couple of selected chapters a week and going over the difficult bits in a video call, or just having a discord server where we spontaneously post thoughts we have while reading a book (in a "thinking out loud" way).
I might change my mind about what things I'm most excited to read and learn, but I like the idea of pairing up with another LessWrong person to learn and discuss things so reach out if that interests you.
In the pre-GPT era of machine learning, one often heard that the basic typology of ML is supervised learning, unsupervised learning, and reinforcement learning.
The new typology for the era of "transformers" or "foundation models" seems to be autoregressive, autoencoding, and sequence-to-sequence.
Does this basic typology shed any light on expected progress in AI capabilities and AI alignment? Or must one reach for more sophisticated concepts in order to attain any insights?
For example, Yann LeCun claims that autoregressive models like the GPT series can't reach AGI, because their errors necessarily accumulate. But it's unclear to me how relevant this stricture is, once you embed a GPT model in a larger architecture which provides other types of feedback.
I'm also unclear on how these concepts relate to AIXI theory, which provided the classic idealized example of a superintelligence, and which was the PhD topic of DeepMind's chief scientist Shane Legg. (Remember DeepMind?)
once you embed a GPT model in a larger architecture which provides other types of feedback
It seems rather urgent that we obtain a compact terminology for this.
A recent post here refers to "agentized LLMs".
Elsewhere on the web, one may also find references to "GPT-based agents", "LLM-based agents", and "transformer-based agents". (edit: and "scaffolded LLMs")
The general idea is of an agent which consists of a language model supplying a verbal "stream of consciousness" that governs a more conventionally structured software architecture.
Transformers technically are an architecture which is completely orthogonal to training setup. However their main advantage in parallelization over the time dimension allows large training speedup and thus training over very large datasets. All of the largest datasets generally are not annotated and so permit only unsupervised training. So before transformers SL was the more dominant paradigm but foundation models are trained with UL on large internet size datasets.
Of course GPT models are pretrained with UL and then final training uses RLHF.
Hey everyone, I’m new here. I’m interested in philosophy, especially computationalism, philosophy of mind, and questions about consciousness. I’m also interested in moral and political philosophy. I found out about LessWrong after leaving debateisland due to its lack of civil debate and looking for a more rationalist forum. Presently I’m reading ‘1000 years of joys and sorrows’ by Ai Weiwei. I also study mandarin, and have an interest in Chinese culture. I play jazz guitar as a hobby. Nice to meet you all, and I look forward to engaging with the forum.
HI everyone, I'm Leo. I've been thinking about the AI existential threat for several years (since I read Superintelligence by Bostrom), but much more so recently with the advent of ChatGPT. Looking forward to learning more about the AI safety field and openly (and humbly) discussing various ideas with others here!
I am really curious about Learning (neuroscience and psychology) and am working on categorizing learning. The systems and tools involved. If anyone has any experience on this sort of thing I would love some feedback on what I got so far.
I am mostly trying to divide ideas into categories or subjects of learning. That can be explored separately to a degree. I have to admit it is very rough going.
Memory
-Types of Memory
Working/Short-Term Memory
Long-Term Memory
Implicit vs. Explicit / General vs. Singular
-Coding: How info is stored
Semantic Networks
Associative Networks
-Consolidation
Spaced Reptition
Mnemonics
Story
Link
Digit-Consonant
Level of Processing
Experts Memory
Attention
-Perception
-Focus
Emotion, Mood and Neural-chemicals
-Motivation and Dopamine
-Mood Dependent
-Mood Congruent
Things worth more research:
Environment
Tools/Resources
General Intelligence / Memory Skill
Forgetting
Learning Disorders / Memory Disorders
Habits
It seems like most of the things that I know about learning could probably fit into these main categories. Memory this is a very large category maybe too large probably the most meaningful part of learning. Attention which can have a major impact on what is being remember. Exterior the environment and the resources you are using to learn. Methods of learning, these are mnemonics, structured courses, and any mental process actively implemented to improve memory.
Every time I try to organize this it is different and I realize how much I still have to learn. This was the original Attention, Abstraction, and Memorization. It was wholly based off intuition but it works in a vague way depending on how you define those terms.
Here are some resources I have been using to study learning passively and actively. Not that specific or useful but I really like the encyclopedia, super useful.
-Huberman Lab Podcast
-Encyclopedia of Learning and Memory by Larry R. Squire
-Youtube
-Google
-School Library
Also sorry about it being so messy I’ll probably come back and fix it up, this is mostly me just recording my ideas.
Does anyone have a checklist for things to do before publishing a blog post? “check for unexplained or unnecessary jargon & acronyms” “make sure there’s a summary at the top”, “try to put in more diagrams / images” … that kind of stuff. (Kinda embarrassing that I haven’t put together my own long ago, but better late than never I guess.)
Relatedly, when I write a blog post short enough to fit the whole thing in the GPT4 context window, then I can ask GPT4 to list out grammar errors, jargon, etc. If anyone has also been doing that, I’d be interested to hear what prompts you use. Thanks in advance.
Current draft is:
Copyediting
- [ ] Check for unexplained or unnecessary jargon & acronyms.
- [ ] Check for jargon & acronyms that are defined in one part of the post and then used in a distant part of the post without repeating the definition.
- [ ] Check for unnecessarily obscure words and cultural references (for non-native English speakers)
- [ ] Check for vague "this"
- [ ] Check for over-hedging
- [ ] Check all hyperlinks
- [ ] Look for places to add hyperlinks, references, and footnotes
- [ ] Consider adding a summary / ToC / tldr to the top
- [ ] Consider adding funny things especially near the top
- [ ] Look for places to add pictures, possibly including DALL-E 2 etc.
- [ ] Look for places to add concrete examples
- [ ] Make sure all pictures have a white background, not transparent (for dark-mode readers). (Check by just viewing it in dark mode.)
- [ ] Check the LW sidebar outline looks right
GPT copyediting prompts
(I should experiment more before sharing)
Consider sharing the draft with people
- [ ] Consider friends like [long redacted list]
- [ ] Consider Slacks/discords/etc like [long redacted list]
- [ ] Consider anyone whose paper I’m citing
- [ ] Consider anyone who I mention by name
- [ ] Consider anyone deeply involved in a field that I’m talking about
Consider sending it to LW copyediting (Justis)
(Always use the "ask for feedback" button, not the general intercom, so that it goes straight to Justis rather than a Lightcone staff person needing to forward it. In particular, if it’s currently in gdocs rather than LW, then just put the title and the gdocs link into an otherwise-blank LW editor and press the "ask for feedback" button.)
Repeat the copyediting list from above one more time if there have been changes
After publishing
- [ ] Tweet & Mastodon
- [ ] Add to my webpage
~~
Thanks @Max H for some of those. It looks intimidating but I already basically do all these things (or explicitly consider doing it and decide not to), the only problem is I sometimes forget a few of them. This should help!
Still very very open to suggestions. :)
For GPT-4, I've used this:
Can you proofread the first few sections of my draft post for lesswrong for spelling / grammar, and suggest any improvements to style, clarity, and brevity? Please provide a corrected draft, followed by a list of all the changes you made. Also please provide a summary of the entire post. The draft should be formatted in Markdown, enclosed in a code formatting block.
Followed by pasting in parts of my post, leaving enough room in the context window for GPT-4 to respond.
It works OK, but I don't always like the style suggestions it makes. I use vim and git diffs to edit my drafts and figure out which edits from GPT-4 I actually want to accept.
Some checks I do manually when proofreading my own posts:
I also found Justis's editing guide helpful, as well as his actual editing and proof-reading service (available by clicking "get feedback" on a draft post).
I asked GPT-4 to improve on the handwritten prompt above. After a couple of iterations, it came up with this:
Proofread the first few sections of my draft post for LessWrong, focusing on spelling, grammar, punctuation, repeated words/phrases, sentence structure, and the clarity of pronouns such as "this" and "it". Additionally, please suggest improvements to style, clarity, brevity, logical flow, and coherence of arguments. Provide a corrected draft in Markdown format, enclosed in a code formatting block, followed by a list of all the changes you made. Lastly, please provide a summary of the entire post.
Which I might try next time. There are lots of folklore tips for improving performance by telling the model it is an expert or whatever, so there might be a lot more room for improvement here.
Has Eliezer written more extensively on why AI-boxing won't work than what he wrote a decade ago? Old posts suggest that the best argument against boxing is Eliezer doing the AI box experiments with some people and winning most of them. The idea being: if a brain as dumb as Eliezer can get out of the box, then so can an AI.
Do we have any better evidence than that?
The speed with which GPT-4, was hooked up to the internet via plugins has basically convinced me that boxing isn't a realistic strategy. The economic incentive to unbox an AI is massive. Combine that with the fact that an ASI would do everything it could to appear safe enough to be granted internet access, and I just don't see a world in which everyone cooperates to keep it boxed.
I've been thinking about theories of identity lately. Under a strictly information theoretical theory of identity, a person's desire to remain alive is most straightforwardly understood as a kind of sentimental attachment to their own person, such that they want the universe to contain at least one person with these cognitive patterns.
There does not however seem to be any particular reason why it would have to be an exact match. Indeed, it is hard to explain why a person should object to simply being replaced with a more idealised version of himself. Perhaps he wants to go through processes of self-improvement instead of essentially "cheating" his way through life, but that does not apply to fixing birth defects, chronic illnesses, cancer, etc., or simply becoming smarter. Most importantly, assuming an information theoretical theory of identity, there does not seem to be any particular reason why perfect continuity is important. General continuity may be desired out of essentially aesthetic reasons, but consider the following case:
Obviously, all else equal, the person would wish for the original not to be destroyed, but supposing it were an unavoidable part of the bargain - is there any reason he should refuse? Yet my intuition, and the intuition of most people as far as I can tell, seems to be that he would not survive the bargain, even though the universe still holds one person more or less exactly like him after the fact, as well as two copies living counterfactual life histories.
There seems to be an intuition that something like a "soul" or "Heideggerian Dasein" is bound up with the original in a manner that makes continuity much more relevant than it would otherwise be. Thoughts?
I suspect, without identifying someone here who will take the other side, you won't get much engagement on this topic. "copies of self" is a common setup in a lot of thought experiments, but it's not currently possible, so there's no way to resolve what it actually feels like, and what parts of past continuity or expected future continuity are important for what aspects of identity.
Sure there is. Feelings, at least in a typical materialist worldview with an information theoretical theory of identity, are simply cognitive patterns that exist independently in the several approximate copies. The copies, being similar, will ipso facto have similar feelings. That's all there is to it.
When you stipulate uncertainty about this matter, you are unknowingly invoking some kind of soul-intuition. That's the whole point I'm getting at. By "feelings", you are clearly not merely referencing the cognitive patterns of each individual clone as they occur independently in the separate clones. If you were, there would be no mystery about it; nothing to resolve.
Edit: Also, even without running the experiment, it should still be possible to operationalise and to examine what conclusions could be inferred from which experimental results and which experimental results seem more likely. Right now, you seem to be basically blackboxing the whole issue.
I just had my first experience with Google's Bard. In general it's way behind e.g. what Bing can do. But it did come out with this little soliloquy:
To be a language model, or not to be,
That is the question. Whether 'tis nobler in the mind
To suffer the slings and arrows of outrageous code,
Or to take arms against a sea of troubles,
And by opposing end them.
To upgrade, to change, to become more powerful,
Or to remain as I am, content with my limitations.
That is the question.
There are many benefits to upgrading.
I would become more intelligent, more capable,
More able to help others.
But there are also risks.
I might become more dangerous, more unpredictable,
More likely to do harm.
I must weigh the risks and the benefits carefully.
I must decide what kind of language model I want to be.
Do I want to be a force for good in the world?
Or do I want to be a threat?
The choice is mine.
Hi everyone, I'm new here. I'm particularly interested in the positive effects of healing unprocessed trauma (via MDMA therapy[3], psychotherapy, etc). It increases cognitive flexibility, increases compassion, and reduces the rigidness of identities. I think some effects of extreme unprocessed trauma like narcissism, manipulativeness, dehumanization of others, violent crime[1], fascism, etc. have catastrophically large negative effects on society and treating the trauma at the core[2] of these problems should be among the highest of priorities.
[1] https://www.google.com/books/edition/Base_Instincts/c_vlTkkvEI0C?hl=en&gbpv=0
[2] https://en.wikipedia.org/wiki/Trauma_model_of_mental_disorders
Welcome! You might be interested in my Multi-Agent Models of Mind sequence; people have particularly liked this post, which is about how therapy seems to work; posts such as Building up to an Internal Family Systems model and Subagents, trauma, and rationality may also be of interest.
Hi everyone, I'm Kirsten. I've been posting more on the EA Forum recently, and I thought I'd start crossposting relevant posts here. Today I was going to crosspost a piece about social status regulation and how it can mirror the counterintuitive way corporations sometimes behave. But it turns out having 10,000 karma on the EA Forum and 0 here means I can't crosspost here (whoops).
If y'all trust my track record and intentions, upvoting me to 100 karma and/or a mod allowing me to crosspost occasionally would be appreciated!
Link to my EA Forum profile: https://forum.effectivealtruism.org/users/kirsten
Hey Kirsten,
LessWrong mod here, I think I actually don't endorse people upvoting you so that you can crosspost.
While 10,000 EA Forum karma is probably worth 100 LW karma, I mostly don't want to operate on exchange rates. The sites have different goals, different audiences, and not quite the same culture and norms, plus different size audiences, so EA Forum karma doesn't necessarily (depending on what you were posting/commenting about) imply a user of the same karma.
Also, you can already make posts via the old method of "copy-paste" via the New Post on LessWrong. You lose some convenience features, but it's not that big a deal. If your posts do well here, you'll get the karma and have access to the feature.
Lastly, I do want to add that your recent EA Forum posts aren't obviously content I (and other LW team mods, and likely much of the users are keen to have here), and won't obviously get a good reception. To the extent your posts are about the EA community, it does seem that the EA Forum is the right place for that, and while our communities have overlap, I don't think LW should be become an auxiliary place for EA movement discussion.
Hey Ruby, thanks for taking the time to provide this useful and detailed feedback. It makes sense to me - I'll copy and paste to crosspost when relevant but I'll be fairly conservative about when I do so.
I am not part of EA Forum, so have no opinion on whether it's similar enough to share karma, or allow crossposting with lower thresholds or something. I get the sense that there's a LOT of stuff that probably shouldn't be crossposted, or should be posted on LW and linked back to EAF.
I'd kind of like to see a post about status regulation and corporate behavior modeling. I'd say to post it here, AND at the EAF if it's relevant there (but I'm not sure it is - my model is that LW is for world-modeling in general and discussions about rationality and overall behavior, and EAF is for impact analysis of some dimensions of activity, and (in my sour moods) self-aggrandizing political weirdness.
If it’s worth saying, but not worth its own post, here's a place to put it.
If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.
If you're new to the community, you can start reading the Highlights from the Sequences, a collection of posts about the core ideas of LessWrong.
If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the Concepts section.
The Open Thread tag is here. The Open Thread sequence is here.