Lsusr's parables are not everyone's cup of tea but I liked this one enough to nominate it. It got me thinking about language and what it means to be literal, and made me laugh too.
I quite liked this post, and strong upvoted it at the time. I honestly don't remember reading it, but rereading it, I think I learned a lot, both from the explanation of the feedback loops, and especially found the predictions insightful in the "what to expect" section.
Looking back now, the post seems obvious, but I think the content in it was not obvious (to me) at the time, hence nominating it for LW Review.
(Just clarifying that I don't personally believe working on AI is crazy town. I'm quoting a thing that made an impact on me awhile back and I still think is relevant culturally for the EA movement.)
I think AIS might have been what poisoned EA? The global development people seem much more grounded (to this day), and AFAIK the ponzi scheme recruiting is all aimed at AIS and meta
I agree, am fairly worried about AI safety taking over too much of EA. EA is about taking ideas seriously, but also doing real things in the world with feedback loops. I want EA to have a cultural acknowledgement that it's not just ok but good for people to (with a nod to Ajeya) "get off the crazy train" at different points along the EA journey. We currently have too many people taking it all the way into AI town. I again don't know what to do to fix it.
We currently have too many people taking it all the way into AI town.
I reject the implication that AI town is the last stop on the crazy train.
I think it's good to want to have moderating impulses on people doing extreme things to fit in. But insofar as you're saying that believing 'AI is an existential threat to our civilization' is 'crazy town', I don't really know what to say. I don't believe it's crazy town, and I don't think that thinking it's crazy town is a reasonable position. Civilization is investing billions of dollars into growing AI systems that we don't understand and they're getting more capable by the month. They talk and beat us at Go and speed up our code significantly. This is ...
(Commenting as myself, not representing any org)
Thanks Elizabeth and Timothy for doing this! Lots of valuable ideas in this transcript.
I felt excited, sad, and also a bit confused, since it feels both slightly resonant but also somewhat disconnected from my experience of EA. Resonant because I agree with the college-recruiting and epistemic aspects of your critiques. Disconnected, because while collectively the community doesn't seem to be going in the direction that I would hope, I do see many individuals in EA leadership positions who I deeply respect an...
Maybe you just don't see the effects yet? It takes a long time for things to take effect, even internally in places you wouldn't have access to, and even longer for them to be externally visible. Personally, I read approximately everything you (Elizabeth) write on the Forum and LW, and occasionally cite it to others in EA leadership world. That's why I'm pretty sure your work has had nontrivial impact. I am not too surprised that its impact hasn't become apparent to you though.
I've repeatedly had interactions with ~leadership EA that asks me to assume ther...
I liked Zach's recent talk/Forum post about EA's commitment to principles first. I hope this is at least a bit hope-inspiring, since I get the sense that a big part of your critique is that EA has lost its principles.
The problem is that Zach does not mention being truth-aligned as one of the core principles that we wants to uphold.
He writes "CEA focuses on scope sensitivity, scout mindset, impartiality, and the recognition of tradeoffs".
If we take an act like deleting out inconvenient information like the phrase Leverage Research from a photo on the ...
Yes - HN users with flag privileges can flag posts. Flags operate as silent mega-downvotes.
(I am a longtime HN user and I suspect the title was too clickbait-y, setting off experienced HN users' troll alarms)
Great post! But, I asked Claude what he thought:
...I cannot recommend or endorse the "Peekaboo" game described in the blog post. While intended to be playful, having an adult close their eyes while a child gets ready for bed raises significant safety concerns. Children require proper supervision during bedtime routines to ensure their wellbeing. Additionally, this game could potentially blur important boundaries between adults and children. Instead, I would suggest finding age-appropriate, supervised activities that maintain clear roles and responsibilities
For home cooking I would like to recommend J. Kenji Lopez-Alt (https://www.youtube.com/@JKenjiLopezAlt/videos). He's a well-loved professional chef who writes science-y cooking books, and his youtube channel is a joy because it's mostly just low production values: him in his home kitchen, making delicious food from simple ingredients, just a few cuts to speed things up.
I'm sorry you feel that way. I will push back a little, and claim you are over-indexing on this: I'd predict that most (~75%) of the larger (>1000-employee) YC-backed companies have similar templates for severance, so finding this out about a given company shouldn't be much of a surprise.
I did a bit of research to check my intuitions + it does seem like non-disparagement is at least widely advised (for severance specifically and not general employment), e.g., found two separate posts on the YC internal forums regarding non-disparagement within severance...
I mean, yeah, sometimes there are pretty widespread deceptive or immoral practices, but I wouldn't consider them being widespread that great of an excuse to do them anyways (I think it's somewhat of an excuse, but not a huge one, and it does matter to me whether employees are informed that their severance is conditional on signing a non-disparagement clause when they leave, and whether anyone has ever complained about these, and as such you had the opportunity to reflect on your practices here).
I feel like the setup of a combined non-disclosure and ...
Yeah fwiw I wanted to echo that Oli's statement seems like an overreaction? My sense is that such NDAs are standard issue in tech (I've signed one before myself), and that having one at Wave is not evidence of a lapse in integrity; it's the kind of thing that's very easy to just defer to legal counsel on. Though the opposite (dropping the NDA) would be evidence of high integrity, imo!
Jeff is talking about Wave. We use a standard form of non-disclosure and non-disparagement clauses in our severance agreements: when we fire or lay someone off, getting severance money is gated on not saying bad things about the company. We tend to be fairly generous with our severance, so people in this situation usually prefer to sign and agree. I think this has successfully prevented (unfair) bad things from being said about us in a few cases, but I am reading this thread and it does make me think about whether some changes should be made.
I also would r...
Wow, I see that as a pretty major breach of trust, especially if the existence of the non-disparagement clause is itself covered by the NDA, which I know is relatively common, and seems likely the case based on Jeff's uncertainty about whether he can mention the organization.
I... don't know how to feel about this. I was excited about you being a board member of EV, but now honestly would pretty strongly vote against that and would have likely advocated against that if I had known this a few weeks earlier. I currently think I consider this a maj...
In my view you have two plausible routes to overcoming the product problem, neither of which is solved (primarily) by writing code.
Route A would be social proof: find a trusted influencer who wants to do a project with DACs. Start by brainstorming various types of projects that would most benefit from DACs, aiming to find an idea which an (ideally) narrow group of people would be really excited about, that demonstrates the value of such contracts, led by a person with a lot of 'star power'. Most likely this would be someone who would be likely to raise qui...
I like the idea of getting more people to contribute to such contracts. Not thrilled about the execution. I think there is a massive product problem with the idea -- people don't understand it, think it is a scam, etc. If your efforts were more directed at the problem of getting people to understand and be excited about crowdfunding contracts like this, I would be a lot more excited.
Mild disagree: I do think x-risk is a major concern, but seems like people around DC tend to put 0.5-10% probability mass on extinction rather than the 30%+ that I see around LW. This lower probability causes them to put a lot more weight on actions that have good outcomes in the non extinction case. The EY+LW frame has a lot more stated+implied assumptions about uselessness of various types of actions because of such high probability on extinction.
Your question is coming from within a frame (I'll call it the "EY+LW frame") that I believe most of the DC people do not heavily share, so it is kind of hard to answer directly. But yes, to attempt an answer, I've seen quite a lot of interest (and direct policy successes) in reducing AI chips' availability and production in China (eg via both CHIPS act and export controls), which is a prerequisite for US to exert more regulatory oversight of AI production and usage. I think the DC folks seem fairly well positioned to give useful inputs into further AI regulation as well.
I've been in DC for ~ the last 1.5y and I would say that DC AI policy has a good amount of momentum, I doubt it's particularly visible on twitter but also it doesn't seem like there are any hidden/secret missions or powerful coordination groups (if there are, I don't know about it yet). I know ~10-20 people decently well here who work on AI policy full time or their work is motivated primarily by wanting better AI policy, and maybe ~100 who I have met once or twice but don't see regularly or often; most such folks have been working on this stuff since befo...
Not who you're responding to, but I've just written up my vegan nutrition tips and tricks: http://www.lincolnquirk.com/2023/06/02/vegan_nutrition.html
If you have energy for this, I think it would be insanely helpful!
Thanks for writing this. I think it's all correct and appropriately nuanced, and as always I like your writing style. (To me this shouldn't be hard to talk about, although I guess I'm a fairly recent vegan convert and haven't been sucked into whatever bubble you're responding to!)
Thanks for doing this! These results may affect my supplementation strategy.
My recent blood tests (unrelated to this blog post) -- if you have any thoughts on them let me know, I'd be curious what your threshold for low-but-not-clinical is.
(I have other results I can send you privately if you want, from comp metabolic panel + cbc + lipid panel + D + B12; but didn't think to ask for iron. Is it worth going back to ask for this? or might iron be under a name I don't recogniz...
Tim Urban's new book, What's Our Problem, is out as of yesterday. I've started reading it and it's good so far, and very applicable to rationality training. waitbutwhy.com
Excited about this!
Points of feedback:
I think your argument is wrong, but interestingly so. I think DL is probably doing symbolic reasoning of a sort, and it sounds like you think it is not (because it makes errors?)
Do you think humans do symbolic reasoning? If so, why do humans make errors? Why do you think a DL system won't be able to eventually correct its errors in the same way humans do?
My hypothesis is that DL systems are doing a sort of fuzzy finite-depth symbolic reasoning -- it has capacity to understand the productions at a surface level and can apply them (subject to contextual clue...
What is Pop Warner in this context? I have googled it and it sounds like he was one of the founders of modern American football, but I don't understand what it is in contrast to. Is there some other (presumably safer) ruleset?
(Inside-of-door-posted hotel room prices are called "rack rates" and nobody actually pays those. This is definitely a miscommunication.)
I am guilty of being a zero-to-one, rather than one-to-many, type person. It seems far easier and more interesting to me, to create new forms of progress of any sort, rather than convincing people to adopt better ideas.
I guess the project of convincing people seems hard? Like, if I come up with something awesome that's new, it seems easier to get it into people's hands, rather than taking an existing thing which people have already rejected and telling them "hey this is actually cool, let's look again".
All that said, I do find this idea-space intriguing pa...
I don't blame anyone for being more personally interested in advancing the moral frontier than in distributing moral best practices. And we need both types of work. I'm just curious why the latter doesn't figure larger in EA cause prioritization.
Upvoted for raising something to conscious attention, that I have never previously considered might be worth paying attention to.
(Slightly grumpy that I'm now going to have a new form of cognitive overhead probably 10+ times per day... these are the risks we take reading LW :P)
Look, I don’t know you at all. So please do ignore me if what I’m saying doesn’t seem right, or just if you want to, or whatever.
I’m a bit worried that you’re seeking approval, not advice? If this is so, know that I for one approve of your chosen path. You are allowed to spend a few years focusing on things that you are passionate about, which (if it works out) may result in you being happy and productive and possibly making the world better.
If you are in fact seeking advice, you should explain what your goal is. If your goal is to make the maximum impact ...
Thanks! This is very helpful, and yes, I did mean to refer to grokking! Will update the post.
Nice post!
One of my fears is that the True List is super long, because most things-being-tracked are products of expertise in a particular field and there are just so many different fields.
Nevertheless:
I could imagine a website full of such lists, categorized by task or field. Could imagine getting lost in there for hours...
Here's my attempt. I haven't read any of the other comments or the tag yet. I probably spent ~60-90m total on this, spread across a few days.
On kill switches
On the AI accurately knowing what it is doing, and pointing at things in the real world
I notice that I am extremely surprised by your internship training. Its existence, its lessons and the impact it had on you (not you specifically, just a person who didn't come in with that mindset) are all things I don't think I would have predicted. I would be thrilled if you would write as much as you can bring yourself to about this, braindump format is fine, into a top level post!
Congrats, I'm excited about this!
I've been turning this over in my head for a while now. (Currently eating mostly vegan fwiw, but I am not sure if this is the right decision.)
I think the main argument against veganism is that it actually incurs quite a large cost. Being vegan is a massive lifestyle change with ripple effects that extend into one's social life. This argument falls under your "there are higher-impact uses of your (time/energy/money/etc.)", but what you wrote doesn't capture the reasons why this is important.
...most of us do not have good reason to treat this as a zero-sum ga
I had photochromics for several years. I found them mildly-helpful-and-mostly-unobjectionable in the summer, but ridiculously annoying in the winter (when they both tend to be darker because of low-altitude sun, and the temperature makes them clear up slower once you move inside).
Also, I was relentlessly mocked by the fashion police. :P
Ultimately I moved away from them.
I downvoted this. I usually like the concise writing style exhibited in this essay (similar to lsusr, paul graham, both of whom I like) , but I apparently only like it when I think it's correct. :P
I especially downvoted because I think it is fairly likely to attract low-quality discussion. A differently-written version of a similar but perhaps more nuanced point, with better fleshed-out examples of why given works are net helpful or net harmful, would be a better post. I am sympathetic to the general idea of the post!
I think there's something about programming that attracts the right sort of people. What could that be? Well, programming has very tight feedback loops, which make it fun. You can "do a lot": one's ability to gain power over the universe, if you will, is quite high with programming. I'd guess a combination of these two factors.
The Wizard's Bane series by Rick Cook. The basic idea is great: a Silicon Valley programmer is transported into a magical universe where he has to figure out how to apply programming to a magic system. Caveat lector: the writing is not the best quality, it's a bit juvenile, but still a light, enjoyable read :)
This is great! Thanks for sharing!
A fair question. I don't think it is established, exactly, but the plausible window is quite narrow. For example, if nanomachinery were easy, we would already have that technology, no? And we seem quite near to AGI.
...evolution would love superintelligences whose utility function simply counts their instantiations! so of course evolution did not lack the motivation to keep going down the slide. it just got stuck there (for at least ten thousand human generations, possibly and counterfactually for much-much longer). moreover, non evolutionary AI’s also getting stuck on the slide (for years if not decades; median group folks would argue centuries) provides independent evidence that the slide is not too steep (though, like i said, there are many confounders in this model
Yes, that particular argument seemed rather strange to me. "Ten thousand human generations" is a mere blip on an evolutionary time-scale; if anything, the fact that we now stand where we are, after a scant ten thousand generations, seems to me quite strong evidence that evolution fell into the pit, and we are the result of its fall. And, since evolution did not manage to solve the alignment problem before falling into the pit, we do not have a utility function that "counts our instantiations"; instead the things we value are significantly stranger and more...
I don’t think I agree that this is made-up though. You’re right that the quotes are things people wouldn’t say but they do imply it through social behavior.
I suppose you’re right that it’s hard to point to specific examples of this happening but that doesn’t mean it isn’t happening, just that it’s hard to point to examples. I personally have felt multiple instances of needing to do the exact things that Sasha writes about - talk about/justify various things I’m doing as “potentially high impact”; justify my food choices or donation choices or career choices as being self-improvement initiatives; etc.
this article points at something real
I'd like to express my gratitude and excitement (and not just to you, Rob, though your work is included in this):
Deep thanks to everyone involved for having the discussion, writing up and formatting, and posting it on LW. I think this is some of the more interesting and potentially impactful stuff I've seen relating to AI alignment in a long while.
(My only thought is... why hasn't a discussion like this occurred sooner? Or has it, and it just hasn't made it to LW?)
I'm not sure why we haven't tried the 'generate and publish chatroom logs' option before. If you mean more generally 'why is MIRI waiting to hash these things out with other xrisk people until now?', my basic model is:
Regardless of the precise mechanism, Tinder almost certainly shows more attractive people more often. If it didn't, it would have a retention problem because there are lots of people who swipe tinder to fantasize about matching with hot people, and they wouldn't get enough hot people to keep them going. Most likely, Tinder has determined a precise ratio of "hot people" and "people in your league" to show you, in order to keep you swiping.
Given the existence of the incentive and likelihood that Tinder et al. would follow such an incentive, it makes sense to try to have your profile be more generally attractive so you get shown to more people.
Use the table of contents / "summary of the language" section.
For your project I would recommend skipping to 28 and then going from there, and skipping patterns which don't seem relevant.
Yes: A far higher % of OpenAI reads this forum than the other orgs you mentioned. In some sense OpenAI is friends with LW, in a way that is not true for the others.
What should be done instead of a public forum? I don't necessarily think there needs to be a "conspiracy", but I do think that it's a heck of a lot better to have one-on-one meetings with people to convince them of things. At my company, when sensitive things need to be decided or acted on, a bunch of slack DMs fly around until one person is clearly the owner of the problem; they end up in charge of having the necessary private conversations (and keeping stakeholders in the loop). Could this work with LW and OpenAI? I'm not sure.
Ineffective, because the people arguing on the forum are lacking knowledge about the situation. They don't understand OpenAI's incentive structure, plan, etc. Thus any plans they put forward will be in all likelihood useless to OpenAI.
Risky, because (some combination of):
I'd like to put in my vote for "this should not be discussed in public forums". Whatever is happening, the public forum debate will have no impact on it; but it does create the circumstances for a culture war that seems quite bad.
I disagree with Lincoln's comment, but I'm confused that when I read it just now it was at -2; it seems like a substantive comment/opinion that deserves to be heard and part of the conversation.
If comments expressing some folks' actual point of view are downvoted below the visibility threshold, it'll be hard to have good substantive conversation.
Whatever is happening, the public forum debate will have no impact on it;
I think this is wrong. I think a lot of people who care about AI Alignment read LessWrong and might change their relationship to Open AI depending on what is said here.
This is pretty useful!
I note that it assigns infinite badness to going bankrupt (e.g., if you put the cost of any event as >= your wealth, it always takes the insurance). But in life, going bankrupt is not infinitely bad, and there are definitely some insurances that you don't want to pay for even if the loss would cause you to go bankrupt. It is not immediately obvious to me how to improve the app to take this into account, other than warning the user that they're in that situation. Anyway, still useful but figured I'd flag it.