Okay, I have not downvoted any of your posts, but I see the three posts you probably mean, and I dislike them, and shall try to explain why. I'm going to take the existence of this question as an excuse to be blunt.
The Snuggle/Date/Slap Protocol: Frankly, wtf? You took the classic Fuck/Marry/Kill game, tweaked it slightly, and said adding this as a feature to GPT-4 "would have all sorts of nice effects for AI alignment." At the end, you also started preaching about your moral system, which I didn't care for. (People gossiping about the dating habits of minor AI celebrities is an amusing idea though.) If you actually built this, I confess I'd still be interested to see what this system does, and if you can then get the LLM to assume different personae.
My Mental Model of Infohazards: As a general rule, long chains of reasoning where you disagree strongly with some early step are unpleasant to follow (at each following step, you have to keep track of whether you agree with the point being made, whether it follows from the previous steps, and whether this new world being described is internally consistent), especially if the payoff isn't worth it, as in your post. You draw some colorful analogy to black holes and supernovae, but are exceedingly vague, and everything you did say is either false or vacuous. You specifically flagged one of your points as something you expected people to disagree with, but offered no support for, or rebuttal to arguments against.
Ethicophysics II: Politics is the Mind-Savior (see also: Ethicophysics I): You open "We present an ethicophysical treatment on the nature of truth and consensus reality, within the framework of a historical approach to ludic futurecasting modeled on the work of Hegel and Marx." This looks like pompous nonsense to me, and my immediate thought was that you were doing the Sokal hoax thing. And then I skimmed your first post, and you seem serious, so I concluded you're a crank. ("Gallifreyan" is a great name for a physics-y expression, btw. It fits in perfectly with Lagrangians and Hamiltonians.)
More generally, voting on posts doesn't disentangle dislike vs. disagree, so a lot of the downvotes might just be the latter.
I'm definitely a crank, but I personally feel like I'm onto something? What's the appropriate conduct for a crank that knows they're a crank but still thinks they've solved some notorious unsolved problem? Surely it's something other than "crawl into a hole and die"...
Can you clarify what part of the downvote system is broken? If someone posts multiple things that get voted below zero, that indicates to me that most voters don't want to see more of that on LW. Are you saying it means something else?
I do wish there were agreement indicators on top-level posts, so it could be much clearer to remind people "voting is about whether you think this is good to see on LW, agreement is about the specific arguments". But even absent that, I don't see very much below-zero post scores that surprise me or I think are strongly incorrect. If I did, I somewhat expect the mods would override a throttle.
I think your automatic restriction is currently too tight. I would suggest making it decay faster.
The problem is a long time contributor can be heavily downvoted once and become heavily rate limited, and then it relies on them earning back their points to be able to post again. I wouldn't say such a thing is necessarily terrible, but it seems to me to have driven away a number of people I was optimistic about who were occasionally saying something many people disagree with and getting heavily downvoted.
I think you need to be more frugal with your weirdness points (and more generally your demanding-trust-and-effort-from-the-reader points), and more mindful of the inferential distance between yourself and your LW readers.
Also remember that for every one surprisingly insightful post by an unfamiliar author, we all come across hundreds that are misguided, mediocre, or nonsensical. So if you don't yet have a strong reputation, many readers will be quick to give up on your posts and quick to dismiss you as a crank or dilettante. It's your job to prove that you're not, and to do so before you lose their attention!
If there's serious thought behind The Snuggle/Date/Slap Protocol then you need to share more of it, and work harder to convince the reader it's worth taking seriously. Conciseness is a virtue but when you're making a suggestion that is easy to dismiss as a half-baked thought bubble or weird joke, you've got to take your time and guide the reader along a path that begins at or near their actual starting point.
Ethicophysics II: Politics is the Mind-Savior opens with language that will trigger the average LWer's bullshit detector, and appears to demand a lot of effort from the reader before giving them reason to think it will be worthwhile. LW linkposts often contain the text of the linked article in the body of the LW post, and at first glance this looks like one of those. In any case, we're probably going to scan the body text before clicking the link. So before we've read the actual article we are hit with a long list of high-effort, unclear-reward, and frankly pretentious-looking exercises. When we do follow the link to Substack we face the trivial inconvenience of clicking two more links and then, if we're not logged in to academia.edu, are met with an annoying 'To Continue Reading, Register for Free' popup. Not a big deal if we're truly motivated to read the paper! But at this point we probably don't have much confidence that it will be worth the hassle.
This is solid advice, I suppose. A friend of mine has compared my rhetorical style to that of Dr. Bronner - I say a bunch of crazy shit, then slap it around a bar of the finest soap ever made by the hand of man.
I started posting my pdfs to academia.edu because I wanted them to look more respectable, not less. Earlier drafts of them used to be on github with no paywall. I'm going to post my latest draft of Ethicophysics I and Ethicophysics II to github later tonight; hopefully this decreases the number of hoops that interested readers have to jump through.
I don't think you're being consistently downvoted: most of your comments are neutral-to-slightly positive?
I do see one recent post of yours that was downvoted noticeably, https://www.lesswrong.com/posts/48X4EFJdCQvHEvL2t/ethicophysics-ii-politics-is-the-mind-savior
I downvoted that post myself. (Uh....sorry?) My engagement with it was as follows:
I...guess this might be me being impatient or narrow-minded or some such. But ideally I would like either to see what your post is about directly, or at least have a clearer and more comprehensible summary that makes me think putting the effort into digging in will likely be rewarded.
The Snuggle/Date/Slap Protocol seems to me to be just not how language models work, and I don't expect many to actually care in the relevant ways if any of your forecasted news paper articles get written about ChatGPT outputting those tokens at people.
I did not read your ethicophysics stuff, nor did I downvote. You can probably get me to read those by summarizing your main methods and conclusions, with some obvious facts about human morality which you can reconstruct to lend credence to your hypothesis, and nonobvious conclusions to show you're not just saying tautologies. I in fact expect that doc to be filled with a bunch of tautologies or to just be completely wrong.
5 posts in 5 hours is way too many. People will be much more generous if you space things out or put them on shortform. I've never discussed this with anyone else, but if I was going to make up a number it would be no more than two posts per month unless they're getting fantastic karma, and and avoid having two posts above the fold at the same time.
I don't think the raw number is the problem. If someone writes too many posts in general, they'll start to get ignored, not heavily downvoted.
I skimmed The Snuggle/Date/Slap Protocol and Ethicophysics II: Politics is the Mind-Savior which are two recent downvoted posts of yours. I think they get negative karma because they are difficult to understand and it's hard to tell what you're supposed to take away from it. They would probably be better received if the content were written such that it's easy to understand what your message is at an object-level as well as what the point of your post is.
I read the Snuggle/Date/Slap Protocol and feel confused about what you're trying to accomplish (is it solving AI Alignment?) and how the method is supposed to accomplish that.
In the ethicophysics posts, I understand the object level claims/material (like the homework/discussion questions) but fail to understand what the goal is. It seems like you are jumping to grounded mathematical theories for stuff like ethics/morality which immediately makes me feel dubious. It's a too much, too grand, too certain kind of reaction. Perhaps you're just spitballing/brainstorming some ideas, but that's not how it comes across and I infer you feel deeply assured that it's correct given statements like "It [your theory of ethics modeled on the laws of physics] therefore forms an ideal foundation for solving the AI safety problem."
I don't necessarily think you should change whatever you're doing BTW just pointing out some likely reactions/impressions driving negative karma.
Thanks, this makes a lot of sense.
The snuggle/date/slap protocol is meant to give powerful AI's a channel to use their intelligence to deliver positive outcomes in the world by emitting performative speech acts in a non-value-neutral but laudable way.
Sampling one of your downvoted posts: the one here is nonsensical and probably a joke?
It proposes adding a <SNUGGLE>/<DATE>/<SLAP> token in order to "control" GPT-4. But tokens are numbers that represent common sequences of characters in an NLP dataset. They are not buttons on an LLM remote control - if the </SNUGGLE> token doesn't have any representation in the dataset, the model won't know what it means. You could tell ChatGPT "when I type <SNUGGLE>, do X", but that'd be different than somehow "converting" Noam Chomsky works into a <SLAP> feature and building it into the base model.
Edit/Addendum: I have just read your post on infohazards. I retract my criticisms; as a fellow infopandora and jessicata stan, I think the site requires more of your work.
As an additional referrence, this talk from the University of Chicago is very helpful for me and might be helpful for you too.
The presenter, Larry McEnerney talks about why the most important thing is not what original work or feelings we have - he argues that its about changing peoples minds and we, writers must know that there are readers/community driven norms that are needed to be understood in this process.
Thanks for that talk. I actually took the class that McEnerney taught at UChicago, and it greatly improved my writing.
One big issue is not that you are not respecting the format of LW -- add more context, either link to a document directly, or put the text inline. Resolving this would cover half of the most downvoted posts. You can ask people to review your posts for this before submitting.
Another big issue is that you are a prolific writer, but not a good editor. Just edit more, your writing could be like 5x shorter without losing anything meaningful. You have this overly academic style for your scientific writing, it's not good on the internet, and not even good in scientific papers. A good take here: https://archive.is/29hNC
From "The elements of Style": "Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. This requires not that the writer make all his sentences short, or that he avoid all detail and treat his subjects only in outline, but that every word tell."
Also, you are trying to move too fast, pursuing too many fronts. Why don't you just focus on one thing for some time, clarify and polish it enough so that people can actually grasp clearly what you mean?
I just awarded all of the prizes, but this answer feels pretty useful. You can also claim $100 if you want it.
Most of this work was done in 2018 or before, and just never shared with the mainstream alignment community. The only reason it looks like I'm trying to move too fast is that I am trying to get credit for the work I have done too fast.
What would you recommend I polish first? My intuition says Ethicophysics I, just because that sounds the least rational and is the most foundational.
Shortly:
Is this clear enough:
I posit that the reason that humans are able to solve any coordination problems at all is that evolution has shaped us into game players that apply something vaguely like a tit-for-tat strategy meant to enforce convergence to a nearby Schelling Point / Nash Equilibrium, and to punish defectors from this Schelling Point / Nash Equilibrium. I invoke a novel mathematical formalization of Kant's Categorical Imperative as a potential basis for coordination towards a globally computable Schelling Point. I believe that this constitutes a prom...
(preface: writing and communicating is hard and that i'm glad you are trying to improve)
i sampled two:
this post was hard to follow, and didn't seem to be very serious. it also reads off as unfamiliar with the basics of the AI Alignment problem (the proposed changes to gpt-4 don't concretely address many/any of the core Alignment concerns for reasons addressed by other commentors)
this post makes multiple (self-proclaimed controversial) claims that seem wrong or are not obvious, but doesn't try to justify them in-depth.
overall, i'm getting the impression that your ideas are 1) wrong and you haven't thought about them enough and/or 2) you arent communicating them well enough. i think the former is more likely, but it could also be some combination of the both. i think this means that:
I did SERI-MATS in the winter cohort in 2023. I am as familiar with the alignment field as is possible without having founded it or been given a research grant to work in it professionally (which I have sought but been turned down in the past).
I'm happy to send out drafts, and occasionally I do, but the high-status people I ask to read my drafts never quite seem to have the time to read them. I don't think this is because of any fault of theirs, but it also has not conditioned me to seek feedback before publishing things that seem potentially controversial.
I haven't seen this mentioned explicitly, so I will. Your tone is off relative to this community, in particular ways that signal legitimate complaints.
You do a good job of sounding humble in some places, but your most-downvoted "ethicophysics I" sounds pretty hubristic. It seems to claim that you have a scientifically sound and complete explanation for religion and for history. Those are huge claims, and they're mentioned with no hint of epistemic modesty (recognizing that you're not sure you're right).
This community is really big on epistemic modesty, and I think there's a good reason. It's easier to have productive discussions when everyone doesn't just assume they're sure they're right, and assume the problem must be that others don't recognize their infallible logic and evidence.
The other big problem with the tone and content of that post is that it doesn't mention a single previous bit of work or thought, nor does it use terminology beyond "alignment" indicating that you have read others' theories before writing about your own. I think this is also a legitimate cultural expectation. Everyone has limited reading time, so rereading the same ideas stated in different terms is a bad idea. If you haven't read the previous literature, you're probably restating existing ideas, and you can't help the reader know where your ideas are new.
I actually upvoted that post because it's succinct and actually addresses the alignment problem. But I think tone is a big reason people downvote, even if they don't consciously recognize why they disliked something.
Well what's the appropriate way to act in the face of the fact that I AM sure I am right? I've been offering public bets of the nickel of some high-karma person versus my $100, which seems like a fair and attractive bet for anyone who doubts my credibility and ability to reason about the things I am talking about.
I will happily bet anyone with significant karma that Yudkowsky will find my work on the ethicophysics valuable a year from now, at the odds given above.
I have around 2K karma and will take that bet at those odds, for up to 1000 dollars on my side.
Resolution criteria are to ask EY about his views on this sequence as of December 1st 2024, literally "which of Zac or MadHatter won this bet", and resolves no payment if he declined to respond or does not explicitly rule for any other reason.
I'm happy to pay my loss by eg Venmo, and would request winnings as a receipt for your donation to GiveWell's all-grants fund.
Hey @MadHatter - Eliezer confirms that I've won our bet.
I ask that you donate my winnings to GiveWell's All Grants fund, here, via credit card or ACH (preferred due to lower fees). Please check the box for "I would like to dedicate this donation to someone" and include zac@zhd.dev as the notification email address so that I can confirm here that you've done so.
We've been in touch, and agreed that MatHatter will make the donation by end of February. I'll post a final update in this thread when I get the confirmation from GiveWell.
I strongly downvoted Homework Answer: Glicko Ratings for War. The reason is because it's appears to be a pure data dump that isn't intended to be actually read by a human. As it is a follow up to a previous post it might have been better as a comment or edit on the original post linking to your github with the data instead.
Looking at your post history, I will propose that you could improve the quality of your posts by spending more time on them. There are only a few users who manage to post multiple times a week and consistently get many upvotes.
A "Gallifreyan" sounds also like a Doctor Who timelord (IE an alien from Galifrey).
That was the inspiration. It's meant to be an RLHF cost function corresponding to the question "What would the Doctor think about what you just said?"
It looks like bad luck.
I'd just note - beware the concept that social "status" is a one dimensional variable. but yes, the things the typical one dimensional characterization of refer to are, in very many cases, summaries of real dynamics.
my brain is insufficiently flexible to be able to surrender to social-status-incentives without letting that affect my ability to optimise purely for my goal. the costs of compromise (++) btn diff optimisation criteria are steep, so i would encourage more ppl to rebel against prevailing social dynamics. it helps u think more clearly. it also mks u miserable, so u hv to balance it w concerns re motivation. altruism never promised to be easy. 🍵
One of the homework questions in Ethicophysics II is to compile that list and then search for patterns in it. For instance, Islamic fighting groups comprise both the uppermost echelons, and the lowermost echelons. (The all-time lowest rating is for ISIS.)
Since no one else is doing my homework problems, I thought I would prime the pump by posting a partial answer and encouraging people to trawl through it for patterns.
This seems like very solid advice. I will switch to shortform.
Just added a post entitled My Research Agenda that starts to document this, and added a new Sequence to contain the posts I am working on.
This is a reply to lc's answer, not an answer.
I agree it sounds like I am trolling, but I am paying literal factual dollars to a contractor to build out a prototype of an llm-powered app that can emit performative speech acts in this fashion.
My job title is engineering fellow, and the name of the company where I work ends in "AI".
In another of your most downvoted posts, you say
I kind of expect this post to be wildly unpopular
I think you may be onto something here.
weird, i was intending that as a reply to @trevor's answer, but it got plopped as its own answer instead.
I found some of your posts to be really difficult to read. I still don't really know what some of them are even talking about, and on originally reading them I was not sure whether there was anything even making sense there.
Sorry if this isn't all that helpful. :/
They were difficult to write, and even more difficult to think up in the first place. And I'm still not sure whether they make any sense.
So I'll try to do a better job of writing expository content.
Wow, I came here fully expecting this post to have been downvoted to oblivion, and then realized this was not reddit and the community would not collectively downvote your post as a joke
I've had reddit redirect here for about almost a year now (with some slip ups here and there). It's been fantastic for my mental health.
I feel like I've posted some good stuff in the past month, but the bits that I think are coolest have pretty consistently gotten very negative karma.
I just read the rude post about rationalist discourse basics, and, while I can guess why my posts are receiving negative karma, that would involve a truly large amount of speculating about the insides of other people's heads, which is apparently discouraged. So I figured I would ask.
I will offer a bounty of $1000 for the answer I find most helpful, and a bounty of $100 for the next most helpful three answers. This will probably be paid out over Venmo, if that is a decision-relevant factor.
Note that I may comment on your answer asking for clarification.
Edit 11-30-2023 1:27 AM: I have selected the recipients of the bounties. The grand prize of $1000 goes to @Shankar Sivarajan . The three runner-up prizes of $100 go to @tslarm , @Joe Kwon , and @trevor . Please respond to my DM to arrange payment or select a worthy charity to receive your winnings.
Edit 11-30-2023 12:08 PM: I have paid out all four bounties. Please contact me in DM if there is any issue with any of the bounties.