All of Iknownothing's Comments + Replies

Yup, that's definitely something that can be argued by people Against during the Debate Stage!
And they might come to the same conclusion!

1Remmelt
Sure. Keep in mind that as an organiser, you are setting the original framing.
2Jan Christian Refsgaard
Ahh, I know that is a first year course for most math students, but only math students take that class :), I have never read an analysis book :), I took the applied path and read 3 other bayesian books before this one, so I taught the math in this books were simultaneously very tedious and basic :)

I'm not a grad physics student- I don't have a STEM degree, or the equivalent- I found the book very readable, nonetheless. It's by far my favourite textbook- feels like it was actually written by someone sane, unlike most.

2Jan Christian Refsgaard
That surprising to me, I think you can read the book two ways, 1) you skim the math, enjoy the philosophy and take his word that the math says what he says it says 2) you try to understand the math, if you take 2) then you need to at least know the chain rule of integration and what a delta dirac function is, which seems like high level math concepts to me, full disclaimer I am a biochemist by training, so I have also read it without the prerequisite formal training. I think you are right that if you ignore chapter 2 and a few sections about partition functions and such then the math level for the other 80% is undergraduate level math

I'm really glad you wrote this! 
I think you address an important distinction there, but I think there might be a further one to be made- in that how we measure/tell if a model is aligned in the first place. 
There seems to be a growing voice which says that if a model's output seems to be the output we might expect from an aligned AI, then it's aligned. 
I think it's important to distinguish that from the idea that the model is aligned if you actually have a strong idea of what it's values are, how it's gotten them, etc. 

I'm really excited to see this!! 
I'd like it if this became embed-able so it could be used on ai-plans.com and on other sites!!
Goodness knows, I'd like to be able to get summaries and answers to obscure questions on some alignmentforum posts!

What do you think someone who knows about PDP knows that someone with a good knowledge of DL doesn't?
And why would it be useful?

I think folks in AI Safety tend to underestimate how powerful and useful liability and an established duty of care would be for this.

I think calling things a 'game' makes sense to lesswrongers, but just seems unserious to non lesswrongers.

I don't think a lack of IQ is the reason we've been failing at making AI sensibly. Rather, it's a lack of good incentive making. 
Making an AI recklessly is current much more profitable than not doing do- which imo, shows a flaw in the efforts which have gone towards making AI safe - as in, not accepting that some people have a very different mindset/beliefs/core values and figuring out a structure/argument that would incentivize people of a broad range of mindsets.

Hasn't Eliezer Yudkowsky largely failed at solving alignment and getting other to solve alignment? 
And wasn't he largely responsible for many people noticing that AGI is possible and potentially highly fruitful?
Why would a world where he's the median person be more likely to solve to solve alignment?

2Zack_M_Davis
In a world where the median IQ is 143, the people at +3σ are at 188. They might succeed where the median fails.

Update: Rob Miles will also be judging some critiques! He'll be judging Communication!

Hi, I'm Kabir Kumar, the founder of AI-Plans.com, I'm happy to answer any questions you might have about the site or the Critique-a-Thon!

Hi, we've already made a site which does this!

Answer by Iknownothing-2-4

Probably much better for health overall to have a bowl of veg and fruit at your table for easy healthy snacking (carrots, cucumber, etc)

Most of my knowledge on dependencies and addictions comes from a brief study I did on neurotransmitter's roles in alcohol dependence/abuse while in school,  for an EPQ, so I'm really not sure how much of this applies- also, a lot of my study was finding that my assumptions were in the wrong direction(I didn't know about endorphins)- but I think a lot of the stuff on neurotransmitters and receptors holds across different areas- take it with some salt though. 

Quitting cold turkey rarely ever works for addictions/dependencies. The vast majority of t... (read more)

When I say media, I mean social media, movies, videos, books etc- any type of recording or something that you believe you're using as entertainment. 

I'm trying this myself. Done singular days before, sometimes 2 or 3 days, but failed to keep it consistent. I did find that when I did it, my work output was far higher and greater quality, I had a much better sleeping schedule and was generally in a much more enjoyable mood.
I also ended up spending more time with friends and family, meeting new people, trying interesting things, spending time outdoors, e... (read more)

2trevor
I predict (losing Bayes points if I'm wrong) that most people will have a similar experience, but I also predict that the best strategy is to quit cold turkey; nicotine does not run SGD to notice that user retention is at risk and autonomously take actions that were successful at mitigating risk in the past.  It would be hard for them to make their systems not optimize in weird ways due to goodhart's law; furthermore, anyone running a successful social media platform would need to give the algorithms a wide leeway to experiment with user retention, since competitor platforms might be running systems that also autonomously form novel strategies.

A challenge for folks interested: spend 2 weeks without media based entertainment. 

2trevor
I'd love it if people could try the basic precautions and see how harmless they are! Especially because they might be the minimum ask in order to avoid getting your brain and motivation/values hacked. I guess there would be bonus points for avoiding watching videos that millions of other people have watched.

"CESI’s Artificial Intelligence Standardization White Paper released in 2018 states
that “AI systems that have a direct impact on the safety of humanity and the safety of life,
and may constitute threats to humans” must be regulated and assessed, suggesting a broad
threat perception (Section 4.5.7).42 In addition, a TC260 white paper released in 2019 on AI
safety/security worries that “emergence” (涌现性) by AI algorithms can exacerbate the
black box effect and “autonomy” can lead to algorithmic “self-improvement” (Section
3.2.1.3).43"
From https://concordia-consult... (read more)

I disagree with this paragraph today: "A lot of what AI does currently, that is visible to the general public seems like it could be replicated without AI"

I was talking about for a farmer. For a consumer, they can get their eggs/milk from such a farmer and fund/invest in such a farm, if they can. 
Or talk to a local farm about setting aside some chickens, pay for them to be given extra space, better treatment, etc.

I don't really know what you mean about the EA reducetarian stuff. 

Also, if you as an individual want to be healthy, not contribute to harming animal and have the time, space, money, willingness etc to raise some chickens, why not? 

Exercise in general is pretty great, yes. Especially if done outdoors, imo.

Could a solution to some of this be to raise some chickens for eggs, treat them nicely, give them space to roam, etc? 
Obviously the best would be to raise cows as well, treat them well, don't kill the male calves, etc- but that's much less of an option for most.

3orthonormal
Your framing makes it sound like individual raising of livestock, which is silly—specialization of expertise and labor is a very good thing, and "EA reducetarians find or start up a reasonably sized farm whose animal welfare standards seem to them to be net positive" seems to dominate "each EA reducetarian tries to personally raise chickens in a net positive way" (even for those who think both are bad, the second one seems simply worse at a fixed level of consumption).

This is great! Thank you for doing this! Might add some of these to ai-plans.com!

3Logan Zoellner
Cool site! It doesn't look like there's a button for "Add a strength" on e.g. https://ai-plans.com/post/f180b51d7e6a (although it appears possible to do so if I click the "show post" button. I also wish there was some way to understand depth/breadth of plans.  E.g. is this a "full alignment plan" (examples would be The Plan or Provably Save Systems) or is this a narrow technical research direction (e.g. this post). Ideally, there would be some kind of prediction market style mechanism that assigned "dignity points" to plans that were most likely to contribute significantly to AI Alignment.

I think this kind of thing makes people feel like you're pushing a message, to which the automatic response is to push back.
What I've found works is to be agreeable, inviting, meet them at their own values and present how it as a hard problem to solve which isn't being competently tackled by this other dumb group (not us, we wouldn't do this). 
That kind of thing. Had a 100% success rate so far.
I'm simplifying my approach, since I'm not spending a lot of time on this, but if you imagine I'm not a dumbass and think about what kind of approach like this could work a lot, while not being dumb in that it doesn't actually address the problem, you'll probably get what I mean.

I'm generally disincentivized to post or put effort into a post from the system where someone can just heavily downvote my post, without even giving a reason.
 

A simple way to improve this system would be to require someone to comment/give a reason when heavily upvoting/heavily downvoting things. 
 

"In the ancestral environment, politics was a matter of life and death." - this is a pretty strong statement to make with no evidence to back it up.

5Rebecca
They’re talking about technical research orgs/labs, not ancillary orgs/projects

I think your ideas are some of the most promising I've seen- I'd love to see them pursued further, though I'm concerned about the air-gaping  

Hi Ruby! Thanks for the great feedback!! Sorry for the late reply, I've been working on the site!

So, we're not doing just criticisms anymore- we're ranking plans by Total Strength score - Total Vulnerabilities scores. Quite a few researchers have been posting their plans on the site!
Going to do a full rebuild soon, to make the site look nicer and be even faster to work on.
We're also holding regular critique-a-thons. The last one went very well! 
We had 40+ submissions and produced what I think is really great work!
We also made a Broad List of Vulnerabi... (read more)

This was really great. Thanks for making it.

I was curious why Trump was dropping some of the best takes!

Yeah, I think you're right- at least about the sequences. 

I think something more specific about attitudes would be more accurate and useful.

Thank you! I've sorted that now!!

Please let me know if you have any other feedback!!

From my very spotty info on evolution:
Humans got 'trained' to maximise reproducibility and in doing so maximised a bunch of other stuff along the way- including resource acquisition.

What I spoke about here is creating an environment where a more intelligent+fast agent is put in an environment that is deliberately crafted such that it can only survive by helping much dumber, slower agents. Training to act co-operatively. 

Writing this out, I may have just made an overcomplicated version of reinforcement learning.

That was something like what I was thinking. But I think this won't work, unless modified so much that it'd be completely different. More an idea to toss around.
 

I'll start over with something else. I do think something that might have value is designing an environment that induces empathy/values/whatever, rather than directly trying to design the AI to be what you want from scratch. 
Environment design can be very powerful in influencing humans, but that's in huge part because we (or at least, those of us who put thought in designing environments... (read more)

1mishka
Yes, I think we are looking at "seeds of feasible ideas" at this stage, not at "ready to go" ideas... I tried to look at what would it take for super-powerful AIs * not to destroy the fabric of their environment together with themselves and everything * to care about "interests, freedom, and well-being of all sentient beings" That's not too easy, but might be doable in a fashion invariant with respect to recursive self-modification (and might be more feasible than more traditional approaches to alignment). Of course, the fact that we don't know what's sentient and what's not sentient does not help, to say the least ;-) But perhaps we and/or AIs and/or our collaborations with AIs might figure this out sooner rather than later... Anyway, I did scribble a short write-up on this direction of thinking a few months ago: Exploring non-anthropocentric aspects of AI existential safety

On the porch/outside/indoors thing- maybe that's not a great example, because having the numbers there seems to add nothing of value to me. Other than maybe clarifying to yourself how you feel about certain ideas/outcomes, but that's something that any one with decent thinking does anyways.

2moridinamael
The Party Problem is a classic example taught as an introductory case in decision theory classes, that was the main reason why I chose it.

Sorry, I think I have an idea of what you're saying, but I'm not really sure. Do you mind elaborating? With a little less LessWrong lingo, please.

Absolutely! 

One of the reasons I've gone against the idea of tags, different ways of sorting, etc (though they get brought up a lot) is that it could lead to plans which are the most attractive at first glance, or the most understandable at first glance, appealing, etc getting the most attention.
It's very important that what a criticism's points measure is the validity of the criticism to the plan and not something else - though, I think if there are two criticisms making the same point and one gets a higher amount of points because it's more readable... (read more)

2Gurkenglas
Suppose an outcome pump picks a random property, checks if papers with it Goodhart your points, and time-loops until it finds one. Do you think it would eventually find one? Unfortunately, optimization tries all properties in parallel, without even an outcome pump. Treat hardness proofs (perpetual motion, NP, ...) as neon tubes on the box to think outside of. Find any difference between the proven-hard problem and yours (usually exists!), then imagine leads that wouldn't help on the proven-hard problem, leads you don't get better at ruling out by knowing the existing proof. To not fall to the dire kind of "adversary" that moves after you, don't calculate a number.

Thank you, I think there's an error in my phrasing. 
I should have said: 

Currently, it takes a very long time to get an idea of who is doing what in the field of AI Alignment and how good each plan is, what the problems are, etc.

Thank you very much for this. 
I agree, it does seem like this way, people will end up getting a bunch of karma even for bad criticisms. Which would defeat the whole point of the points system.

I'm not sure I fully understand "So I would rather make sure that the bottom half of criticism gets an increasing potential for negative karma impact, by applying a weight on the upvote points starting from 1 for the median criticism, and progressing towards 0 for the worst criticism. (goodness can be measured as unweighted votes divided by number of votes.)"

I th... (read more)

2Zoltan Foris
Let me explain this suggestion of mine: "So I would rather make sure that the bottom half of criticism gets an increasing potential for negative karma impact, by applying a weight on the upvote points starting from 1 for the median criticism, and progressing towards 0 for the worst criticism. (goodness can be measured as unweighted votes divided by number of votes.)" I explain on an example. There are 800 criticisms arrived in Januar 2024, in total. all have their upvote/downvote based points (let us say as of 15 Feb), let me call these "raw points".  We put them in the order of increasing raw points. The worst let be -5, the 100th 5, the the 400th (the middle one) 25, the top one 110. Now a multiplier "m" is calculated for the bottom 400 criticisms, it will be 1-(400-x)/400 , where x is the rank of the criticism, so x=1 for the worst one, x=100 for the 100th one.  Now, for example, the worst criticism had raw point -5, and this was calculated as a sum of upvote - downvote points (raw = up - down), let us assume total upvote points 10, total downvote 15, so -5 = 10-15. We now apply the multiplier : final points = m*up - down. In this example, final points = (1-399/400)*10 - 15 = -15.  So the final point will be approximately -15 because a heavy multiplier has decreased the value of the upvotes.

not just that. It's because the field isn't organized at all. 

2TAG
Alignment -- getting the utility function exactly right-- and Control are the two main proposals for AI safety. Whilst LeCunns's proposal isn't alignment, it is control.

I'm really interested about what you mean here!

1worse
Idk the public access of some of these things, like with nonlinear's recent round, but seeing a lot of apps there and organized by category, reminded me of this post a little bit. edit - in terms of seeing what people are trying to do in the space. Though I imagine this does not capture the biggest players that do have funding.

This plan originated from the idea of trying to have a hackathon to disprove alignment plans. I'm still very interested in that!

I don't think you meant for it to be, but this, like a lot of EA stuff, reads like it was written by a psychopath slightly obsessed with helping people. 
Though, to be fair, a lot of EA stuff seems more like an obsession with 'making the world a better place' than helping people, so this is actually less disturbing than a lot of EA stuff.

Edit: which is probably why part of why so many people are turned off by EA stuff.

5Mitchell_Porter
Seems the same as a thousand other reports written by people at the intersection of volunteer work and organized charity, trying to ameliorate poverty, domestic violence, you name it. I really don't see what's "disturbing" about it (let alone "psychopathic"!). 

Thank you, I will look at these.

Load More