All of Noah Topper's Comments + Replies

Domain: Miscellaneous (Running D&D/RPGs)

Link: Your First Adventure, Your First Town, Prepping an Adventure, Sandboxing

Person: Matt Colville

Why: Matt's videos are really informative and his primary goal is to show that DMing is a straightforward thing that anyone can do, not something reserved for certain members of the tabletop RPG hobby. I've tried to pick out a few of his videos that are especially hands-on, but he has lots more which are very useful. I recommend exploring them at your leisure if you're into TTRPGs at all.

Thanks for coming. :)

Answer by Noah Topper7-2

I am confused by your confusion. Your basic question is "what is the source of the adversarial selection". The answer is "the system itself" (or in some cases, the training/search procedure that produces the system satisfying your specification). In your linked comment, you say "There's no malicious ghost trying to exploit weaknesses in our alignment techniques." I think you've basically hit on the crux, there. The "adversarially robust" frame is essentially saying you should think about the problem in exactly this way.

I think Eliezer has conceded that Stu... (read more)

5Quintin Pope
I think that the intuitions from "classical" multivariable optimization are poor guides for thinking about either human values or the cognition of deep learning systems. To highlight a concrete (but mostly irrelevant, IMO) example of how they diverge, this claim: is largely false for deep learning systems, whose parameters mostly[1] don't grow to extreme positive or negative values. In fact, in the limit of wide networks under the NTK initialization, the average change in parameter values goes to zero. Additionally, even very strong optimization towards a given metric does not imply that the system will pursue whatever strategy is available that minimizes the metric in question. E.g., GPT-3 was subjected to an enormous amount of optimization pressure to reduce its loss. But GPT-3 itself does not behave as though it has any desire to decrease its own loss. If you ask it to choose its own curriculum, it won't default to the most easily predicted possible data. Related: humans don't sit in dark rooms all day, even though doing so would minimize predictive error, and their visual cortexes both (1) optimize towards low predictive error, and (2) there are pathways available by which visual cortexes can influence their human's motor behavior. Related: Reward is not the optimization target 1. ^ Some specific subsets of weights, such as those in layer norms, can be an exception here, as they can grow into the hundreds for some architectures.
1Noosphere89
Another source of Adversarial robustness issues relates to the model itself becoming deceptive. As for this: I unfortunately think this is exactly what real world AI companies are building.

I mean, fair enough, but I can't weigh it up against every other opportunity available to you on your behalf. I did try to compare it to learning other languages. I'll toss into the post that I also think it's comparatively easy to learn.

4Jiro
Your post only really compares it to other languages for the purpose of saying "yes, it really is a language". Not for deciding whether learning another language would be better than learning it. I'm not expecting you to compare it to every single possibility, but it does take a lot of time, and that's something you need to take into account--the opportunity cost is huge, and you're really glossing over it. There are a lot of things you can do in four years other than learn ASL.

FWIW I genuinely think ASL is easy to learn with the videos I linked above. Overall I think sign is more worthwhile to learn than most other languages, but yes, not some overwhelming necessity. Just very personally enriching and neat. :)

5Alicorn
I watched one of the videos and it was clearly a great example of the category.  And yet.  I think ease of learning varies with language and also with learner.  ASL in particular seems likely to be very interpersonally variable - I definitely found it harder than making equivalent progress in French, Chinese, or Japanese, and those last two are famously considered difficult for English-natives.  It requires manual dexterity!  If you get confused in the middle of a sign language sentence you're going to poke yourself in the ear or tangle your elbows together or something.  You have to look at people's facial expressions, they have grammatical import - you have to look at those and at their hands.  There's no good way to take notes because it has no written form or transliteration; I wound up, in my class, writing down things like "quotey eyes" (I don't even remember what that word was) and trying to hang muscle memory on the resemblance between "sorry" and "Canada".  I'm glad you find it easy and exciting!  But I believe you're overgeneralizing.

It's entirely just a neat thing. I think most people should consider learning to sign, and the idea of it becoming a rationalist "thing" just sounded fun to me.  I did try to make that clear, but apologies if it wasn't. And as I said, sorry this is kind of off topic, it's just been a thing bouncing around in my head.

Honestly I found ASL easier to learn than, say, the limited Spanish I tried to learn in high school. Maybe because it doesn't conflict with the current way you communicate. Just from watching the ASL 1 - 4 lectures I linked to, I was surprisingly able to manage once dropped in a one-on-one conversation with a deaf person.

It would definitely be good to learn with a buddy. My wife hasn't explicitly learned it yet, but she's picked up some from me. Israel is a tough choice, I'm not sure what the learning resources are like for it.

4Sameerishere
You should put this in your main post - it greatly increased my interest in actually trying to learn.

...and now I am also feeling like I really should have realized this as well.

I agree that there isn’t an “obvious” set of assumptions for the latter question that yields a unique answer. And granted I didn’t really dig into why entropy is a good measure, but I do think it ultimately yields the unique best guess given the information you have. The fact that it’s not obvious is rather the point! The question has a best answer, even if you don’t know what it is or how to give it.

In any real-life inference problem, nobody is going to tell you: "Here is the exact probability space with a precise, known probability for each outcome." (I literally don't know what such a thing would mean anyway). Is all inference thereby undefined? Like Einstein said, "As far as laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality". If you can't actually fulfill the axioms in real life, what's the point?

If you still want to make inferences anyway, I think you're going to have to adopt ... (read more)

2JBlack
Sure! The original problem invites assumptions of symmetry and integrality[1] and so on, and the implied assumptions result in a fully specified probability model. The adjusted problem adds information that necessarily breaks symmetry, and there is no correspondingly "obvious" set of assumptions that leads to a unique answer. So yes, communicating probability problems requires some shared concepts and assumptions just like communication about anything else. I've previously commented here about my experience with word problems in teaching mathematics, and how the hardest part is not anything to do with the mathematics itself, but with background assumptions about interpreting the problem that were not explicitly taught. 1. ^ The original problem never said that the random number generator only gives integers in the range 1 to 6. That's an additional assumption.

Would one be allowed to make multiple submissions distilling different posts? I don't know if I would necessarily want to do that, but I'm at least curious about the ruling.

Cool, makes sense. I was planning on making various inquiries along these lines starting in a few weeks, so I may reach out to you then. Would there be a best way to do that?

3Davidmanheim
Nope, find me online, I'm pretty easy to reach.

But even in the case of still being in school, one would require the background of having proved non-trivial original theorems? All this sounds exactly like the research agenda I'm interested in. I have a BS in math and am working on an MS in computer science. I have a good math background, but not at that level yet. Should I consider applying or no?

3Davidmanheim
For this position, we are looking for people already able to contribute at a very high level. If you're interested in working on the agenda to see if you'd be able to do this in the future, I'd be interested in chatting separately and looking at whether some form of financial support or upskilling would be useful, and look at where to apply for funding.

I said nothing about an arbitrary utility function (nor proof for that matter). I was saying that applying utility theory to a specific set of terminal values seems to basically get you an idealized version of utilitarianism, which is what I thought the standard moral theory was around here.

1TAG
If you know the utility function that is objectively correct, then you have the correct metaethics, and VnM style utility maximisation only tells you how implement it efficiently. The first thing is "utilitarianism is true", the second thing is "rationality is useful". But that goes back to the issue everyone criticises: EY recommends an object level decision...prefer torture to dust specks... unconditionally without knowing the reader's UF. If he had succeeded in arguing, or even tried to tried to argue that there is one true objective UF, then he would be in a position to hand out unconditional advice. Or if he could show that preferring torture to dust specks was rational given an arbitrary UF, then he could also hand out unconditional advice (in the sense that the conditioning on an subjective UF doesn't make a difference,). But he doesn't do that, because if someone has a UF that places negative infinity utility on torture, that's not up for grabs... their personal UF is what it is .

None of what you have linked so far has particularly conveyed any new information to me, so I think I just flatly disagree with you. As that link says, the "utility" in utilitarianism just means some metric or metrics of "good". People disagree about what exactly should go into "good" here, but godshatter refers to all the terminal values humans have, so that seems like a perfectly fine candidate for what the "utility" in utilitarianism ought to be. The classic "higher pleasures" in utilitarianism lends credence toward this fitting into the classical frame... (read more)

3Said Achmiz
You don’t get utilitarianism out of it because, as explained at the link, VNM utility is incomparable between agents (and therefore cannot be aggregated across agents). There are no versions of utilitarianism that can be constructed out of decision-theoretic utility. This is an inseparable part of the VNM formalism. That having been said, even if it were possible to use VNM utility as the “utility” of utilitarianism (again, it is definitely not!), that still wouldn’t make them the same theory, or necessarily connected, or conceptually identical, or conceptually related, etc. Decision-theoretic expected utility theory isn’t a moral theory at all. Really, this is all explained in the linked post… Re: the “EDIT:” part: No, I do not agree that he’s doing this. Yes, he’s a utilitarian. (“Torture vs. Dust Specks” is a paradigmatic utilitarian argument.) I would call that “being confused”. How to (coherently, accurately, etc.) map “human well-being” (whatever that is) to any usable scalar (not vector!) “utility” which you can then maximize the expectation of, is probably the biggest challenge and obstacle to any attempt at formulating a moral theory around the intuition you describe. (“Utilitarianism using VNM utility” is a classic failed and provably unworkable attempt at doing this.) If you don’t have any way of doing this, you don’t have a moral theory—you have nothing.

I meant to convey a utility function with certain human values as terminal values, such as pleasure, freedom, beauty, etc.; godshatter was a stand-in. 

If the idea of a utility function has literally nothing to do with moral utilitarianism, even around here, I would question why in the above when Eliezer is discussing moral questions he references expected utility calculations? I would also point to “intuitions behind utilitarianism“ as pointing at connections between the two? Or “shut up and multiply”? Need I go on?

I know classical utilitarianism is n... (read more)

1TAG
If he has a proof that utilitarianism, as usually defined the highly altruistic ethical theory, is equivalent to maximization of an arbitrary UF , given some considerations about coherence, then he has something extraordinary that should be widely I own. Or he is using "utilitarianism" in a weird way. ..or he is not and he is just confused.
4Said Achmiz
Yes, I understood your meaning. My response stands. What is the connection? Expected utility calculations can be, and are, relevant to all sorts of things, without being identical to, or similar to, or inherently connected with, etc., utilitarianism. The linked post makes some subtle points, as well as some subtle mistakes (or, perhaps, instances of unclear writing on Eliezer’s part; it’s hard to tell). The “utility” of utilitarianism and the “utility” of expected utility theory are two very different concepts that, quite unfortunately and confusingly, share a term. This is a terminological conflation, in other words. Here is a long explanation of the difference.

Hm, I worry I might be a confused LWer. I definitely agree that "having a utility function" and "being a utilitarian" are not identical concepts, but they're highly related, no? Would you agree that, to a first-approximation, being a utilitarian means having a utility function with the evolutionary godshatter as terminal values? Even this is not identical to the original philosophical meaning I suppose, but it seems highly similar, and it is what I thought people around here meant.

5Said Achmiz
This is not even close to correct, I’m afraid. In fact being a utilitarian has nothing whatever to do with the concept of a utility function. (Nor—separately—does it have much to do with “evolutionary godshatter” as values; I am not sure where you got this idea!) Please read this page for some more info presented in a systematic way.

I'm curious about what continued role you do expect yourself to have. I think you could still have a lot of value in helping train up new researchers at MIRI. I've read you saying you've developed a lot of sophisticated ideas about cognition that are hard to communicate, but I imagine could be transmitted easier within MIRI. If we need a continuing group of sane people to be on the lookout for positive miracles, would you still take a relatively active role in passing on your wisdom to new MIRI researchers? I would genuinely imagine that being in more direct mind-to-mind contact with you would be useful, so I hope you don't become a hermit.

7Chris_Leong
Agreed. If MIRI feels like they aren't going to be able to solve the problem, then it makes sense for them to focus on training up the next generation instead.

What do you think about ironically hiring Terry Tao?

Eliezer, do you have any advice for someone wanting to enter this research space at (from your perspective) the eleventh hour? I’ve just finished a BS in math and am starting a PhD in CS, but I still don’t feel like I have the technical skills to grapple with these issues, and probably won’t for a few years. What are the most plausible routes for someone like me to make a difference in alignment, if any?

I don't have any such advice at the moment.  It's not clear to me what makes a difference at this point.

I think I might actually be happy to take e.g. the Bellman equation, a fundamental equation in RL, as a basic expression of consistent utilities and thereby claim value iteration, Q-learning, and deep Q-learning all as predictions/applications of utility theory. Certainly this seems fair if you claim applications of the central limit theorem for probability theory.

To expand a bit, the Bellman equation only expresses a certain consistency condition among utilities. The expected utility of this state must equal its immediate utility plus the best expected ut... (read more)

Hey all, organizer here. I don't know if you'll automatically get notified of this message, but I don't have emails for everyone. I just wanted to give some parking info. You can get a daily parking pass here: https://parking.ucf.edu/permits/visitor-permits/. You can get a virtual pass and they'll check by plate. It's $3. I'd recommend parking in Garage A or I. Hope to see everyone there!

From what I understand, it's difficult enough to get an abortion as it is. Clinics are rather rare, insurance doesn't always cover it, there may be mandatory waiting periods and counseling, etc. I don't think it would be impossible to still get one, but the added inconvenience is not trivial.  At minimum, a big increase in travel time and probable insurance complications. But if someone here knows more than me, I'd very much like to hear it.

I'd like to note that Texas is passing strong restrictions on abortion. They've passed a "heartbeat bill" banning abortions after six weeks, and it seems likely that they'll pass a trigger bill outlawing abortion almost entirely, contingent on the Supreme Court overturning Roe v Wade. 

I'm not a Supreme Court expert, but I know people who are sincerely worried about Roe v Wade being undone. This would be a pretty big deal breaker for my fiancée (and by extension myself). From what I read, the Supreme Court will make a Roe v Wade ruling in the middle of... (read more)

As a non-American: If the problem just applies to Texas or to Republican states in general, are there substantial barriers to getting an abortion in another state (for rationalists)? I have heard that argument made often online for why passing state level abortion bans is ineffective.

6Adele Lopez
This would be a deal breaker for me too.

I think you need to fix the days listed on the application form, they say August 17th - 20th.

1Linda Linsefors
Fixed! Thank you for pointing this out.

I don't have a wonderful example of this insta-feedback (which definitely sounds ideal for learning), but I've gotten annoyed lately with any math book that doesn't have exercises. Some of the books on MIRI's Research Guide list are like this, and it really boggles my mind how anyone could learn math from a book when they don't have anything to practice with. So I'm getting more selective.

Even some books with exercises are super hard, and really don't have any kind of walkthrough process. AI: A Modern Approach is a criti... (read more)

Well, I tend to throw them onto my general to-read list, so I'm not entirely sure. A few I remember are Godel, Escher, Bach, Judgment Under Uncertainty: Heuristics and Biases, Influence: Science and Practice, The End of Time, QED: The Strange Theory of Light and Matter, The Feynman Lectures, Surely You're Joking Mr. Feynman, Probability Theory: The Logic of Science, Probabilistic Inference in Intelligent Systems, and Player of Games. There's a longer list here, but it's marked as outdated.

Sounds awesome! A meatspace group would be great, I'm sure. One of my issues with self-study is having nobody to go to when I have questions or don't understand something. Having an empirical goal can also tell you if you've succeeded or failed in your attempt to learn the art.

7Stefan De Young
Having a group of rationalists to talk to in person has been invaluable to me this year. It's helping me emerge from depression, overcome impostor syndrome, and find my project. The previous sentence reads like the battles have been won, but they are being fought constantly. Check this list of upcoming meetups: https://slatestarcodex.com/2018/08/09/ssc-meetups-2018-times-and-places/ Right now is a really good time to start or join meatspace communities.

I definitely agree that there's a bigger issue, but I think this could be a good small-scale test. Can we apply or own individual rationality to pick up skills relevant to us and distinguish between good and bad practices? Are we able to coordinate as a community to distinguish between good and bad science? Rationality should in theory be able to work on big problems, but we're never going to be able to craft the perfect art without being able to test it on smaller problems first and hone the skill.

So yeah. I think a guide putting together good ... (read more)

8tinyanon
Give me a month to make a fitness one. I train a bunch of my friends including one rationalist friend that has been pushing me towards writing some analyses of studies; so I have a good amount of experience trying to find ways to get people into fitness who've had issues fighting against their baser urges just to sit down and conserve calories.