All of lalaithion's Comments + Replies

Perhaps there is a different scheme for dividing gains from coöperation which satisfies some of the things we want but not superadditivity, but I’m unfamiliar with one. Please let me know if you find anything in that vein, I’d love to read about some alternatives to Shapley Value.

I had a weird one today; I asked it to write a program for me, and it wrote one about the Golden Gate Bridge, and when I asked it why, it used the Russian word for “program” instead of the English word “program”, despite the rest of the response being entirely in English.

1Ann
Kind of interesting how this is introducing people to Sonnet quirks in general, because that's within my expectations for a Sonnet 'typo'/writing quirk. Do they just not get used as much as Opus or Haiku?

I don't think the Elimination approach gives P(Heads|Awake) = 1/3 or P(Monday|Awake) = 2/3 in the Single Awakening problem. In that problem, there are 6 possibilities:

P(Heads&Monday) = 0.25

P(Heads&Tuesday) = 0.25

P(Tails&Monday&Woken) = 0.125

P(Tails&Monday&Sleeping) = 0.125

P(Tails&Tuesday&Woken) = 0.125

P(Tails&Tuesday&Sleeping) = 0.125

Therefore:

P(Heads|Awake)

= P(Heads&Monday) / (P(Heads&Monday) + P(Tails&Monday&Woken) + P(Tails&Tuesday&Woken))

= 0.5

And:

P(Monday|Awake)

= (P(Heads&Monday) + P(T... (read more)

1Ape in the coat
Elimination argument is the logic of "There are four equiprobable mutually exclusive outcomes but only 3 of them are compatible with an observation, so we update to 3 equiprobable outcomes" attempted to be applied to the Sleeping Beauty problem. This logic itself is correct when there actually are four equiprobable mutually exclusive outcomes. Like in a situation where two coins were tossed and you know that the resulting outcome is not Heads Heads. So the argument tries to justify, why the same situation is true for the Sleeping Beauty problem. It does so based on these facts: 1. P(Heads) = P(Tails) - because the coin is fair 2. P(Monday)=P(Tuesday) - because the experement lasts for two days 3. Days happen independently of the outcomes of the coin tosses The thing is, it's not actually enough. And to show it I apply this reasoning to another prblem where all the same conditions are satisfied, yet the answer is definetely not 1/3. What you've done is not an application of Elimination argument to Single-Awakening problem. You are trying to apply a version of Updating model to Single-Awakening problem. And just as Updating model when attempted to apply to Sleeping Beauty is actually solving Observer problem, you are solving Observer-Single-Awakening problem: You seem to be doing it correctly. No update happens because P(Heads|Awake)=P(Tails|Awake). 

I also consider myself as someone who had—and still has—high hopes for rationality, and so I think it’s sad that we disagree, not on the object level, but on whether we can trust the community to faithfully report their beliefs. Sure, some of it may be political maneuvering, but I mostly think it’s political maneuvering of the form of—tailoring the words, metaphors, and style to a particular audience, and choosing to engage on particular issues, rather than outright lying about beliefs.

I don’t think I’m using “semantics” in a non-standard sense, but I may ... (read more)

I owe you an apology; you’re right that you did not accuse me of violating norms, and I’m sorry for saying that you did. I only intended to draw parallels between your focus on the meta level and Zack’s focus on the meta level, and in my hurry I erred in painting you and him with the same brush.

I additionally want to clarify that I didn’t think you were accusing me of lying, but merely wanted preemptively close off some of the possible directions this conversation could go.

Thank you for providing those links! I did see some of them on his blog and skipped ... (read more)

3Said Achmiz
Apology accepted! You’re quite welcome. Hmm. I continue to think that you are using the term “semantics” in a very odd way, but I suppose it probably won’t be very fruitful to go down that avenue of discussion… I imagine the answer to this one will depend on the details—which people, disagreeing on what specific matter, in what way, etc. Certainly it seems implausible that none of it is “political maneuvering” of some sort (which I don’t think is “bizarre”, by the way; really it’s quite the opposite—perfectly banal political maneuvering, of the sort you see all the time, especially these days… more sad to see, perhaps, for those of us who had high hopes for “rationality”, but not any weirder, for all that…).

I haven’t read everything Zack has written, so feel free to link me something, but almost everything I’ve read, including this post, includes far more intra-rationalist politicking than discussion of object level matters.

I know other people are interested in those things. I specifically phrased my previous post in an attempt to avoid arguing about what other people care about. I can neither defend nor explain their positions. Neither do I intend to dismiss or malign those preferences by labeling them semantics. That previous sentence is not to be read as a... (read more)

I haven’t read everything Zack has written, so feel free to link me something, but almost everything I’ve read, including this post, includes far more intra-rationalist politicking than discussion of object level matters.

Certainly:

https://www.lesswrong.com/posts/LwG9bRXXQ8br5qtTx/sexual-dimorphism-in-yudkowsky-s-sequences-in-relation-to-my

https://www.greaterwrong.com/posts/juZ8ugdNqMrbX7x2J/challenges-to-yudkowsky-s-pronoun-reform-proposal

https://www.lesswrong.com/posts/RxxqPH3WffQv6ESxj/blanchard-s-dangerous-idea-and-the-plight-of-the-lucid

http://unrem... (read more)

Yeah, what factual question about empirical categories is/was Zack interested in resolving? Tabooing the words “man” and “woman”, since what I mean by semantics is “which categories get which label”. I’m not super interested in discussing which empirical category should be associated with the phonemes /mæn/, and I’m not super interested in the linguistic investigation of the way different groups of English speakers assign meaning to that sequence of phonemes, both of which I lump under the umbrella of semantics.

2Said Achmiz
Zack has written very many words about this, including this very post, and the ones prior to it in the sequence; and also his other posts, on Less Wrong and on his blog. But other people are interested in these things (and related ones), as it turns out; and the question of why they have such interest, as well as many related questions, are also factual in nature. What’s more, “A Human’s Guide to Words” (which I linked to in the grandparent) explains why reassigning different words to existing categories is not arbitrary, but has consequences for our (individual and collective) epistemics. So even such choices cannot be dismissed by labeling them “semantics”.

What factual question is/was Zack trying to figure out? “Is a woman” or “is a man” are pure semantics, and if that’s all there is then… okay… but presumably there’s something else?

5Said Achmiz
Given some referent—some definition, either intensional or extensional—of the word “man” (in other words, some discernible category with the label “man”), the question “is X a man” (i.e., “is X a member of this category labeled ‘man’”) is an empirical question. And “man”, like any commonly used word, can’t be defined arbitrarily. All of the above being the case, what do you mean by “pure semantics” such that your statement is true…?

I think this post could be really good, and perhaps there should be an effort to make this post as good as it can be. Right now I think it has a number of issues.

  1. It's too short. It moves very quickly past the important technical details, trusting the user to pick them up. I think it would be better if it was a bit longer and luxuriated on the important technical bits.

  2. It is very physics-brained. Ideally we could get some math-literate non-physicists to go over this with help from a physicist to do a better job phrasing it in ways that are unfamiliar t

... (read more)

There are distributions which won't approach a normal—Lévy distributions and Cauchy distributions are the most commonly known.

Yeah, to be clear I don't have any information to suggest that the above is happening—I don't work in EA circles—except for the fact that Ben said the EA ecosystem doesn't have defenses against this happening, and that is one of the defenses I expect to exist.

Yeah, this post makes me wonder if there are non-abusive employers in EA who are nevertheless enabling abusers by normalizing behavior that makes abuse popular. Employers who pay their employees months late without clarity on why and what the plan is to get people paid eventually. Employers who employ people without writing things down, like how much people will get paid and when. Employers who try to enforce non-disclosure of work culture and pay.

None of the things above are necessarily dealbreakers in the right context or environment, but when an employ... (read more)

9Rob Bensinger
Do any of those things happen much in EA? (I don't think I've ever heard of an example of one of those things outside of Nonlinear, but maybe I'm out of the loop.)

Find an area of the thing you want to do where quality matters to you less. Instead of trying to write the next great American novel, write fanfic[1]. Instead of trying to paint a masterpiece, buy a sketchbook and trace a bunch of stuff. Instead of trying to replace your dish-ware with handmade ceramics, see how many mugs you can make in an hour. Instead of trying to invent a new beautiful operating system in a new programming language, hack together a program for a one-off use case and then throw it away.

[1] not a diss to fanfic—but for me, at least, it's easier to not worry about my writing quality when I do so

7Elizabeth
Some things that help me with this: 1. pots theory of art (which was actually about film photography, which I think was unusually likely to benefit from more raw attempts, but chanting "pots theory" still helps me). 2. Zefrank's Brain Crack video. This is the guy who does True Facts about [Animals] and the sad cat diaries, so he has credibility on artistic work.  3. Remember my taste will always exceed my ability and it would kind of be a bad sign if I could live up to my own standard of perfection.

I think an important point missing from the discussion on compute is training vs inference: you can totally get a state-of-the-art language model performing inference on a laptop.

This is a slight point in favor of Yudkowsky: thinking is cheap, finding the right algorithm (including weights) is expensive. Right now we're brute-forcing the discovery of this algorithm using a LOT of data, and maybe it's impossible to do any better than brute-forcing. (Well, the human brain can do it, but I'll ignore that.)

Could you run a LLM on a desktop from 2008? No. But, o... (read more)

So, like, I remain pretty strongly pro Hanson on this point:

  1. I think LLaMA 7b is very cool, but it's really stretching it to call it a state-of-the-art language model. It's much worse than LLaMA 65b, which much worse than GPT-4, which most people think is > 100b as far as I know. I'm using a 12b model right now while working on an interpretability project... and it is just much, much dumber than these big ones.

  2. Not being able to train isn't a small deal, I think. Learning in a long-term way is a big part of intelligence.

  3. Overall, and not to be too

... (read more)

General principles of OSes and Networks is invaluable to basically everyone.

How programming languages, compilers, and interpreters work will help you master specific programming languages.

Hmm, no, I don't believe I use sex and gender interchangeably. Let's taboo those two terms.

I think that most people don't care about a person's chromosomes. When I inspect the way I use the words "sex" and "gender", I don't feel like either of them is a disguised query for that person's chromosomes.

I think that many people care about hormone balances. Testosterone and Estrogen change the way your body behaves, and the type of hormone a person's body naturally produces and whether they're suppressing that and/or augmenting with a different hormone is defini... (read more)

2ymeskhout
I don't understand your objection to the Harvard analogy exactly, especially since you included it in your list. I also don't understand how exactly you're distinguishing "social role" from "physical fact" (which one is 'lesbian'?) or why identifying as a Bostonian is meaningful but Harvard grad is not? Given that lack of clarity, I worry we have different understanding of the phrase "social role". My definition would be something along the lines of "the set of behaviors, expectations, and responsibilities associated with a particular position or status within a group or society". Assuming you agree with that definition, I absolutely understand why someone would care about wanting to fit into and interact with the world as a specific role. But desire for a particular social role is not a sufficient condition for attaining it. Babies are a social role. No one complains when they sleep all day instead of going to work, and no one is surprised when other people clean them up when they shit their pants. Babies are treated this way because they're vulnerable and they lack this capacity to take care of themselves because of they haven't sufficiently developed at their age. And while age is a really good proxy for self-sufficiency (especially early on) it's obviously not the only one. See for example how we treat the elderly and the severely disabled. The simplistic analysis here would be to deny a vulnerable person the pampered treatment just because they're too old to be a baby. Conversely, if I had the intent to be treated the same way as a baby — fed Gerber and pampered by a dedicated two-person team while I sleep all day — it would be reasonable to ask why I am warranted that role. If I'm able-bodied and self-sufficient, it's reasonable for me to be denied that social role. You can call this a sort of "price of admission" to the social role if you'd like. I'm not sure what made you think that? I don't have an opinion on what the "correct" segregation method for dr

I think that one thing you're missing is that lots of people... use gender as a very strong feature of navigating the world. They treat "male" and "female" as natural categories, and make lots of judgements based on whether someone "is" male or female.

You don't seem to do that, which puts you pretty far along the spectrum towards gender abolition, and you're right, from a gender abolition perspective there's no reason to be trans (or to be worried about people using the restroom they prefer or wearing the clothes they prefer or taking hormones to alter the... (read more)

5ymeskhout
I think I finally understand the source of the confusion. Correct me if I'm wrong but it appears to me that you use "sex" and "gender" interchangeably, where a change in the latter is (or at least should be treated as) functionally a change in the former. I definitely use sex as "a very strong feature of navigating the world" because I can readily point to endless circumstances where knowing someone's sex (and the likely associated secondary characteristics) provides useful information (height, violence propensity, affinity towards trains, whatever). I recognize that someone's sex is a dense piece of information to have because it has reliable predictive qualities. There's plenty of other traits with similarly reliable predictive qualities and maybe it's helpful to set aside sex/gender for a bit. If I knew nothing about a person except that they're a Harvard graduate, it's reasonable for me to guess that they're more likely to be intelligent, ambitious, privileged, conceited, etc. Obviously not every Harvard grad fits those traits, and if somehow there's a grad whom I assume is ambitious and conceited but who idles his days as a homeless beach bum, well the error is on me. But suppose the only thing I knew about a person is they "identify" as a Harvard graduate. I would like to think that everyone's first question would be "what does that mean??" and suppose this individual responds with "although I technically never went to Harvard, I am nevertheless intelligent, ambitious, privileged, conceited, etc. and it's just easier/faster/simpler for me to implant that impression through a single informational payload rather than piece by piece." I may not endorse this tactic, but I get it! Of course, the natural follow-up question would be "but how do we know that you're actually any of those traits you mentioned?" and also "but how do you ensure that you implant the intended impression?" These are the exact same questions I would ask of a trans individual. There seems

I've met many self-identified women (trans and otherwise) that did not prefer female-gendered terms, prompting plenty of inadvertent social gaffes on my end.

I think that if someone self identifies as a woman to you, and you use a gendered term to describe them (she, policewoman, actress) that is not a social gaffe on your part. I think that it is fine for someone to identify as a woman, but advocate for the use of gender neutral language in all cases even applied to them, but they should not put pressure on those who do so differently.

and the most rel

... (read more)

Some examples of places where knowing someone identifies as a woman vs. as a man vs. as nonbinary would affect your view of their behavior:

I appreciate that you provided specific examples but my immediate reaction to your list was one of bafflement.

While I personally don't care what bathrooms or changing room anyone uses, to the extent someone does prompt a negative reaction for being in the "wrong" room it would be entirely predicated on how that individual is perceived by others (read: pass), not what they personally identify as. 

Similarly, I don't ... (read more)

The common justification trotted out (that it’s necessary to include the theoretically-possible transman who somehow can get pregnant and apparently suffers no dysphoria from carrying a fetus to term) is completely daft.

This is as far as I can tell completely false. Plenty of trans men carry fetuses to term. Plenty of trans men carried fetuses to term before they came out as trans men. Plenty of trans men decide to carry fetuses to term after they come out as trans men. A couple of facts I believe about the world that may help you make sense of this:

... (read more)

Hi! I'm not sure where exactly in this thread to jump in, so I'm just doing it here.

I like this thread! It's definitely one of my favorite discussions about gender between people with pretty different perspectives. I also like the OP; I found it to be surprisingly clear and grounded, and to point at some places where I am pretty confused myself.

>Originally you said that my post lacked an "understanding of the experiences of trans people" and I'm still eager to learn more! What am I missing exactly and what sources would you recommend I read?

I'm taking a... (read more)

7ymeskhout
Thank you so much for replying and engaging with my post, I really appreciate it. I admit I should have qualified my assertion and used less polemical wording in that passage. I didn't intend to imply that pregnancy must mandate feelings of dysphoria among transmen, my overall point in that paragraph was collateral to that issue either way. I'm always eager to learn more! I have a habit of finding myself go down some deep research rabbit holes, and this post definitely was not an exception. I made an earnest attempt to find good sources on trans experience (e.g. looking up trans philosophers and reading their work) and I reached out to many people to discuss further. Obviously this is a touchy subject but it was disappointing to encounter so many people averse to a critical discussion on the topic. If you have any sources you believe I should be familiar with, please send them my way! I admit, this is extremely confusing to me. I've met many self-identified women (trans and otherwise) that did not prefer female-gendered terms, prompting plenty of inadvertent social gaffes on my end. I've since learned not to assume and to be more mindful about asking for people's specific term preferences but this dovetails into your latter point about relying on categories as a modeling tool which I already touched upon in my post: My kingdom for some specifics! If someone invited me to model their "personality, desires, actions, etc." based on just their gender category, you're bound to see my face take on the appearance of a stuck boot-up screen. I've tried to really think hard and introspect about how exactly I would treat someone differently based on their gender category and the most reliable heuristic I could think of was "in conversation, don't bring up video games or guns when talking to women." Hilariously this guidance is completely off the mark with the transwomen I've hung out with. Beyond that very crude and unreliable guidance, I remain rudderless and find gender

Orthogonality in design states that we can construct an AGI which optimizes for any goal. Orthogonality at runtime would be an AGI design that would consist of an AGI which can switch between arbitrary goals while operating. Here, we are only really talking about the latter orthogonality

This should not be relegated to a footnote. I've always thought that design-time orthogonality is the core of the orthogonality thesis, and I was very confused by this post until I read the footnote.

5beren
Fair point. I need to come up with a better name than 'orthogonality' for what I am thinking about here -- 'well factoredness?' Will move the footnote into the main text.

There are tactics I have available to me which are not oriented towards truthseeking, but instead oriented towards "raising my status at the expense of yours". I would like to not use those tactics, because I think that they destroy the commons. I view "collaborative truth seeking" as a commitment between interlocutors to avoid those tactics which are good at status games or preaching to the choir, and focus on tactics which are good at convincing.

Additionally,

I can can just ... listen to the counterarguments and judge them on their merits, without getting

... (read more)

I walked through some examples of Shapley Value here, and I'm not so sure it satisfies exactly what we want on an object level. I don't have a great realistic example here, but Shapley Value assigns counterfactual value to individuals who did in fact not contribute at all, if they would have contributed were your higher-performers not present. So you can easily have "dead weight" on a team which has a high Shapley Value, as long as they could provide value if their better teammates were gone.

my friend responds, "Your beliefs/desires, and ultimately your choice to study English, was determined by the state of your brain, which was determined by the laws of physics. So don't feel bad! You were forced to study English by the laws of physics."

The joke I always told was

"I choose to believe in free will, because if free will does exist, then I've chosen correctly, and if free will doesn't exist, then I was determined to choose this way since the beginning of the universe, and it's not my fault that I'm wrong!"

Reductionism means that "Jones" and "Smith" are not platonic concepts. They're made out of parts, and you can look at the parts and ask how they contribute to moral responsibility.

When you say "Smith has a brain tumor that made him do it", you are conceiving of Smith and the brain tumor as different parts, and concluding that the non-tumor part isn't responsible. If you ask "Is the Smith-and-brain-tumor system responsible for the murder", the answer is yes. If you break the mind of Jones into parts, you could similarly ask "Is the visual cortex of Jones re... (read more)

One adversarial prior would be "my prior for bets of expected value X is O(1/X)".

No public estimates, but the difficulty of self driving cars definitely pushed my AGI timelines back. In 2018 I predicted full self driving by 2023; now that’s looking unlikely. Yes, the advance in text and image understanding and generation has improved a lot since 2018, but instead of shortening my estimates that’s merely rotated which capabilities will come online earlier and which will wait until AGI.

However, I expect some crazy TAI in the next few years. I fully expect “solve all the millennium problems” to be doable without AGI, as well as much of coding/design/engineering work. I also think it’s likely that text models will be able to do the work of a paralegal/research assistant/copywriter without AGI.

Additionally, if you have a problem which can be solved by either (a) crime or (b) doing something complicated to fix it, your ability to do (b) is higher the smarter you are.

It would be nice to have a couple examples comparing concrete distributions Q and P and examining their KL-divergence, why it's large or small, and why it's not symmetric.

1CallumMcDougall
I think some of the responses here do a pretty good job of this. It's not really what I intended to go into with my post since I was trying to keep it brief (although I agree this seems like it would be useful).

If I were making one up, I might say "g distributes over composition of f".

2niplav
I like it!

It would be better if it was merely an organization that merely had contradictory goals (maybe a degrowth anarcho-socialist group? A hardcore anti-science christian organization?) but wasn't organized around the dislike of our group specifically.

3[comment deleted]

More likely, the AI just finds a website with a non-compliant GET request, or a GET request with a SQL injection vulnerability.

1nem
So in your opinion, is an AI with access to GET requests essentially already out of the box?

I agree that before that point, an AI will be transformative, but not to the point of “AGI is the world superpower”.

That’s what people used to say about chess and go. Yes, mathematics requires intuition, but so does chess; the game tree’s too big to be explored fully.

Mathematics requires greater intuition and has a much broader and deeper “game” tree, but once we figure out the analogue to self-play, I think it will quickly surpass human mathematicians.

2Archimedes
Sure. I’m not saying it won’t happen, just that an AI will already be transformative before it does happen.

GPT-4 (Edited because I actually realize I put way more than 5% weight on the original phrasing): SOTA on language translation for every language (not just English/French and whatever else GPT-3 has), without fine-tuning.

Not GPT-4 specifically, assuming they keep the focus on next-token prediction of all human text, but "around the time of GPT-4": Superhuman theorem proving. I expect one of the millennium problems to be solved by an AI sometime in the next 5 years.

4Archimedes
AI solving a millennium problem within a decade would be truly shocking, IMO. That’s the kind of thing I wouldn’t expect to see before AGI is the world superpower. My best guess coming from a mathematics background is that dominating humanity is an easier problem to for an AI.

"metis" is an ancient greek word/goddess which originally meant "magical cunning", which drifted throughout ancient greek culture to mean something more like "wisdom/prudence/the je ne sais quoi of being able to solve practical problems". 

James C. Scott uses it in his book Seeing Like a State to mean the implicit knowledge passed down through a culture.

How did you decide where the y-intercept for Huang’s law should be? It seems that even if you fix the slope to 25x per 5 years, the line could still be made to fit the data better by giving it a different y-intercept.

2Marius Hobbhahn
The comparison lines (dotted) have completely arbitrary y-intercepts. You should only take the slope seriously. 

I think it's not necessarily the case that free-market pairwise bargaining always leads to the Shapley value. 10Y = X -Y has an infinite number of solutions, and the only principled ways I know of for choosing solutions is either Shapley value or the fact that in this scenario, since there are no other jobs, the owner should be able to negotiate X and Y down to epsilon.

8tgb
It looks like Shapley values satisfy an equilibrium property that should take into account more than just pairwise bargaining. Specifically, there is no subset of participants that can gain more than the Shapley values by excluding the others (assuming that v satisfies [superadditivity](https://en.wikipedia.org/wiki/Shapley_value#Stand-alone_test), i.e. a group is always at least as valuable as it's subsets individually added together). We can prove this: First, by induction see that ∑R⊂Sw(R)=v(S) for any S. And by superadditivity, w(S)≥0 for all S. Then we can do: ∑i∈RvS(i)=∑i∈R∑R′⊂S1R′(s)w(R′)/|R′|≥∑i∈R∑R′⊂R1R′(i)w(R′)/|R′| So then ∑i∈RvS(i)≥∑R′⊂Rw(R′)=∑R′⊂Rw(R′)=v(R). That means that the total value produced by the subset R is going to be less than (or equal to) the total of the Shapley values they obtain from participating in the whole group. Therefore, they can't possibly all profit by excluding anyone since there's not enough profit to go around. Presumably this is well known and has a name. It's basically a direct extension of the 'stand-alone test' that Wikipedia lists, so maybe it's the 'stand-together test'? So that makes me think Shapley values are what you might get after multi-party bargaining arrives at equilibrium. This a pretty amazing topic, and great selections of examples to explore!

If you apply Shapley value to a situation where there are more workers than jobs, regardless of how many businesses the jobs are split between, people who can't get jobs still have a nonzero Shapley value based on their counterfactual contribution to the enterprises they could be working at. 

1Alexander Gietelink Oldenziel
Sure. That seems very sensible to me. I don't see the problem?

Unfortunately, sometimes your body doesn't give you a choice! If you use caffeine once a week, maybe you can avoid acclimating to it, but in my experience, drinking black tea went from "whoa, lots of caffeine" to "slight boost" over ~2 years of drinking it 5 days/week.

2mako yass
And you haven't been able to reset your tolerance with a break? Or would it not be worth it? (I can't provide any details about what the benefits would be sry)

My experience:

  • If you don't currently drink caffeinated drinks, an entire coffee is probably overkill for you right now. Start slow and build your way up. I find black tea and caffienated sparkling water (e.g. https://www.drinkaha.com/products/mango-black-tea) to be good pre-coffee drinks (which also don't have sugar)
  • Try to aim caffeine intake for periods where you can be heads-down working. I find caffeine less effective when I'm running meetings or being interrupted constantly.
2mako yass
Why work your way up at all? The lower you can keep your tolerance, the better, I'd guess? I don't intend on ever switching away from my sencha/japanese green tea.

I think this is covered by preamble item -1: “None of this is about anything being impossible in principle.”

3lc
You're probably right, I just confused myself. I think it'd be more helpful to explain why it'd be hard to engineer an honest AGI in that section because that's the relevant part, even if you're just pointing back to another section.

High dimensional spaces are unlikely to have local optima, and probably don’t have any optima at all.

Just recall what is necessary for a set of parameters to be at a optimum. All the gradients need to be zero, and the hessian needs to be positive semidefinite. In other words, you need to be surrounded by walls. In 4 dimensions, you can walk through walls. GPT3 has 175 billion parameters. In 175 billion dimensions, walls are so far beneath your notice that if you observe them at all it is like God looking down upon individual protons.

If there’s any randomne

... (read more)
8Daniel Kokotajlo
Wait, how is it possible for there to be no optimum at all? There's only a finite number of possible settings of the 175 billion parameters; there has to be at least one setting such that no other setting has lower loss. (I don't know much math, I'm probably misunderstanding what optimum means.)
  1. DallE2 is bad at prepositional phrases (above, inside) and negation. It can understand some sentence structure, but not reliably.

  2. In the first example, none of those are paragraphs longer than a single sentence.

  3. In the first example, the images are not stylistically coherent! The bees are illustrated inconsistently from picture to picture. They look like they were drawn by different people working off of similar prompts and with similar materials.

  4. The variational feature is not what I’m talking about; I mean something like “Draw a dragon sleeping o

... (read more)

DallE2 is bad at prepositional phrases (above, inside) and negation. It can understand some sentence structure, but not reliably.

Goalpost moving. DALL-E 2 can generate samples matching lots of complex descriptions which are not 'noun phrases', and GLIDE is even better at it (also covered in the paper). You said it can't. It can. Even narrowly, your claim is poorly supported, and for the broader discussion this is in the context of, misleading. You also have not provided any sources or general reasons for this sweeping assertion to be true, or for the br... (read more)

Here's another possible explanation: The models aren't actually as impressive as they're made out to be. For example, take DallE2. Yes, it can create amazingly realistic depictions of noun phrases automatically. But can it give you a stylistically coherent drawing based on a paragraph of text? Probably not. Can it draw the same character in three separate scenarios? No, it cannot.

DallE2 basically lifts the floor of quality for what you can get for free. But anyone who actually wants or needs the things you can get from a human artist cannot yet get it from an AI.

See also, this review of a startup that tries to do data extraction from papers: https://twitter.com/s_r_constantin/status/1518215876201250816

Meta: I disagree with Alex's decision to delete Gwern's comment on this answer. People can reasonably disagree about the optimal balance between 'more dickish' (leaves more room for candor, bluntness, and playfulness in discussions) and 'less dickish' (encourages calm and a focus on content) in an intellectual community. And on LW, relatively high-karma users like Alex are allowed to moderate discussion of their posts, so Alex is free to promote the balance he thinks is best here.

But regardless of where you fall on that spectrum, I think LW should have a s... (read more)

[+][comment deleted]180

 As other commenters have said, approximating integer ratios is important.

  • 1:2 is the octave
  • 2:3 is the perfect fifth
  • 3:4 is the perfect fourth
  • 4:5 is the major third
  • 5:6 is the minor third

and it just so happens that these ratios are close to powers of the 12th root of 2. 

  • 2^(12/12) is the octave
  • 2^(7/12) is the perfect fifth
  • 2^(5/12) is the perfect fourth
  • 2^(4/12) is the major third
  • 2^(3/12) is the minor third

You can do the math and verify those numbers are relatively close.

It's important to recognize that this correspondence is relatively recently discov... (read more)

  1. Even the political party which would be disincentivized by the phrasing to answer "yes" to the above question answered with a 11% positive rate. If we accept the discrepancy of the results being purely because of the engineered phrasing, then we have at least 22% of the population agreeing with committing violence in order to preserve their notion of what this country ought to be.
  2. This is an isolated demand for rigor.  Your post uses a tweet and your personal feelings as evidence for the fact that "perhaps 1-5%" of the population supports a communist v
... (read more)
2lc
1. I mostly agree, and that's why I said the results were still interesting. In fact, both numbers should be lower bounds; it says at least that 33% of conservatives are willing to tell surveyors they'd support violence. I would crowdfund a study similarly phrased so that left-wing respondents can say yes; I could imagine it coming up with 15%, 40%, 30%, but I don't really know and I think I would have probably undershot it a few days ago. If the left and the right are equally bellacostic, then that would mean we'd expect a third of the country to support violence, not a fifth. 2. Perhaps I'm being partial, but I don't think I'm making an isolated demand for rigor. It's not a general critique of the methods of scientific polling as much as punching in the effects of the wind. If you had said "I see Facebook posts by conservatives making similar threats all the time' I wouldn't have had a problem with that. On the other hand, if I had said something like "well, you need a meta-analysis of different results by different polling companies before you can make that claim", then that would of course be unfair because I'd have no a priori expectation that the further evidence would move away from the already established conclusion. In this case however we do have the ability to conclude that the study is biased based on mechanical details, so we can and should adjust our GPS coordinates. I do, by the way, totally believe conservatives on the whole are more willing to use violence, mostly because they have a self image as political underdogs who have been shut out of large institutions like Hollywood/News media/Education by elites and have "no other option". I'm more scared of communists because I think their extremists are more capable, and that might reflect a bias on my part.

“Republicans (30%) are approximately three times as likely as Democrats (11%) to agree with the statement, “Because things have gotten so far off track, true American patriots may have to resort to violence in order to save our country.” Agreement with this statement rises to 40% among Republicans who most trust far-right news sources and is 32% among those who most trust Fox News. One in four white evangelicals (26%) also agree that political violence may be necessary to save the country”

https://www.prri.org/press-release/competing-visions-of-america-an-e... (read more)

8philh
Note that someone agreeing with Might not support any kind of uprising. They may be thinking of using violence to defend against an uprising from the other side, or even against a foreign aggressor. Like, I think of "support for a violent uprising" as being something like "the country is fucked and we need to use violence to make it better". But someone answering yes to that question might be more like "the country is pretty okay but we might need to use violence to keep it that way".
3lc
This was in the answer section for a while, and I deleted it for that reason. Then it was moved to the comments section, and it seems like an admin might have moved it. Note to lesswrong devs: 1. It'd be nice if I could do that myself. 2. The delete reasons should probably show up on answers like they do comments.
3lc
In case you didn't notice, the phrasing ("true American patriots", "gotten so far off track") is engineered to give the discrepancy highlighted found in the press release. The survey itself is interesting but it's propaganda.

Programming languages: If they were written idiomatically and quickly, you can absolutely tell the difference between a list of primes generated by a Python vs C program. Hint: Python and C have different default numeric types.

I think that the main barrier is that I have no idea how to read this book.

It looks like a forum to me. Do I read all the posts, or are some of them comments from readers? Do I need to pay attention to usernames, post times, history of each comment?

There are no reader comments -- this is a "glowfic", the whole thread is the story. Each "comment" is a portion of the story. You need read it all. 

You do need pay attention to the usernames because they often identify which character is speaking (e.g. "Keltham", or "Carissa Sevar"), and an approximation of their facial expression via the graphics.

(The post times and history of each comment are not important in-story, the "history" are just edits made to that story portion, the post times are just when it was posted)

One of the authors is "Iarwain" (who is EY) the other author is "lintamande".

5Robert Kennedy
Good questions! It's a forum with posts between two users "Iarwain" (Yudkowsky) and "lintamande", who is co-authoring this piece. There are no extraneous posts, although there are (very rarely) OOC posts, for instance announcing the discord or linking to a thread for a side story. In each post, either user will post as either a character (ie Keltham, Carissa, and others - each user writes multiple characters) or without a character (for 3rd person exposition). I usually use the avatars when possible to quickly identify who is speaking. You don't need to pay attention to history or post time, until your catch up to the current spot and start playing the F5 game (they are writing at a very quick pace).

There was a critical moment in 2006(?) where Hinton and Salakhutdinov(?) proposed training Restricted Boltzmann machines unsupervised in layers, and then 'unrolling' the RBMs to initialize the weights in the network, and then you could do further gradient descent updates from there, because the activations and gradients wouldn't explode or die out given that initialization.  That got people to, I dunno, 6 layers instead of 3 layers or something? But it focused attention on the problem of exploding gradients as the reason why deeply layered neural nets

... (read more)
Load More