All of rotatingpaguro's Comments + Replies

Why I'm Moving from Mechanistic to Prosaic Interpretability

I think I prefer Claude's attitude as assistant. The other two look too greedy to be wise.

No, the Polymarket price does not mean we can immediately conclude what the probability of a bird flu pandemic is. We also need to know the interest rate!

Referring to the section "What is Intelligence Even, Anyway?":

I think AIXI is fairly described as a search over the space of Turing machines. Why do you think otherwise? Or maybe are you making a distinction at a more granular level?

1Daniel Tan4mo

Upon consideration I think you are right, and I should edit the post to reflect that. But I think the claim still holds (if you expect intelligence looks like AIXI then it seems quite unlikely you should expect to be able to understand it without further priors)

When you say "true probability", what do you mean?

The current hypotheses I have about what you mean are (in part non-exclusive):

You think some notion of objective, non-observer dependent probability makes sense, and that's the true probability.
You do not think "true probability" exists, you are referencing to it to say the market price is not anything like that.
You define "true probability" a probability that observers contextually agree on (like a coin flip observed by humans who don't know the thrower).

AI #95: o1 Joins the API

Don't Associate AI Safety With Activism

Anton Leicht says evals are in trouble as something one could use in a regulation or law. Why? He lists four factors. Marius Hobbhahn of Apollo also has thoughts. I’m going to post a lot of disagreement and pushback, but I thank Anton for the exercise, which I believe is highly useful.

I think there's one important factor missing: if you really used evals for regulation, then they would be gamed. I trust more the eval when the company is not actually at stake on it. If it was, there would be a natural tendence for evals to slide towards empty box-checking.

rotatingpaguro4mo1214

I sometimes wonder about this. This post does pose the question, but I don't think it gives an analysis that could make me change my mind on anything, it's too shallow and not adversarial.

Monthly Roundup #24: November 2024

Rethinking Laplace's Rule of Succession

I read part of the paper. That there's a cultural difference north-south about honesty and willingness to break the rules matches my experience on the ground.

rotatingpaguro5mo5-2

I find this intellectually stimulating, but it does not look useful in practice, because with repeated i.i.d. data the information in the data is much higher than the prior if the prior is diffuse/universal/ignorance.

6Cleo Nardo5mo

You raise a good point. But I think the choice of prior is important quite often: 1. In the limit of large i.i.d. data (N>1000), both Laplace's Rule and my prior will give the same answer. But so too does the simple frequentist estimate n/N. The original motivation of Laplace's Rule was in the small N regime, where the frequentist estimate is clearly absurd. 2. In the small data regime (N<15), the prior matters. Consider observing 12 successes in a row: Laplace's Rule: P(next success) = 13/14 ≈ 92.3%. My proposed prior (with point masses at 0 and 1): P(next success) ≈ 98%, which better matches my intuition about potentially deterministic processes. 3. When making predictions far beyond our observed data, the likelihood of extreme underlying probabilities matters a lot. For example, after seeing 12/12 successes, how confident should we be in seeing a quadrillion more successes? Laplace's uniform prior assigns this very low probability, while my prior gives it significant weight.

Monthly Roundup #24: November 2024

Italians over time sorted themselves geographically by honesty, which is both weird and damn cool, and also makes a lot of sense. There are multiple equilibria, so let everyone find the one that suits them. We need to use this more in logic puzzles. In one Italian villa everyone tells the truth, in the other…

I can't get access to the paper, anyone has a tip on this?

1rotatingpaguro5mo

I read part of the paper. That there's a cultural difference north-south about honesty and willingness to break the rules matches my experience on the ground.

8gwern5mo

If you simply search the title, you will find many PDFs: https://scholar.google.com/scholar?q=Rule Breaking%2C Honesty%2C and Migration (eg)

AI #90: The Wall

I agree with whay you say about how to maximize what you get out of an interview. I also agree about that discussion vs. debate distinction you make, and I wasn't specifically trying to go there when I used the word "debate", I was just sloppy with words.

I guess you agree that it is friction to create a social norm that you should do a read up of the other person material before engaging in public. I expect less discussions would happen. There is not a clear threshold at how much you should be prepared.

I guess we disagree about how much value do we lose du... (read more)

3AnthonyC5mo

That's a good point about public discussions. It's not how I absorb information, but I can definitely see that.

AI #90: The Wall

The Online Sports Gambling Experiment Has Failed

I see your proposed condition for meaningful debate as bureaucracy that adds friction rather than value.

3AnthonyC5mo

I'm not sure where I'm proposing bureaucracy? The value is in making sure a conversation efficiently adds value for both parties, by not having to spend time rehashing things that are much faster absorbed in advance. This avoids the friction of needing to spend much of the time rehashing 101-level prerequisites. A very modest amount of groundwork beforehand maximizes the rate of insight in discussion. I'm drawing in large part from personal experience. A significant part of my job is interviewing researchers, startup founders, investors, government officials, and assorted business people. Before I get on a call with these people, I look them (and their current and past employers, as needed) up on LinkedIn and Google Scholar and their own webpages. I briefly familiarize myself with what they've worked on and what they know and care about and how they think, as best I can anticipate, even if it's only for 15 minutes. And then when I get into a conversation, I adapt. I'm picking their brain to try and learn, so I try to adapt to their communication style and translate between their worldview and my own. If I go in with an idea of what questions I want answered, and those turn out to not be the important questions, or this turns out to be the wrong person to discuss it with, I change direction. Not doing this often leaves everyone involved frustrated at having wasted their time. Also, should I be thinking of this as a debate? Because that's very different than a podcast or interview or discussion. These all have different goals. A podcast or interview is where I think the standard I am thinking of is most appropriate. If you want to have a deep discussion, it's insufficient, and you need to do more prep work or you'll never get into the meatiest parts of where you want to go. I do agree that if you're having a (public-facing) debate where the goal is to win, then sure, this is not strictly necessary. The history of e.g. "debates" in politics, or between creationists a

AI #90: The Wall

rotatingpaguro5mo61

I somewhat disagree with Tenobrus' commentary about Wolfram.

I watched the full podcast, and my impression was that Wolfram uses a "scientific hat", of which he is well aware of, which comes with a certain ritual and method for looking at new things and learning them. Wolfram is doing the ritual of understanding what Yudkowsky says, which involves picking at the details of everything.

Wolfram often recognizes that maybe he feels like agreeing with something, but "scientifically" he has a duty to pick it apart. I think this has to be understood as a learning process rather than as a state of belief.

5AnthonyC5mo

I can totally believe this. But, I also think that responsibly wearing the scientist hat entails prep work before engaging in a four hour public discussion with a domain expert in a field. At minimum that includes skimming the titles and ideally the abstracts/outlines of their key writings. Maybe ask Claude to summarize the highlights for you. If he'd done that he'd have figured out many of the answers to many of these questions on his own, or much faster during discussion. He's too smart not to. Otherwise, you're not actually ready to have a meaningful scientific discussion with that person on that topic.

rotatingpaguro5mo90

So, should the restrictions on gambling be based on feedback loop length? Should sport betting be broadly legal when about the far enough future?

AnthonyC5mo119

It's a good question. I'd also say limiting mid-game advertising might be a good idea. I'm not really a sports fan in general and don't gamble, but a few months ago I went to a baseball game, and people advertising - I think it was Draftkings? - were walking up and down the aisles constantly throughout the game. It was annoying, distracting, and disconcerting.

Bogdan Ionut Cirstea's Shortform

rotatingpaguro5mo63

current inference scaling methods tend to be tied to CoT and the like, which are quite transparent

Aschenbrenner in Situational Awareness predicts illegible chains of thought are going to prevail because they are more efficient. I know of one developer claiming to do this (https://platonicresearch.com/) but I guess there must be many.

Occupational Licensing Roundup #1

johnswentworth's Shortform

Related, I have a vague understanding on how product safety certification works in EU, and there are multiple private companies doing the certification in every state.

rotatingpaguro6mo30

Half-informed take on "the SNPs explain a small part of the genetic variance": maybe the regression methods are bad?

3johnswentworth6mo

Two responses: * It's a pretty large part - somewhere between a third and half - just not a majority. * I was also tracking that specific hypothesis, which was why I specifically flagged "about 25% of IQ variability (using a method which does not require identifying all the relevant SNPs, though I don't know the details of that method)". Again, I don't know the method, but it sounds like it wasn't dependent on details of the regression methods.

Slightly More Than You Wanted To Know: Pregnancy Length Effects

rotatingpaguro6mo40

Not sure if I missed something because I read quickly, but: all these are purely correlational studies, without causal inference, right?

2JustisMills6mo

They're correlational, though the broad cohorts help - not sure what you can do beyond just canvassing an entire birth cohort and noticing differences. There are possible pitfalls like the decision to induct early being made by people with genes that predict bad outcomes? But I really don't think that's major.

OpenAI defected, but we can take honest actions

rotatingpaguro6mo40

OpenAI is recklessly scaling AI. Besides accelerating "progress" toward mass extinction, it causes increasing harms. Many communities are now speaking up. In my circles only, I count seven new books critiquing AI corps. It’s what happens when you scrape everyone's personal data to train inscrutable models (computed by polluting data centers) used to cheaply automate out professionals and spread disinformation and deepfakes.

Could you justify that it causes increasing harms? My intuition is that OpenAI is currently net-positive without taking into account fu... (read more)

3Remmelt6mo

Appreciating your inquisitive question! One way to think about it: For OpenAI to scale more toward “AGI”, the corporation needs more data, more automatable work, more profitable uses for working machines, and more hardware to run those machines. If you look at how OpenAI has been increasing those four variables, you can notice that there are harms associated with each. This tends to result in increasing harms. One obvious example: if they increase hardware, this also increases pollution (from mining, producing, installing, and running the hardware). Note that the above is not a claim that the harms outweigh the benefits. But if OpenAI & co continue down their current trajectory, I expect that most communities would look back and say that the harms to what they care about in their lives were not worth it. I wrote a guide to broader AI harms meant to emotionally resonate for laypeople here.

Amalthea6mo114

I'd agree the OpenAI product line is net positive (though not super hung up on that). Sam Altman demonstrating what kind of actions you can get away with in front of everyone's eyes seems problematic.

The Hopium Wars: the AGI Entente Delusion

rotatingpaguro6mo30

Ok, that. China seems less interventionist, and to use more soft power. The US is more willing to go to war. But is that because the US is more powerful than China, or because Chinese culture is intrinsically more peaceful? If China made the killer robots first, would they say "MUA-HA-HA actually we always wanted to shoot people for no good reason like in yankee movies! Go and kill!"

Since politics is a default-no on lesswrong, I'll try to muddle the waters by making a distracting unserious figurative narration.

Americans maybe have more of a culture of "if ... (read more)

The Hopium Wars: the AGI Entente Delusion

rotatingpaguro6mo17-10

[Alert: political content]

About the US vs. China argument: have any proponent made a case that the Americans are the good guys here?

My vague perspective as someone not in China neither in the US, is that the US is overall more violent and reckless than China. My personal cultural preference is for US, but when I think about the future of humanity, I try to set aside what I like for myself.

So far the US is screaming "US or China!" while creating the problem in the first place all along. It could be true that if China developed AGI it would be worse, but tha... (read more)

Seth Herd6mo115

I think the general idea is that the US is currently a functioning democracy, while China is not. I think if this continued to be true, it would be a strong reason to prefer AGI in the hands of the US vs Chinese governments. I think this is true despite agreeing that the US is more violent and reckless than China (in some ways - the persecution of the Uigher people by the Chinese government hits a different sort of violence than any recent US government acts).

If the government is truly accountable to the people, public opinion will play a large role in de... (read more)

Matrice Jacobine6mo174

Relevant: China not that interested in developing AGI and substantial factions of Chinese elites are concerned about AI safety

Most arguments for AI Doom are either bad or weak

rotatingpaguro6mo30

I agree it's not a flaw in the grand scheme of things. It's a flaw for using it for consensus for reasoning.

Most arguments for AI Doom are either bad or weak

rotatingpaguro6mo21

I start with a very low prior of AGI doom (for the purpose of this discussion, assume I defer to consensus).

You link to a prediction market (Manifold's "Will AI wipe out humanity before the year 2100", curretly at 13%).

Problems I see with using it for this question, in random order:

It ends in 2100 so the incentive is effectively about what people will believe a few years from now, not about the question. It is a Keynesian beauty contest. (Better than nothing.)
Even with the stated question, you win only if it resolves NO, so it is strategically correct to b

... (read more)

3AnthonyC6mo

(3) is not necessarily a flaw. Every prediction market is an action market unless the outcome is completely outside human influence. If there were a prediction market where a concerned group of billionaires could invest a huge sum on the "No" side of "Will humans solve how to make AGI and ASI safety to ensure continued human thriving?" (or some much better operationalization of the idea), that would be great.

AI #85: AI Wins the Nobel Prize

rotatingpaguro6mo32

This type of issue is a huge effective blocker for people with my level of skills. I find myself excited to write actual code that does the things, but the thought of having to set everything up to get to that point fills with dread – I just know that the AI is going to get something stupid wrong, and everything’s going to be screwed up, and it’s going to be hours trying to figure it out and so on, and maybe I’ll just work on something else. Sigh. At some point I need to power through.

Reminds me of this 2009 kalzumeus quote:

I want to quote a real customer

... (read more)

Advice for journalists

rotatingpaguro6mo24

Ah, sorry for being so cursory.

A common trope about mathematicians vs. other math users is that mathematicians are paranoid persnickety truth-seekers, they want everything to be exactly correct down to every detail. Thus engineers and physicists often perceive mathematicians as a sort of fact-checker caste.

As you say, in some sense mathematicians deal with made-up stuff and engineers with real stuff. But from the engineer's point of view, they deal with mathematicians when writing math, not when screwing bolts, and so perceive mathematicians as "the annoyi... (read more)

3Nathan Young6mo

Hmmmm. I wonder how common this is. This is not how I think of the difference. I think of mathematicians as dealing with coherent systems of logic and engineers dealing with building in the real world. Mathematicians are useful when their system maps to the problem at hand, but not when it doesn't. I should say i have a maths degree so it's possible that my view of mathematicians and the general view are not conincident.

Advice for journalists

rotatingpaguro6mo25

The analogy with mathematicians is very stretched.

3Nathan Young6mo

Hmmm, what is the picture that the analogy gives you. I struggle to imagine how it's misleading but I want to hear.

2Nathan Young6mo

Why do you think it's stretched. It's about the difference between mathematicians and engineers. One group are about relating the real world the other are about logically consistent ideas that may be useful.

Conventional footnotes considered harmful

Eye contact is effortless when you’re no longer emotionally blocked on it

Include, in the cue to each note, a hint as to its content, besides just the ordinal pointer. A one-letter abbreviation, standardised thruout the work, may work well, e.g.:
"c" for citation supporting the marked claim
"d" for a definition of the marked term
"f" for further, niche information extending the marked section
"t" for a pedantic detail or technicality modifying the marked clause
Commit to only use notes for one purpose — say, only definitions, or only citations. State this commitment to the reader.

These don't look like good solutions to me. Just a first impression.

rotatingpaguro7mo83

I don't make eye contact while speaking but fix people while in silence. Were there people like me? Did they managed to reverse this? The way I feel inside is more like I can't think both about the face of someone and what I am saying at once, too many things to keep track of.

0mako yass7mo

This can be quite a bad thing, since a person's face often tells you whether what you're saying is landing for them or whether you need to elaborate on certain points (unless they have a people pleaser complex, in which case they'll just nod and smile always even when they're confused and offended on the inside lmao). The worst I've seen it was this discussion with Avi Loeb where he was lecturing someone who he had a disagreement with and he actually closed his eyes while he was talking and although I'm sure it wasn't fully self-aware about it, it was very arrogant. He was not talking to that person; he must waste a lot of time, in reckonings, retreading old ground without making progress towards reconciliation.

5Chipmonk7mo

Yeah for many people it was hard to both make eye contact and think at the same time. Some of them told me that this changed what they spoke about. Personally I have a very hard time recalling the past or thinking very logically while making eye contact

AI #83: The Mask Comes Off

[Completed] The 2024 Petrov Day Scenario

Is champerty legal in California?

rotatingpaguro7mo1513

I take this as a fun occasion to lose some of my karma in a silly way to remind myself lesswrong karma is not important.

On Measuring Intellectual Performance - personal experience and several thoughts

I want a good multi-LLM API-powered chatbot

A very interesting problem is measuring something like general intelligence. I’m not going to delve deeply into this topic but simply want to draw attention to an idea that is often implied, though rarely expressed, in the framing of such a problem: the assumption that an "intelligence level," whatever it may be, corresponds to some inherent properties of a person and can be measured through their manifestations. Moreover, we often talk about measurements with a precision of a few percentage points, which suggests that, in theory, the measurement should be

... (read more)

2Alexander Gufan7mo

Of course I do.

AI #79: Ready for Some Football

Still 20$/usd

rotatingpaguro8mo50

Does it imply that normally everyone would receive as many spam calls, but the more expensive companies are spending a lot of their budget to actively fight against the spammers?

Yeah, they said this is what happens.

"things that mysteriously don't work in USA despite working more or less okay in most developed countries"

Let my try:

expensive telephone (despite the infamous bell breakup)
super-expensive health
super-expensive college
no universally free or dirty cheap wire transfers between any bank
difficult to find a girlfriend
no gun control (disputed)
crap trai

... (read more)

5Viliam8mo

I would add: * mass transit sucks (depending on the city) * expensive internet connection (probably also location-dependent) * problems with labeling allergens contained in food * too much sugar in food (including the kinds of food that normally shouldn't contain it, e.g. fish) I may be wrong about some things here, but that's kinda my point -- I would like someone to treat this seriously, to separate actual America-specific things from things that generally suck in many (but not all) places across developed countries, to create an actual America-specific list. And then, analyze the causes, both historical (why it started) and current (why it cannot stop). Sorry for going off-topic, but I would really really want someone to write about this. It's a huge mystery to me, and most people don't seem to care; I guess everyone just takes their situation as normal.

AI #79: Ready for Some Football

rotatingpaguro8mo30

I don't really know about this specific proposal to deter spam calls, but speaking in general: I'm from another large first world country, and when staying in the US a striking difference was receiving on average 4 spam calls per day. My american friends told me it was because my phone company was low-cost, but it was O(10) more expensive (per unit data) than what I had back home, with about O(1) spam calls per year.

So I expect that it is totally possible to solve this problem without doing something too fancy, even if I don't know how it's solved where I am from.

3Viliam8mo

OK, that is way too much. I don't understand how specifically that causes more spam calls. Does it imply that normally everyone would receive as many spam calls, but the more expensive companies are spending a lot of their budget to actively fight against the spammers? Neither do I, so I am filing it under "things that mysteriously don't work in USA despite working more or less okay in most developed countries". Someone should write a book about this whole set, because I am really curious about it, and I assume that Americans would be even more curious.

Are most people deeply confused about "love", or am I missing a human universal?

Answer by rotatingpaguroAug 27, 202421

Most people do not have the analytical clarity to be able to give an explanation of love isomorphic to their implementation of love; to that extent, they are "confused about love".

This though does not imply that their usage of the word "love" is amiss, the same way people are able to get through simple reasoning without learning logic, or walking without learning Physics.

So I'll assume that people are wielding "love" meaningfully, and try to infer what the word means.

It seems to indicate prolonged positive emotional involvement with an external entity. Oth... (read more)

How great is the utility of "saving" endangered languages?

I'm reminded of the recent review of How Language Began on ACX: the missionary linguist becomes an atheist because in the local very weird language they have declinations to indicate the source of what you are saying, and saying things about Jesus just doesn't click.

The Ap Distribution

Monthly Roundup #21: August 2024

I still don't understand your "infinite limit" idea. If in your post I drop the following paragraph:

A way to think about the proposition $A_{p}$ is as a kind of limit. When we have little evidence, each bit of evidence has a potentially big impact on our overall probability of a given proposition. But each incremental bit of evidence shifts our beliefs less and less. The proposition $A_{p}$ can be thought of a shorthand for an infinite collection of evidences $F_{i}$ where the collection leads to an overall probability of $p$ given t

... (read more)

rotatingpaguro8mo30

I happened to have the same doubt as you. A deeper analysis of the sacred texts shows how your interpretation of the Golden Rule is amiss. You say:

“Do unto others as you would have them do unto you.” (Matthew 7:12)

But the correct version is:

Therefore whatever you desire for men to do to you, you shall also do to them; for this is the law and the prophets.

The verse speaks specifically of men, not generically of others. So if you are straight, it does not compel you to sexual acts on women, while if you are gay, you shall try to hit on all the men to your he... (read more)

Shortform

rotatingpaguro8mo21

There are too many nonpolar bears in the US to keep up the lie.

Shortform

rotatingpaguro8mo30

I guess the point of the official party line is to avoid kids going and trying to scare polar bears.

5eukaryote8mo

As opposed to other species of bear, which are safe for children to engage with?

The Ap Distribution

rotatingpaguro8mo20

I don't think this concept is useful.

What you are showing with the coin is a hierarchical model over multiple coin flips, and doesn't need new probability concepts. Let $F_{i}$ be the flips. All you need in life is the distribution $P (F_{1}, F_{2}, \dots)$ . You can decide to restrict yourself to distributions of the form $\int_{0}^{1} d p_{coin} P (F, G | p_{coin}) p (p_{coin})$ . In practice, you start out thinking about $p_{coin}$ as a variable atop all the $F_{i}$ in a graph, and then think in terms of $P (F, G | p_{coin})$ and $p (p_{coin})$ separately, because that... (read more)

2criticalpoints8mo

Thanks for the feedback. This isn't emphasized by Jaynes (though I believe it's mentioned at the very end of the chapter), but the Ap distribution isn't new as a formal idea in probability theory. It's based on De Finetti's representation theorem. The theorem concerns exchangeable sequences of random variables. A sequence of random variables {Xi} is exchangeable if the joint distribution of any finite subsequence is invariant under permutations. A sequence of coin flips is the canonical example. Note that exchangeability does not imply independence! If I have a perfectly biased coin where I don't know the bias, then all the random variables are perfectly dependent on each other (they all must obtain the same value). De Finetti's representation theorem says that any exchangeable sequence of random variables can be represented as an integral over identical and independent distributions (i.e binomial distributions). Or in other words, the extent to which random variables in the sequence are dependent on each other is solely due to their mutual relationship to the latent variable (the hidden bias of the coin). P(X1=x1,…,Xn=xn)=∫10(nk)θk(1−θ)n−kdF(θ) You are correct that all relevant information is contained in the joint distribution P(F1,F2,...). And while I have no deep familiarity with Bayesian hierarchical modeling, I believe your claim that the decomposition ∫10 dpcoinP(F,G|pcoin)p(pcoin) is standard in Bayesian modeling. But I think the point is that the Ap distribution is a useful conceptual tool when considering distributions governed by a time-invariant generating process. A lot of real-world processes don't fit that description, but many do fit that description. Yes, this is correct. The part about "the probability of assigning a probability" and the part about interpreting the proposition Ap as a shorthand for an infinite collection evidences are my own interpretations of what the Ap distribution "really" means. Specifically, the part about the "probabi

AI #76: Six Shorts Stories About OpenAI

A group of MR links led to a group of links that led to this list of Obvious Travel Advice. It seems like very good Obvious Travel Advice, and I endorse almost all points.
> A place that has staff trying to flag down customers walking past is almost certainly pursuing a get people in the door strategy.

To my great surprise, I found this to be false in Pisa (n_restaurant = 2).