All of Jalex S's Comments + Replies

doing anything requires extra effort proportional to the severity of symptoms

that's what the entire post is about?

1Archimedes
Yes, but with a very different description of the subjective experience -- kind of like getting a sunburn on your back feels very different than most other types of back pain.

Seems like a bad joke, and accordingly I have decreased trust that bhauth posts won't waste the reader's time in the future.

Getting oxygen from the moon to LEO requires less delta V than going from the Moon to LEO!

I think there might be a typo? 

2harsimony
Oops yes, that should read "Getting oxygen from the moon to LEO requires less delta V than going from the Earth to LEO!". I edited the original comment.

possibly an easier entry point to the topic is here
https://en.wikipedia.org/wiki/Chaitin%27s_constant

which is a specific construction that has some relation to the ideas OP has for a construction

MoviePass was paying full price for every ticket.

2Dzoldzaya
Ah, thanks, okay, I get it now. That's a very different proposition! Updated my post.

Well what's the appropriate way to act in the face of the fact that I AM sure I am right?

  1. Change your beliefs
  2. Convince literally one specific other person that you're right and your quest is important, and have them help translate for a broader audience

I agree that my suggestion was not especially helpful.

I think a generic answer is "read the sequences"? Here's a fun one
https://www.lesswrong.com/posts/qRWfvgJG75ESLRNu9/the-crackpot-offer

1MadHatter
I have read the sequences. Not all of them, because, who has time.  Here is a video of me reading the sequences (both Eliezer's and my own): https://bittertruths.substack.com/p/semi-adequate-equilibria

I think "read the sequences" is an incredibly unhelpful suggestion. It's an unrealistic high bar for entry. The sequences are absolutely massive. It's like saying "read the whole bible before talking to anyone at church", but even longer. And many newcomers already understand the vast bulk of that content. Even the more helpful selected sequences are two thousand pages.

We need a better introduction to alignment work, LessWrong community standards, and rationality. Until we have it, we need to personally be more helpful to aspiring community members.

See Th... (read more)

5MadHatter
And now I have my own Sequence! I predict that it will be as unpopular as the rest of my work.
8MadHatter
I agree that that is an extremely relevant post to my current situation and general demeanor in life. I guess I'm not willing to declare the alignment problem unsolvable just because it's difficult, and I'm not willing to let anyone else claim to have solved it before I get to claim that I've solved it? And that inherently makes me a crackpot until such time as consensus reality catches up with me or I change my mind about my most deeply held values and priorities. Are there any other posts from the sequences that you think I should read?

With regards to subsidizing: all the subsidizer needs to do in order to incentivize work on P is sell shares of P. If they are short P when P is proven, they lose money -- this money in effect goes to the people who worked to prove it. 

 

To be more concrete:
Suppose P is trading at 0.50. I think I can prove P with one hour of work.  Then an action available to me is to buy 100 shares of P, prove it, and then sell them back for $50 of profit. 
But my going fee is $55/hour, so I don't do it. 
Then a grantmaker comes along and offers to sell some shares at $0.40. Now the price is right for me, so I buy and prove and make $60/hr. 

3boazbarak
I was asked about this on Twitter. Gwern’s essay deserves a fuller response than a comment but I’m not arguing for the position Gwern argues against. I don’t argue that agent AI are not useful or won’t be built. I am not arguing that humans must always be in the loop. My argument is that tool vs agent AI is not so much about competition but specialization. Agent AIs have their uses but if we consider the “deep learning equation” of turning FLOPs into intelligence, then it’s hard to beat training for predictions on static data. So I do think that while RL can be used forAI agents, the intelligence “heavy lifting” (pun intended) would be done by non-agentic tool but very large static models. Even “hybrid models” like GPT3.5 can best be understood as consisting of an “intelligence forklift” - the pretrained next-token predictor on which 99.9% of the FLOPs were spent on building - and an additional light “adapter” that turns this forklift into a useful Chatbot etc

I think part of your point, translated to local language is "GPTs are Tool AIs, and Tool AI doesn't necessarily become agentic"

Thank you! You’re right. Another point is that intelligence and agency are independent, and a tool AI can be (much) more intelligent than an agentic one.

IMO those issues are all very minor, even when summed.

Is that relevant? Imagine that we were discussing the replacement of a ramp with stairs. This has a very minor effect on my experience -- is that enough to conclude the change was benign?

This is an example where the true distribution of future prices is bimodal (with the average between the modes). If all you can do is buy or sell stock, then you actually have to disagree with the market about the distribution to make money. 

Without having information about the probability of default, there might still be something to do based on the vol curve.

Because the phenomenon happens at the tokenization level, GPT at runtime can't, like, "perceive" the letters. It has no idea that "SolidGoldMagikarp" looks similar to "SolidSoldMagikarp" (or the reverse)

it would be 3 lines

~all of the information is in lines 2 and 3, so you'd get all of the info on the first screen if you nix line 1.~

edit: not sure what I was thinking -- thanks, Slider

[This comment is no longer endorsed by its author]Reply
3Slider
departure annoucements are not a thing?

For those who care, it's open source and you can host your own server from a docker image. (In addition to the normal "just click buttons on our website and pay us some money to host a server for you" option)

I think that to get the type of the agent, you need to apply a fixpoint operator. This also happens inside the proof of Löb for constructing a certain self-referential sentence.
(As a breadcrumb, I've heard that this is related to the Y combinator.)

2Adam Jermyn
Good catch, fixed!

Who do you suppose will buy them?

5ChristianKl
FTX had $1 billion yearly revenue in 2021 which is a 1000% year-over-year improvement. 20 billion market cap for FTX is a 20x revenue multiple. 20x revenue multiples for startups aren't uncommon and 1000% year-over-year improvement means startup.  I think that revenue is mostly trading fees which are a reasonable service. The speculative profits seem to be made by Alameda. You can argue with all startups about whether or not they are overpriced but it's qualitatively different than Madoff hair which has zero market value. FTX was working on new financial products like prediction markets as well, so even in a crypto downturn, they could expand to new markets.
-2M. Y. Zuo
So what does money in a bank account, in electronic form?, have to do with the diminishing returns of intelligence?

My knee-jerk reaction to the argument was negative, but now I'm confused enough to say something.

If the contract is trading for M$p, then the "arbitrage" of "buy a share of yes and cause the contract to settle to 1" nets you M$(1-p) per share. Pushing up the price reduces the incentive for a new player to hit the arb.  

If you sell the contract, you are paying someone to press the button and hoping they do not act on their incentive. 

An interesting plot twist: after you buy the contract, your incentive has changed -- instead of the M$(1-p) availab... (read more)

Can you say more about what you mean by "steering mechanisms"? Is it something like "outer loop of an optimization algorithm"?

How complex of a computation do you expect to need in order to find an example where cut activations express something that's hard to find in layer activations? 

Money in the account per year is not fuzzy; it is literally a scalar for which the ground truth is literally a number stored in a computer.

-2M. Y. Zuo
Did you reply to the right comment? The last topic discussed was 13d ago on the diminishing returns of  intelligence;

If you convince people to live there, then there's more places for people to live and the population growth rate goes up. Many folks care about this goal, though idk whether it's interesting to you specifically.

The US isn't short on places to live, it's short on places to live that are short drives from the people and businesses you most want to interact with. If you want to found a new city, there are cheaper and more desireable places to do it; the difficulty comes from the fact that very few people want to go somewhere that doesn't already have a large critical mass of people, business and infrastructure already in place.

Writing my dating profile was a good use of my time before I shared it with anybody. I had an insufficiently strong sense of what kind of relationship I want and why other people might want to have it with me. The exercise of "make a freeform document capturing all of that" was very helpful for focusing my mind towards figuring it out -- much moreso than the exercise of "fill in dating app textboxes in a way that seems competitive for the swiping game". (This is just a special case of "writing an essay teaches you a lot" -- something I'd like to take advan... (read more)

Which trade are you advocating for? "long crypto"? Reversion? (akak "buying the dip") Long ETH vs. short BTC? 

All of these are plausible opinions, and it's not crazy to allocate some of your portfolio based on them -- but a trade consists of a price and a size. Do you think you should have 0.1% of your net worth in ETH or 30%? Does that change if ETH goes to 100 or 3000 next week? Do your arguments apply equally well elsewhere? (solana?)

 

2mukashi
I am just saying long ETH. I am not giving any financial advice because everyone has a different risk profile. In my personal case, I am comfortable having around 30 or 40% of my net worth in cryptos (specifically ETH). I do not think the same arguments can be applied to other cryptos (I do have some SOL and I am not buying anymore for now)

It's a piece of fiction about someone using a funky language model tool to write autobiographical fiction.

Specifically, the punchline is using the repetition trap as an emotive ending bang. (Which is clever but also something that will be lost on most people who have not used large models personally, because users usually scrub or restart repetition-trap samples while tweaking the repetition penalty & other sampling parameters to minimize it as part of basic prompt engineering.)

If you launch the nukes, you also die, and we spend a lot of time worrying about that. Why?

1Not Relevant
We actually don’t worry about that that much. Nothing close to the 60s, before the IAEA and second strike capabilities. These days we mostly worry about escalation cycles, i.e. unpredictable responses by counter parties to minor escalations and continuously upping the ante to save face. There isn’t an obvious equivalent escalation cycle for somebody debating with themselves whether to destroy themselves or not. (The closer we get to alignment, the less true this is, btw.)

So you have a crisp concept called "unbounded utility maximizer" so that some AI systems are, some AI systems aren't, and the ones that aren't are safe. Your plan is to teach everyone where that sharp conceptual boundary is, and then what? Convince them to walk back over the line and stay there? 

Do you think your mission is easier or harder than nuclear disarmament?

3Not Relevant
The alignment problem isn’t a political opinion, it’s a mathematical truth. If they understand it, they can and will want to work the line out for themselves, with the scientific community publicly working to help any who want it. Nuclear disarmament is hard because if someone else defects you die. But the point here is that if you defect you also die. So the decision matrix on the value of defecting is different, especially if you know other people also know their cost of defection is high.

I think I get what you're saying now; let me try to rephrase. We want to grow the "think good and do good" community. We have a lot of let's say "recruitment material" that appeals to people's sense of do-gooding, so unaligned people that vaguely want to do good might trip over the material and get recruited. But we have less of that on the think-gooding side, so there's a larger gap of unaligned people who want to think good that we could recruit. 

Does that seem right? 

Where does the Atlas fellowship fall on your scale of "recruits do-gooders" versus "recruits think-gooders"?

I think the most important claim you make here is that trying to fit into a cultural niche called "rationality" makes you a more effective researcher than trying to fit into a cultural niche called "EA". I think this is a plausible claim, (e.g. I feel this way about doing a math or philosophy undergrad degree over doing an economics or computer science undergrad degree) but I don't intuitively agree with it. Do you have any arguments in favor?

5Alex_Altair
Hm, so maybe a highly distilled version of my model here is that EAs tend to come from a worldview of trying to do the most good, whereas rationalists tend to come from a world view of Getting the Right Answer. I think the latter is more useful for preventing AI x-risk. (Though to be very clear, the former is also hugely laudable, and we need orders of magnitude more of both types of people active in the world; I'm just wondering if we're leaving value on the table by not having a rationalist funnel specifically).

Pushing which button? They're deploying systems and competing on how capable those systems are. How do they know the systems they're deploying are safe? How do they define "not-unbounded-utility-maximizers" (and why is it not a solution to the whole alignment problem)? What about your "alignment-pilled" world is different from today's world, wherein large institutions already prefer not to kill themselves?

1Not Relevant
Wait, there are lots of things that aren’t unbounded utility maximizers - just because they’re “uncompetitive” doesn’t mean that non-suicidal actors won’t stick to them. AlphaGo isn’t! The standard LessWrong critique is that such systems don’t provide pivotal acts, but the whole point of governance is not to need to rely on pivotal acts. The difference from this world is that in this world large institutions are largely unaware of alignment failure modes and will thus likely deploy unbounded utility maximizers.

How does that distinguish between AGI and not-yet-AGI? How does that prevent an arms race?

1Not Relevant
An arms race to what? If we alignment-pill the arms-racers, they understand that pushing the button means certain death. If your point is an arms race on not-unbounded-utility-maximizers, yeah afaict that’s inevitable… but not nearly as bad?

Is there any concrete proposal that meets your specification? "don't kill yourself with AGI, please"?

1Not Relevant
Prevent agglomerations of data center scale compute via supply chain monitoring, do mass expert education, create a massive social stigma (like with human bio experimentation), and I think we buy ourselves a decade easily.

I think the impact of little bits of "people engage with the problem" is not significantly positive. Maybe it rounds to zero. Maybe it is negative, if people engaging lightly flood serious people with noisy requests.

Hard research problems just don't get solved by people thinking for five minutes. There are some people who can make real contributions [0] by thinking for ~five hours per week for a couple of months, but they are quite rare.

(This is orthogonal to the current discussion, but: I had not heard of stampy.ai before your comment. Probably you should... (read more)

6Chris_Leong
I'm not suggesting that they contribute towards research, just that if they were able to reliably get things done they'd be able to find someone who'd benefit from a volunteer. But I'm guessing you think they'd waste people's time by sending them a bunch of emails asking if they need help? Or that a lot of people who volunteer then cause issues by being unreliable?

It seems like you're pointing at a model where society can make progress on safety by having a bunch of people put some marginal effort towards it. That seems insane to me -- have I misunderstood you?

7Chris_Leong
Sorry, I don't quite understand your objection? Is it that you don't think these are net-positive, that you think all of these little bits will merely add up to a rounding error or that you think timelines are too short for them to make a difference?

Holden Karnofsky writes about this here
https://www.cold-takes.com/future-proof-ethics/

2Davis_Kingsley
Yeah, I strongly disagree with some of his takes but agree he has a similar thing in mind.

you would be able to drop those activities quickly and find new work or hobbies within a few months. 

 

I don't see it. Literally how would I defend myself? Someone who doesn't like me tells you that I'm doing AI research. What questions do you ask them before investigating me? What questions do you ask me? Are there any answers I can give that meaningfully prove that I never did any such research (without you ransacking my house and destroying my computers?)


re q2: If you set up the bounty, then other people can use it to target whoever they want. ... (read more)

In this system, how do I defend myself from the accusation of "being an AI researcher"? I know some theorems, write some code, and sometimes talk about recent AI papers. I've never tried to train new AI systems, but how would you know?

Have you heard about McCarthyism? 
 

If you had the goal of maximizing the probability of unaligned AI, you could target only "AI researchers" that might contribute to the alignment problem. Since they're a much smaller target than AI researchers at large, you'll kill a much larger fraction of them and reduce their relative power over the future.

1aaq
OP here, talking from an older account because it was easier to log into on mobile. Kill: I never said anything about killing them. Prisoners like this don't pose any immediate threat to anyone, and indeed are probably very skilled white collar workers who could earn a lot of money even behind bars. No reason you couldn't just throw them into a minimum security jail in Sweden or something and keep an eye on their Internet activity. McCarthyism: Communism didn't take over in the US. That provides if anything weak evidence that these kinds of policies can work, even for suppressing much more controversial ideas than preventing the building of an unsafe AI. q1: The hardcore answer would be "Sorry kid, nothing personal." If there was ever a domain where false positives were acceptable losses, stopping unaligned AI from being created in the first place would probably be it. People have waged wars for far less. The softcore answer, and the one I actually believe, is that you're probably a smart enough guy that if such a bounty were announced you would be able to drop those activities quickly and find new work or hobbies within a few months. q2: I mean, you could. You can make a bounty to disincentivize any behavior. But who would have that kind of goal or support such a bounty, much less fund one? If you're worried about Goodhart's law here, just use a coarse enough metric like "gets paid to work on something AI-related" and accept there would be some false positives.

Thanks for all the detail, and for looking past my clumsy questions!

It sounds like one disagreement you're pointing at is about the shape of possible futures. You value "humanity colonizes the universe" far less than some other people do. (maybe rob in particular?) That seems sane to me.

The near-term decision questions that brought us here were about how hard to fight to "solve the alignment problem," whatever that means. For that, the real question is about the difference in total value of the future conditioned on "solving" it and conditioned on "not sol... (read more)

6jbash
In some philosophical sense, you have to multiply the expected value by the estimated chance of success. They both count. But I'm not sitting there actually doing multiplication, because I don't think you can put good enough estimates on either one to make the result meaningful. In fact, I guess that there's a better than 1 percent chance of avoiding AI catastrophe in real life, although I'm not sure I'd want to (a) put a number on it, (b) guess how much of the hope is in "solving alignment" versus the problem just not being what people think it will be, (c) guess how much influence my or anybody else's actions would have on moving the probability[edited from "property"...], or even (d) necessarily commit to very many guesses about which actions would move the probability in which directions. I'm just generally not convinced that the whole thing is predictable down to 1 percent at all. In any case, I am not in fact working on it. I don't actually know what values I would put on a lot of futures, even the 1000 year one. Don't get hung up on the billion dollars, because I also wouldn't take a billion dollars to singlemindedly dedicate the remainder of my life , or even my "working time", to anything in particular unless I enjoyed it. Enjoying life is something you can do with relative certainty, and it can be enough even if you then die. That can be a big enough "work of art". Everybody up to this point has in fact died, and they did OK. For that matter, I'm about 60 years old, so I'm personally likely to die before any of this stuff happens... although I do have a child and would very much prefer she didn't have to deal with anything too awful. I guess I'd probably work on it if I thought I had a large, clear contribution to make to it, but in fact I have absolutely no idea at all how to do it, and no reason to expect I'm unusually talented at anything that would actually advance it. If you ended up enacting a serious s-risk, I don't understand how you could sa

One thing I like about the "dignity as log-odds" framework is that it implicitly centers coordination.

I guess by "civilization" I meant "civilization whose main story is still being meaningfully controlled by humans who are individually similar to modern humans". Other than that, I just mean your current expectations about what that civilization is like, conditioned on it existing.

(It seems like you could be disagreeing with "a lot of people here" about what those futures look like or how valuable they are or both -- I'd be happy to get clarification on either front.)

3jbash
Hmm. I should have asked what the alternative to civilization was going to be. Nailing it down to a very specific question, suppose my alternatives are... 1. I get a billion dollars. My life goes on as normal otherwise. Civilization does whatever it's going to do; I'm not told what. Omega tells me that everybody will suddenly drop dead at some time within 1000 years, for reasons I don't get to know, with probability one minus one in ten million. ... versus... 2. I do not get a billion dollars. My life goes on as normal otherwise. Civilization does whatever it's going to do; I'm not told what. Omega tells me that everybody will suddenly drops dead at some time within 1000 years, for reasons I don't get to know, with probability one minus one in one million. ... then I don't think I take the billion dollars. Honestly the only really interesting thing I can think of to do with that kind of money would be to play around with the future of civilization anyway. I think that's probably the question you meant to ask. However, that's a very, very specific question, and there are lots other hypotheticals you could come up with. The "civilization whose main story is still being meaningfully controlled by humans etc." thing bothers me. If a utopian godlike friendly AI were somehow on offer, I would actively pay money to take control away from humans and hand it to that AI... especially if I or people I personally care about had to live in that world. And there could also be valuable modes of human life other than civilization. Or even nonhuman things that might be more valuable. If those were my alternatives, and I knew that to be the case, then my answer might change. For that matter, even if everybody were somehow going to die, my answer could depend on how civilization was going to end and what it was going to do before ending. A jerkass genie Omega might be withholding information and offering me a bum deal. Suppose I knew that civilization would end bec

Can you be more explicit about the arithmetic? Would increasing the probability of civilization existing 1000 years from now from 10^{-7} to 10^{-6} be worth more or less to you than receiving a billion dollars right now?

2jbash
Do I get any information about what kind of civilization I'm getting, and/or about what it would be doing during the 1000 years or after the 1000 years? On edit: Removed the "by how much" because I figured out how to read the notation that gave the answer.

One of Eliezer's points is that most people's judgements about adding 1e-5 odds (I assume you mean log odds and not additive probability?) are wrong, and even systematically have the wrong sign. 

I mostly know this idea as pre-rigor and post-rigor in mathetmatics:
https://terrytao.wordpress.com/career-advice/theres-more-to-mathematics-than-rigour-and-proofs/

Holden Karnofsky has written some about average quality of life, including talking about that chart.

https://www.cold-takes.com/has-life-gotten-better/

 I think he thinks that the zero point was crossed long before 1900, but I'm not sure.


I think the phrase "10,000 years of misery" is perfectly consistent with believing that the changes were net good due to population growth, and "misery" is pretty much equivalent to "average quality of life". 

I mostly agree with swarriner, and I want to add that writing out more explicit strategies for making and maintaining friends is a public good.

The "case clinic" idea seems good. This sometimes naturally emerges among my friends, and trying to do it more would probably be net positive in my social circles.

2Jarred Filmer
Thank you, I thought so too 😊 And yeah, case clinics have given me a lot of value. If something like it is emerging naturally amoung your friends, then they sound like great friends! If you do try to expressly instantiate a case clinic with the steps I'd be curious to hear how it goes. I've been surprised at the effect setting an explicit format can have on how it feels to be in a group. Something about creating common knowledge on where we're all supposed to be directing our attention (and with what intention), can be really powerful. Thinking about it now I suppose this is how DnD works 😄

Requiring  to be finite is just part of assuming the  form a probability distribution over worlds. I think you're confused about the type difference between the and the utility of . (Where in the context of this post, the utility is just represented by an element of a poset.)

I'm not advocating for or making arguments about any fanciness related to infinitesimals or different infinite values or anything like that.

L is not equal to infinity; that's a type error. L is equal to 1/2 A_0 + 1/4 A_1 + 1/8 A_2 ...

 is a bona fide vector space -- addition behaves as you expect. The points are infinite sequences (x_i) such that  is finite. This sum is a norm and the space is Banach with respect to that norm.

Concretely, our interpretation is that x_i is the probability of being in world A_i.

A utility function is a linear functional, i.e. a map from points to real numbers such that the map commutes with addition. The space of continuous linear functionals... (read more)

1Slider
The presentation tries to deal with unbounded utilities. Assuming ∑i|x1| to be finite exludes the target of investigation from the scope. Supposedly there are multiple text input methods but atleast on the website I can highlight text and use a f(x) button to get math rendering. I don't know enough about the fancy spaces whether a version where the norm can take on transfinite or infinidesimal values makes sense or that the elements are just sequences without a condition to converge. Either (real number times a outcome) is a type for which finiteness check doesn't make sense or the allowable conversions from outcomes to real numbers forces the sum to be bigger than any real number.

The sum we're rearranging isn't a sum of real numbers, it's a sum in . Ignoring details of what  means... the two rearrangements give the same sum! So I don't understand what your argument is.

Abstracting away the addition and working in an arbitrary topological space, the argument goes like this: . For all  Therefore, f is not continuous (else 0 = 1).

2Slider
if ℓ1 is something weird then I don't neccesarily even know that x+y=y+x, it is not a given at all that rearrangement would be permissible. In order to sensibly compare limxn and limyn it would be nice if they both existed and not be infinities. L=limxn=limyn=∞ is not useful for transiting equalities between x and y.
Load More