LESSWRONG
LW

All of mtaran's Comments + Replies

Re: LLMs for coding: One lens on this is that LLM progress changes the Build vs Buy calculus.

Low-power AI coding assistants were useful in both the "build" and "buy" scenarios, but they weren't impactful enough to change the actual border between build-is-better vs. buy-is-better. More powerful AI coding systems/agents can make a lot of tasks sufficiently easy that dealing with some components starts feeling more like buying than building. Different problem domains have different peak levels of complexity/novelty, so the easier domains will start bei... (read more)

Supposing the 1bit LLM paper pans out

mtaran1y51

Perhaps if you needed a larger number of ternary weights, but the paper claims to achieve the same performance with ternary weights as one gets with 16-bit weights using the same parameter count.

Supposing the 1bit LLM paper pans out

Answer by mtaranMar 02, 202410-4

I think this could be a big boon for mechanistic interpretability, since it's can be a lot more straightforward to interpret a bunch of {-1, 0, 1}s than reals. Not a silver bullet by any means, but it would at least peel back one layer of complexity.

6Thomas Kwa1y

It could also be harder. Say that 10 bits of current 16 bit parameters are useful; then to match the capacity you would need 6 ternary parameters, which would potentially be hard to find or interact in unpredictable ways.

Meaning & Agency

mtaran1y10

Wouldn't the granularity of the action space also impact things? For example, even if a child struggles to pick up some object, you would probably do an even worse job if your action space was picking joint angles, or forces for muscles to apply, or individual timings of action potentials to send to separate nerves.

align your latent spaces

mtaran1y30

This is a cool model. I agree that in my experience it works better to study sentence pairs than single words, and that having fewer exact repetitions is better as well. Probably paragraphs would be even better, as long as they're tailored to be not too difficult to understand (e.g. with a limited number of unknown words/grammatical constructions).

One thing various people recommend for learning languages quickly is to talk with native speakers, and I also notice that this has an extremely large effect. I generally think of it as having to do with more of o... (read more)

2bhauth1y

Yes; try tracing the path of that data from conversations, and you should see what systems it would be training.

Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI"

mtaran1y446

A few others have commented about how MSFT doesn't necessarily stifle innovation, and a relevant point here is that MSFT is generally pretty good at letting its subsidiaries do their own thing and have their own culture. In particular GitHub (where I work), still uses Google Workspace for docs/email, slack+zoom for communication, etc. GH is very much remote-first whereas that's more of an exception at MSFT, and GH has a lot less suffocating bureaucracy, and so on. Over the years since the acquisition this has shifted to some extent, and my team (Copilot) i... (read more)

Trying to understand John Wentworth's research agenda

mtaran1y3-2

It'd be great if one of the features of these "conversation" type posts was that they would get an LLM-genererated summary or a version of it not as a conversation. Because at least for me this format is super frustrating to read and ends up having a lower signal to noise ratio.

4habryka1y

I do think at least for a post like this, the likelihood the LLM would get any of the math right is pretty low. I do think some summary that allows people to decide whether to read the thing is pretty valuable, but I think it's currently out of reach to have a summary actually contain the relevant ideas/insights in a post.

ask me about technology

mtaran2y20

You have a post about small nanobots being unlikely, but do you have similar opinions about macroscopic nanoassemblers? Non-microscopic ones could have a vacuum and lower temperatures inside, etc.

3mako yass2y

What can you do with macroscopic nanoassemblers? Usually for a nanostructure to have an effect on human scales, you need a lot of it. If the assemblers are big and expensive, you wont get a lot of it.

0bhauth2y

Different things have different optimal scales. In practice, lithography machines + molds + 3d printers + etc are better ways to make fine detail than controlling lots of tiny robotic arms. Big robotic arms are more cost-effective than small ones.

Goodhart's Law inside the human mind

mtaran2y70

Strong upvote for the core point of brains goodhearting themselves being a relatively common failure mode. I honestly didn't read the second half of the post due to time constraints, but the first rang true to me. I've only experienced something like social media addiction at the start of the Russian invasion last year since most of my family is still back in Ukraine. I curated a Twitter list of the most "helpful" authors, etc., but eventually it was taking too much time and emotional energy and I stopped, although it was difficult.

I think this is related ... (read more)

Response to Holden’s alignment plan

mtaran2y20

Brief remarks:

For AIs we can use the above organizational methods in concert with existing AI-specific training methodologies, which we can't do with humans and human organizations.
It doesn't seem particularly fair to compare all human organizations to what we might build specifically when trying to make aligned AI. Human organizations have existed in a large variety of forms for a long time, they have mostly not been explicitly focused on a broad-based "promotion of human flourishing", and have had to fit within lots of ad hoc/historically conditional

... (read more)

Monthly Roundup #1

mtaran2y10

I grew up in Arizona and live here again now. It has had a good system of open enrollment for schools for a long time, meaning that you could enroll your kid into a school in another district if they have space (though you'd need to drive them, at least to a nearby school bus stop). And there are lots of charter schools here, for which district boundaries don't matter. So I would expect the impact on housing prices to be minimal.

Godzilla Strategies

mtaran2y30

Godzilla strategies now in action: https://simonwillison.net/2022/Sep/12/prompt-injection/#more-ai :)

Coordinate-Free Interpretability Theory

mtaran2y62

No super detailed references that touch on exactly what you mention here, but https://transformer-circuits.pub/2021/framework/index.html does deal with some similar concepts with slightly different terminology. I'm sure you've seen it, though.

Freeloading?

mtaran2y93

Is the ordering intended to reflect your personal opinions, or the opinions of people around you/society as a whole, or some objective view? Because I'm having a hard time correlating the order to anything in my wold model.

3jefftk2y

It's my attempt to order them based on how I think most people view them, though perhaps my model of most people's opinions isn't very good here

6Lone Pine2y

For me it's a spectrum from "not really wrong and shouldn't be seen in moral terms" to "not really wrong and shouldn't be seen in moral terms."

Informal semantics and Orders

mtaran2y50

This is the trippiest thing I've read here in a while: congratulations!

If you'd like to get some more concrete feedback from the community here, I'd recommend phrasing your ideas more precisely by using some common mathematical terminology, e.g. talking about sets, sequences, etc. Working out a small example with numbers (rather than just words) will make things easier to understand for other people as well.

3Q Home2y

I'm bad at math. But I know a topic where you could formulate my ideas using math. I could try to formulate them mathematically with someone's help. I can give a very abstract example. It's probably oversimplified (in a wrong way) and bad, but here it is: You got three sets, A {9, 1} and B {5, -3} and C {4, 4}. You want to learn something about the sets. Or maybe you want to explain why they're ordered A > C > B in your data. You make orders of those sets using some (arbitrary) rules. For example: 1. A {9} > B {5} > C {4}. This order is based on choosing the largest element. 2. A {10} > C {8} > B {2}. This order is based on adding elements. 3. A {10} > C {8} > B {5}. This order is based on this: you add the elements if the number grows bigger, you choose the largest element otherwise. It's a merge of the previous 2 orders. If you want to predict A > C > B, you also may order the orders above: * (2) > (3) > (1). This order is based on predictive power (mostly) and complexity. * (2) > (1) > (3). This order is based on predictive power and complexity (complexity gives a bigger penalty). * (3) > (2) > (1). This order is based on how large the numbers in the orders are. This example is likely useless out of context. But you read the post: so, if there's something you haven't understood just because it was confusing without numbers, then this example should clarify something to you. For example, it may clarify what my post misses to be understandable/open to specific feedback. "No math, no feedback" if this is an irrational requirement it's gonna put people at risk. Do you think there isn't any other way to share/evaluate ideas? For example, here're some notions: * On some level our thoughts do consist of biases. See "synaptic weight". My idea says that "biases" exist on (almost) all levels of thinking and those biases are simple enough/interpretable enough. Also it says that some "high-level thinking" or "high-level knowledge" can be modeled by simple enou

Human Mimicry Mainly Works When We’re Already Close

mtaran2y30

My mental model here is something like the following:

a GPT-type model is trained on a bunch of human-written text, written within many different contexts (real and fictional)
it absorbs enough patterns from the training data to be able to complete a wide variety of prompts in ways that also look human-written, in part by being able to pick up on implications & likely context for said prompts and proceeding to generate text consistent with them

Slightly rewritten, your point above is that:

The training data is all written by authors in Context X. What we w

... (read more)

2johnswentworth2y

Yup, that's the part I disagree with. Prompting could potentially set GPT's internal representation of context to "A lesswrong post from 2050"; the training distribution has lesswrong posts generated over a reasonably broad time-range, so it's plausible that GPT could learn how the lesswrong-post-distribution changes over time and extrapolate that forward. What's not plausible is the "stable, research-friendly environment" part, and more specifically the "world in which AGI is not going to take over in N years" part (assuming that AGI is in fact on track to take over our world; otherwise none of this matters anyway). The difference is that 100% of GPT's training data is from our world; it has exactly zero variation which would cause it to learn what kind of writing is generated by worlds-in-which-AGI-is-not-going-to-take-over. There is no prompt which will cause it to generate writing from such a world, because there is no string such that writing in our world (and specifically in the training distribution) which follows that string is probably generated by a different world. (Actually, that's slightly too strong a claim; there does exist such a string. It would involve a program specifying a simulation of some researchers in a safe environment. But there's no such string which we can find without separately figuring out how to simulate/predict researchers in a safe environment without using GPT.)

Human Mimicry Mainly Works When We’re Already Close

mtaran2y73

Alas, querying counterfactual worlds is fundamentally not a thing one can do simply by prompting GPT.

Citation needed? There's plenty of fiction to train on, and those works are set in counterfactual worlds. Similarly, historical, mistaken, etc. texts will not be talking about the Current True World. Sure right now the prompting required is a little janky, e.g.:

But this should improve with model size, improved prompting approaches or other techniques like creating optimized virtual prompt tokens.

And also, if you're going to be asking the model for som... (read more)

7johnswentworth2y

Those works of fiction are all written by authors in our world. What we want is text written by someone who is not from our world. Not the text which someone writing on real-world Lesswrong today imagines someone in safer world would write in 2050, but the text which someone in a safer world would actually write in 2050. After all, those of us writing on Lesswrong today don't actually know what someone in a safer world would write in 2050; that's why simulating/predicting the researcher is useful in the first place.

Unifying Bargaining Notions (2/2)

mtaran3y34

Please consider aggregating these into a sequence, so it's easier to find the 1/2 post from this one and vice versa.

5Diffractor3y

Task completed.

Sexual Abuse attitudes might be infohazardous

mtaran3y235

Sounds similar to what this book claimed about some mental illnesses being memetic in certain ways: https://astralcodexten.substack.com/p/book-review-crazy-like-us

Any tips for eliciting one's own latent knowledge?

Answer by mtaranJul 14, 202220

If you do get some good results out of talking with people, I'd recommend trying to talk to people about the topics you're interested in via some chat system and then go back and extract out useful/interesting bits that were discussed into a more durable journal. I'd have recommended IRC in the distant past, but nowadays it seems like Discord is the more modern version where this kind of conversation could be found. E.g. there's a slatestarcodex discord at https://discord.com/invite/RTKtdut

YMMV and I haven't personally tried this tactic :)

1MSRayne3y

I do this. I only started trying to do this extraction recently after accruing years of conversations though, and I quickly realized sorting through it was overwhelming and quit... but I need to suck it up and do more. I constantly drop thought-jewels on random friends and then forget them!

Slowing down AI progress is an underexplored alignment strategy

mtaran3y152

Well written post that will hopefully stir up some good discussion :)

My impression is that LW/EA people prefer to avoid conflict, and when conflict is necessary don't want to use misleading arguments/tactics (with BS regulations seen as such).

Predicting Parental Emotional Changes?

mtaran3y62

I agree I've felt something similar when having kids. I'd also read the relevant Paul Graham bit, and it wasn't really quite as sudden or dramatic for me. But it has had a noticeable effect long term. I'd previously been okay with kids, though I didn't especially seek out their company or anything. Now it's more fun playing with them, even apart from my own children. No idea how it compares to others, including my parents.

My vision of a good future, part I

mtaran3y31

Love this! Do consider citing the fictional source in a spoiler formatted section (ctrl+f for spoiler in https://www.lesswrong.com/posts/2rWKkWuPrgTMpLRbp/lesswrong-faq)

Our mental building blocks are more different than I thought

mtaran3y10

Also small error "from the insight" -> "from the inside"

Can you MRI a deep learning model?

Answer by mtaranJun 13, 202220

The most similar analysis tool I'm aware of is called an activation atlas (https://distill.pub/2019/activation-atlas/), though I've only seen it applied to visual networks. Would love to see it used on language models!

Axis oriented programming

mtaran3y-20

As it is now, this post seems like it would fit in better on hacker new, rather than lesswrong. I don't see how it addresses questions of developing or applying human rationality, broadly interpreted. It could be edited to talk more about how this is applying more general principles of effective thinking, but I don't really see that here right now. Hence my downvote for the time being.

Gordon Seidoh Worley3y170

People can write personal posts on LW that don't meet Frontpage standards. For example, plenty of rationalists use LW as their personal blog in one way or another and those posts don't make it to the frontpage but they also end up fitting in here.

Russia has Invaded Ukraine

mtaran3y60

Came here to post something along these lines. One very extensive commentary with reasons for this is in https://twitter.com/kamilkazani/status/1497993363076915204 (warning: long thread). Will summarize when I can get to laptop later tonight, or other people are welcome to do it.

Mildly Photochromic Lenses?

mtaran3y20

Have you considered lasik much? I got it about a decade ago and have generally been super happy with the results. Now I just wear sunglasses when I expect to benefit from them and that works a lot better than photochromatic glasses ever did for me.

The main real downside has been slight halos around bright lights in the dark, but this is mostly something you get used to within a few months. Nowadays I only noticed it when stargazing.

7Razied3y

The true downside of Lasik is the nontrivial risk of permanent eye dryness, this hurts like hell and doesn't really have a cure apart from constantly using eye drops. The bad cases are basically life destroying, my mom had a moderate case of chronic dry eyes and it made her life significantly more unpleasant (she couldn't sleep well and was basically in constant pain during the day).

4jefftk3y

I'm very nervous about my eyes, and deeply unsettled by the idea of eye surgery. Relatedly, I like having a protective layer between my eyes and the world.

mtaran3y10

This seems like something that would be better done as a Google form. That would make it easier for people to correlate questions + answers (especially on mobile) and it can be less stressful to answer questions when the answers are going to be kept private.

2Curious_Cruiser3y

Those are great points! Google forms added.

Consume fiction wisely

mtaran3y70

How is it that authors get reclassified as "harmful, as happened to Wright and Stross"? Do you mean that later works become less helpful? How would earlier works go bad?

2RomanS3y

What I mean: the author's name on the cover can't be used anymore as an indicator of the book's harmfulness / helpfulness. An extreme example is the story of a certain American writer. He wrote some of the most beautiful transhumanist science fiction ever. But when he crashed his car and almost died. He came back wrong. He is now a religious nutjob who writes essays on how transhumans are soul-less children of Satan. And in his new fiction books, transhumanists are stock villains opposed by glorious Christian heroes.

Has anyone had weird experiences with Alcor?

Answer by mtaranJan 11, 2022382

Given that you didn't actually paste in the criteria emailed to Alcor, it's hard to tell how much of a departure the revision you pasted is from it. Maybe add that in for clarity?

My impression of Alcor (and CI, who I used to be signed up with before) is that they're a very scrappy/resource-limited organization, and thus that they have to stringently prioritize where to expend time and effort. I wish it weren't so, but that seems to be how it is. In addition, they have a lot of unfortunate first-hand experience with legal issues arising during cryopreservat... (read more)

8Jaevko3y

Thank you for your kind and thoughtful reply; I really appreciate it. Here's the quote: As you can see, it's a pretty big departure. Given the excellent points you and others raise, I think I will try giving them the benefit of the doubt, and simplify my criteria for Alcor, putting the decision solely in my wife's hands, with the provision that I should be preserved if she is not present and cannot be immediately reached. If I do not get any more seemingly underhanded pushback (them pushing back a little/stating their concerns is fine, but any more sneakily making huge changes would increase my concerns), then I'll write this off to the factors you suggest, and proceed. Thank you!

Gordon Seidoh Worley3y330

+1 on the wording likely being because Alcor has dealt with resistant families a lot, and generally you stand a better chance of being preserved if Alcor has as much legal authority as possible to make that happen. You may have to explain that you're okay with your wife potentially doing something that would have been against your wishes (yes, I realize you don't expect that, but there more than 0% chance it will happen) and result in no preservation when Alcor thinks you would have liked one.

This is actually why I went with Alcor: they have a long record of going to court to fight for patients in the face of families trying to do something else.

mtaran3y30

Downvoted for lack of standard punctuation, capitalization, etc., which makes the post unnecessarily hard to read.

New Year's Prediction Thread (2022)

mtaran3y10

Do you mean these to apply at the level of the federal government? At the level of that + a majority of states? Majority of states weighted by population? All states?

More accurate models can be worse

mtaran3y30

Thanks! Reversed :)

More accurate models can be worse

mtaran3y30

Downvoted for burying the lede. I assumed from the buildup this was something other than what it was, e.g. how a model that contains more useful information can still be bad, e.g. if you run out of resources for efficiently interacting with it or something. But I had to read to the end of the second section to find out I was wrong.

2tailcalled3y

Added TL;DR to the top of the post.

A good rational fiction about IT-inspired magic system?

mtaran3y20

Came here to suggest exactly this, based on just the title of the question. https://qntm.org/structure has some similar themes as well.

The Natural Abstraction Hypothesis: Implications and Evidence

mtaran3y40

Re: looking at the relationship between neuroscience and AI: lots of researchers have found that modern deep neural networks actually do quite a good job of predicting brain activation (e.g. fmri) data, suggesting that they are finding some similar abstractions.

Examples: https://www.science.org/doi/10.1126/sciadv.abe7547 https://www.nature.com/articles/s42003-019-0438-y https://cbmm.mit.edu/publications/task-optimized-neural-network-replicates-human-auditory-behavior-predicts-brain

Teaser: Hard-coding Transformer Models

mtaran3y20

I'll make sure to run it when I get to a laptop. But if you ever get a chance to set the distill.pub article up to run on heroku or something, that'll increase how accessible this is by an order of magnitude.

4Igor Ostrovsky3y

I (not the OP) put it up here for now: https://igor0.github.io/hand/distill/ I'll take it down if MadHatter asks me or once there is an official site.

Teaser: Hard-coding Transformer Models

mtaran3y90

Sounds intriguing! You have a GitHub link? :)

2MadHatter3y

It's very, very rough, but: https://github.com/epurdy/hand

What have your romantic experiences with non-EAs/non-Rationalists been like?

Answer by mtaranDec 05, 2021110

The biggest rationalist-ish issue for me has been my partners not being interested (or actively disinterested) in signing up for cryonics. This has been the case in three multi-year relationships.

4Randomized, Controlled3y

Oh, yeah, both my mom, sister and GF have been entirely uninterested in cryo, but it hasn't caused any issues for me yet. Has it been actively problematic?

Did EcoHealth create SARS-CoV-2?

mtaran3y90

You'd be more likely to get a meaningful response if you sold the article a little bit more. E.g. why would we want to read it? Does it seem particularly good to you? Does it draw a specific interesting conclusion that you particularly want to fact-check?

3jamal3y

It's a long read, but you can skim it. Nicholas Wade is a serious science writer and he has smoking gun evidence. EcoHealth was getting US grants, subcontracted out to Dr Shi at the Wuhan Institute of Virology to insert spike proteins into bat viruses to see what makes them more infectious to humans. Also, here's another more detailed EcoHealth proposal to DARPA, that discusses Gain of Function Research, making bat viruses more infectious to humans, and proposes subcontracting work out to Dr Shi at the Wuhan Institute of Virology. The DARPA proposal was rejected, but EcoHealth just got similar proposals funded through NIAID. This is really smoking gun evidence. They did it. https://drasticresearch.files.wordpress.com/2021/09/main-document-preempt-volume-1-no-ess-hr00118s0017-ecohealth-alliance.pdf

The Blackwell order as a formalization of knowledge

mtaran3y30

I really loved the thorough writeup and working of examples. Thanks!

I would say I found the conclusion section the least generally useful, but I can see how it is the most personal (that kinda why it has a YMMV feel to it for me).

3Alex Flint3y

Thank you for the kind words and feedback. I wonder if the last section could be viewed as a post-garbling of the prior sections...

A Contamination Theory of the Obesity Epidemic

mtaran4y50

Reverse osmosis filters will already be more common in some places that have harder water (and decided that softening it at the municipal level wouldn't be cost-effective). If there was fine grained data available about water hardness and obesity levels, that might provide at least a little signal.

Can someone help me understand the arrow of time?

mtaran4y10

There's a more elaborate walkthrough of the last argument at https://web.stanford.edu/~peastman/statmech/thermodynamics.html#the-second-law-of-thermodynamics

It's part of a statistical mechanics textbook, so a couple of words of jargon may not make sense, but this section is highly readable even without those definitions. To me it's been the most satisfying resolution to this question.

1TAG4y

There are at least two questions here.

"Decision Transformer" (Tool AIs are secret Agent AIs)

mtaran4y10

Nice video reviewing this paper at https://youtu.be/-buULmf7dec

In my experience it's reasonably easy to listen to such videos while doing chores etc.

"Scaling Laws for Autoregressive Generative Modeling", Henighan et al 2020 {OA}

mtaran4y60

https://youtu.be/QMqPAM_knrE is a video by one of the authors presenting on this research

Toy Problem: Detective Story Alignment

mtaran4y30

The problem definition talks about clusters in the space of books, but to me it’s cleaner to look at regions of token-space, and token-sequences as trajectories through that space.

GPT is a generative model, so it can provide a probability distribution over the next token given some previous tokens. I assume that the basic model of a cluster can also provide a probability distribution over the next token.

With these two distribution generators in hand, you could generate books by multiplying the two distributions when generating each new token. This will bia... (read more)

2johnswentworth4y

If anyone tries this, I'd be interested to hear about the results. I'd be surprised if something that simple worked reliably, and it would likely update my thinking on the topic.

[Link] Reddit, help me find some peace I'm dying young

mtaran12y00

Ok, I misread one of gwern's replies. My original intent was to extract money from the fact that gwern gave (from my vantage point) too high a probability of this being a scam.

Under my original version of the terms, if his P(scam) was .1:

he would expect to get $1000 .1 of the time
he would expect to lose $100 .9 of the time
yielding an expected value of $10

Under my original version of the terms, if his P(scam) was .05:

he would expect to get $1000 .05 of the time
he would expect to lose $100 .95 of the time
yielding an expected value of -$45

In the s... (read more)

2gwern12y

Alright then, I accept. The wager is thus: * on 1 January 2013, if CI confirms that she is really dying and has or is in the process of signing up with membership & life insurance, then I will pay you $52; if they confirm the opposite, confirm nothing, or she has made no progress, you will pay me $1000. * In case of a dispute, another LWer can adjudicate; I nominate Eliezer, Carl Shulman, Luke, or Yvain (I haven't asked but I doubt all would decline). * For me, paying with Paypal is most convenient, but if it isn't for you we can arrange something else (perhaps you'd prefer I pay the $52 to a third party or charity). I can accept Paypal or Bitcoin.

[Link] Reddit, help me find some peace I'm dying young

mtaran12y40

Well I still accept, since now it's a much better deal for me!

3philh12y

Um, the way I'm reading this it looks like gwern is taking the position you were originally trying to take?

[Link] Reddit, help me find some peace I'm dying young

mtaran12y70

Done. $100 from you vs $1000 from me. If you lose, you donate it to her fund. If I lose, I can send you the money or do with it what you wish.

6gwern12y

Wait, I'm not sure we're understanding each other. I thought I was putting up $100, and you'd put up $10; if she turned out to be a scam (as judged by CI), I lose the $100 to you - while if she turned out to be genuine (as judged by CI), you would lose the $10 to me.