LESSWRONG
LW

All of Sandi's Comments + Replies

Transformers Represent Belief State Geometry in their Residual Stream

Yep, that's what I was trying to describe as well. Thanks!

Transformers Represent Belief State Geometry in their Residual Stream

We do this by performing standard linear regression from the residual stream activations (64 dimensional vectors) to the belief distributions (3 dimensional vectors) which associated with them in the MSP.

I don't understand how we go from this to the fractal. The linear probe gives us a single 2D point for every forward pass of the transformer, correct? How do we get the picture with many points in it? Is it by sampling from the transformer while reading the probe after every token and then putting all the points from that on one graph?

Is this result equiva... (read more)

3Adam Shai1y

I should have explained this better in my post. For every input into the transformer (of every length up to the context window length), we know the ground truth belief state that comp mech says an observer should have over the HMM states. In this case, this is 3 numbers. So for each input we have a 3d ground truth vector. Also, for each input we have the residual stream activation (in this case a 64D vector). To find the projection we just use standard Linear Regression (as implemented in sklearn) between the 64D residual stream vectors and the 3D (really 2D) ground truth vectors. Does that make sense?

larger language models may disappoint you [or, an eternally unfinished draft]

Sandi3y10

Very comprehensive, thank you!

larger language models may disappoint you [or, an eternally unfinished draft]

Sandi3y10

Epistemic status: I'm not familiar with the technical details of how LMs work, so this is more word association.

You can glide along almost thinking "a human wrote this," but soon enough, you'll hit a point where the model gives away the whole game. Not just something weird (humans can be weird) but something alien, inherently unfitted to the context, something no one ever would write, even to be weird on purpose.

What if the missing ingredient is a better sampling method, as in this paper? To my eye, the completions they show don't seem hugely better.... (read more)

5nostalgebraist3y

I've tried the method from that paper (typical sampling), and I wasn't hugely impressed with it. In fact, it was worse than my usual sampler to a sufficient extent that users noticed the difference, and I switched back after a few days. See this post and these tweets. (My usual sampler one I came up with myself, called Breakruns. It works the best in practice of any I've tried.) I'm also not sure I really buy the argument behind typical sampling. It seems to conflate "there are a lot of different ways the text could go from here" with "the text is about to get weird." In practice, I noticed it would tend to do the latter at points where the former was true, like the start of a sample or of a new paragraph or section. Deciding how you sample is really important for avoiding the repetition trap, but I haven't seen sampling tweaks yield meaningful gains outside of that area.

Quick Thoughts on A.I. Governance

Sandi3y30

How many of the decision makers in the companies mentioned care about or even understand the control problem? My impression was: not many.

Coordination is hard even when you share the same goals, but we don't have that luxury here.

An OpenAI team is getting ready to train a new model, but they're worried about it's self improvement capabilities getting out of hand. Luckily, they can consult MIRI's 2025 Reflexivity Standards when reviewing their codebase, and get 3rd-party auditing done by The Actually Pretty Good Auditing Group (founded 2023).

Current OpenAI ... (read more)

2Nicholas / Heather Kross3y

At one point (working off memory here), Sam Altman (leader of OpenAI) didn't quite agree with the orthogonality thesis. After some discussion and emailing with someone on the Eleuther discord (iirc), he shifted to agree with it more fully. I think. This ties into my overall point of "some of this might be adversarial, but first let's see if it's just straight-up neglected along some vector we haven't looked much at yet".

Humans pretending to be robots pretending to be human

Sandi3y110

TL;DR: Thought this post was grossly misleading. Then I saw that the GPT3 playground/API changed quite a lot recently in notable and perhaps worrying ways. This post is closer to the truth than I thought but I still consider it misleading.

Initially strongly downvoted since the LW post implies (to me) that humans provide some of the GPT3 completions in order to fool users into thinking it's smarter than it is. Was that interpretation of your post more in the eye of the beholder?

Nested three layers deep is one of two pieces of actual evidence:

InstructGPT is

... (read more)

2[comment deleted]3y

We got what's needed for COVID-19 vaccination completely wrong

Sandi4y10

The Kefauver-Harris Drug Amendments of 1962 coincide with a drop in the rate of life-span increase.

I believe that, but I couldn't find a source. Do you remember where you got it from?

3ChristianKl4y

https://www.statista.com/statistics/1040079/life-expectancy-united-states-all-time/ is the chart for the US lifespan. For the timeframe from 1880 to 1960 it looks like a straight line with the expection for the First World War / Spanish flu. Our medical system is now so broken that lifespan dropped from 2015 to 2020.

Inaccessible finely tuned RNG in humans?

Sandi5y20

I wonder if, in that case, your brain picks the stopping time, stopping point or "flick" strength using the same RNG source that is used when people just do it by feeling.

What if you tried a 50-50 slider on Aaronson's oracle, if it's not too exhausting to do it many times in a row? Or write down a sequence here and we can do randomness tests on it. Though I did see some tiny studies indicating that people can improve at generating random sequences.

Inaccessible finely tuned RNG in humans?

Sandi5y10

Hm, could we tell apart yours and Zack's theories by asking a fixed group of people for a sequence of random numbers over a long period of time, with enough delay between each query for them to forget?

Inaccessible finely tuned RNG in humans?

Sandi5y20

I seriously doubt the majority of the participants in these casual polls are doing anything like that.

Inaccessible finely tuned RNG in humans?

Sandi5y10

This occurred to me, but I didn't see how it could work with different ratios. I guess if you have a sample from a variable with a big support (> 100 events) that's uniformly distributed, that would work (e.g. if x is your birth date in days, then x/365 < 20 would work).

It would be interesting to test this with a very large sample where you know a lot of information about the respondents and then trying to predict their choice.

Inaccessible finely tuned RNG in humans?

Sandi5y10

Well, I'm quite satisfied with that. Thank you!

Rationality for Kids?

Sandi5y10

Here's an Android game that works like Zendo but has colorful caterpillars, might be great for kids: https://play.google.com/store/apps/details?id=org.gromozeka1980.caterpillar_logic

1Alex Vermillion4y

I'm 2-3 times the age of the students involved and I love that game enough to recognize it from the name in the link. I highly recommend using this or even a version with beads to teach kids. Explanation on the beads comment: I imagine a game where you allow kids to put beads on a string and each strand is finished with either a gold or silver bead depending on whether it passes or fails. Tie them off and let them arrange them however they need to see the patterns. I would play this bead game. Further exploration: Try the bead game with Lego, which seems obvious in retrospect but might have flaws I don't see.

3ChristianKl5y

When teaching it to a group of children I would likely not do it via an app but with physical items and drawings. It makes sense to switch the kind of items with which you play it between sessions to increase the generalizability of the learning.

Open thread, July 31 - August 6, 2017

Sandi8y20

What would be the physical/neurological mechanism powering ego depletion, assuming it existed? What stops us from doing hard mental work all the time? Is it even imaginable to, say, study every waking hour for a long period of time, without ever having an evening of youtube videos to relax? I'm not asking what the psychology of willpower is, but rather if there's a neurology of willpower?

And beyond ego depletion, there's a very popular model of willpower where the brain is seen as a battery, used up when hard work is being done and charged when relaxing. I... (read more)

0gjm8y

I have a hazy memory that there's some discussion of exactly this in Keith Stanovich's book "What intelligence tests miss". Unfortunately, my memory is hazy enough that I don't trust it to say accurately (or even semi-accurately) what he said about it :-). So this is useful only to the following extent: if Sandi, or someone else interested in Sandi's question, has a copy of Stanovich's book or was considering reading it anyway, then it might be worth a look.

1IlyaShpitser8y

Imo: legislative gridlock of the congress inside your head (e.g. a software issue). Unclear if a problem or not.

2ChristianKl8y

I don't think either of those explanations is true but writing out my alternative theory and doing it full justice is a longer project. I think part of the problem is that "hard mental work" is a category that's very far from a meaningful category on the physical/neurological level. Bad ontology leads to bad problem modeling and understanding.

1Lumifer8y

The question about willpower depletion is different from the question about mental fatigue and you tend to conflate the two. Which one do you mean?

Open thread, May 8 - May 14, 2017

Sandi8y00

What does TapLog lack, besides a reminder feature? It seems pretty nifty from the few screenshots I just saw.

0HungryHippo8y

TapLog is very nifty, it's simply that it would be even better with a somewhat extended feature set. Here's one use case: I want to log my skin picking and skin care routine (morning/evening). The first is easy. I just add a button to my home screen that increments by one every time I click it (which is every time I touch my face with my fingers). After a while I can plot number of picks each day, or month, or cumulative, etc. It's very nice. Logging my skin care routine is more difficult, since TapLog does not support lists. (Only quantity, and/or text-input [with an optional prompt], and/or gps position, for a single entry) What I would like is for TapLog to let me predefine a list of items (shave, cleanse, moisturizer) then give me a push notification in the morning and/or evening requesting me to check off each item. (If you use something like Wunderlist with a daily repeat of the list, it is very fragile. If you miss a couple of days you have to reset the date for the reminder, because there's no way for unfinished lists to simply disappear unless you actually check them off. And in Wunderlist there's no way to analyze your list data to see how well you did last month, etc.)

0ChristianKl8y

TapLog is designed for entering one piece of data at a time. If you have a checklist with 10 items and on average 5 are "yes" you have to do 10 clicks. Basically "click 1 yes" "back" "click 2 yes" "back" "click 3 yes" "back" "click 1 yes" "back" "click 1 yes" "back" and "click 5 yes" "back". If you have a Google form it only takes half as much clicks. Besides pure click counting it's also nice to see the checklist of 10 items together before clicking send to make sure that everything is right.

Open thread, May 8 - May 14, 2017

Sandi8y00

Yeah, that's why I kept comparing it to a spreadsheet. Ease of use is a big point. I don't want to write SQL queries on my phone.

0Lumifer8y

The point is, this kind of problems is the wheel that every starting coder feels the need to reinvent. How much innovation there is in linking an on-screen UI element like a button with a predefined SQL query? (eh, don't answer that, I'm sure there is a patent for it :-/) Sure, you may want a personalized app that is set up just right for you, but in that case just pick the right framework and assemble your app out of Lego blocks. You don't need to build your own plastic injection moulding machinery.

Open thread, May 8 - May 14, 2017

Sandi8y00

Thanks! I didn't know this was such a developed concept already and that there are so many people trying to measure stuff about themselves. Pretty cool. I'll check out Quantified Self and what's linked.

0ChristianKl8y

I just went through my app list and found https://play.google.com/store/apps/details?id=org.odk.collect.android&hl=en

0ChristianKl8y

The Quantified Self facebook group is https://www.facebook.com/groups/quantifiedself/ . It might be another good place to get answer to your question.

Introducing the Instrumental Rationality Sequence

Sandi8y10

That is indeed very low weight. My prior is pretty shaky as-is, but that evidence shouldn't move it much.

I thought about priming a lot while reading. Many of the results he lists are similar to priming, but priming being false doesn't mean all results similar to it are false. One could consider a broader hypothesis encompassing all that, namely "humans can be influenced by subtle clues to their subconsciousness to a significant degree". That's the similarity I see with priming, both it and many of Caldini's hypothesis follow from this premise. Th... (read more)

Open thread, May 8 - May 14, 2017

Sandi8y90

I have a neat idea for a smartphone app, but I would like to know if something similar exists before trying to create it.

It would be used to measure various things in one's life without having to fiddle with spreadsheets. You could create documents of different types, each type measuring something different. Data would be added via simple interfaces that fill in most of the necessary information. Reminders based on time, location and other factors could be set up to prompt for data entry. The gathered data would then be displayed using various graphs and c... (read more)

0HungryHippo8y

Your post reads as if you read my mind. :) I currently use a mix between TapLog (for Android) and google forms (with an icon on my home screen so that it mimics a locally installed app). Neither feels as if they really solve my needs, though. E.g. both lack a reminder feature.

0whpearson8y

You could probably cobble something together with google forms. https://docs.google.com/forms/u/0/ This can record the time and date of when the form was filled in. And it can go into a spreadsheet for easy analysis

3ChristianKl8y

I used to be very active in the Quantified Self community in the past but currently I still follow the Facebook group. As far as I know there's no app that does a good job at this task. For background research you might check out: https://gyrosco.pe/ http://www.inputbox.co/#/start http://www.reporter-app.com/ http://brainaid.com/

2Viliam8y

Gleeo Time Tracker lets you define categories, and then use one click to start or stop tracking the category. You can edit the records and include more specific descriptions in them. You can export all data to spreadsheet. I use it to track my daily time, on very general level -- how much I sleep, meditate, etc. (Note: When you start integrating with other apps, there are almost unlimited options. You may want to make some kind of plugin system, write a few plugins yourself, and let other users write their own. Otherwise people will bother you endlessly to integrate with their favorite app.)

0Thomas8y

Not radical enough! You not need to see a seagull, it is better to hear a seagull. Especially for the reason your vision angle is not all around and behind the tree. Your hearing is. And when you hear a seagull, your phone hears it too, and there is no need for "a popup insert record into the database Android/Apple gesture widget" shit. It can be done automatically every time! You don't want me to continue, do you?

2Lumifer8y

Looks like a database with some input forms.

Introducing the Instrumental Rationality Sequence

Sandi8y00

Cialdini? I'm finishing "Influence" right now. I was extra skeptical during reading it since I'm freshly acquainted with the replication crisis, but googling each citation and reading through the paper is way too much work. He supports many of his claims with multiple studies and real-life anecdotes (for all that's worth). Could you point me to the criticism of Cialdini you have read?

0[anonymous]8y

Cialdini is based off a comment I think I saw by Scott Alexander along the lines of "everything in Cialdini now seems to be bunk". This is low confidence and I'm happy to revise in light of new info. My priors on Cialdini are mainly based on how priming, which seems similar to many of his claims, doesn't replicate well.

Open thread, Apr. 17 - Apr. 23, 2017

Sandi8y20

The SSC article about omega-6 surplus causing criminality brought to my attention the physiological aspect of mental health, and health in general. Up until now, I prioritized mind over body. I've been ignoring the whole "eat well" thing because 1) it's hard, 2) I didn't know how important it was and 3) there's a LOT of bullshit literature. But since I want to live a long life and I don't want my stomach screwing with my head, the reasonable thing to do would be to read up. I need book (or any other format, really) recommendations on nutrition 1... (read more)

2morganism8y

You Can’t Trust What You Read About Nutrition http://fivethirtyeight.com/features/you-cant-trust-what-you-read-about-nutrition/ "Some populations today thrive on very few vegetables, while others subsist almost entirely on plant foods. The takeaway, Archer said, is that our bodies are adaptable and pretty good at telling us what we need, if we can learn to listen."

0Viliam8y

How Not to Die, and the videos at https://nutritionfacts.org/

0Lumifer8y

Nutrition is pretty messy. I'd recommend self-experimentation (people are different), but if you want a book, something like Perfect Health Diet wouldn't be a bad start. It sounds a bit clickbaity, but it's a solid book.

0Benquo8y

Holden's Powersmoothie page is a decent short review, not comprehensive, not very detailed.

0MrCogmor8y

Nutrition is taught in colleges to so people become qualified to become accredited dieticians. You should be able to find a decent undergrad textbook on Amazon. If you get used and an edition behind the current one it should be cheap as well. https://www.amazon.com/Nutrition-for-Foodservice-and-Culinary-Professionals/dp/0470052422/ref=cm_cr_dp_d_rvw_txt?ie=UTF8

Open thread, March 13 - March 19, 2017

Sandi8y00

I have two straight-forward empirical questions for which I was unable to find a definitive answer.

1) Does ego depletion exist? There was a recent meta-study that found a negligible effect, but the result is disputed.

2) Does visualizing the positive outcome of a endeavor help one achieve it? There are many popular articles confirming this, but I've found no studies in either direction. My prediction is no, it doesn't, since the mind would feel like it already reached the goal after visualizing it, so no action would be taken. It has been like this in my personal experience, although inferring from personal experience is incredibly unreliable.

0Sjcs8y

As a bit of a tangent to 2) Certainly using visualisation as practice has some evidence (especially high-fidelity visualisation increasing performance at comparable rates to actual practice; one course I've been to advocated for the PETLEPP model in the context of medical procedures/simulation) - in this sense it may help achieving an endeavor but 1. It's got nothing (much) to do with positive visualisation and 2. It feels like its moving the goal-posts by interpreting the 'endeavor' as 'performing better'. I've definitely also heard people discussing positive and negative visualisation as tools for emotional stabilisation and motivation - although the more persuasive (read: not sounding like new age/low brow self help BS) usually favour using both together or just negative visualisation - see gjm's and Unnamed's posts

0MrMind8y

1) We still don't know yet. If we are not observing some statistical noise, then it's possible that it's either bimodal (some have it, some don't) or it has a very weak effect. 2) Visualizing only the positive outcome, as far as I know, doesn't work. There's an interesting book about it: Rethinking positive thinking, by G. Oettingen. I've only skimmed it though, and I don't know how sound are the citations.

0Elo8y

1. we don't know either way. It seems that believing it exists causes your ego to be depleted though. 2. it probably relates to the original context in which you do the visualisation. You have given one example of a context where conflicting results might come out, there are several similar situations, so it's hard to know. I would feel safe saying that it seems to work some of the time for some people.

5gjm8y

On #2, I've seen it claimed -- but have no idea how good the science behind it is -- that better than visualizing positive or negative outcomes alone is doing both and paying attention to the contrast. "If I do X, then the result will look like Y. If I don't do X, the result will look like Z. Wow, Y is much better than Z: better get on with doing X".

Welcome to Less Wrong! (11th thread, January 2017) (Thread B)

Sandi8y10

Depending on where you are in your life and education, you could consider enrolling in graduate school.

If I've managed to translate "graduate school" to our educational system correctly, then I currently am in undergraduate school. Our mileages vary by quite a bit, most people I meet aren't of the caliber. Also, it's hard to find out if they are. Socially etiquette prevents me from bringing up the heavy hitting topics except on rare occasions.

I guess I should work on my social skills then cast a bigger net. The larger the sample, the better od... (read more)

0g_pepper8y

In that case, you could look for clubs and organizations to join at your university. If you are in engineering or natural sciences, there will probably be a professional/academic organization for your sub discipline you could join (e.g. IEEE for electrical engineers, ACS for chemistry majors, ACM for computer science, etc.) I would imagine that mathematics and liberal arts have similar organizations as well. And, attend the meetings and functions. You could also look for other organizations on campus such as political organizations, cultural organizations, a cinema society (if you are a film enthusiast), etc. No guarantees that these will lead to intellectual conversations, but the people who join and participate in these type of organizations tend to be (on average) more intellectual than those who do not. And, as Grothor suggested, look for nearby LessWrong meetups (if any).

Welcome to Less Wrong! (11th thread, January 2017) (Thread B)

Sandi8y00

I'm not 100% clear as to where the non-ambitious posts should go, so I will write my question here.

Do you know of a practical way of finding intellectual friends, so as to have challenging/interesting conversations more often? Not only is the social aspect of friendship in general invaluable (of course I wouldn't be asking here if that was the sole reason), but I assume talking about the topics I care and think about will force me to flesh them out and keep me closer to Truth, and is a great source of novelty. So, from a purely practical standpoint (althou... (read more)

0Richard Korzekwa 8y

(Also, the place to ask this sort of question might be the current Open Thread: http://lesswrong.com/r/discussion/lw/ol5/open_thread_feb_06_feb_12_2017/)

1g_pepper8y

Depending on where you are in your life and education, you could consider enrolling in graduate school. I found that I tended to have intellectual conversations with my fellow students and professors in graduate school. Plus you will have at least one common interest with your fellow students - whatever subject you are studying in school. Grad school is too big of a commitment just to find intellectual friends. But, if you have an interest in grad school to advance your education or career, then meeting intellectual friends is an added benefit. Finally, even if you are working and do not wish to go back to school full time, many universities offer a master's program that you can enroll in on a part-time basis. As a part-time student you will have less contact with your fellow students and therefore fewer chances to make friends, etc., but this can be overcome with a little effort to socialize, attend events, host small dinner parties, etc. I do this too. I don't think that it is abnormal - I agree with you that it can be a useful way to think through issues. I once worked with a more senior engineer who was also a personal friend and mentor. But, his job was demanding and he was always quite busy. So, when I needed his help to solve some problem, I would think about what sorts of questions he would ask, so that I could be prepared to answer them - basically, I would play out the (probable) conversation in my head ahead of time to avoid wasting his time. More often than not, this process would yield the answer to the problem, and I would end up not having to bother him at all.