Thanks. I think I get it now. (at least one of) my confusion was something between confusing a "transformer run" and "number of FLOPS".
And I get the thing about cost, that's what I meant but I articulated it poorly.
Got it, thanks!
But to process the 1001st input token, you also need to load all the 1000 tokens in memory, forming the cache (it does happen in one step though). And for each new output token, you surely don't dump all the existing KV cache after each generation, only to load it again to append an extra KV vectors for the last generated token. So isn't the extra work for output tokens just that the KV cache is accessed, generated, expanded, one token at a time, and that's where the "more work" come from?
Is there any reason why this would imply the ratio of pricing of output:input tokens being commonly something like 3:1?
Thanks for the answer, I appreciate it!
Intuitively, it seems that output tokens should be more expensive. The autoregressive model has to run once for each output token, and as these runs progress, output tokens gradually become a part of the input (so the last token is generated with context being all input and almost all output).
I agree with the intuition, but I think that's where I am confused. Thanks to the KV cache we do not run the new input sequence (previous sequence + last generated token) through the encoders (as we do for the input sequence...
Even though some commenters mentioned some issues with the article, I really want to appreciate the attempt and being upfront with the estimates. It's very relevant for the thing I am now trying to figure out. As I have almost no intuitions about this except about some raw FLOPS, it pointed to important flaws my analysis would have. There are not many public sources that would explain that [are not a book or don't require me reading one-to-many to understand it]
Yes, but to defend (hehe) OP, he seems to be fully aware of that and addresses that explicitly in the linked article (which is also excellent, like this one):
...In part because of those aforementioned stats on the frequency of guilty pleas, public defenders have garnered a reputation for being trial-averse, for pressuring clients to cop a plea just to keep the machine humming along. I think this reputation is ill-deserved. It’s completely counter to my own experience, at least, as few things are talked about with as much awed respect among one’s public-defe
Thanks for the feedback and the encouragement, I will incorporate these.
Btw. for questions 2-4 there is an intentional redundancy.
(slightly tangential) I think people are doing a terrible bucket error with competency, and that is that people overestimate how are others competent across all dimensions. I.e. it's often enough for a person to show great competency in providing vision, and we assume that person also needs to be great in leadership or management, and people are shocked it's not the case. Other examples:
Hey. I decided on a private school with more of a "democratic approach". I unfortunately wasn't able to find suitable tutors etc.
I am also trying to process what ChatGPT-like platforms will do with the landscape. E.g. my partner is using coding almost exclusively with ChatGPT and it's outstanding. Kids gonna follow IMHO.
Thanks Duncan, I really appreciate you posting this, even though you are unsure about how exactly it all fits together. I am still glad to read it in this version, likely because you are quite clear about it, and not "leaving it as an exercise for the reader" to figure out where things do fit together and where they don't (or worse, trying to make it more profound).
All of these might be stating obvious to some of you, but I am trying to clarify my thoughts and maybe some people will find it useful or correct me. At least part of this relates to (by me endo...
Oh, I really enjoyed reading this, this is so LW-rationality-curiosity-boggle-at-things post. Thanks!
Thanks, that's helpful.
Also, kudos for Lily to know active listening and being awesome.
I am curious about how you introduced money to your kids? Do you have some "framework" for that? I did a small research and didn't end up with any really novel ideas (I am happy to share my findings and conclusions, but it's a fairly small page in roam).
Basically, what I want to do with my daughter:
This is an excellent post. I have been doing bets with my 4y old daughter already as well (and I am following your projects for a quite some time already)!
Yeah, that's useful. Agree on the assessment, I want to give it a shot with one of those Bridgelux Vesta Thrive thing, it sounds like a good hobby project I would like to try. If that happens, I would do a post about it here.
By the way, I asked about this setup on reddit. They also recommend some custom COBs, which seems to be the most powerful solution, but isn't as practical as strips.
These look very promising, ship to Europe too. Extra high-CRI, very powerful (up 2600 lumens/m) and even dimmable? Wow. A bit unfortunate they are 5x times as much expensive than other high-CRI high-power led strip.
I am glad there are more posts on this. Are there any reasons why not considering LED strips at all? When installed properly with the "milk" diffuser and as indirect lightning, it's IMO quite nice and effective. They seem to be powerful enough (20W are about 2k lumens/m), can be also found in high-CRI variants, less expensive, various CCT, dimmable, etc. I am considering using them in a new house. Basically, multiple parallel led (with different CCT, like 3000, 4500, 6400) strips diffused against a wall/ceiling, controlled via smart relays and incorporated...
I am so glad this question is here, as it's very relevant to my post a few weeks back about Effective Children Education.
By the way, I recommend following Duncan Sabien (referenced in the post below) on Facebook, he has good posts about children edu, e.g. his speech for sixth-graders (referenced by someone else here - but she picked the good parts).
As mentioned below, Julia Galef also sometimes mentions something related, but I haven't found much
Hi! This is an excellent answer, thanks.
[...] I believe your questions relate to all three ways to different extents (although the title of the post leans towards the HOW types of questions), but I found it useful to differentiate between these issues in order to make sure my time, money and efforts are well spent.
Needless to say that I updated significantly in the past month since I posted this question and the "Why" and "What" has definitely enlarged. I agree with you that it's a useful framework to have. I am also thankful for the practical bits.
...I
The contraception didn't work and it was too late for abortion (should we choose it as an option)
It's great to see some support like this. Not just to help with motivation but also to see what the interest is.
I think there is a massive opportunity in creating a k-12 home schooling version of Lambda School but targeted at general knowledge. Why not start by work together on it?
I am interested as it's probably clear from my question, but I don't think I would be a good fit to actually put it together (or that it would be cost-effective). I would be happy to put and run some structure which could do this. I should emphasize though that I am not try
...Hi. Thanks a lot for a really nice write-up.
It seems that the regulations in the Czech Republic are actually legally "workable", i.e. it's possible to teach kids close to self-directed without having to do a lot of "compulsory curriculum" (i.e. my estimate is <5%). It also seems there is a "subculture" of families doing this and I managed to get to some people who know how to deal with this.
My conclusion is that there is no simple answer.
I don't aim for a simple answer and I do not expect there is some. But as I said, the current system seems so b
...Thanks for the tip and links. Unfortunately, it doesn't show much in Prague (but still gives me a hint about what to look for even if some school isn't registered in the linked project).
The OP seems intent on designing/engineering the perfect education, when the answer from this perspective will require a lot of letting go.
Hm... I don't think I would have issues of having to do so. I am trying to understand how to think about this, and this simply didn't occur to me before. In fact, it seems that me and my girlfriend are currently rather at the side of trying to figure out how to do this in self-directed-way, but it's still in early stages.
Thanks Vil. I agree with Ericf comment that you seem to try to take it more generically than I intended (i.e. I realize that I have resources 99% of the local population doesn't). That said, I fully agree with you on these points.
it takes about 1 hour to teach at home what they teach at school in 1 day
These are good datapoints, thanks.
And yeah, I would hope that with internet and some good courses which would give the kids some "library" of what I could learn + mixed with the self-driven learning wouldn't need a full attention of a tutor.
Thanks for clarifying my questions.
The key point (I'm synthesizing this from How Children Learn and How Children Fail, by John Holt) [...]
It's very much similar to what @Raj mentioned above, am I right? Seems that Holt advocates for self-driven learning, e.g. from a goodreads review:
Holt believes that children learn best when they learn at their own pace and pursue their own interests--learning should never be forced or uniform, but spontaneous and dynamic. Children don't need to be "taught" -- they simply need to be given opportunites to LEARN
Thanks for t
...Hey Raj. Thanks a lot for an insightful post, it's definitely that sort of things I was looking after, regardless if I immediately agree with them or not.
1-self learning: How I read it so far is that instead of selecting "the way" first and optimizing it later, instead it might be a good idea to focus on learning how to learn by yourself first, recognizing what's the most effective in any given case, be it via internet or an actual human resource such as a tutor.
By the way, my solely main motivation for her to know English was the access to much better mat
...Thanks @ericf!
How neurotypical is your child?
She's regular kid, so neurotypical. Goes to an English speaking kindergarten (so she speaks fluently two languages + we are starting with Spanish) where she does above average according to the teachers with behavior and socializing, although she prefer playing with teachers and older kids. Thanks to the fact that one of us didn't have to work, we could spend a lot of time with her in the first 3 years and now she can read simple words and sentences (in English - how I hate its irregularities damn!) and do simple
...In a way that you never "go back" and edit the "immutable" previous writeups, right?
Sorry, a quick question: linear means something like:
while non-linear means
?
Yeah, seems that we use success system by default then. Thanks again!
Thanks, that's pretty interesting, it's good to get some inspiration from it and then replace "inappropriate" words by something from her vocabulary (like princesses, dogs, cats instead of murderer, undead, zombie, ...) :-D . We'll get there, eventually.
Thanks, that's stupid simple, love it. It seems that the little one likes cooperative storytelling a lot, but she doesn't understand the dices and the concept of opposing checks very well. I still do some hoping she picks up, eventually...
Thanks! That's useful, didn't know about it.
Hello. Is it possible for the author to review this and possibly update it? It has been already 4 years. I wonder, if something changed.
I answered to that thread.
And I think I wrote you on Facebook. You should have my message in "others".
I was thinking about today - Tuesday. But it seems a bit in hurry for other to notice. Do you have some date which would suit you? For example next week Wednesday 11?
I am so sorry about not appearing on the meeting - I've got stuck in a train from east for several hours. I should have at least post it here when I knew that I can't make it. I am still really looking forward to meet you guys.
What about meeting on November 3 (Tuesday)?
On this meetup there was a guy from Ostrava. We exchanged numbers and emails and I promised that I will keep in touch. Unfortunately, my mobile phone crashed and I had to reinstall it, loosing my message history. I couldn't find him, since I do not remember the name. The only I remember is that he is doing his doctorate on VSB - FBI. Unfortunately, I couldn't find any name what would remind me him.
So if you are reading this, contact me!
Can I know to who and where the money for the book goes?
From Amazon, 30% goes to Amazon and 70% goes to MIRI.
From e-junkie (the pay-what-you-want option): 100% goes to MIRI, minus PayPal transaction fees (a few %).
It worth to note one more thing - I'm not really skilled Bayesian and rationalist, but I do my best and I'm currently studying. So far I've finished HPMOR, An Abridged Introduction to Less Wrong and now I'm working on core sequences. There I've just finished Map and Territory.
For anyone interested, I've made an ebook variants for myself (epub, mobi, PDF, odt). It is far from awesome, but at least readable on e-book reader. https://www.dropbox.com/sh/6agp4otiukejb0g/AACO-5V1J8i0USBWUFL9nw74a
Hello.
I was searching more about my interests and I've found a opportunity has a nice Bachelor's topic in maths/informatics/neuroscience. I was offered two topics:
Both are connected with neuroscience (e.g the correlation matrix is created by brain activity, variables are activities of different parts of brain etc.)
Does anyone have any informations or advices to this?
Hi.
After some more research and digesting these answers (and some other sources), maybe this is just to heavy for me. But it is really interesting reading and thank you for that.
Thank you for answer.
Could you redirect me to somewhere, where I could find what problems/directions are you talking about? Since I'm not so shining mathematician, maybe I could contribute in these areas, which I found similar interesting.
Thank you. I'm just going to go through the papers publishers. Great idea!
The "mainstream-friendly" stuffs are maybe the middle-path for which I'm looking for, since response from Risto_Saarelma is pretty explanatory about my possibilities.
And it is possible to do similar kind of Bachelor's thesis and I believe it would be possible. That is not a problem. But, to be honest, I'd like to do some work which I find fulfilling even at tiniest amount. I'm doing literature review in my free-time.
Thanks for the answer.
I don't know how I could miss MIRI's course recommendation list. It looks great. Will definitely take a closer look at it.
Second part is a bit disappointment for me, since I'm not that kind of student. I'm in the stronger group of mathematicians in my university, but in that group I'm in or below average (they are one of the best in my country).
Maybe I put too much weight too maths part of AGI, which are obviously aren't for me. And I'm not sure about taking PhD in it right now also. Do I understand correctly that right now there are...
Hello,
I'd like to get some opinions about my future goals.
I'm 21 and I'm a second-year student of engineering in Prague, Czech Republic, focusing mainly on math and then physics.
My background is not stunning - I was born in 93, visiting sporting primary school and then general high school. Until I was in second year of high school, I behaved as an idiot with below-average results in almost everything, paradoxically except extraordinary "general study presupposes" (whatever it means). My not so bad IQ - according to IQ test I took when I was 15 ...
Interesting, thanks!