I disagree. You seem to think that the list of missing technologies sketched by Crawford is exhaustive, but it's not. One example that ties in your conclusions: paper. Maybe the Romans could have invented the printing press, I'm not sure, but printing on super-expensive vellum or papyrus is pointless.
And it's just one example. I make another. The Romans spread and improved watermills, so they were interested in labor-saving technology contra your argument. But their mills were not as good or widespread as modern or even late medieval ones. (mill technology was very important to the industrial revolution as you mention too)
You could also try to fit an ML potential to some expensive method, but it's very easy to produce very wrong things if you don't know what you're doing (I wouldn't be able for one)
Ahh for MD I mostly used DFT with VASP or CP2K, but then I was not working on the same problems. For thorny issues (biggish and plain DFT fails, but no MD) I had good results using hybrid functionals and tuning the parameters to match some result of higher level methods. Did you try meta-GGAs like SCAN? Sometimes they are suprisingly decent where PBE fails catastrophically...
My job was doing quantum chemistry simulations for a few years, so I think I can comprehend the scale actually. I had access to one of the top-50 supercomputers and codes just do not scale to that number of processors for one simulation independently of system size (even if they had let me launch a job that big, which was not possible)
Isn't this a trivial consequence of LLMs operating on tokens as opposed to letters?
True, but this doesn't apply to the original reasoning in the post - he assumes constant probability while you need increasing probability (as with the balls) to make the math work.
Or decreasing benefits, which probably is the case in the real world.
Edit: misred the previous comment, see below
It seems very weird and unlikely to me that the system would go to the higher energy state 100% of the time
I think vibrational energy is neglected in the first paper, it would be implicitly be accounted for in AIMD. Also, the higer energy state could be the lower free energy state - if the difference is big enough it could go there nearly 100% of the time.
Although they never take the whole supercomputer, so if you have the whole supercomputer for yourself and the calculations do not depend on each other you can run many in parallel
That's one simulation though. If you have to screen hundreds of candidate structures, and simulate every step of the process because you cannot run experiments, it becomes years of supercomputer time.
There are plenty of people on LessWrong who are overconfident in all their opionions (or maybe write as if they are, as a misguided rhetorical choice?). It is probably a selection effect of people who appreciate the sequences - whatever you think of his accuracy record, EY definitely writes as if he's always very confident in his conclusions.
Whatever the reason, (rhetorical) overconfidence is most often seen here as a venial sin, as long as you bring decently-reasoned arguments and are willing to change your mind in response to other's. Maybe it's not your...
(Phd in condensed matter simulation) I agree with everything you wrote where I know enough (for readers, I don't know anything about lead contacts and several other experimental tricky points, so my agreement should not be counted too much).
I just add on the simulation side (Q3): this is what you would expect to see in a room-T superconductor unless it relies on a completely new mechanism. But, this is something you see also in a lot of materials that superconduct at 20K or so. Even in some where the superconducting phase is completely suppressed by mage...
Is there something that would regularise the vectors towards constant norm? An helix would make a lot of sense in this case. Especially one with varying radius, like in some (not all) the images
I don't think it would change your conclusion but your kettle was not very scaly. My gets much worse than that, with the resistence entirely covered by a thick layer, despite descaling 3-4 times per year. It depends on the calc content of your tap water. I still don't think it affects energy use (maybe?), but the taste can be noticeable and I feel tea is actually harder to digest if I put off the descaling.
Also, you can use citric acid instead of vinegar. Better for the environment, less damaging to the kettle and it doesn't smell :)
Well stated. I would go even further: the only short timeline scenario I can immagine involves some unholy combination of recursive LLM calls, hardcoded functions or non-LLM ML stuff, and API calls. There would probably be space to align such a thing. (sort of. If we start thinking about it in advance.)
Isn't that the point of the original transformer paper? I have not actually read it, just going by summaries read here and there.
If I don't misremember RNN should be expecially difficult to train in parallel
That seem reasonable, but it will probably change a number of correct answers (to tricky questions) as well if asked whether it's certain. One should verify that the number of incorrect answers fixed is significantly larger than the number of errors introduced.
But it might be difficult to devise a set of equally difficult questions for which the first result is different. Maybe choose questions where different instances give different answers, and see if asking a double check changes the wrong answers but not the correct ones?
Good post, thank you for it. Linking this will save me a lot of time when commenting...
However I think that the banking case is not a good application. When one bank fails, it makes much more likely that other banks will fail immediately after. So it is perfectly plausible that two banks are weak for unrelated reasons, and that when one fails this pushes the other under as well.
The second one does not even have to be that weak. The twentieth could be perfectly healthy and still fail in the panic (it's a full blown financial crisis at this point!)
It's not clear here, but if you read the linked post it's spelled out (the two are complementary really). The thesis is that it's easy to do a narrow AI that knows only about chess, but very hard to make an AGI that knows the world, can operate in a variety of situations, but only cares about chess in a consistent way.
I think this is correct at least with current AI paradigms, and it has both some reassuring and some depressing implications.
I always thought Hall's point about nanotech was trivially false. Nanotech research like he wanted it died out in the whole world, but he explains it by US-specific factors. Why didn't research continue elsewhere? Plus, other fields that got large funding in Europe or Japan are alive and thriving. How comes?
That doesn't mean that a government program which sets up bad incentives cannot be worse than useless. It can be quite damaging, but not kill a technologically promising research field worldwide for twenty years.
The point about incouraging safe over innovative research is on spot though. Although the main culprits are not granting agencies but tying researcher careers to the number of peer reviewed papers imo. The main problem with the granting system is the amount of time wasted in writing grant applications.
That was quite different though (spoiler alert)
A benevolent conspiracy to hide a dangerous scientific discovery by lying about the state of the art and denying resources to anyone whose research might uncover the lie. Ultimately failing because apparently unrelated advances made rediscovering the true result too easy.
I always saw it as a reply to the idea that physicists could have hidden the possibility of an atomic bomb for more than a few years.
The example in the beginning is a perfect retelling of my interaction with transformers too :D
However, a word of caution: sometimes the efficient thing is actually to skim and move on. If you spend the effort to actually understand a topic which is difficult but limited in scope, but then you don't interact with it for a year or two, what you remember is just the high-level verbal summary (the same as if you stopped at the first step). For example, I have understood and forgotten MOSFET transistors at least three times in my life, and each time it was more or less the same effort. If I had to explain them now, I would retreat to a single shallow-level sentence.
They commented without reading the post I guess...
I think having an opinion on this requires much more technical knowledge than GPT4 or DALLE 3. I for one don't know what to expect. But I upvoted the post, because it's an interesting question.
I agree with you actually. My point is that in fact you are implicitly discounting EY pessimism - for example, he didn't release a timeline but often said "my timeline is way shorter than that" with respect to 30-years ones and I think 20-years ones as well. The way I read him, he thinks we personally are going to die from AGI, and our grandkids will never be born, with 90+% probability, and that the only chances to avoid it is that are either someone having a plan already three years ago which has been implemented in secret and will come to fruition next ...
I like the idea! Just a minor issue with the premise:
"Either I’d find out he’s wrong, and there is no problem. Or he’s right, and I need to reevaluate my life priorities."
There is a wide range of opinions, and EY's has one of the most pessimistic ones. It may be the case that he's wrong on several points, and we are way less doomed than he thinks, but that the problem is still there and a big one as well.
(In fact, if EY is correct we might as well ignore the problem, as we are doomed anyway. I know this is not what he thinks, but it's the consequence I would take from his predictions)
I think that you need to distinguish two different goals:
Addendum: if you want to bring legislation more in line with voters' preferences issue by issue, avoiding the distortion from coalition building, Swiss-style referenda seem to work to an acceptable degree http://www.lesswrong.com/posts/x6hpkYyzMG6Bf8T3W/swiss-political-system-more-than-you-ever-wanted-to-know-i
The biggest obstacle to your idea is, I think, the executive. In parlamentary systems the government answers to the parliament, and needs MPs support to continue - indeed, the Israeli maneuvering that you cite is related to making the government collapse, not to political parties. So as a first thing, you need a presidential system. But even then, MPs would probably organize as for or against the president - I imagine that the president's role in drafting and proposing legislation would be even higher than in present day US, as the coordination of MPs via ...
You are correct (QM-based simulation of materials is what I do). The caveat is that exact simulations are so slow that they are impossible, that would not be the case with quantum computing I think. Fortunately, we have different levels of approximation for different purposes that work quite well. And you can use QM results to fit faster atomistic potentials.
Note that there could still be some priors on some functions being more probable, or some more complex case being plainly impossible to fit because there's no way to get there from the meta-model that is the trained NN.
I am left wondering if when GPT3 does few-shot arithmetics, it is actually fitting a linear model on the examples to predict the next token. I.e. the GPT3 weights do not "know" arithmetics, but they know how to fit, and that's why they need a few examples before they can tell you the answer to 25+17: they need to know what function of 25 and 17 to return.
It is not that crazy given my understanding of what a transformer does, which is in some sense returning a function of the most recent input which depends on earlier inputs. Or am I confusing them with a different NN design?
Ahh sorry! Going back to read it was pretty clear from the text. I was tricked by the figure where the embedding is presented first. Again, good job! :)
Cool work!
Can I ask a couple of questions about the DR+clustering approach?
If I understand correctly, you do the clustering in a 2D space obtained with UMAP (ignore this if I am wrong). Are you sure you are not losing important information with such a low dimension? I say this because you show that one dimension is strongly correlated with style (academic vs forum/blog) and the second may be somewhat correlated with time. I remember that an argument exists for using n-1 dimensions when looking for n clusters, although that was probably using linear D...
I agree. In fact, you could say that Mélenchon and le Pen are closer to each other on economic and possibly foreign policy, and very far from Macron. So not unreasonable that some votes would transfer from one to the other. Huge differences on everything else of course (immigration, but also law and order, education, culture, ...) I disagree on Hollande and generally center-left. Hollande had to juggle a very broad coalition as you say. He ended up hated by everyone because his way to handle it was not finding a middle ground, but campaigning as Mélenchon ...
I think if you look up antifragile investment you find a lot of discussion of exactly this problem. As far as I understand, the idea is that most investments have limited downsides (at most, you lose what you put in) but may have limitless upsides in low-probability scenarios. Then you can make many small investments of this kind, so that when ones pays off, it's more than enough to pay you back from the loss of the rest. Taking your example of the nuclear bunker, if you could build one with 1% of your wealth or less, in this frame of mind probably you sho...
Interesting post! I like the picture you draw. But you should consider the possibility that it was not a Rome-unique factor, but the intersection of multiple things of which each one was true for multiple ancient states, but all of them only for Rome. In particular I have the impression that the subjects of the Persian empire were pretty happy with it and flourishing under its rule. To be clear, it was nothing like citizenship, because Persia was a kingdom and not a republican city-state. But between the investment model and the pillaging model that you ...
I like to think it in this way: the determinant is the product of the eigenvalues of a matrix, which you can conveniently compute without reducing the matrix to diagonal form. All interesting properties of the determinant are very easy (and often trivial!) to show for the product of the eigenvalues.
More in the spirit of your post, I don't remember how hard it is to show that the determinant is invariant under unitary transformation, but not too hard I think. It's not the only invariant of course (the trace is as well, I don't remember if there are others). But you could definitely start from the product of eigenvalues idea and make it invariant to get the formula for det.
Interesting read, but I don't think the initial example and the following are very much connected. The shift of opinion about ww2 has presumably happened without fabricated evidence or misinformation about factual events. USSR and USA played a very different role in the defeat of Germany, so asking "which contributed the most" is sensitive to shifting narratives and highlighting of different events. Similar questions from more distant past: who was to blame for ww1? Was Napoleon spreading modernity and equality in Europe, or ruthlessly subjugating neighbor...
Or more generally, X sends a costly signal of his belief in P. If X is the state (as in example 2) a bet is probably impractical, but doing anything that would be costly if X is false should work. But for this, it makes a big difference in what sense Y does not trust X. If Y thinks X may deceive, costly signals are good. If Y thinks X is stupid or irrational or similar, showing belief in P is useless.
I mostly agree with the other commenters that the story does not show the qualitative changes we may expect to see from autonomous weapons. But I found it a very good short story nevertheless, and believable as well. I think it could serve well if broadly diffused, by getting someone to think about the topic for the first time before going into scenarios farther away from what they are used to.
I notice that while a lot of the answer is formal and well-grounded, "stories have the minimum level of internal complexity to explain the complex phenomena we experience" is itself a story :) Personally, I would say that any gear-level model will have gaps in the understanding, and trying to fill these gaps will require extra modeling which also has gaps, and so on forever. My guess is that part of our brain will constantly try to find the answers and fill the holes, like a small child asking "why x? ...and why y?". So if a more practical part of us wants to stop investigating, it plugs the holes with fuzzy stories which sound like understanding. Obviously, this is also a story, so discount it accordingly...
I agree it would be very good, and possibly an economic no-brainer. My point is just that what is discussed in the post works for a political no-brainer, by which I mean something that no one would bother to oppose. To get what you want you need a real political campaign, or a large scale economic education campaign. Even then it's difficult, imo, unless your proposals fit one of the cases I mention above.
That said, of you are thinking of the US there is an easy proposal to be done for medicine, which is making medical school equivalent to a college degree...
The problem is, licensed people have made an investment and expect to repay it by reaping profits from the protected market. Some have borrowed money to get in and may have to file for personal bankruptcy. So they will oppose the reform by any means at their disposal, for which I don't blame them (even if it is obviously against the general interest).
Such a reform would be doable in the following cases (1) it compensates the losers in some way (2) it's so gradual that current licensed will mostly retire before it's fully implemented (3) it is decided by a ...
On Prussia:
On effectiveness and public health studies: the thread quoted says multiple times "in the US". I would be curious to know if this kind of things are done more elsewhere or it's an implicit assumption that it could be done only in the US anyway (which could very well be true for what I know, drug profits are way higher in the US after all).
Does anybody know?
My feeling is that many of the people which did not benefit tend to "generalise from one example" and assume that's true for most kids. Actually, I (despite being generally pro-schooling) would say something stronger than you: there is a minority of people who are actually harmed by school compared to a reasonable counterfactual (e.g. home-schooling for some). Plus, many kids can see easily where the system is failing them, less easily where it's working.
Thanks for the review!
Regarding the "countering racism" doubts, I can see how the results should disprove at least some racist worldviews.
I think that an interpretation of human history among racists is the following: the population splits in to clusters, these clusters diverge in different "races", eventually one emerges as "the best" and out-competes or replaces all others, before splitting again. Historically, this view was used to justify aggressive expansionism, opposition to intermarriage, and opposition to any policy that could slow this proce...
According to my understanding (which comes from popularized sources, not I am not a doctor nor a biologist) antibody counts are not the main drivers of long-term immunity. Lasting immunity is given by memory T and B cells, which are able to quickly escalate the immune response in case of new infection, including producing new antibodies. So while high antibody count means you're well protected, a low count some months after the vaccine could mean that the protection has reduced, but in almost all cases you will be protected for a much longer time. Note tha...
The point is that if the majority of the "cost of crime" is actually the cost of preventing potential crime, then it's not obvious at all that more crime prevention will help.
Sure, sometimes it's better to shift from private prevention (behavior change) to collective prevention (policing) at the margin, but not always.