All of rbv's Comments + Replies

rbv72

The vanilla Transformer architecture is horrifically computation inefficient. I really thought it was a terrible idea when I learnt about it. On every single token it processes ALL of the weights in the model and ALL of the context. And a token is less than a word — less than a concept. You generally don't need to consider trivia to fill in grammatical words. On top of that, implementations of it were very inefficient. I was shocked when I read the FlashAttention paper: I had assumed that everyone would have implemented attention that way in the first plac... (read more)

rbv10

tl;dr: For a hovering aircraft, upward thrust equals weight, but this isn't what determines engine power.

I'm no expert, but the important distinction is between power and force (thrust). Power is work done (energy transferred) per unit time, and if you were just gliding slowly in a large and light unpowered glider at a fixed altitude (pretending negligible drag), or to be actually realistic, hovering in a blimp, with lift equalling weight, you're doing no work! (And neither is gravity.) On the other hand when a helicopter hovers at a fixed altitude it's do... (read more)

rbv32

Fight the tyrant, not the Russian army. I believe the sort of thing that the OP is asking for, if we restrict ourselves to just Russia for the moment, is: is there any way to assist with getting rid of Putin, reducing the harm he causes, or preventing the next Putin after he's gone? Focusing in further on the first of those: Is it helpful to donate to democracy-enhancing initiatives in Russia? (Is it possible to help get Putin voted out? The answer is apparently no.) Can one help to get him overthrown? It seems possible, if he were to become unpopular enou... (read more)