Mikola Lysenko
Mikola Lysenko has not written any posts yet.

Mikola Lysenko has not written any posts yet.

That's a great link, thanks!
Though it doesn't really address the point I made, they do briefly mention it:
> Interestingly, diamond has the highest known oxidative chemical storage density because it has the highest atom number (and bond) density per unit volume. Organic materials store less energy per unit volume, from ~3 times less than diamond for cholesterol, to ~5 times less for vegetable protein, to ~10–12 times less for amino acids and wood ...
>
> Since replibots must build energy-rich product structures (e.g. diamondoid) by consuming relatively energy-poor feedstock structures (e.g., biomass), it may not be possible for biosphere conversion to proceed entirely to completion (e.g., all carbon atoms incorporated into nanorobots) using... (read more)
It's good to hear from an actual expert on this subject. I've also been quite skeptical of the diamondoid nanobot apocalypse on feasibility grounds (though I am still generally worried about AI, this specific foom scenario seems very implausible to me).
Maybe you could also help answer some other questions I have about the scalability of nanomanufacturing. Specifically, wouldn't the energy involved in assembling nanostructures be much much greater than snapping together ready made proteins/nucleic acids to build proteins/cells? I am not convinced that run away nanobots can self assemble or be built in factories at planet scales due to simple thermodynamic limits. For example if you are ripping apart atoms and sticking... (read more)
This is the same kind of thing as the Black-Scholes model for options pricing. As a prediction with a finite time horizon approaches the probability of it updating to a known value converges. In finance people use this to price derivatives like options contracts, but the same principle should apply to any information.
I think you can probably put some numbers on the ideas in this post using roughly the same sort of analysis.
I disagree. In practice diffusion models are autoregressive for generating non-trivial amounts of text. A better way to think about diffusion models is that they are a generalization of multi-token prediction (similar to how DeepSeek does it) where the number of tokens you get to predict in 1 shot is controllable and steerable. If you do use a diffusion model over a larger generation you will end up running it autoregressively, and in the limit you could make them work like a normal 1-token-at-a-time LLM or do up to 1-big-batch-of-N-tokens at a time.