At some point I had the harebrained idea of writing a LW sequence on health, despite lacking all formal qualifications for doing so, because I've been confused about / disappointed with most of my experiences at doctors' offices throughout my life. Eventually I figured I won't understand why until I do a deep dive into the topic myself.
Questions I'd like to help (myself and others) answer include "how can I think about this potential illness I have" or "how could I tell whether I have a sleep problem when I'm unconscious while asleep" or "how good is healthcare, how do the incentives work, in which situations are doctors likely to give good / poor advice", etc.
By now I have a huge bunch of disorganized notes, but no drafts or essays written up; maybe I'll manage to write one this week, but most likely not.
Rough intended outline:
Does that sound interesting / useful / valuable?
Regarding concrete blockers for this project (besides procrastination, of course):
Hi, I'm a lurker. I work on CPUs. This also motivated me to post!
This is a rather niche topic, but I want to express it, because I greatly enjoy seeing other ramble about their deep-work domain expertise, so maybe someone will find this interesting too? This is relatively similar to the concept behind the podcast [What's your problem?], in which engineers talk about ridiculously niche problems that are integral to their field.
Anyways-- here's my problem.
Fuzzing (maybe known as mutation based testing, or coverage directed verification, or 10 other different names) has, in my opinion, been revolutionary for the software security industry. [AFL] is probably the best and most successful example, and I think most people would agree with me that this tool has saved millions of manhours in finding security vulnerabilities.
Why don't we have such tools in hardware? Well, my personal opinion is that EDA tools are rather monopolistic and cumbersome relative to e.g. GCC (e.g. imagine paying millions of dollars for a GCC release!), and a partial side-effect of that is that the language hardware codes in (verilog, systemverilog) is so ingrained we can't get out of it.
This is just barely starting to change.
[Here] is my personal top favorite of a contender. What makes this cool is not entirely revolutionary new ideas, but rather the sheer amount of effort to make things just work is truly truly commendable.
The main perk of fuzzing is, frankly, finding low-ish hanging fruit. Just like how buffer overflows are, in some sense, a "known problem", there's a plethora of hardware vulnerabilities I've found that you wouldn't believe are insanely easy to find. And I firmly think this can be done by simple fuzzing.
My project plan? Convert the detected vulnerabilities into generated exploitable software vulnerabilities. And I think the above project can fit into the "detection" aspect-- honestly still a WIP for me to evaluate how good it is, or how complicated the fuzzer is (most of the time it's just a wrapper around SMT solvers), but it's something I'm excited about.
(On the "exploitable vulnerabilities" end, there is similar work [here], but I've done some experimentation with this and still find it rather cumbersome for a variety of details I won't get in to.)
Oh I guess, while I'm on the topic of "bringing software paradigms into the hardware world", let me also talk about CirctIR briefly.
I also believe LLVM was a bit of a boon for the software security world, enabling some really cool symbolic execution and/or reverse engineering tools. CirctIR is an attempt to basically bring this "intermediate representation" idea to hardware.
This "generator for intermediate language representation", by the way, is similar to what Chisel currently does w.r.t generating verilog. But CirctIR is a little more generic, and frankly Chisel's generator (called FIRRTL) is annoying in many ways.
Chris Lattner worked at SiFive for a bit, and made these same observations, so he spearheaded the CirctIR movement. Partially as a result, there are many similarities with FIRRTL and CirctIR (Chisel's goal is to make hardware design easier, and CirctIR's goal is to make designs portable and/or decouple these toolchain flows. Related goals, but still differentiable)
.
I've wanted for some time to play with this as well, but the fuzzing work has gotten me more interested currently and something I'm trying to make an MVP for at work.
I'm not sure whether you mean fuzzing of the synthesis tools (quick google here) or fuzzing of the synthesized devices (e.g., here; corrected). I worked with FPGAs a while back before even unit testing was established practice in software. I'm surprised that fuzzing isn't used much, esp. as it seems much faster so close to the HW.
Those two links are the same. But yeah I'm referring to the latter, w.r.t fuzzing of the synthesized devices.
"Fuzzing" as a concept is used, but not very "block-level" (some some exceptions, e.g. you likely know about UVM's support for random data streams, coming from an FPGA background). The fuzzing analogue in hardware might be called "constrained random verification".
Fuzzing as I've heard it referenced is more of a jargon used in the software security world, the aforementioned AFL fuzzer being one example.
I do agree that traditional fuzzing isn't used in hardware is rather surprising to me.
Those two links are the same.
ups. corrected.
Didn't know UVMs.
Maybe one reason fuzzing isn't used more is that it is harder to detect failure? You don't get page faults or exceptions or some such with hardware. What is your idea there?
What's the purpose? To have people encourage you to finish them? To look cool for having started something awesome?
Yes.
Ideas need fostering these days. As Scott Alexander describes, we live in a time where low-hanging fruits have been picked, and individuals rarely have the resources or time to pick any. But the picking could be distributed over time and over multiple people, and to do so, you need to get the ideas that are there out of people's heads into a space where collaborating on them is possible. And reduce the natural tendency to hold on to "your" idea.
Big Motivation: Biological systems are thought to stochastically sample from probability distributions. For instance, an optimal prey who is in the act of evading a predator might want to act randomly, at least relative to the predator's model of the prey. Is it possible for such a system to actually generate a random output without explicitly stochastic mechanisms?
Actual Project Question: How can deterministic recurrent neural networks with fixed weights be trained to create random outputs?
Project Plan: Train a recurrent neural network to output a binary digit at random, with specified entropy. For instance, say I want an RNN that can output a 0 or a 1 at every timestep, and I'd like the bit to be (effectively) chosen at random from a uniform distribution.
Some initial work/thoughts on solutions: Training an RNN on any set of random outputs will not work, it will just memorize the random strings. Can I train directly on the entropy of the output? One way to get a working system is to have the RNN implement chaotic dynamics, and make sure the timescales work out such that the dynamics have evolved enough to randomly sample the ergodic distribution associated with the chaotic attractor. How exactly I can use this to generate a string with e.g. 0.7 bit entropy indead of 1 bit of entropy, I'm not totally sure. I've implemented a Lorenz attractor, and chosen different planes to seperate the state space into two partitions. I asign one partition the symbol 0 and the other symbol 1. Then I can run the system for N timesteps, and then see if I output a 0 or a 1. Thus I get a symbol string. I can then plot the block length entropy diagram to quantify the generation of structure/entropy in that system. The trick would be to get training working with this system somehow.
Further Goals: How about outputing a string that has different amounts of structure and entropy? For instance, a string that goes ...01R01R01R01R..., where R is a bit with 1 bit of entropy?
Is it possible for such a system to actually generate a random output without explicitly stochastic mechanisms?
If your idea of an explicitly stochastic mechanism is something like a pseudo random number generator, then it's possible , but complex, and so unlikely compared to the much easier alternative of using some kind of existing noise, IE failing to perfectly filter out noise.
What's your definition of 'explicitly stochastic mechanisms'? Every physical-world sensor is stochastic to some degree.
"All" you really need to be stochastic is to ensure that you have some level of stochastic input, plus a chaotic feedback loop of some sort.
Some initial work/thoughts on solutions: Training an RNN on any set of random outputs will not work, it will just memorize the random strings. Can I train directly on the entropy of the output?
Train directly on maximizing the Lyapunov exponent, perhaps? That is, repeatedly:
(...this seems suspiciously like 'calculate and maximize the gradient'...)
How about outputing a string that has different amounts of structure and entropy?
Instead of maximizing the Lyapunov exponent, try optimizing towards a specific value of Lyapunov exponent instead?
This isn't quite right, I don't think. Close though.
Great idea! My intuition says this won't work, as you'll just capture half of the mechanism of the type of chaotic attractor we want. This will give you the "stretching" of points close in phase space to some elongated section, but not by itself the folding over of that stretched section, which at least in my current thinking is necessary. But it's definitely worth trying, I could very well be wrong! Thanks for the idea :)
Similarly it's not obvious to me that constraining the lyapanov exponent to a certain value gives you the correct "structure". For instance, if instead of ..01R... I wanted to train on ...10R... Or ...11R... Etc. But maybe the training of the lyapanov would just be one part of the optimization, and then other factors could play into it.
This will give you the "stretching" of points close in phase space to some elongated section, but not by itself the folding over of that stretched section, which at least in my current thinking is necessary.
I mean, the only way to have "stretching" of 'most' points in phase space is to have some sort of 'folding over'.
Of course, it's an entirely different matter as to if a standard optimizer can actually make any headway in figuring that out.
For a time, I was interested in a postulated obscure relativistic effect called Transverse Gravitational Redshift. I tried to check the calculations and, in 2019, wrote a simulation of relativistic motion. The simulator works in 3+1D special relativity, uses adaptive time steps, and can reproduce the twin paradox. The project is on Github here:
https://github.com/GunnarZarncke/timewarp
But to simulate the Transverse Redshift requires general relativity, and extending the simulator to support that would be a huge effort, so I stopped working on it. Still, I think it is a useful software for teaching special relativity.
I don't recommend looking into the effect because it was suggested by a quack as the cause of (perceived) accelerated cosmic expansion. Only if you are still interested in this hobby horse of mine, read on. The effect has been discussed on physics fora here and here. A charitable explanation is here. If the calculations are correct the effect is very small (10^-16 at 1g; and if the calculations are wrong, it is zero). The problem is that general relativity is hard; nobody does calculations except for the guy claiming it - and he doesn't do it in small steps. Everybody who checks makes errors, even physics profs. There are a lot of irrelevant tangents ("doesn't hold for high velocities") or ad-hominems. The last reported error in the debunks went unanswered. Based on what you find online, nobody can know if the thing holds or not. The problem is that the guy is an asocial arrogant quack. He does make testable predictions and provides detailed instructions to reproduce with the SDSS dataset (but please start reading at page 10, or you will fall back over). If he is onto something, we will never know because nobody will take the social risk to associate with him.
Neural style transfer algorithms that rely on optimizing an image by gradient descent are painfully slow. You can make them faster by training a network to map the original image to a stylized one that minimizes the loss. I made them even faster by stacking a few Perlin noise channels to the input, using a very fast hand-written edge detection algorithm and stacking that too and then performing only local processing. Unfortunately this gives very poor results and is only not uselessly slow for low-res inputs.
So I gave up on that and started reading about non-ML style transfer and texture synthesis algorithms, it's a fascinating topic. In the end I managed to create an algorithm that can perform real time poor high resolution style transfer, by using numba to compile python code. Sadly when I tried to turn that into an Android app, despite trying really hard, I found that there was no way to run numba on Android, and using numpy directly isn't fast enough, so I gave up on that too.
Post about your unfinished projects, incomplete inventions, or not-so-crazy ideas here.
This is an easy-entry version of the old Crazy Ideas Threads.
I am posting this in the spirit of the Good Heath Week, especially as a place for lurkers to offer their thoughts and maybe grow them.
Please post one project/idea per thread.