A lot of what you write is to the point and very valid. However, I think you are missing part of the story. Let’s start with
“Unlike drug development, where you’re trying to precisely hit some key molecular mechanism, assessing toxicity almost feels…brutish in nature”
I assume you don’t really believe this. Toxicity is often exactly about precisely hitting some key molecular mechanism. A mechanism that you may have no idea your chemistry is going to hit before hand. A mechanism moreover that you cannot use a straight forward ML to find because your chemistry is not in any training set that an ML model could access. It is very easy to underestimate the vastness of drug-like chemical space, and it is generally the case any given biological target molecule (desired or undesired) can be inhibited or otherwise interfered with a wide range of different chemical moieties (thus keeping medicinal chemists very well employed, and patent lawyers busy). There is unlikely to be toxicological data on any of them unless the target is quite old and there is publically available data on some clinical candidates.
We look to AlphaFold as the great success for ML in the biological chemistry field, and so we should, but we need to remember that AlphaFold is working on an extremely small portion of chemical space, not much more than that covered by the 20 natural amino acids. So AlphaFold’s predictions can be comfortably within distribution of what is already established by structural biology. ML models for toxicology, on the other hand, are very frequently predicting out of distribution.
In point of fact the most promising routes to avoiding toxicity reside in models that are wholly or partially physics-based. If we are targeting a particular kinase (say) we can create models (using AlphaFold if necessary) of all the most important kinases we don’t want to hit and, using physics-based modelling, predict whether we could get unwanted activity against any of these targets. We still have the problem of hitting unrelated protein targets but even here we could, in principle, screen for similarities in binding cavities over a wide range of off-targets and use physics-based modelling to assess cases where there is a close enough match.
Needless to say that requires an awful lot of compute and no-one is really doing this to scale yet. It is a very difficult problem.
Yes, I agree, I think it is pretty unlikely. But not completely impossible. As I said it should be pretty easy to find them if they are in the lysate via, HP liquid chromatography. Brain penetrant cyclic peptides should on the whole be significantly less polar than acyclic polypeptides of similar mass.
An excellent analysis and I’m almost sure your mistrust in the pharmaceutical efficacy of Cerebrolysin is well founded. However, having some experience working in the field of brain-penetrant drugs, I can comment that your restrictions on molecular weight and properties are too conservative. Small molecules of >550 dalton are capable of crossing the blood brain barrier if very well tailored. Also small cyclic peptides can hide their polar backbones within buried intramolecular hydrogen bond networks and become membrane permeable. The bicyclic peptide SFTI-1, a 14mer peptide, has been shown brain penetrant in rat in what looks to me a reasonable study. So, playing devil’s advocate, there is a hypothesis that the lysis procedure generates certain cyclic peptides of 500-1000 Dalton that could penetrate the BBB and have a biological effect.
I don’t believe this hypothesis but it does need to be discounted. Such cyclic peptides should be straight-forward to detect by HPLC/MS, I’d have thought, through their significantly less polar nature. Has anyone published work looking for these in Cerebrolysin?
There is an additional important point that needs to be made. Alphafold3 is using predominantly “positive” data. By this I mean the training data encapsulates considerable knowledge of favourable atom-atom or group-group interactions and relative propensities can be deduced. But “negative” data, in other words repulsive electrostatic or Van der Waals interactions, are only encoded by absence because these are naturally not often found in stable biochemical systems. There are no relative propensities available for these interactions. So AF3 can be expected to not perform as well when applied to real-world drug design problems where such interactions have to be taken into account and balanced against each other and against favourable interactions. Again, this issue can be mitigated by creating hybrid physics compliant models.
It is worth also noting that ligand docking is not generally considered a high accuracy technique and, these days is often used to 1st pass screen large molecular databases. The hits from docking are then further assessed using an accurate physics-based method such as Free Energy Perturbation.
I have similar concerns regarding the ligand sets used to test Alphafold3. I’ve had a cursory look at them and it seemed to me there were a lot phosphate containing molecules, a fair few sugars, and also some biochemical co-factors. I haven’t done a detailed analysis, so some caveats. But if true, there are two points here. Firstly there will be a lot of excellent crystallographic training material available on these essentially biochemical entities, so AlphaFold3 is more likely to get these particular ones right. Secondly, these are not drug-like molecules and docking programs are generally parameterized to dock drug-like molecules correctly, so are likely to have a lower success rate on these structures than on drug-like molecules.
I think a more in-depth analysis of performance of AF3 on the validation data is required, as the OP suggests. The problem here is that biochemical chemical space, which is very well represented by experimental 3D structure, is much smaller than potential drug-like chemical space, which is poorly represented by experimental 3D structure comparatively speaking. So inevitably AF3 will often be operating beyond the zone of applicability, for any new drug series. There are ways of getting round this data restriction, including creating physics compliant hybrid models (and thereby avoiding clashing atoms). I’d be very surprised if such approaches are not currently being pursued.
So after tearing my hair out trying to generate increasingly complex statistical analyses of scientific data in Excel, my world changed completely when I started using KNIME to process and transform data tables. It is perfect for a non-programmer such as myself, allowing the creation of complex yet easily broken-down workflows, that use spreadsheet input and output. Specialist domain tools are easily accessible (e.g chemical structure handling and access to the RDKit toolkit for my own speciality) and there is a thriving community generating free-to-use functionality. Best of all it is free to the single desk-top user.
Useful post. I can expand on one point and make a minor correction. Single Particle Cryo-EM is indeed a new(ish) powerful method of protein structure elucidation starting to make an impact in drug design. It is especially useful when a protein cannot easily be crystallised to allow more straightforward X-Ray structure determination. This is usually the case with transmembrane proteins for example. However it is actually best if the protein molecules are completely unaligned in any preferred direction as the simplest application of the refinement software assumes a perfectly random 3D orientation of the many thousands of protein copies imaged on the grid. In practice this is not so easy to achieve and corrections for unwanted preferred orientation need to be made.
I think this is an interesting point of view. The OP is interested in how this concept of checked democracy might work within a corporation. From a position of ignorance can I ask whether anyone familiar with German corporate governance recognises this mode of democracy within German organisations? I choose Germany because large German companies historically incorporate significant worker representation within their governance structures, and, historically, tend to perform well.
The other day, during an after-symposium discussion on detecting BS AI/ML papers, one of my colleagues suggested doing a text search for “random split” as a good test.