Ponder Stibbons - LessWrong

I agree with the OP that the search for a broad spectrum anti-cancer drug is still a worthwhile endeavour. But I think it would be wrong to hold back on research into the more specific cancer remedies because the current most effective therapies are very often combination therapies of a type-specific anti-cancer drug co-administered or post-administered alongside a classical broad-spectrum anticancer agent such as cis-platin, or an anti-hormone agent for hormone dependant cancers. It is unlikely that new broad-spectrum treatments be so effective that this situation will change.

Having said that, there is huge research into targets ubiquitous for many cancers but formerly considered undruggable, such as P53 and Myc. We are getting better at finding ways to tackle these problem proteins. One approach,illustrated for example, which has great general promise is proximity induced degradation. A binder for the target is found, which doesn’t need to be at the main active site, if this is unattractive (for example if it is highly polar). This binder is then attached by chemical linker to a molecule that strongly binds to an E3 Ligase. This enzyme then recruits an E2 ubiquitin-conjugating enzyme which then ubiquitinylates the target protein preferentially on account of their proximity. The ubiquitinylated target protein is then recognised by the proteasome for degradation.

AOH1996 has, according to some accounts, been oversold as a cancer cure-all by the media. However, even if that is true, it could still have value as part of a combination therapy.

Three Subtle Examples of Data Leakage

Ponder Stibbons8mo105

The other day, during an after-symposium discussion on detecting BS AI/ML papers, one of my colleagues suggested doing a text search for “random split” as a good test.

A primer on why computational predictive toxicology is hard

Ponder Stibbons10mo70

A lot of what you write is to the point and very valid. However, I think you are missing part of the story. Let’s start with

“Unlike drug development, where you’re trying to precisely hit some key molecular mechanism, assessing toxicity almost feels…brutish in nature”

I assume you don’t really believe this. Toxicity is often exactly about precisely hitting some key molecular mechanism. A mechanism that you may have no idea your chemistry is going to hit before hand. A mechanism moreover that you cannot use a straight forward ML to find because your chemistry is not in any training set that an ML model could access. It is very easy to underestimate the vastness of drug-like chemical space, and it is generally the case any given biological target molecule (desired or undesired) can be inhibited or otherwise interfered with a wide range of different chemical moieties (thus keeping medicinal chemists very well employed, and patent lawyers busy). There is unlikely to be toxicological data on any of them unless the target is quite old and there is publically available data on some clinical candidates.

We look to AlphaFold as the great success for ML in the biological chemistry field, and so we should, but we need to remember that AlphaFold is working on an extremely small portion of chemical space, not much more than that covered by the 20 natural amino acids. So AlphaFold’s predictions can be comfortably within distribution of what is already established by structural biology. ML models for toxicology, on the other hand, are very frequently predicting out of distribution.

In point of fact the most promising routes to avoiding toxicity reside in models that are wholly or partially physics-based. If we are targeting a particular kinase (say) we can create models (using AlphaFold if necessary) of all the most important kinases we don’t want to hit and, using physics-based modelling, predict whether we could get unwanted activity against any of these targets. We still have the problem of hitting unrelated protein targets but even here we could, in principle, screen for similarities in binding cavities over a wide range of off-targets and use physics-based modelling to assess cases where there is a close enough match.

Needless to say that requires an awful lot of compute and no-one is really doing this to scale yet. It is a very difficult problem.

WTH is Cerebrolysin, actually?

Ponder Stibbons10mo21

Yes, I agree, I think it is pretty unlikely. But not completely impossible. As I said it should be pretty easy to find them if they are in the lysate via, HP liquid chromatography. Brain penetrant cyclic peptides should on the whole be significantly less polar than acyclic polypeptides of similar mass.

WTH is Cerebrolysin, actually?

Ponder Stibbons10mo102

An excellent analysis and I’m almost sure your mistrust in the pharmaceutical efficacy of Cerebrolysin is well founded. However, having some experience working in the field of brain-penetrant drugs, I can comment that your restrictions on molecular weight and properties are too conservative. Small molecules of >550 dalton are capable of crossing the blood brain barrier if very well tailored. Also small cyclic peptides can hide their polar backbones within buried intramolecular hydrogen bond networks and become membrane permeable. The bicyclic peptide SFTI-1, a 14mer peptide, has been shown brain penetrant in rat in what looks to me a reasonable study. So, playing devil’s advocate, there is a hypothesis that the lysis procedure generates certain cyclic peptides of 500-1000 Dalton that could penetrate the BBB and have a biological effect.

I don’t believe this hypothesis but it does need to be discounted. Such cyclic peptides should be straight-forward to detect by HPLC/MS, I’d have thought, through their significantly less polar nature. Has anyone published work looking for these in Cerebrolysin?

I'm a bit skeptical of AlphaFold 3

Ponder Stibbons1y10

There is an additional important point that needs to be made. Alphafold3 is using predominantly “positive” data. By this I mean the training data encapsulates considerable knowledge of favourable atom-atom or group-group interactions and relative propensities can be deduced. But “negative” data, in other words repulsive electrostatic or Van der Waals interactions, are only encoded by absence because these are naturally not often found in stable biochemical systems. There are no relative propensities available for these interactions. So AF3 can be expected to not perform as well when applied to real-world drug design problems where such interactions have to be taken into account and balanced against each other and against favourable interactions. Again, this issue can be mitigated by creating hybrid physics compliant models.

It is worth also noting that ligand docking is not generally considered a high accuracy technique and, these days is often used to 1st pass screen large molecular databases. The hits from docking are then further assessed using an accurate physics-based method such as Free Energy Perturbation.

I'm a bit skeptical of AlphaFold 3

Ponder Stibbons1y52

I have similar concerns regarding the ligand sets used to test Alphafold3. I’ve had a cursory look at them and it seemed to me there were a lot phosphate containing molecules, a fair few sugars, and also some biochemical co-factors. I haven’t done a detailed analysis, so some caveats. But if true, there are two points here. Firstly there will be a lot of excellent crystallographic training material available on these essentially biochemical entities, so AlphaFold3 is more likely to get these particular ones right. Secondly, these are not drug-like molecules and docking programs are generally parameterized to dock drug-like molecules correctly, so are likely to have a lower success rate on these structures than on drug-like molecules.

I think a more in-depth analysis of performance of AF3 on the validation data is required, as the OP suggests. The problem here is that biochemical chemical space, which is very well represented by experimental 3D structure, is much smaller than potential drug-like chemical space, which is poorly represented by experimental 3D structure comparatively speaking. So inevitably AF3 will often be operating beyond the zone of applicability, for any new drug series. There are ways of getting round this data restriction, including creating physics compliant hybrid models (and thereby avoiding clashing atoms). I’d be very surprised if such approaches are not currently being pursued.

Level up your spreadsheeting

Ponder Stibbons1y10

So after tearing my hair out trying to generate increasingly complex statistical analyses of scientific data in Excel, my world changed completely when I started using KNIME to process and transform data tables. It is perfect for a non-programmer such as myself, allowing the creation of complex yet easily broken-down workflows, that use spreadsheet input and output. Specialist domain tools are easily accessible (e.g chemical structure handling and access to the RDKit toolkit for my own speciality) and there is a thriving community generating free-to-use functionality. Best of all it is free to the single desk-top user.

introduction to cancer vaccines

Ponder Stibbons1y82

Useful post. I can expand on one point and make a minor correction. Single Particle Cryo-EM is indeed a new(ish) powerful method of protein structure elucidation starting to make an impact in drug design. It is especially useful when a protein cannot easily be crystallised to allow more straightforward X-Ray structure determination. This is usually the case with transmembrane proteins for example. However it is actually best if the protein molecules are completely unaligned in any preferred direction as the simplest application of the refinement software assumes a perfectly random 3D orientation of the many thousands of protein copies imaged on the grid. In practice this is not so easy to achieve and corrections for unwanted preferred orientation need to be made.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments