1) Only handful of amino acids in a enzyme structures were highly conserved. (Out of hundreds, generally less than ten.)
2) Those were generally in the reaction center.
3) Almost all single sequence replacements had no measurable effect on protein structure and function.
4) Across species the "same" protein can diverge in sequence by up to 40%, while keeping the same structure. Sometimes this goes as far as 80%.
Given these basic facts, the findings in the paper aren't really surprising to anyone who studies proteins.
[Note: As with everything in biology, you can find counter examples. The histone proteins involved in DNA packing have an incredibly conserved sequence.]
Are there any folds and patterns that evolution evolution has not discovered that are also useful? I think Baker Group created a bunch of new folds. I'm not sure if they are as useful as the one discovered by Evolution. After all, Evolution had more compute power than us.
i did neuroscience for grad school, and i was always amazed by how often complex neural activity could be well represented by lower dimensional representations--clean manifolds, attractor dynamics, etc. i think, in general, biology (evolution) doesn't penalize against redundancy too hard (hence things like genetic drift, neutral theory of evolution, etc.).
anyway, super cool stuff. agree with you that probs more useful to explore the search space via 'less natural' structures, given how forgiving evolution is to redundancy. probs where the most information can be found
We don’t even know if this is like body plans (four legs for mammals, why not six?) i.e. is this about physical limitations of the folding space (did evolution explore most of the space and hold onto the most useful folds, or are the common set of folds one of those accident-of-history results?). Then there’s the issue that folding takes place as the protein chain exits the ribosomal tunnel so that’s a whole other constraint on what kinds of folds might be selected. For that matter, why not other genetically determined complex amino acids instead of just the canonical set?
Also, a common evolutionary process in eukaryotes is duplication of protein sequences and shuffling of code blocks which might represent folding domains, which might tend to lock in the existing collection of folds rather than generating novel folds. That’s not so clear.
This weakness of AlphaFold has some modern practical relevance since non-canonical amino acids and modified proteins are increasingly used medically, and their structures mostly seem to be determined using the direct experimental methods, eg:
https://pmc.ncbi.nlm.nih.gov/articles/PMC10296201/
“Non-Canonical Amino Acids as Building Blocks for Peptidomimetics: Structure, Function, and Applications” (2023)
(note: there are bigger proteins, including ones so big you can see them with the naked eye (e.g. a hair) but they consists of multiple repeats of the same small building block. There are many such building blocks. And the very few exceptions to that are "not really" part of eukaryot cells, but of cell organelles that have their own DNA)
But even if you just take the first 4 amino acids, there's half a million possible combinations. Life uses less than 1000 of those.
In other words: DNA and evolution, even with billions of years to think about it, is really a bit of a beginner when it comes to protein design. Or at least, it is pretty obvious that it's possible to do A LOT better than natural selection.