Check out this paper in eLife, in which the authors use machine learning applied to facial images to determine whether people have genetic disorders. So cool! From what I can gather, they use a training set of just under 3000 images of faces (1300 or so of them have a genetic disorder) and then use facial recognition software to quantify those images. Using that quantification, they can cluster different disorders based on these facial features–check out this cool animation showing the morphing of an average normal face to an average face of various disorders. Although they started with a training set of 8 syndromes, the resulting characteristics they used (the “Clinical Face Phenotype Space”) was sufficiently rich to distinguish 90 different syndromes with reasonable accuracy.
“Reasonable accuracy” being a key point. The authors are quick to point out that their accuracy (varies, can be around 95%) is not sufficient for diagnostic purposes, where you really want to know 100% (or as close to it as possible). Rather, it can assist the clinician by giving them some idea of what potential disorders might be. The advantage would be that pictures are so easy to take and share. With modern cell phone cameras having penetrated virtually every market in the world and their classification being computational, then a pretty big fraction of the world population could easily participate. I think this is one of the highlights of their work, because they note that previous approaches relied on 3D scans of people, which are obviously considerably harder to get your hands on.
This approach will have to compete with sequencing, which is both definitive for genetic disorders and getting cheaper and cheaper (woe to the imagers among us!). It doesn’t feel like a stretch to imagine sequencing a person for, say, $10 or $1 in the not so distant future, at which point sequencing’s advantages would be hard to beat.
That said, I feel like the approach in this paper has a lot of implications, even in a future where sequencing is much cheaper and more accessible. Firstly, there are diseases that are genetic but have no simple or readily discernible genetic basis, in which case sequencing may not reveal the answer (although as the number of genome sequences available increase, this may change).
Secondly, and perhaps more importantly, images are ubiquitous in ways that sequences are not. If you want someone’s sequence, you still have to get a physical sample. Not so for images, which are just a click away on Facebook. Will employers and insurers be able to discriminate based on a picture? Matchmakers? Can Facebook run the world’s largest genetic analyses? Will Facebook suggest friends with shared disorders? Can a family picture turn into a genetic pedigree? The authors even try to diagnose Abraham Lincoln with Marfan disorder from an old picture, and got a partial match. I’m sure a lot will depend on the ultimate limitations of image-based phenotyping, but still, this paper definitely got my mind whirring.