All of those administrative and metadata and peripheral problems with AI systems... I'm not sure I buy that story. Because all of those things are fixable, slowly but surely. The problem is that the AI isn't worth it yet. Even when it performs at its best, it's not massively outperforming radiologists, which is what it would need to do to motivate everyone to fix the little problems.
The issue is that radiologists are pretty good at their jobs. They are not the big problem in medicine, and AI choosing to compete with those guys is just an example of looking for your car keys under the streetlight. AI is relatively good at image data, so they thought they'd try their hand at radiology; but in practical terms no one needs Google to do that.
The big entry of AI into medicine will come when AI offers a killer app that doctors can't currently do well. It might be very early diagnosis of disease from changes in lifestyle noticed by a phone or wearable. Or micro-keyhole surgery. Or something else I can't even imagine yet. But AI engines at the moment are just like the very earliest steam engines: not actually able to go faster than a horse. Definitely worth investing in as a long-term project, but the answer to the question "why isn't this technology in widespread use?" is not because hospital data isn't tidy enough. It's because we already have horses.
^ Absolutely. 100% agree. Hospitals are not motivated to fix the issues with image quality, coding, and metadata because there is no economic incentive.
Yes, the value prop for applications that seek to replace what radiologists do is generally quite low. That's why radiologists and other insiders I talk to are most excited about triage applications (FDA req's are also a bit lower for triage). Because radiologists are so overworked, it can often take 12-24 hours before a radiologist looks at an image. Two of the most commercially successful AI for radiology companies - AiDoc and viz.ai - have been making a lot of money with their systems for detecting hemorrhage in brain MRI. Obviously, it's important that hemmorages are treated as fast as possible!
The other area where I think radiology AI can bring value, which I didn't get into all in this post, is in developing countries. But there are a host of other challenges there - old scanning tech, slow internet connections, distributional shift in demographics and disease presentation, need to maintain high trust in the outputs, etc etc. Maybe I'll do another post on AI for radiology applications in developing countries in the near future...
Another area is digital pathology, which I'm not as familiar with but where I get the sense AI has a lot of potential. Pathology has always lagged behind radiology in terms of tech adoption and digitization however, for reasons that aren't exactly clear. Some people say its because the pathologists were too attached to their microscopes.
Two issues with digital pathology are that the images are far larger and that the larger vendors work with proprietary formats. Clinical digital radiology grew in the 1990s and 2000s and DICOM provided a standard way for images to be shared. The computers and networks in that era couldn't handle gracefully the much larger pathology slide images (a digitized film was typically around 2K x 2K pixels, a digitized slide might be 100K x 100K). Storage costs for pathology were very large.
Although DICOM standards for pathology were created most of the vendors chose to build their own formats. Partly this was for performance - but I suspect that part of it was the vendor's motivations to keep hospitals from migrating to another vendor's product (very similar to Radiology vendors support GSPS inconsistently and avoiding widespread support for DICOM SR and Seg).
"Although DICOM standards for pathology were created most of the vendors chose to build their own formats. " -- wow, I didn't know that. That's unfortunate because AI systems are all being built around the DICOM standard for input/output, as they should be. And obviously standards are helpful when they are widely adopted.
Side comment: if anyone here works in radiology or pathology, they may find this library useful for manipulating DICOM images and generating overlays for AI outputs: https://github.com/herrmannlab/highdicom Many pathology people are using it.
Excellent article, but few people will want to believe it. From a view on the ground interacting with colleagues, the overwhelming belief among medical personnel is that AI is coming and in fact Hinton will be right, while you have old-timers who don't accept EMRs and are skeptical for unknown reasons about AI and say it won't work but really for vague reasons so sort of invalid logic. The problem is the big lie equating deep learning with AI. Deep learning is not AI and never will be. It is incapable of causal reasoning. You can force solutions via brute strength (to wit, self-driving applications and possibly CXR's and Pap smears, etc) but until real AI is developed (eg, my work but I won't give a narcissistic plug for it) you will need human doctors. No one will want to read or believe what you wrote above, but it is true.
Very true Howard, almost no one wants to hear this, but "speak the truth, even if your voice shakes" is sort of my motto.
Yes a lot of doctors and radiologists are already preparing for AI, especially younger ones.
Causality is important in medical imaging.. but what I stress is the 'mechanistic/physical modeling' aspect. In other words, interpreting things physically -- understanding how fluid may move around under the influence of gravity, or how tissues change appearance due to IV contrast injection. Humans have a physical model of how different parts of the body deform and interact with each other physically (for instance how plaque can restrict bloodflow leading to upstream effects). This holistic high level knowledge of "what's actually going on" seems important for robust interpretation of images that may be corrupted or noisy. My impression is deep learning systems, at least right now, lack such abilities.
There is a term that describes "mechanistic/physical modeling" -- "causality" :)
(Approx 600K years ago give or take few 100K years ago small set of mutations allowing causality to emerge (plus psychosis) in pre-modern humans, something that does *not* exist in chimps or other animals or in deep learning. )
Very thorough and well-considered article, Dan. While it's good to be excited about the potential of AI for radiology, we need to be measured and realistic concerning its present potential.
When the next disruptive breakthrough occurs, we can then ratchet up our expectations. Keep us posted!
The Thailand retinopathy paper and the BMJ review seem to validate the hype. The Thailand paper ran into problems based on serious material constraints. They had poor-quality images because they didn’t have time for a 60 second waiting period between eyes, they couldn’t turn off the lights in the imaging room because it was shared with other services, or because the camera equipment was broken(!). A western ophthalmology clinic would not have these issues.
The BMJ article says that 2/36 models are better than a single radiologist for routine mammogram evaluations! The market is young and full of failures, but there are AI products that already can eat a good slice of the radiology pie. To capture most of the market, It doesn’t need to be better than the best radiologists, only the median.
I agree AI is breaking through in mammography but what has been striking to me is how long it has taken. The amount of effort that has gone through AI for mammography has been monumental yet a bunch of reviews from just the past year have all stressed it still isn't quite ready. All of this is happening on a backdrop of fierce debate about how mammography screening should be conducted which shows there is a lot of room for improvement in how screening is done (there are a lot of false negatives and false positives currently). So I hope AI can make a difference soon.
Another point is comparing to radiologists is tricky. Most studies only compare 1-3 radiologists which are not necessarily a good sample - radiologists vary enormously in their sensitivity/specificity based on their years of experience and specialization. So with mammo AI flirting around human level it's not surprising a few studies did better than "a single radiologist". But you are absolutely right - technically it doesn't have to be better than the median radiologist to bring value although people do generally want to see that.
Regarding the retinopathy paper, I don't see how it validates the hype but your point seems otherwise likely valid. I still think it is striking that one of the companies with the top AI talent (Alphabet / Verily) had so much trouble. It leaves me a bit nervous for all the startups which do not have top talent inside them doing the development of models.
I think our disagreement is just on the level of hype. I’ve met multiple med students pursuing radiology, and they’ve been totally blasé about the chance their job prospects get markedly worse over the next 20+ years, so their hype seems too low. But amongst tech/admin circles you’re probably well calibrated.
It seems most overhyped among tech circles, yes, like among data scientists and ML engineers. But I also think it's overhyped in the mind of random people on the street. I think a lot of people have seen ads from IBM about Watson, and maybe from other companies too like Microsoft and gotten the impression that AI is being used by doctors regularly or that "AI is transforming healthcare" and all their data is being fed into some sort of AI system.
What’s more likely: 2/26 of the models are better than humans or 2/26 papers are fraudulent? Or more charitably, the papers were written in a way to be very favorable to the model’s performance.
Excellent post! I think I am more bullish than you given the rapid progress on LLMs, especially when combined with foundation models and vision, I think you might get something approaching a model of the human body, but you've convinced me that radiology is a lot harder to automate than I thought.
Excellent.. as a radiologist working on AI,, I find this one of the most honest pieces on this topic I've read in a while. As an example, it takes me a glance to detect tumor in liver.. But the best deep learning models are up to only 60% accurate(roughly speaking) and that is with the deepest networks I can train on latest GPUs.. I am beginning to wonder whether deep learning may have peaked already and we ought to find modern methods
I have also gotten two wonderful emails from medical doctors (one a radiologist, one a physician) who thanked me for my piece. One of them said they are afraid to share it because everyone around them is so excited about AI. He said it's very much an "the emperor has no clothes" type thing right now.
What you're saying sounds about right. I personally don't think deep learning has peaked - I think it can improve as we add more data and new techniques are perfected (such as transfer learning and few shot learning). However, I do think it's massively overhyped *right now*.
The time it takes algorithms to run is absolutely an overlooked part of the equation here. Runtime can be painfully slow if the scan has to be sent to the cloud for processing which is what companies like Nuance are pushing for. (Raw scans (in DICOM) are hundreds of megabytes and the size of scans is increasing over time as scanners push towards thinner slice thickness).
People envision AI/ML/DL as being very fast, and in principle it can be fast, but it requires an eye towards code optimization throughout development, specialized "ML devOPs" talent which is in short supply, and specialized infrastructure to get things to run fast. In my experience speed is an area that many developers, in their rush to commercialize, have overlooked.
All of those administrative and metadata and peripheral problems with AI systems... I'm not sure I buy that story. Because all of those things are fixable, slowly but surely. The problem is that the AI isn't worth it yet. Even when it performs at its best, it's not massively outperforming radiologists, which is what it would need to do to motivate everyone to fix the little problems.
The issue is that radiologists are pretty good at their jobs. They are not the big problem in medicine, and AI choosing to compete with those guys is just an example of looking for your car keys under the streetlight. AI is relatively good at image data, so they thought they'd try their hand at radiology; but in practical terms no one needs Google to do that.
The big entry of AI into medicine will come when AI offers a killer app that doctors can't currently do well. It might be very early diagnosis of disease from changes in lifestyle noticed by a phone or wearable. Or micro-keyhole surgery. Or something else I can't even imagine yet. But AI engines at the moment are just like the very earliest steam engines: not actually able to go faster than a horse. Definitely worth investing in as a long-term project, but the answer to the question "why isn't this technology in widespread use?" is not because hospital data isn't tidy enough. It's because we already have horses.
^ Absolutely. 100% agree. Hospitals are not motivated to fix the issues with image quality, coding, and metadata because there is no economic incentive.
Yes, the value prop for applications that seek to replace what radiologists do is generally quite low. That's why radiologists and other insiders I talk to are most excited about triage applications (FDA req's are also a bit lower for triage). Because radiologists are so overworked, it can often take 12-24 hours before a radiologist looks at an image. Two of the most commercially successful AI for radiology companies - AiDoc and viz.ai - have been making a lot of money with their systems for detecting hemorrhage in brain MRI. Obviously, it's important that hemmorages are treated as fast as possible!
The other area where I think radiology AI can bring value, which I didn't get into all in this post, is in developing countries. But there are a host of other challenges there - old scanning tech, slow internet connections, distributional shift in demographics and disease presentation, need to maintain high trust in the outputs, etc etc. Maybe I'll do another post on AI for radiology applications in developing countries in the near future...
Another area is digital pathology, which I'm not as familiar with but where I get the sense AI has a lot of potential. Pathology has always lagged behind radiology in terms of tech adoption and digitization however, for reasons that aren't exactly clear. Some people say its because the pathologists were too attached to their microscopes.
Two issues with digital pathology are that the images are far larger and that the larger vendors work with proprietary formats. Clinical digital radiology grew in the 1990s and 2000s and DICOM provided a standard way for images to be shared. The computers and networks in that era couldn't handle gracefully the much larger pathology slide images (a digitized film was typically around 2K x 2K pixels, a digitized slide might be 100K x 100K). Storage costs for pathology were very large.
Although DICOM standards for pathology were created most of the vendors chose to build their own formats. Partly this was for performance - but I suspect that part of it was the vendor's motivations to keep hospitals from migrating to another vendor's product (very similar to Radiology vendors support GSPS inconsistently and avoiding widespread support for DICOM SR and Seg).
"Although DICOM standards for pathology were created most of the vendors chose to build their own formats. " -- wow, I didn't know that. That's unfortunate because AI systems are all being built around the DICOM standard for input/output, as they should be. And obviously standards are helpful when they are widely adopted.
Side comment: if anyone here works in radiology or pathology, they may find this library useful for manipulating DICOM images and generating overlays for AI outputs: https://github.com/herrmannlab/highdicom Many pathology people are using it.
Excellent article, but few people will want to believe it. From a view on the ground interacting with colleagues, the overwhelming belief among medical personnel is that AI is coming and in fact Hinton will be right, while you have old-timers who don't accept EMRs and are skeptical for unknown reasons about AI and say it won't work but really for vague reasons so sort of invalid logic. The problem is the big lie equating deep learning with AI. Deep learning is not AI and never will be. It is incapable of causal reasoning. You can force solutions via brute strength (to wit, self-driving applications and possibly CXR's and Pap smears, etc) but until real AI is developed (eg, my work but I won't give a narcissistic plug for it) you will need human doctors. No one will want to read or believe what you wrote above, but it is true.
Very true Howard, almost no one wants to hear this, but "speak the truth, even if your voice shakes" is sort of my motto.
Yes a lot of doctors and radiologists are already preparing for AI, especially younger ones.
Causality is important in medical imaging.. but what I stress is the 'mechanistic/physical modeling' aspect. In other words, interpreting things physically -- understanding how fluid may move around under the influence of gravity, or how tissues change appearance due to IV contrast injection. Humans have a physical model of how different parts of the body deform and interact with each other physically (for instance how plaque can restrict bloodflow leading to upstream effects). This holistic high level knowledge of "what's actually going on" seems important for robust interpretation of images that may be corrupted or noisy. My impression is deep learning systems, at least right now, lack such abilities.
There is a term that describes "mechanistic/physical modeling" -- "causality" :)
(Approx 600K years ago give or take few 100K years ago small set of mutations allowing causality to emerge (plus psychosis) in pre-modern humans, something that does *not* exist in chimps or other animals or in deep learning. )
Very thorough and well-considered article, Dan. While it's good to be excited about the potential of AI for radiology, we need to be measured and realistic concerning its present potential.
When the next disruptive breakthrough occurs, we can then ratchet up our expectations. Keep us posted!
The Thailand retinopathy paper and the BMJ review seem to validate the hype. The Thailand paper ran into problems based on serious material constraints. They had poor-quality images because they didn’t have time for a 60 second waiting period between eyes, they couldn’t turn off the lights in the imaging room because it was shared with other services, or because the camera equipment was broken(!). A western ophthalmology clinic would not have these issues.
The BMJ article says that 2/36 models are better than a single radiologist for routine mammogram evaluations! The market is young and full of failures, but there are AI products that already can eat a good slice of the radiology pie. To capture most of the market, It doesn’t need to be better than the best radiologists, only the median.
I agree AI is breaking through in mammography but what has been striking to me is how long it has taken. The amount of effort that has gone through AI for mammography has been monumental yet a bunch of reviews from just the past year have all stressed it still isn't quite ready. All of this is happening on a backdrop of fierce debate about how mammography screening should be conducted which shows there is a lot of room for improvement in how screening is done (there are a lot of false negatives and false positives currently). So I hope AI can make a difference soon.
I'm not down on applications of AI in medical imaging looking out a few years, my point here is just that its overhyped *right now*. I think AI will break through for mammography soon and I'm also excited about AI being used for used for cancer risk prediction in mammography which will help physicians determine if and when to order a follow-up screening scan. See https://www.science.org/doi/10.1126/scitranslmed.aba4373 https://www.nature.com/articles/s41591-021-01599-w https://news.mit.edu/2021/robust-artificial-intelligence-tools-predict-future-cancer-0128
Another point is comparing to radiologists is tricky. Most studies only compare 1-3 radiologists which are not necessarily a good sample - radiologists vary enormously in their sensitivity/specificity based on their years of experience and specialization. So with mammo AI flirting around human level it's not surprising a few studies did better than "a single radiologist". But you are absolutely right - technically it doesn't have to be better than the median radiologist to bring value although people do generally want to see that.
Regarding the retinopathy paper, I don't see how it validates the hype but your point seems otherwise likely valid. I still think it is striking that one of the companies with the top AI talent (Alphabet / Verily) had so much trouble. It leaves me a bit nervous for all the startups which do not have top talent inside them doing the development of models.
I think our disagreement is just on the level of hype. I’ve met multiple med students pursuing radiology, and they’ve been totally blasé about the chance their job prospects get markedly worse over the next 20+ years, so their hype seems too low. But amongst tech/admin circles you’re probably well calibrated.
It seems most overhyped among tech circles, yes, like among data scientists and ML engineers. But I also think it's overhyped in the mind of random people on the street. I think a lot of people have seen ads from IBM about Watson, and maybe from other companies too like Microsoft and gotten the impression that AI is being used by doctors regularly or that "AI is transforming healthcare" and all their data is being fed into some sort of AI system.
What’s more likely: 2/26 of the models are better than humans or 2/26 papers are fraudulent? Or more charitably, the papers were written in a way to be very favorable to the model’s performance.
Excellent post! I think I am more bullish than you given the rapid progress on LLMs, especially when combined with foundation models and vision, I think you might get something approaching a model of the human body, but you've convinced me that radiology is a lot harder to automate than I thought.
Excellent post!
Excellent.. as a radiologist working on AI,, I find this one of the most honest pieces on this topic I've read in a while. As an example, it takes me a glance to detect tumor in liver.. But the best deep learning models are up to only 60% accurate(roughly speaking) and that is with the deepest networks I can train on latest GPUs.. I am beginning to wonder whether deep learning may have peaked already and we ought to find modern methods
It's really great to hear your comment!
I have also gotten two wonderful emails from medical doctors (one a radiologist, one a physician) who thanked me for my piece. One of them said they are afraid to share it because everyone around them is so excited about AI. He said it's very much an "the emperor has no clothes" type thing right now.
What you're saying sounds about right. I personally don't think deep learning has peaked - I think it can improve as we add more data and new techniques are perfected (such as transfer learning and few shot learning). However, I do think it's massively overhyped *right now*.
The time it takes algorithms to run is absolutely an overlooked part of the equation here. Runtime can be painfully slow if the scan has to be sent to the cloud for processing which is what companies like Nuance are pushing for. (Raw scans (in DICOM) are hundreds of megabytes and the size of scans is increasing over time as scanners push towards thinner slice thickness).
People envision AI/ML/DL as being very fast, and in principle it can be fast, but it requires an eye towards code optimization throughout development, specialized "ML devOPs" talent which is in short supply, and specialized infrastructure to get things to run fast. In my experience speed is an area that many developers, in their rush to commercialize, have overlooked.