I'm going to assume that machine learning in radiology basically followed this rough model:
- An exhaustive data set of radiological scans (of whatever type they worked with) was built.
- This data set obviously didn't just use raw scans, but they were annotated with tags such that it identified the various anomalies and outcomes that the actual patients had. I.e. if something was found to actually be a tumor when removed, it was tagged as a tumor in the data set.
- The machine learning models were trained on this very large dataset, such that it developed the ability to be presented with any of the images in the data set without any sort of tagging and would identify what it was supposed to identify with high accuracy.
- The models were then tested with novel untagged scans that it had NOT been trained on to see if its accuracy was maintained.
- Once a sufficient level of accuracy was achieved, it became a tool that radiologists can now use to augment their own training--because sometimes the model catches something they would have missed. But sometimes it catches something they missed because it was actually nothing. So the radiologist has to be able to use their own judgment to both recognize their own false negatives, and exclude the model's false positives.
Somewhat close on that?
Not far off.
It's worth noting that within rigid contexts, ML models already outperform radiologists at diagnosing with a higher degree of accuracy. There are obvious "real-world" problems, such as medical boards getting on the bandwagon as well as insurance companies deciding to pay for such tests.
But even without all that, we're not yet in Radiology Utopia anyway, for nothing more than pure AI-related issues.
For one thing, ground truth for radiology images is hard to come by. brad and utee will know this, but for anybody unfamiliar, "ground truth" just means the objective, brute, real fact about something no matter who thinks differently. In this case, we're usually dealing with some kind of classification algorithm, i.e., the model spits out a label "You have a tumor, bruh" or "Nah, ur good." It can do this, as brad mentioned, because it has trained on gobs and gobs of data that was labeled for it, before it was given the task of deciding for itself about an unlabeled image. The first and obvious question is, who decides what the training images are labeled? In this case.....radiologists did. But wait, how can AI surpass humans if it learned everything it knows from fallible humans? Thing is,
real ground truth for radiology images can only be verified with biopsies, lab tests, and other things that are often out of the question. You can't biopsy somebody with a healthy image to prove they're healthy, and much of the time you can't biopsy somebody an image suggests has a problem, for various reasons, both medical and ethical. That's just one of the hurdles. The solution, in most cases I researched, was that ground truth was established--both for the training sets and for the test sets--by a group of radiologists. The idea is that multiple heads are better than one, and indeed, while a radiologist can and does mis-diagnose an image, the chances go way down if a group of them have a consensus. When evaluating a model for published research, it's usually being weighed against that human consensus (and stats describe how it performs compared to individual doctors, not the collective). So, ground truth in this case is not like teaching an algorithm to classify images of cats and dogs, where the training labels can be taken as a given.
Another problem is AI doesn't yet know which tests are appropriate for what kind of findings, and it's more convoluted than it sounds, though I would think eventually that part will get sorted out. Yet another problem is with rare tumors or conditions. AI will never perform well when it doesn't have much data to train on, while a radiologist who has years of experience and went through thousands of hours in med school has an advantage with rare cases. A human can be shown a few images of a rare condition and begin nailing it pretty quickly. An ML model can't.
There's a lot more to go into, but I'm probably boring you to tears.
The only thing I'd push back on is about radiologists augmenting their training with AI. They will certainly augment their practice, but for them to train in the first place, I don't know that they'll ever use anything but their eyes and brains. Radiologists learn to do what they do very similarly to AI. A radiologist is shown hundreds of thousands of normal, disease-free images before they're ever shown any kinds of pathologies. That's because they have to be able to identify normal images in their sleep, so that when they see an image with a problem, it jumps out at them. I'd assume that using AI to help them train would work against the very skill they're trying to develop. But, that's probably a better question for my medical wife.