When it comes to the diagnosis of pigmented skin lesions, artificial intelligence is superior to humans. In a study conducted under the supervision of the MedUni Vienna human experts “competed” against computer algorithms. The algorithms achieved clearly better results, yet their current abilities cannot replace humans. The results were published in the journal “The Lancet Oncology”.
The International Skin Imaging Collaboration (ISIC) and the MedUni Vienna organized an international challenge to compare the diagnostic skills of 511 physicians with 139 computer algorithms (from 77 different machine learnings labs). A database of more than 10.000 images, which was established by the team around Harald Kittler at the Department of Dermatology of MedUni Vienna in cooperation with the University of Queensland (Australia), was used as a training set for the machines. This database includes benign (moles, sun spots, senile warts, angiomas and dermatofibromas) and malignant pigmented lesions (melanomas, basal cell carcinoma and pigmented squamous cell carcinoma).
Each participant had to diagnose 30 randomly selected images out of a test-set of 1511 images. The result was unequivocal. While the best humans diagnosed 18.8 out of 30 cases correctly, the best machines achieved 25.4 correct diagnoses. This did not surprise first-author Philipp Tschandl from the MedUni Vienna: “Two thirds of all participating machines were better than humans; this result had been evident in similar trials during the past years.”
“Two thirds of all participating machines were better than humans; this result had been evident in similar trials during the past years.”
Not a substitute for human beings
Although the algorithms were clearly superior in this experiment, this does not mean that the machines will replace humans in the diagnosis of skin cancer. Philipp Tschandl: “The computer only analyzes an optical snapshot and is really good at it. In real life, however, the diagnosis is a complex task. Physicians usually examine the entire patient and not just single lesions. When humans make a diagnosis they also take additional information into account, such as the duration of the disease, whether the patient is at high or low risk, and the age of the patient, which was not provided in this study.
Despite the impressive performance of artificial intelligence there is still room for improvement. The machines were significantly less accurate in the diagnosis of lesions that came from centres that did not provide training images.
With regard to human performance experience was important. The most experienced participants with at least ten years of experience in the diagnosis of pigmented skin lesions performed best.