In a groundbreaking collaboration between MIT and the Dana-Farber Cancer Institute, researchers have used machine learning to tackle a challenging issue in cancer treatment. For a small group of patients, the source of their cancer is unknown, making it difficult to choose the right treatment. By developing a computational model using machine learning, the researchers have found a way to solve this puzzle and create personalized therapies.
Traditional cancer treatments often use drugs that are tailored to the specific type of cancer, making them very effective. However, in approximately 3 to 5 percent of cases where cancer has spread throughout the body, it’s hard to determine where the cancer originated. This is known as cancers of unknown primary (CUP) and has been a challenge for oncologists, resulting in limited treatment options.
The MIT and Dana-Farber team created a powerful computational model by analyzing the genetic sequences of around 400 genes commonly associated with cancer. Using machine learning, the model accurately predicted the origin of tumors based on these gene sequences. The model had an impressive success rate, correctly classifying over 40 percent of tumors with high confidence. This opens up possibilities for personalized treatments based on predicted cancer origins.
The model’s impact on treatment decisions is significant. By guiding doctors towards personalized therapies for CUP patients, the model offers hope to those struggling with unknown cancer origins.
The researchers developed the model using a large dataset of genetic sequences from nearly 30,000 patients with 22 types of cancer. This training phase allowed the machine-learning model, called OncoNPC, to accurately predict cancer origins with 80 percent accuracy for unseen tumors. For high-confidence predictions, accuracy rose to approximately 95 percent.
The researchers tested the model on a dataset of around 900 tumors from CUP patients at Dana-Farber. Surprisingly, the model confidently predicted the origins of 40 percent of these tumors, making significant progress in personalized cancer treatment.
The model’s predictions were validated through comparisons with germline mutation analysis, which reveals genetic predispositions to specific cancers. Encouragingly, the model’s predictions closely matched the most strongly predicted cancer type based on germline mutations.
In addition to prediction accuracy, the model showed promise in terms of clinical impact. The survival times of CUP patients were consistent with the model’s prognosis, with patients predicted to have poor prognosis cancers experiencing shorter survival times. Patients who received treatments based on the model’s predictions fared better than those who received treatments for different cancer types.
One of the most promising aspects of the model is that it identified an additional 15 percent of patients (a 2.2-fold increase) who could have benefited from existing targeted treatments if their cancer type had been known. This breakthrough opens the door to wider use of precision therapies, maximizing the potential of available treatments.
Looking to the future, the researchers plan to improve their model by incorporating additional data modalities, such as pathology and radiology images. By considering multiple aspects of tumor analysis, the model can not only improve predictions but also guide treatment choices, ushering in a new era of personalized cancer care. As technology and medical science continue to collaborate, patients will have a brighter future in the fight against cancer’s unknown origins.
For more information, check out the research paper and MIT blog.
Remember to join our ML subreddit, Facebook community, Discord channel, and email newsletter to stay updated on the latest AI research news and projects.