Big data analytics is having a positive significant impact on the healthcare industry and holds a promising wide range of medical applications. The big data applications are enabling researchers and clinicians to exploit data generated by healthcare systems globally. Dr. Ravi Gupta, Vice President – Bioinformatics, MedGenome Labs, India’s leading genomics and clinical data driven diagnostics and drug discovery research company in an interview shares how the genetic testing landscape is rapidly evolving with the introduction of new and advanced technologies allowing for more efficient and affordable solutions. Read the full interview..
How are genomic companies shaping the diagnostic landscape with the use of data-driven technologies?
The genetic testing landscape is rapidly evolving with the introduction of new and advanced technologies allowing for more efficient and affordable solutions. Consequently, these advancements continue to generate complex and large volumes of genetic data rapidly. Advanced bioinformatics and AI/ML techniques are being applied to extract meaning from existing data and transform how we use it. From personalized therapy, drug design to population screening and electronic health record mining, the applications are vast. There are several diseases that can only be identified through genetic testing. Today therapeutics are available for cancer that are prescribed only after completing a companion diagnostic test.
Machine Learning (ML) techniques can be used to identify disease patterns, classify patients with ease, develop clinical forecasting models, and inspect the impact of medication on disease, in addition to boosting the speed of analysis. Gene sequencing techniques provide crucial insights into an individual’s current and future health that may have implications for family members as well, saving valuable time in clinical decision-making and devising health and medication plans. In a nutshell, personalized medicine complemented by AI/ML techniques has the potential to revolutionize healthcare. We are now looking at multi-omics testing and further lowering data analysis costs.
How big data is helping MedGenome in collecting the patient’s data. Can you cite an example of big data which helped MedGenome in problem solving
Large genomic data is helping us in developing better interpretation systems. With the help of both population and clinical data we can make better sense of the diagnostics data and at the same it helps in fine-tuning our recommendation for the patient. Using large genomics data, we could build an India or South Asia centric machine learning model for identifying actionable and or pathogenic variants in the clinical samples. We are implementing fewer approaches for our big data journey. As the first step we are spending our effort to structure and clean up the genetic and clinical metadata. We are spending our time in understanding the genotype and phenotype relationship to understand the underlying disease better. We are also investing in acquiring data through regulated and controlled studies to cover different types of diseases. All these steps will help MedGenome in understanding the disease better and offer a solution to the patients.
Can you explain the role of big data in diagnostics. How is it transforming the medical diagnostics sector in India?
The patient healthcare process including record keeping, patient diagnostics and care, compliance, and regulatory requirements generates a large amount of structured and unstructured data. The big data analytical methodologies that integrate all the patient data is helping in better and accurate diagnostics by providing better insights and at the same time helping in better management and treatment of patients in healthcare with reduced cost and better quality. Big data analytics is having a positive significant impact on the healthcare industry and holds a promising wide range of medical applications that includes patient decision support systems, disease surveillance, and population health. The big data applications are enabling researchers and clinicians to exploit data generated by healthcare systems globally.
With the emergence of better communication networks and the rapid digitisation of patient records and clinical data, the medical diagnostics field is rapidly transforming. The Ayushman Bharat Digital Mission is one such big step taken up by the government of India to transform the way healthcare services are delivered to fellow citizens. Under this mission the government of India has developed an ecosystem that encourages many new age companies to develop their solutions and provide better healthcare services. One of the best examples of big data is how the COVID was handled in India. The Indian Council of Medical Research (ICMR) quickly developed a national COVID-19 testing data management tool that holds more than 550 million records. The centralized data management and analytical tool helped in better surveillance and management of COVID in the country.
What are the trends today that encourage the healthcare industry to embrace big data?
With the emergence of new and better deep learning (DL) algorithms and it’s success in application to various diagnostics field, the healthcare industry is moving towards AI based methods to make more accurate decisions. Further there is a significant rise in digital health applications specifically within doctor and clinical settings. Clinicians are using more latest digital technologies to have a better relationship with the patients and provide better care. Many hospitals and healthcare organizations are actively exploring AI and big data solutions in many areas including radiology, genomics, telemedicine.
How AI can outperform humans in the interpretation of medical images from various diseases. Can you discuss the various applications and challenges.
Accurate interpretation of radiology images plays an important role in clinical diagnosis and treatment planning. In the last few years there have been many studies that show how AI can outperform humans in the interpretation of medical images from various diseases. Chest X-rays with over 2 billion scans worldwide every year are one of the most used medical scans for diagnosis of several thoracic diseases. A deep learning program viz., CheXNet developed by Rajpurkar et al. in 2017 uses a 121-layer Convolutional Neural Network (CNN) trained on large publicly available chest X-ray dataset with over 100,000 records from 14 diseases, can detect pneumonia from chest X-rays better than trained radiologists. The sensitivity of manual identification of cancerous pulmonary nodule by clinical community has not been satisfactory and ranges from 36% to 84% depending upon tumor size and cohort. Recently a deep neural network (DNN) was applied to detect cancerous pulmonary nodule from X-ray of chest using DNN (Nam et al., 2018). This method yielded much better results compared to manual identification methods and its overall performance was better in 16 out of 18 clinicians. The clinicians who performed better than the AI method had over 13 years of experience. Another interesting application of AI has been in identification of bone fracture from images as compared to human interpretation e.g. using a DNN method on wrist fractures detection increased accuracy from 81% to 92% and reduced the misinterpretation by 47% (Lindsey et al., 2018).
Apart from radiology big data analysis is also widely used in genomics. MedGenome applies big data analysis approach for identify the causal variants accurately and rapidly from the patient’s genomics data. Identifying causal variants from a large genomics data is a needle in a haystack problem. By building a machine learning model trained on large dataset we can speed up the identification of causal variant and at the same time do it accurately. Big data approaches are also used in handling of astronomical genomics data that traditional methodologies cannot handle. Recently deep learning models are also being used for identifying early cancer signals in healthy adults.
Most of the models that are available are trained on western population. This can lead to biased results. There is a need for developing population level computer model for better and unbiased prediction.
How can big data enable diagnostics players to expand their services?
The big data mythologies are playing a crucial role in developing next level of services and applications and at the same time improve our existing manual process. Using big data predictive power, the preventive healthcare industry can lower the cost of treatment significantly. Using data driven forecasting system the hospitals can streamlining the patient care processes which can lead to better utilization and planning of resources. The wearable devices and mobile phones are further helping in expanding the services of diagnostics. The continuous monitoring of wearable devices data has led to the growth of preventive medicine applications where the clinician can monitor the heath parameters on real time basis. Such devices are saving many lives including predicting early cardiac arrests.
What are the latest technologies that are being used in big data for the diagnostics sector?
The big data technologies can be grouped into four different types: data storage, data mining, data analytics, and data visualization. In medical imaging the most popular framework used is Hadoop. The Hadoop framework helps in processing and storing large amounts of different kind of data efficiently. The Hadoop framework consists of three specific components – Hadoop distributed file system (HDFS), MapReduce and Yet another resource negotiator (YARN). HDFS is used for storing the data efficiently, MapReduce helps in faster processing of the data and YARN is applied for efficient allocation of resources required to process and manage clusters of data. To overcome some of the challenges faced by Hadoop framework Apache Spark framework was formulated with the development in Berkeley’s AMPLab at the University of California, Berkeley. Spark supports several programming languages like Python, Java, R, and Scala. Several deep learning-based algorithms have been developed in the last few years which are now being applied into diagnostics especially on radiology and genomics.
How can big data give more scope to precision medicine? Any example.
The electronic patient health record (EHR) that contains both structured and unstructured data is a source of big data. It contains information related to socio-demographics, medical conditions, genetics, and treatments. The big data techniques help in developing computer models that helps the clinician to organize the data, recognize patterns, interpret results, and set thresholds for actions. The EHR system will become an integral part of clinical decision support system. This is also a process in many western countries for many any clinical decision. This also increases the productivity of clinicians by reducing mundane jobs. The big data for health will play an important in pharmacogenetics and stratified healthcare for precision medicine. Patients with a similar cancer subtype often respond differently when challenged with the same chemotherapeutics. Already precision medicine is becoming a part of regular diagnostics where a patient is tested for certain gene mutations before the drug is prescribed and/or a clinical decision is made. For example, CYP2D6 polymorphism is associated with response to Tamoxifen, BRAF mutations (Y472C) have been linked to Dasatinib response in non-small cell cancer of the lung, and many more such genes have been recently associated with the response of rectal cancer to chemoradiotherapy.