View this email in your browser

Doctor Penguin Weekly

Welcome to the thirteenth week for the Doctor Penguin newsletter! This week, the following papers caught our attention: 

1. Natarajan et al. validate the performance of an offline automated analysis algorithm for Diabetic Retinopathy that runs directly off a smartphone.

2. Banerjee et al. develop a patient-specific risk score for pulmonary embolism diagnosis using ML on longitudinal clinical data and evaluate in both multi-institutional inpatient and outpatient settings.

3. Coley et al train a neural network to predict the transform rules that are most applicable to a target molecule based on its molecular structure, to design and validate synthetic pathways for complex organic molecules.

4. Mercan et al. develop machine learning methods for the automated diagnosis of preinvasive and invasive lesions of the breast.

5. Engelhard et al. identify environments and environmental features associated with smoking in an effort to optimize a smoker’s environment during a quit attempt or to study environmental correlates of other behaviors.

6. Bera et al. review AI approaches for digital pathology, providing a broad framework for incorporating ML tools into clinical oncology. 
-- Eric Topol & Pranav Rajpurkar  

Quick Links:

Diagnostic Accuracy of Community-Based Diabetic Retinopathy Screening With an Offline Artificial Intelligence System on a Smartphone.


In JAMA ophthalmology

Importance : Offline automated analysis of retinal images on a smartphone may be a cost-effective and scalable method of screening for diabetic retinopathy; however, to our knowledge, assessment of such an artificial intelligence (AI) system is lacking.

Objective : To evaluate the performance of Medios AI (Remidio), a proprietary, offline, smartphone-based, automated system of analysis of retinal images, to detect referable diabetic retinopathy (RDR) in images taken by a minimally trained health care worker with Remidio Non-Mydriatic Fundus on Phone, a smartphone-based, nonmydriatic retinal camera. Referable diabetic retinopathy is defined as any retinopathy more severe than mild diabetic retinopathy, with or without diabetic macular edema.

Design, Setting, and Participants : This prospective, cross-sectional, population-based study took place from August 2018 to September 2018. Patients with diabetes mellitus who visited various dispensaries administered by the Municipal Corporation of Greater Mumbai in Mumbai, India, on a particular day were included.

Interventions : Three fields of the fundus (the posterior pole, nasal, and temporal fields) were photographed. The images were analyzed by an ophthalmologist and the AI system.

Main Outcomes and Measures : To evaluate the sensitivity and specificity of the offline automated analysis system in detecting referable diabetic retinopathy on images taken on the smartphone-based, nonmydriatic retinal imaging system by a health worker.

Results : Of 255 patients seen in the dispensaries, 231 patients (90.6%) consented to diabetic retinopathy screening. The major reasons for not participating were unwillingness to wait for screening and the blurring of vision that would occur after dilation. Images from 18 patients were deemed ungradable by the ophthalmologist and hence were excluded. In the remaining participants (110 female patients [51.6%] and 103 male patients [48.4%]; mean [SD] age, 53.1 [10.3] years), the sensitivity and specificity of the offline AI system in diagnosing referable diabetic retinopathy were 100.0% (95% CI, 78.2%-100.0%) and 88.4% (95% CI, 83.2%-92.5%), respectively, and in diagnosing any diabetic retinopathy were 85.2% (95% CI, 66.3%-95.8%) and 92.0% (95% CI, 97.1%-95.4%), respectively, compared with ophthalmologist grading using the same images.

Conclusions and Relevance : These pilot study results show promise in the use of an offline AI system in community screening for referable diabetic retinopathy with a smartphone-based fundus camera. The use of AI would enable screening for referable diabetic retinopathy in remote areas where services of an ophthalmologist are unavailable. This study was done on patients with diabetes who were visiting a dispensary that provides curative services to the population at the primary level. A study with a larger sample size may be needed to extend the results to general population screening, however.

Natarajan Sundaram, Jain Astha, Krishnan Radhika, Rogye Ashwini, Sivaprasad Sobha


Development and Performance of the Pulmonary Embolism Result Forecast Model (PERFORM) for Computed Tomography Clinical Decision Support.


In JAMA network open

Importance : Pulmonary embolism (PE) is a life-threatening clinical problem, and computed tomographic imaging is the standard for diagnosis. Clinical decision support rules based on PE risk-scoring models have been developed to compute pretest probability but are underused and tend to underperform in practice, leading to persistent overuse of CT imaging for PE.

Objective : To develop a machine learning model to generate a patient-specific risk score for PE by analyzing longitudinal clinical data as clinical decision support for patients referred for CT imaging for PE.

Design, Setting, and Participants : In this diagnostic study, the proposed workflow for the machine learning model, the Pulmonary Embolism Result Forecast Model (PERFORM), transforms raw electronic medical record (EMR) data into temporal feature vectors and develops a decision analytical model targeted toward adult patients referred for CT imaging for PE. The model was tested on holdout patient EMR data from 2 large, academic medical practices. A total of 3397 annotated CT imaging examinations for PE from 3214 unique patients seen at Stanford University hospitals and clinics were used for training and validation. The models were externally validated on 240 unique patients seen at Duke University Medical Center. The comparison with clinical scoring systems was done on randomly selected 100 outpatient samples from Stanford University hospitals and clinics and 101 outpatient samples from Duke University Medical Center.

Main Outcomes and Measures : Prediction performance of diagnosing acute PE was evaluated using ElasticNet, artificial neural networks, and other machine learning approaches on holdout data sets from both institutions, and performance of models was measured by area under the receiver operating characteristic curve (AUROC).

Results : Of the 3214 patients included in the study, 1704 (53.0%) were women from Stanford University hospitals and clinics; mean (SD) age was 60.53 (19.43) years. The 240 patients from Duke University Medical Center used for validation included 132 women (55.0%); mean (SD) age was 70.2 (14.2) years. In the samples for clinical scoring system comparisons, the 100 outpatients from Stanford University hospitals and clinics included 67 women (67.0%); mean (SD) age was 57.74 (19.87) years, and the 101 patients from Duke University Medical Center included 59 women (58.4%); mean (SD) age was 73.06 (15.3) years. The best-performing model achieved an AUROC performance of predicting a positive PE study of 0.90 (95% CI, 0.87-0.91) on intrainstitutional holdout data with an AUROC of 0.71 (95% CI, 0.69-0.72) on an external data set from Duke University Medical Center; superior AUROC performance and cross-institutional generalization of the model of 0.81 (95% CI, 0.77-0.87) and 0.81 (95% CI, 0.73-0.82), respectively, were noted on holdout outpatient populations from both intrainstitutional and extrainstitutional data.

Conclusions and Relevance : The machine learning model, PERFORM, may consider multitudes of applicable patient-specific risk factors and dependencies to arrive at a PE risk prediction that generalizes to new population distributions. This approach might be used as an automated clinical decision-support tool for patients referred for CT PE imaging to improve CT use.

Banerjee Imon, Sofela Miji, Yang Jaden, Chen Jonathan H, Shah Nigam H, Ball Robyn, Mushlin Alvin I, Desai Manisha, Bledsoe Joseph, Amrhein Timothy, Rubin Daniel L, Zamanian Roham, Lungren Matthew P



In Science

The synthesis of complex organic molecules requires several stages, from ideation to execution, that require time and effort investment from expert chemists. Here, we report a step toward a paradigm of chemical synthesis that relieves chemists from routine tasks, combining artificial intelligence-driven synthesis planning and a robotically controlled experimental platform. Synthetic routes are proposed through generalization of millions of published chemical reactions and validated in silico to maximize their likelihood of success. Additional implementation details are determined by expert chemists and recorded in reusable recipe files, which are executed by a modular continuous-flow platform that is automatically reconfigured by a robotic arm to set up the required unit operations and carry out the reaction. This strategy for computer-augmented chemical synthesis is demonstrated for 15 drug or drug-like substances.

Coley CW, Thomas DA 3rd, Lummiss JAM, Jaworski JN, Breen CP, Schultz V, Hart T, Fishman JS, Rogers L, Gao H, Hicklin RW, Plehiers PP, Byington J, Piotti JS, Green WH, Hart AJ, Jamison TF, Jensen KF



In JAMA network open

Importance : Following recent US Food and Drug Administration approval, adoption of whole slide imaging in clinical settings may be imminent, and diagnostic accuracy, particularly among challenging breast biopsy specimens, may benefit from computerized diagnostic support tools.

Objective : To develop and evaluate computer vision methods to assist pathologists in diagnosing the full spectrum of breast biopsy samples, from benign to invasive cancer.

Design, Setting, and Participants : In this diagnostic study, 240 breast biopsies from Breast Cancer Surveillance Consortium registries that varied by breast density, diagnosis, patient age, and biopsy type were selected, reviewed, and categorized by 3 expert pathologists as benign, atypia, ductal carcinoma in situ (DCIS), and invasive cancer. The atypia and DCIS cases were oversampled to increase statistical power. High-resolution digital slide images were obtained, and 2 automated image features (tissue distribution feature and structure feature) were developed and evaluated according to the consensus diagnosis of the expert panel. The performance of the automated image analysis methods was compared with independent interpretations from 87 practicing US pathologists. Data analysis was performed between February 2017 and February 2019.

Main Outcomes and Measures : Diagnostic accuracy defined by consensus reference standard of 3 experienced breast pathologists.

Results : The accuracy of machine learning tissue distribution features, structure features, and pathologists for classification of invasive cancer vs noninvasive cancer was 0.94, 0.91, and 0.98, respectively; the accuracy of classification of atypia and DCIS vs benign tissue was 0.70, 0.70, and 0.81, respectively; and the accuracy of classification of DCIS vs atypia was 0.83, 0.85, and 0.80, respectively. The sensitivity of both machine learning features was lower than that of the pathologists for the invasive vs noninvasive classification (tissue distribution feature, 0.70; structure feature, 0.49; pathologists, 0.84) but higher for the classification of atypia and DCIS vs benign cases (tissue distribution feature, 0.79; structure feature, 0.85; pathologists, 0.72) and the classification of DCIS vs atypia (tissue distribution feature, 0.88; structure feature, 0.89; pathologists, 0.70). For the DCIS vs atypia classification, the specificity of the machine learning feature classification was similar to that of the pathologists (tissue distribution feature, 0.78; structure feature, 0.80; pathologists, 0.82).

Conclusion and Relevance : The computer-based automated approach to interpreting breast pathology showed promise, especially as a diagnostic aid in differentiating DCIS from atypical hyperplasia.

Mercan Ezgi, Mehta Sachin, Bartlett Jamen, Shapiro Linda G, Weaver Donald L, Elmore Joann G



In JAMA network open

Importance : Environments associated with smoking increase a smoker's craving to smoke and may provoke lapses during a quit attempt. Identifying smoking risk environments from images of a smoker's daily life provides a basis for environment-based interventions.

Objective : To apply a deep learning approach to the clinically relevant identification of smoking environments among settings that smokers encounter in daily life.

Design, Setting, and Participants : In this cross-sectional study, 4902 images of smoking (n = 2457) and nonsmoking (n = 2445) locations were photographed by 169 smokers from Durham, North Carolina, and Pittsburgh, Pennsylvania, areas from 2010 to 2016. These images were used to develop a probabilistic classifier to predict the location type (smoking or nonsmoking location), thus relating objects and settings in daily environments to established smoking patterns. The classifier combines a deep convolutional neural network with an interpretable logistic regression model and was trained and evaluated via nested cross-validation with participant-wise partitions (ie, out-of-sample prediction). To contextualize model performance, images taken by 25 randomly selected participants were also classified by smoking cessation experts. As secondary validation, craving levels reported by participants when viewing unfamiliar environments were compared with the model's predictions. Data analysis was performed from September 2017 to May 2018.

Main Outcomes and Measures : Classifier performance (accuracy and area under the receiver operating characteristic curve [AUC]), comparison with 4 smoking cessation experts, contribution of objects and settings to smoking environment status (standardized model coefficients), and correlation with participant-reported craving.

Results : Of 169 participants, 106 (62.7%) were from Durham (53 [50.0%] female; mean [SD] age, 41.4 [12.0] years) and 63 (37.3%) were from Pittsburgh (31 [51.7%] female; mean [SD] age, 35.2 [13.8] years). A total of 4902 images were available for analysis, including 3386 from Durham (mean [SD], 31.9 [1.3] images per participant) and 1516 from Pittsburgh (mean [SD], 24.1 [0.5] images per participant). Images were evenly split between the 2 classes, with 2457 smoking images (50.1%) and 2445 nonsmoking images (49.9%). The final model discriminated smoking vs nonsmoking environments with a mean (SD) AUC of 0.840 (0.024) (accuracy [SD], 76.5% [1.6%]). A model trained only with images from Durham participants effectively classified images from Pittsburgh participants (AUC, 0.757; accuracy, 69.2%), and a model trained only with images from Pittsburgh participants effectively classified images from Durham participants (AUC, 0.821; accuracy, 75.0%), suggesting good generalizability between geographic areas. Only 1 expert's performance was a statistically significant improvement compared with the classifier (α = .05). Median self-reported craving was significantly correlated with model-predicted smoking environment status (ρ = 0.894; P = .003).

Conclusions and Relevance : In this study, features of daily environments predicted smoking vs nonsmoking status consistently across participants. The findings suggest that a deep learning approach can identify environments associated with smoking, can predict the probability that any image of daily life represents a smoking environment, and can potentially trigger environment-based interventions. This work demonstrates a framework for predicting how daily environments may influence target behaviors or symptoms that may have broad applications in mental and physical health.

Engelhard Matthew M, Oliver Jason A, Henao Ricardo, Hallyburton Matt, Carin Lawrence E, Conklin Cynthia, McClernon F Joseph


Artificial intelligence in digital pathology - new tools for diagnosis and precision oncology.


In Nature reviews. Clinical oncology

In the past decade, advances in precision oncology have resulted in an increased demand for predictive assays that enable the selection and stratification of patients for treatment. The enormous divergence of signalling and transcriptional networks mediating the crosstalk between cancer, stromal and immune cells complicates the development of functionally relevant biomarkers based on a single gene or protein. However, the result of these complex processes can be uniquely captured in the morphometric features of stained tissue specimens. The possibility of digitizing whole-slide images of tissue has led to the advent of artificial intelligence (AI) and machine learning tools in digital pathology, which enable mining of subvisual morphometric phenotypes and might, ultimately, improve patient management. In this Perspective, we critically evaluate various AI-based computational approaches for digital pathology, focusing on deep neural networks and 'hand-crafted' feature-based methodologies. We aim to provide a broad framework for incorporating AI and machine learning tools into clinical oncology, with an emphasis on biomarker development. We discuss some of the challenges relating to the use of AI, including the need for well-curated validation datasets, regulatory approval and fair reimbursement strategies. Finally, we present potential future opportunities for precision oncology.

Bera Kaustav, Schalper Kurt A, Rimm David L, Velcheti Vamsidhar, Madabhushi Anant


Follow Us on

This email was sent to <<Email Address>>
why did I get this?    unsubscribe from this list    update subscription preferences
Stanford · 353 Serra Mall · Stanford, CA 94305-5008 · USA

Email Marketing Powered by Mailchimp