View this email in your browser

Doctor Penguin Weekly


Welcome to Week #14 for the Doctor Penguin newsletter! This week, the following papers caught our attention: 

1. Chen et al. propose an augmented reality microscope (ARM) for real-time AI assistance in cancer diagnosis, demonstrating utility in detection of metastatic breast cancer and prostate cancer.

2. Winkler et al. demonstrate a limitation of current CNNs in the diagnosis of melanoma, showing that skin markings significantly interfered with a CNN’s predicted probability of melanoma.

3. Ruzzo et al. propose a random forest classifier to distinguish true rare de novo variants from data artifacts in WGS data (that often contain mutations introduced and propagated during cell line transformation unrelated to disease biology).

4. Davis et al. describe a procedure to select a method for updating clinical prediction models in scenarios of population drift (in environments with changing outcome rates, shifting patient populations, and evolving clinical practice).
-- Eric Topol & Pranav Rajpurkar  

Quick Links:

An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis.


In Nature medicine

The microscopic assessment of tissue samples is instrumental for the diagnosis and staging of cancer, and thus guides therapy. However, these assessments demonstrate considerable variability and many regions of the world lack access to trained pathologists. Though artificial intelligence (AI) promises to improve the access and quality of healthcare, the costs of image digitization in pathology and difficulties in deploying AI solutions remain as barriers to real-world use. Here we propose a cost-effective solution: the augmented reality microscope (ARM). The ARM overlays AI-based information onto the current view of the sample in real time, enabling seamless integration of AI into routine workflows. We demonstrate the utility of ARM in the detection of metastatic breast cancer and the identification of prostate cancer, with latency compatible with real-time use. We anticipate that the ARM will remove barriers towards the use of AI designed to improve the accuracy and efficiency of cancer diagnosis.

Chen Po-Hsuan Cameron, Gadepalli Krishna, MacDonald Robert, Liu Yun, Kadowaki Shiro, Nagpal Kunal, Kohlberger Timo, Dean Jeffrey, Corrado Greg S, Hipp Jason D, Mermel Craig H, Stumpe Martin C


Association Between Surgical Skin Markings in Dermoscopic Images and Diagnostic Performance of a Deep Learning Convolutional Neural Network for Melanoma Recognition.


In JAMA dermatology

Importance : Deep learning convolutional neural networks (CNNs) have shown a performance at the level of dermatologists in the diagnosis of melanoma. Accordingly, further exploring the potential limitations of CNN technology before broadly applying it is of special interest.

Objective : To investigate the association between gentian violet surgical skin markings in dermoscopic images and the diagnostic performance of a CNN approved for use as a medical device in the European market.

Design and Setting : A cross-sectional analysis was conducted from August 1, 2018, to November 30, 2018, using a CNN architecture trained with more than 120 000 dermoscopic images of skin neoplasms and corresponding diagnoses. The association of gentian violet skin markings in dermoscopic images with the performance of the CNN was investigated in 3 image sets of 130 melanocytic lesions each (107 benign nevi, 23 melanomas).

Exposures : The same lesions were sequentially imaged with and without the application of a gentian violet surgical skin marker and then evaluated by the CNN for their probability of being a melanoma. In addition, the markings were removed by manually cropping the dermoscopic images to focus on the melanocytic lesion.

Main Outcomes and Measures : Sensitivity, specificity, and area under the curve (AUC) of the receiver operating characteristic (ROC) curve for the CNN's diagnostic classification in unmarked, marked, and cropped images.

Results : In all, 130 melanocytic lesions (107 benign nevi and 23 melanomas) were imaged. In unmarked lesions, the CNN achieved a sensitivity of 95.7% (95% CI, 79%-99.2%) and a specificity of 84.1% (95% CI, 76.0%-89.8%). The ROC AUC was 0.969. In marked lesions, an increase in melanoma probability scores was observed that resulted in a sensitivity of 100% (95% CI, 85.7%-100%) and a significantly reduced specificity of 45.8% (95% CI, 36.7%-55.2%, P < .001). The ROC AUC was 0.922. Cropping images led to the highest sensitivity of 100% (95% CI, 85.7%-100%), specificity of 97.2% (95% CI, 92.1%-99.0%), and ROC AUC of 0.993. Heat maps created by vanilla gradient descent backpropagation indicated that the blue markings were associated with the increased false-positive rate.

Conclusions and Relevance : This study's findings suggest that skin markings significantly interfered with the CNN's correct diagnosis of nevi by increasing the melanoma probability scores and consequently the false-positive rate. A predominance of skin markings in melanoma training images may have induced the CNN's association of markings with a melanoma diagnosis. Accordingly, these findings suggest that skin markings should be avoided in dermoscopic images intended for analysis by a CNN.

Trial Registration : German Clinical Trial Register (DRKS) Identifier: DRKS00013570.

Winkler Julia K, Fink Christine, Toberer Ferdinand, Enk Alexander, Deinlein Teresa, Hofmann-Wellenhof Rainer, Thomas Luc, Lallas Aimilios, Blum Andreas, Stolz Wilhelm, Haenssle Holger A



In Cell

We performed a comprehensive assessment of rare inherited variation in autism spectrum disorder (ASD) by analyzing whole-genome sequences of 2,308 individuals from families with multiple affected children. We implicate 69 genes in ASD risk, including 24 passing genome-wide Bonferroni correction and 16 new ASD risk genes, most supported by rare inherited variants, a substantial extension of previous findings. Biological pathways enriched for genes harboring inherited variants represent cytoskeletal organization and ion transport, which are distinct from pathways implicated in previous studies. Nevertheless, the de novo and inherited genes contribute to a common protein-protein interaction network. We also identified structural variants (SVs) affecting non-coding regions, implicating recurrent deletions in the promoters of DLG2 and NR3C2. Loss of nr3c2 function in zebrafish disrupts sleep and social function, overlapping with human ASD-related phenotypes. These data support the utility of studying multiplex families in ASD and are available through the Hartwell Autism Research and Technology portal.

Ruzzo Elizabeth K, Pérez-Cano Laura, Jung Jae-Yoon, Wang Lee-Kai, Kashef-Haghighi Dorna, Hartl Chris, Singh Chanpreet, Xu Jin, Hoekstra Jackson N, Leventhal Olivia, Leppä Virpi M, Gandal Michael J, Paskov Kelley, Stockham Nate, Polioudakis Damon, Lowe Jennifer K, Prober David A, Geschwind Daniel H, Wall Dennis P


ASD, autism, de novo, genetics, inherited, machine learning, multiplex families


In Journal of the American Medical Informatics Association : JAMIA

OBJECTIVE : Clinical prediction models require updating as performance deteriorates over time. We developed a testing procedure to select updating methods that minimizes overfitting, incorporates uncertainty associated with updating sample sizes, and is applicable to both parametric and nonparametric models.

MATERIALS AND METHODS : We describe a procedure to select an updating method for dichotomous outcome models by balancing simplicity against accuracy. We illustrate the test's properties on simulated scenarios of population shift and 2 models based on Department of Veterans Affairs inpatient admissions.

RESULTS : In simulations, the test generally recommended no update under no population shift, no update or modest recalibration under case mix shifts, intercept correction under changing outcome rates, and refitting under shifted predictor-outcome associations. The recommended updates provided superior or similar calibration to that achieved with more complex updating. In the case study, however, small update sets lead the test to recommend simpler updates than may have been ideal based on subsequent performance.

DISCUSSION : Our test's recommendations highlighted the benefits of simple updating as opposed to systematic refitting in response to performance drift. The complexity of recommended updating methods reflected sample size and magnitude of performance drift, as anticipated. The case study highlights the conservative nature of our test.

CONCLUSIONS : This new test supports data-driven updating of models developed with both biostatistical and machine learning approaches, promoting the transportability and maintenance of a wide array of clinical prediction models and, in turn, a variety of applications relying on modern prediction tools.

Davis Sharon E, Greevy Robert A, Fonnesbeck Christopher, Lasko Thomas A, Walsh Colin G, Matheny Michael E


calibration, model updating, predictive analytics

Follow Us on

This email was sent to <<Email Address *>>
why did I get this?    unsubscribe from this list    update subscription preferences
Stanford · 353 Serra Mall · Stanford, CA 94305-5008 · USA

Email Marketing Powered by Mailchimp