Machine learning and computer vision approaches for phenotypic profiling

Arning

,

A.

,

R.

Agrawal

, and

P.

Raghavan

.

1996

.

A linear method for deviation in large databases

. In

KDD ’96 Proceedings of the Second International Conference on Knowledge Discovery and Data Mining.

E.

Simoudis

,

J.

Han

, and

U.

Fayyad

, editors.

AAAI Press

,

Menlo Park, CA

.

164

–

169

.

https://doi.org/10.1126/science.1140324

Bakal

,

C.

,

J.

Aach

,

G.

Church

, and

N.

Perrimon

.

2007

.

Quantitative morphological signatures define local signaling networks regulating cell morphology

.

Science.

316

:

1753

–

1756

.

Barnett

,

V.

, and

T.

Lewis

.

1994

.

Outliers in statistical data.

John Wiley and Sons

,

New York, NY

.

https://doi.org/10.1111/jmi.12186

Beneš

,

M.

, and

B.

Zitová

.

2015

.

Performance evaluation of image segmentation algorithms on microscopic image data

.

J. Microsc.

257

:

65

–

85

.

https://doi.org/10.1017/CBO9781107415324.004

Bengtsson

,

E.

,

C.

Wählby

, and

J.

Lindblad

.

2004

.

Robust cell image segmentation methods

.

Pattern Recognit. Image Anal.

14

:

157

–

167

.

Bishop

,

C.M.

2006

.

Pattern Recognition and Machine Learning.

Springer

,

New York, NY

.

738

pp.

https://doi.org/10.1002/(SICI)1097-0320(19981101)33:3<366::AID-CYTO12>3.0.CO;2-R

Boland

,

M.V.

,

M.K.

Markey

, and

R.F.

Murphy

.

1998

.

Automated recognition of patterns characteristic of subcellular structures in fluorescence microscopy images

.

Cytometry.

33

:

366

–

375

.

https://doi.org/10.1126/science.1091266

Boutros

,

M.

,

A.A.

Kiger

,

S.

Armknecht

,

K.

Kerr

,

M.

Hild

,

B.

Koch

,

S.A.

Haas

,

R.

Paro

,

N.

Perrimon

, and

Heidelberg Fly Array Consortium

.

2004

.

Genome-wide RNAi analysis of growth and viability in Drosophila cells

.

Science.

303

:

832

–

835

.

https://doi.org/10.1145/335191.335388

Breunig

,

M.M.

,

H.-P.

Kriegel

,

R.T.

Ng

, and

J.

Sander

.

2000

.

LOF: Identifying density-based local outliers

. In

SIGMOD ’00 Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data.

M.

Dunham

,

J.F.

Naughton

,

W.

Chen

, and

N.

Koudas

, editors.

ACM

,

New York, NY

.

93

–

104

.

https://doi.org/10.1016/j.copbio.2016.04.003

Caicedo

,

J.C.

,

S.

Singh

, and

A.E.

Carpenter

.

2016

.

Applications in image-based profiling of perturbations

.

Curr. Opin. Biotechnol.

39

:

134

–

142

.

https://doi.org/10.1007/s12021-015-9260-y

Caligiuri

,

M.E.

,

P.

Perrotta

,

A.

Augimeri

,

F.

Rocca

,

A.

Quattrone

, and

A.

Cherubini

.

2015

.

Automatic detection of white matter hyperintensities in healthy aging and pathology using magnetic resonance imaging: A review

.

Neuroinformatics.

13

:

261

–

276

.

https://doi.org/10.1109/TPAMI.1986.4767851

Canny

,

J.

1986

.

A computational approach to edge detection

.

IEEE Trans. Pattern Anal. Mach. Intell.

8

:

679

–

698

.

https://doi.org/10.1186/gb-2006-7-10-r100

Carpenter

,

A.E.

,

T.R.

Jones

,

M.R.

Lamprecht

,

C.

Clarke

,

I.H.

Kang

,

O.

Friman

,

D.A.

Guertin

,

J.H.

Chang

,

R.A.

Lindquist

,

J.

Moffat

, et al

2006

.

CellProfiler: Image analysis software for identifying and quantifying cell phenotypes

.

Genome Biol.

7

:

R100

.

https://doi.org/10.1016/S0031-3203(02)00027-4

Celeux

,

G.

,

F.

Forbes

, and

N.

Peyrard

.

2003

.

EM procedures using mean field-like approximations for Markov model-based image segmentation

.

Pattern Recognit.

36

:

131

–

144

.

https://doi.org/10.1186/s12859-015-0852-1

Chen

,

L.

,

C.

Cai

,

V.

Chen

, and

X.

Lu

.

2016

a

.

Learning a hierarchical representation of the yeast transcriptomic machinery using an autoencoder model

.

BMC Bioinformatics.

17

:

9

.

https://doi.org/10.1109/CIBCB.2006.330975

Chen

,

S.C.

,

T.

Zhao

,

C.J.

Gordon

, and

R.F.

Murphy

.

2006

.

A novel graphical model approach to segmenting cell images.

IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology

.

Toronto (ON), Canada

;

482

-

9

doi:

https://doi.org/10.1093/bioinformatics/btw074

Chen

,

Y.

,

Y.

Li

,

R.

Narayan

,

A.

Subramanian

, and

X.

Xie

.

2016

b

.

Gene expression inference with deep learning

.

Bioinformatics.

32

:

1832

–

1839

.

https://doi.org/10.1016/j.cell.2015.04.051

Chong

,

Y.T.

,

J.L.Y.

Koh

,

H.

Friesen

,

S.K.

Duffy

,

M.J.

Cox

,

A.

Moses

,

J.

Moffat

,

C.

Boone

, and

B.J.

Andrews

.

2015

.

Yeast proteome dynamics from single cell imaging and automated analysis

.

Cell.

161

:

1413

–

1424

.

https://doi.org/10.1101/gr.2383804

Conrad

,

C.

,

H.

Erfle

,

P.

Warnat

,

N.

Daigle

,

T.

Lörch

,

J.

Ellenberg

,

R.

Pepperkok

, and

R.

Eils

.

2004

.

Automatic identification of subcellular phenotypes on human cell arrays

.

Genome Res.

14

:

1130

–

1136

.

https://doi.org/10.1038/nmeth.2084

Eliceiri

,

K.W.

,

M.R.

Berthold

,

I.G.

Goldberg

,

L.

Ibáñez

,

B.S.

Manjunath

,

M.E.

Martone

,

R.F.

Murphy

,

H.

Peng

,

A.L.

Plant

,

B.

Roysam

, et al

2012

.

Biological imaging software tools

.

Nat. Methods.

9

:

697

–

710

.

https://doi.org/10.1038/msb.2010.25

Fuchs

,

F.

,

G.

Pau

,

D.

Kranz

,

O.

Sklyar

,

C.

Budjan

,

S.

Steinbrink

,

T.

Horn

,

A.

Pedal

,

W.

Huber

, and

M.

Boutros

.

2010

.

Clustering phenotype populations by genome-wide RNAi and multiparametric imaging

.

Mol. Syst. Biol.

6

:

370

.

https://doi.org/10.1049/ji-3-2.1946.0074

Gabor

,

D.

1946

.

Theory of communication. Part 1: The analysis of information

.

Journal of the Institution of Electrical Engineers

.

93

:

429

–

457

.

https://doi.org/10.1371/journal.pone.0080999

Gustafsdottir

,

S.M.

,

V.

Ljosa

,

K.L.

Sokolnicki

,

J.

Anthony Wilson

,

D.

Walpita

,

M.M.

Kemp

,

K.

Petri Seiler

,

H.A.

Carrel

,

T.R.

Golub

,

S.L.

Schreiber

, et al

2013

.

Multiplex cytological profiling assay to measure diverse cellular states

.

PLoS One.

8

:

e80999

.

https://doi.org/10.1371/journal.pcbi.1003085

Handfield

,

L.-F.

,

Y.T.

Chong

,

J.

Simmons

,

B.J.

Andrews

, and

A.M.

Moses

.

2013

.

Unsupervised clustering of subcellular protein expression patterns in high-throughput microscopy images reveals protein complexes and functional relationships between proteins

.

PLOS Comput. Biol.

9

:

e1003085

.

https://doi.org/10.1109/PROC.1979.11328

Haralick

,

R.M.

1979

.

Statistical and structural approaches to texture

.

Proc. IEEE.

67

:

786

–

804

.

Hastie

,

T.

,

R.

Tibshirani

,

J.

Friedman

, and

J.

Franklin

.

2005

.

The elements of statistical learning: data mining, inference and prediction.

New York, NY

:

Springer

, p

1

-

745

.

https://doi.org/10.1016/S0167-8655(03)00003-5

He

,

Z.

,

X.

Xu

, and

S.

Deng

.

2003

.

Discovering cluster-based local outliers

.

Pattern Recognit. Lett.

24

:

1641

–

1650

.

https://doi.org/10.1038/nmeth.1486

Held

,

M.

,

M.H.

Schmitz

,

B.

Fischer

,

T.

Walter

,

B.

Neumann

,

M.H.

Olma

,

M.

Peter

,

J.

Ellenberg

, and

D.W.

Gerlich

.

2010

.

CellCognition: Time-resolved phenotype annotation in high-throughput live cell imaging

.

Nat. Methods.

7

:

747

–

754

.

https://doi.org/10.1023/B:AIRE.0000045502.10941.a9

Hodge

,

V.J.

, and

J.I.M.

Austin

.

2004

.

A survey of outlier detection methodologies

.

J. Artif. Intell. Res.

22

:

85

–

126

.

https://doi.org/10.1038/nmeth.1581

Horn

,

T.

,

T.

Sandmann

,

B.

Fischer

,

E.

Axelsson

,

W.

Huber

, and

M.

Boutros

.

2011

.

Mapping of signaling networks through synthetic genetic interaction analysis by RNAi

.

Nat. Methods.

8

:

341

–

346

.

https://doi.org/10.1145/331499.331504

Jain

,

K.

,

M.N.

Murty

, and

P.J.

Flynn

.

1999

.

Data clustering: A review

.

ACM Comput. Surv.

31

:

264

–

323

.

https://doi.org/10.1016/j.celrep.2015.12.051

Jolly

,

A.L.

,

C.H.

Luan

,

B.E.

Dusel

,

S.F.

Dunne

,

M.

Winding

,

V.J.

Dixit

,

C.

Robins

,

J.L.

Saluk

,

D.J.

Logan

,

A.E.

Carpenter

, et al

2016

.

A Genome-wide RNAi Screen for Microtubule Bundle Formation and Lysosome Motility Regulation in Drosophila S2 Cells

.

Cell Reports

.

14

:

611

–

620

.

https://doi.org/10.1109/TIP.2015.2469136

Ju

,

F.

,

Y.

Sun

,

J.

Gao

,

Y.

Hu

, and

B.

Yin

.

2015

.

Image outlier detection and feature extraction via L1-norm-based 2D probabilistic PCA

.

IEEE Trans. Image Process.

24

:

4834

–

4846

.

https://doi.org/10.1371/journal.pone.0033755

Kitami

,

T.

,

D.J.

Logan

,

J.

Negri

,

T.

Hasaka

,

N.J.

Tolliday

,

A.E.

Carpenter

,

B.M.

Spiegelman

, and

V.K.

Mootha

.

2012

.

A chemical screen probing the relationship between mitochondrial content and cell size

.

PLoS One.

7

:

e33755

.

Knorr

,

E.M.

, and

R.T.

Ng

.

1998

.

Algorithms for mining distance-based outliers in large datasets

. In

VLDB ’98 Proceedings of the 24rd International Conference on Very Large Data Bases.

A.

Gupta

,

O.

Shmueli

, and

J.

Widom

, editors.

Morgan Kaufman Publishers

,

San Francisco, CA

.

392

–

403

.

https://doi.org/10.3109/10409238.2015.1135868

Kraus

,

O.Z.

, and

B.J.

Frey

.

2016

.

Computer vision for high content screening

.

Crit. Rev. Biochem. Mol. Biol.

51

:

102

–

109

.

https://doi.org/10.1093/bioinformatics/btw252

Kraus

,

O.Z.

,

J.L.

Ba

, and

B.J.

Frey

.

2016

.

Classifying and segmenting microscopy images with deep multiple instance learning

.

Bioinformatics.

32

:

i52

–

i59

.

https://doi.org/10.1016/j.protcy.2014.09.007

Krizhevsky

,

A.

,

I.

Sutskever

, and

G.E.

Hinton

.

2012

.

ImageNet classification with deep convolutional neural networks

.

Adv. Neural Inf. Process. Syst.

1

–

9

.

https://doi.org/10.1038/nature14539

LeCun

,

Y.

,

Y.

Bengio

, and

G.

Hinton

.

2015

.

Deep learning

.

Nature.

521

:

436

–

444

.

https://doi.org/10.1117/1.JBO.20.12.121305

Li

,

W.

,

W.

Mo

,

X.

Zhang

,

J.J.

Squiers

,

Y.

Lu

,

E.W.

Sellke

,

W.

Fan

,

J.M.

DiMaio

, and

J.E.

Thatcher

.

2015

.

Outlier detection and removal improves accuracy of machine learning approach to multispectral burn diagnostic imaging

.

J. Biomed. Opt.

20

:

121305

.

https://doi.org/10.1038/nrg3920

Libbrecht

,

M.W.

, and

W.S.

Noble

.

2015

.

Machine learning applications in genetics and genomics

.

Nat. Rev. Genet.

16

:

321

–

332

.

https://doi.org/10.1038/nrg3768

Liberali

,

P.

,

B.

Snijder

, and

L.

Pelkmans

.

2015

.

Single-cell and multivariate approaches in genetic perturbation screens

.

Nat. Rev. Genet.

16

:

18

–

32

.

https://doi.org/10.1177/1087057113503553

Ljosa

,

V.

,

P.D.

Caie

,

R.

Ter Horst

,

K.L.

Sokolnicki

,

E.L.

Jenkins

,

S.

Daya

,

M.E.

Roberts

,

T.R.

Jones

,

S.

Singh

,

A.

Genovesio

, et al

2013

.

Comparison of methods for image-based profiling of cellular morphological responses to small-molecule treatment

.

J. Biomol. Screen.

18

:

1321

–

1329

.

https://doi.org/10.1038/nmeth1032

Loo

,

L.-H.

,

L.F.

Wu

, and

S.J.

Altschuler

.

2007

.

Image-based multivariate profiling of drug responses from single cells

.

Nat. Methods.

4

:

445

–

453

.

https://doi.org/10.1016/j.tcb.2016.03.008

Mattiazzi Usaj

,

M.

,

E.B.

Styles

,

A.J.

Verster

,

H.

Friesen

,

C.

Boone

, and

B.J.

Andrews

.

2016

.

High-content screening for quantitative cell biology

.

Trends Cell Biol.

26

:

598

–

611

.

https://doi.org/10.1109/MSP.2012.2204190

Meijering

,

E.

2012

.

Cell segmentation: 50 years down the road

.

IEEE Signal. Proc. Mag.

29

:

140

–

145

.

https://doi.org/10.1016/j.jbiotec.2009.03.014

Negishi

,

T.

,

S.

Nogami

, and

Y.

Ohya

.

2009

.

Multidimensional quantification of subcellular morphology of Saccharomyces cerevisiae using CalMorph, the high-throughput image-processing program

.

J. Biotechnol.

141

:

109

–

117

.

https://doi.org/10.1038/nature08869

Neumann

,

B.

,

T.

Walter

,

J.-K.

Hériché

,

J.

Bulkescher

,

H.

Erfle

,

C.

Conrad

,

P.

Rogers

,

I.

Poser

,

M.

Held

,

U.

Liebel

, et al

2010

.

Phenotypic profiling of the human genome by time-lapse microscopy reveals cell division genes

.

Nature.

464

:

721

–

727

.

Ng

,

A.

, and

A.

Jordan

.

2002

.

On discriminative vs generative classifiers: A comparison of logistic regression and naive Bayes

.

Adv. Neural Inf. Process. Syst.

14

:

841

–

848

.

https://doi.org/10.1038/nbt1206-1565

Noble

,

W.S.

2006

.

What is a support vector machine?

Nat. Biotechnol.

24

:

1565

–

1567

.

https://doi.org/10.1093/bioinformatics/btt276

Ollion

,

J.

,

J.

Cochennec

,

F.

Loll

,

C.

Escudé

, and

T.

Boudier

.

2013

.

TANGO: A generic tool for high-throughput 3D image analysis for studying nuclear organization

.

Bioinformatics.

29

:

1840

–

1841

.

https://doi.org/10.14778/1920841.1921021

Orair

,

G.H.

,

C.H.C.

Teixeira

,

W.J.

Meira

,

Y.

Wang

, and

S.

Parthasarathy

.

2010

.

Distance-based outlier detection: Consolidation and renewed bearing

.

Proc. VLDB Endowment.

3

:

1469

–

1480

.

https://doi.org/10.1038/ncomms2475

Pardo-Martin

,

C.

,

A.

Allalou

,

J.

Medina

,

P.M.

Eimon

,

C.

Wählby

, and

M.

Fatih Yanik

.

2013

.

High-throughput hyperdimensional vertebrate phenotyping

.

Nat. Commun.

4

:

1467

.

https://doi.org/10.1101/050757

Pärnamaa

,

T.

, and

L.

Parts

.

2016

. Accurate classification of protein subcellular localization from high throughput microscopy images using deep learning. bioRxiv. doi:

(Preprint posted April 28, 2016).

Perlman

,

Z.E.

,

M.D.

Slack

,

Y.

Feng

,

T.J.

Mitchison

,

L.F.

Wu

, and

S.J.

Altschuler

.

2004

.

Multidimensional drug profiling by automated microscopy

.

Science.

306

:

1194

–

1198

.

https://doi.org/10.1126/science.1100709

https://doi.org/10.1038/nmeth.2097

Rajaram

,

S.

,

B.

Pavie

,

L.F.

Wu

, and

S.J.

Altschuler

.

2012

.

PhenoRipper: Software for rapidly profiling microscopy images

.

Nat. Methods.

9

:

635

–

637

.

https://doi.org/10.1145/342009.335437

Ramaswamy

,

S.

,

R.

Rastogi

, and

K.

Shim

.

2000

.

Efficient algorithms for mining outliers from large data sets

.

SIGMOD Rec.

427

–

438

.

https://doi.org/10.7554/eLife.06602

Roosing

,

S.

,

M.

Hofree

,

S.

Kim

,

E.

Scott

,

B.

Copeland

,

M.

Romani

,

J.L.

Silhavy

,

R.O.

Rosti

,

J.

Schroth

,

T.

Mazza

, et al

2015

.

Functional genome-wide siRNA screen identifies KIAA0586 as mutated in Joubert syndrome

.

eLife

.

4

:

e06602

.

https://doi.org/10.1093/bioinformatics/btm344

Saeys

,

Y.

,

I.

Inza

, and

P.

Larrañaga

.

2007

.

A review of feature selection techniques in bioinformatics

.

Bioinformatics.

23

:

2507

–

2517

.

https://doi.org/10.1038/ncb2092

Schmitz

,

M.H.A.

,

M.

Held

,

V.

Janssens

,

J.R.A.

Hutchins

,

O.

Hudecz

,

E.

Ivanova

,

J.

Goris

,

L.

Trinkle-Mulcahy

,

A.I.

Lamond

,

I.

Poser

, et al

2010

.

Live-cell imaging RNAi screen identifies PP2A-B55alpha and importin-beta1 as key mitotic exit regulators in human cells

.

Nat. Cell Biol.

12

:

886

–

893

.

https://doi.org/10.1371/journal.pone.0011426

Seewald

,

A.K.

,

J.

Cypser

,

A.

Mendenhall

, and

T.

Johnson

.

2010

.

Quantifying phenotypic variation in isogenic Caenorhabditis elegans expressing Phsp-16.2:gfp by clustering 2D expression patterns

.

PLoS One.

5

:

e11426

.

https://doi.org/10.1177/1087057114528537

Singh

,

S.

,

A.E.

Carpenter

, and

A.

Genovesio

.

2014

.

Increasing the content of high-content screening: An overview

.

J. Biomol. Screen.

19

:

640

–

650

.

https://doi.org/10.1038/nature08282

Snijder

,

B.

,

R.

Sacher

,

P.

Rämö

,

E.-M.

Damm

,

P.

Liberali

, and

L.

Pelkmans

.

2009

.

Population context determines cell-to-cell variability in endocytosis and virus infection

.

Nature.

461

:

520

–

523

.

https://doi.org/10.1371/journal.pcbi.0030116

Tarca

,

A.L.

,

V.J.

Carey

,

X.W.

Chen

,

R.

Romero

, and

S.

Drăghici

.

2007

.

Machine learning and its applications to biology

.

PLOS Comput. Biol.

3

:

e116

.

https://doi.org/10.1016/j.cub.2012.02.041

Turner

,

J.J.

,

J.C.

Ewald

, and

J.M.

Skotheim

.

2012

.

Cell size control in yeast

.

Curr. Biol.

22

:

R350

–

R359

.

https://doi.org/10.1007/s10479-011-0841-3

Van Der Maaten

,

L.

, and

G.

Hinton

.

2008

.

Visualizing high-dimensional data using t-SNE

.

J. Mach. Learn. Res.

9

:

2579

–

2605

.

https://doi.org/10.1111/j.0022-2720.2004.01338.x

Wählby

,

C.

,

I.M.

Sintorn

,

F.

Erlandsson

,

G.

Borgefors

, and

E.

Bengtsson

.

2004

.

Combining intensity, edge and shape information for 2D and 3D segmentation of cell nuclei in tissue sections

.

J. Microsc.

215

:

67

–

76

.

https://doi.org/10.1177/1087057107311223

Wang

,

J.

,

X.

Zhou

,

P.L.

Bradley

,

S.-F.

Chang

,

N.

Perrimon

, and

S.T.C.

Wong

.

2008

.

Cellular phenotype recognition for high-content RNA interference genome-wide screening

.

J. Biomol. Screen.

13

:

29

–

39

.

https://doi.org/10.1038/srep19598

Wang

,

Y.

,

T.

Liu

,

D.

Xu

,

H.

Shi

,

C.

Zhang

,

Y.-Y.

Mo

, and

Z.

Wang

.

2016

.

Predicting DNA methylation state of CpG dinucleotide using genome topological features and deep networks

.

Sci. Rep.

6

:

19598

.

https://doi.org/10.1371/journal.pone.0056690

Weber

,

S.

,

M.L.

Fernández-Cachón

,

J.M.

Nascimento

,

S.

Knauer

,

B.

Offermann

,

R.F.

Murphy

,

M.

Boerries

, and

H.

Busch

.

2013

.

Label-free detection of neuronal differentiation in cell populations using high-throughput live-cell imaging of PC12 cells

.

PLoS One.

8

:

e56690

.

Weston

,

J.

,

S.

Mukherjee

,

O.

Chapelle

,

M.

Pontil

,

T.

Poggio

, and

V.

Vapnik

.

2000

.

Feature selection for SVMs

.

NIPS 2000: Neural Information Processing Systems 13.

668

–

674

.

https://doi.org/10.1038/nchembio.2007.53

Young

,

D.W.

,

A.

Bender

,

J.

Hoyt

,

E.

McWhinnie

,

G.-W.

Chirn

,

C.Y.

Tao

,

J.A.

Tallarico

,

M.

Labow

,

J.L.

Jenkins

,

T.J.

Mitchison

, and

Y.

Feng

.

2008

.

Integrating high-content screening and ligand-target prediction to identify mechanism of action

.

Nat. Chem. Biol.

4

:

59

–

68

.

https://doi.org/10.1038/nmeth.3547

Zhou

,

J.

, and

O.G.

Troyanskaya

.

2015

.

Predicting effects of noncoding variants with deep learning-based sequence model

.

Nat. Methods.

12

:

931

–

934

.