Summary of origin features reported in mammalian genome-wide mapping studies
| Study | Origin purification | Detection | Genome span | Cell type | Origin number | Mean origin spacing | Mean origin size | Main origin features |
| Mb | kb | kb | ||||||
| Human | ||||||||
| Lucas et al., 2007 | SSS | Microarray | 1,425 | 11365 | 32 | <50 | <5 | 80% within transcription units. Correlated with chromatin acetylation. |
| Cadoret et al., 2008 | λ-SNS | Microarray | 30 (ENCODE) | HeLa | 283 | 63 | <5 | Clustered in GC-rich regions. Rare in GC-poor regions. Associated with CGI, TRE, and c-JUN and c-FOS BSs, with open chromatin due to CGIs, with DNase HSSs and with evolutionarily conserved regions. |
| Karnani et al., 2010 | λ-SNS/BrdU-SNS | Microarray | 30 (ENCODE) | HeLa | 150 | 58/28 | 1.4/1.7 | AT-rich but within GC-rich regions, associated with conserved evolutionary elements. λ-SNS and λ-SNS + BrdU SNS intersects enriched in TSSs, but BrdU-SNS specific peaks depleted in TSSs. |
| Mesner et al., 2011 | Bubble trap | Microarray | 30 (ENCODE) | Early S HeLa/HeLa/GM06990 | 111 (646)/128 (657)/177 (988) | 58/69/41 | 15.2/18.1/14.5 | Broad initiation zones covering 15–22% of the genome, within intergenic regions as well as within or overlapping active and inactive genes. 20% encompass 5′ end of or entire active genes and activating histone marks. Overlap by only ∼1/3 between cell types and affected by synchronization. |
| Valenzuela et al., 2011 | SSS, λ-SNS/λ-SNS | Microarray sequencing | 34/3,000 (WG) | MCF-7, BT474, H520/MCF-7 | 8,281 | 4 | NR | >70% conserved in all cell lines. Enriched at active TSSs and in H3K4me3 and Pol-II. Associated with conserved evolutionary elements. |
| Martin et al., 2011 | λ-SNS | Sequencing | 3,000 (WG) | K562, MCF-7 | NR | NR | NR | Clustered near regions of moderate transcription. Rare in highly transcribed or nontranscribed regions. Excluded from TSSs but enriched ∼0.5 kb downstream. Strongly associated with meCpGs and DNase HSSs./Weakly associated with umCpGs, miRNA transcripts, CTCF, Pol-II and c-JUN BSs, H3K4 me, H3K29ac, and H3K27ac. |
| Besnard et al., 2012 | λ-SNS | Sequencing | 3,000 (WG) | HeLa, IMR-90, hESC H9, iPSCs from IMR-90 | 250,000 | 11 | 0.5 | Often grouped in clusters (mean size of 11 kb). At saturation cover ∼10% of the genome. One half within genes, <18% with TSS and CGI. 65–84% pairwise overlap between cell lines, few and inefficient cell type–specific origins. Density correlated with percentage of GC, timing, and efficiency. 91% associated with G4. Strand asymmetric distribution of G, C, and G4. |
| Mesner et al., 2013 | Bubble trap | Sequencing | 3,000 (WG) | GM06990 | 72,812 (123,297) | NR | 20 | Broad initiation zones covering 24% of the genome. 17,999 early, 25,735 mid-, and 29,020 late-replicating zone of mean size 27 kb, 18 kb, and 16 kb./Early zones more focused and efficient than late zones. Majority in nontranscribed DNA regardless of firing time. Early but not mid- and late zones associated with transcribed genes and activating marks DNase I HSSs (58%), H3K4me3, H3K27me3, H3K36me3, and CTCF BSs. At megabase scale, late zones anticorrelated with both activating and repressive marks. Densities were highest in both highly accessible and highly compact chromatin. |
| Dellino et al., 2013 | ORC1-ChIP | Sequencing | 3,000 (WG) | HeLa | 13,600 | NR | NR | Mostly associated with TSSs of coding and noncoding RNAs. 39% of all expressed TSSs in HeLa cells. Most and least transcribed sites associated with coding and noncoding RNAs, respectively. No consensus sequence. |
| Picard et al., 2014 | λ-SNS | Sequencing | 3,000 (WG) | K562 | 59,185 | NR | 3.4 | Reanalyzed data of Besnard et al. (2012) and found 80,000–90,000 origins for each cell type. Early origins shared between cell types. Cell type–specific origins replicate late. 80% of CGI-associated origins are constitutive. 76% of CGIs are origins. Efficiency correlated with H4K20me1 + H3K27me3. |
| Mukhopadhyay et al., 2014 | λ-SNS/BrdU-SNS | Sequencing | 3,000 (WG) | Primary basophilic erythroblasts | 100,000 | NR | NR | Association with G4 (37%), CGIs (7%), and TSSs (13%). DNase I HSSs associated with but not required for origin formation. |
| Mouse | ||||||||
| Sequeira-Mendes et al., 2009 | λ-SNS | Microarray | 10.1 | mESC PGK12, MEFs, NIH-3T3 | 97 | 103 | NR | Most within transcription units. Half at CGI promoters. Efficiency conserved across cell types and correlated with embryonic TSSs. |
| Cayrou et al., 2011 | λ-SNS | Microarray | 60.4/118.3 | mESC GCR8, mTC P19, MEFs/Kc (Drosophila) | 2,748/6,184 | 21/19 | NR | 44% conserved between cell types. Spacing fivefold smaller than IOD on combed DNA. Inferred firing efficiency 20%. Preferentially intragenic./Bimodal distribution of SNS around CGI. G-rich motifs and local nucleotide skew./Drosophila origins correlated with HP1 BSs. |
| Study | Origin purification | Detection | Genome span | Cell type | Origin number | Mean origin spacing | Mean origin size | Main origin features |
| Mb | kb | kb | ||||||
| Human | ||||||||
| Lucas et al., 2007 | SSS | Microarray | 1,425 | 11365 | 32 | <50 | <5 | 80% within transcription units. Correlated with chromatin acetylation. |
| Cadoret et al., 2008 | λ-SNS | Microarray | 30 (ENCODE) | HeLa | 283 | 63 | <5 | Clustered in GC-rich regions. Rare in GC-poor regions. Associated with CGI, TRE, and c-JUN and c-FOS BSs, with open chromatin due to CGIs, with DNase HSSs and with evolutionarily conserved regions. |
| Karnani et al., 2010 | λ-SNS/BrdU-SNS | Microarray | 30 (ENCODE) | HeLa | 150 | 58/28 | 1.4/1.7 | AT-rich but within GC-rich regions, associated with conserved evolutionary elements. λ-SNS and λ-SNS + BrdU SNS intersects enriched in TSSs, but BrdU-SNS specific peaks depleted in TSSs. |
| Mesner et al., 2011 | Bubble trap | Microarray | 30 (ENCODE) | Early S HeLa/HeLa/GM06990 | 111 (646)/128 (657)/177 (988) | 58/69/41 | 15.2/18.1/14.5 | Broad initiation zones covering 15–22% of the genome, within intergenic regions as well as within or overlapping active and inactive genes. 20% encompass 5′ end of or entire active genes and activating histone marks. Overlap by only ∼1/3 between cell types and affected by synchronization. |
| Valenzuela et al., 2011 | SSS, λ-SNS/λ-SNS | Microarray sequencing | 34/3,000 (WG) | MCF-7, BT474, H520/MCF-7 | 8,281 | 4 | NR | >70% conserved in all cell lines. Enriched at active TSSs and in H3K4me3 and Pol-II. Associated with conserved evolutionary elements. |
| Martin et al., 2011 | λ-SNS | Sequencing | 3,000 (WG) | K562, MCF-7 | NR | NR | NR | Clustered near regions of moderate transcription. Rare in highly transcribed or nontranscribed regions. Excluded from TSSs but enriched ∼0.5 kb downstream. Strongly associated with meCpGs and DNase HSSs./Weakly associated with umCpGs, miRNA transcripts, CTCF, Pol-II and c-JUN BSs, H3K4 me, H3K29ac, and H3K27ac. |
| Besnard et al., 2012 | λ-SNS | Sequencing | 3,000 (WG) | HeLa, IMR-90, hESC H9, iPSCs from IMR-90 | 250,000 | 11 | 0.5 | Often grouped in clusters (mean size of 11 kb). At saturation cover ∼10% of the genome. One half within genes, <18% with TSS and CGI. 65–84% pairwise overlap between cell lines, few and inefficient cell type–specific origins. Density correlated with percentage of GC, timing, and efficiency. 91% associated with G4. Strand asymmetric distribution of G, C, and G4. |
| Mesner et al., 2013 | Bubble trap | Sequencing | 3,000 (WG) | GM06990 | 72,812 (123,297) | NR | 20 | Broad initiation zones covering 24% of the genome. 17,999 early, 25,735 mid-, and 29,020 late-replicating zone of mean size 27 kb, 18 kb, and 16 kb./Early zones more focused and efficient than late zones. Majority in nontranscribed DNA regardless of firing time. Early but not mid- and late zones associated with transcribed genes and activating marks DNase I HSSs (58%), H3K4me3, H3K27me3, H3K36me3, and CTCF BSs. At megabase scale, late zones anticorrelated with both activating and repressive marks. Densities were highest in both highly accessible and highly compact chromatin. |
| Dellino et al., 2013 | ORC1-ChIP | Sequencing | 3,000 (WG) | HeLa | 13,600 | NR | NR | Mostly associated with TSSs of coding and noncoding RNAs. 39% of all expressed TSSs in HeLa cells. Most and least transcribed sites associated with coding and noncoding RNAs, respectively. No consensus sequence. |
| Picard et al., 2014 | λ-SNS | Sequencing | 3,000 (WG) | K562 | 59,185 | NR | 3.4 | Reanalyzed data of Besnard et al. (2012) and found 80,000–90,000 origins for each cell type. Early origins shared between cell types. Cell type–specific origins replicate late. 80% of CGI-associated origins are constitutive. 76% of CGIs are origins. Efficiency correlated with H4K20me1 + H3K27me3. |
| Mukhopadhyay et al., 2014 | λ-SNS/BrdU-SNS | Sequencing | 3,000 (WG) | Primary basophilic erythroblasts | 100,000 | NR | NR | Association with G4 (37%), CGIs (7%), and TSSs (13%). DNase I HSSs associated with but not required for origin formation. |
| Mouse | ||||||||
| Sequeira-Mendes et al., 2009 | λ-SNS | Microarray | 10.1 | mESC PGK12, MEFs, NIH-3T3 | 97 | 103 | NR | Most within transcription units. Half at CGI promoters. Efficiency conserved across cell types and correlated with embryonic TSSs. |
| Cayrou et al., 2011 | λ-SNS | Microarray | 60.4/118.3 | mESC GCR8, mTC P19, MEFs/Kc (Drosophila) | 2,748/6,184 | 21/19 | NR | 44% conserved between cell types. Spacing fivefold smaller than IOD on combed DNA. Inferred firing efficiency 20%. Preferentially intragenic./Bimodal distribution of SNS around CGI. G-rich motifs and local nucleotide skew./Drosophila origins correlated with HP1 BSs. |
WG, whole genome; NR, not reported; iPSCs, induced pluripotent stem cells; mESC, mouse embryonic stem cell; hESC, human embryonic stem cell; MEFs, mouse embryonic fibroblasts; mTC, mouse teratocarcinoma; CGI, CpG islands; TRE, transcriptional regulatory elements; ChIP, chromatin immunoprecipitation; HSS, hypersensitive site; CTCF, CCCTC-binding factor; Pol II, RNA polymerase II; meCpG, methylated CpG dinucleotide; umCpG, unmethylated CpG; G4, G-quadruplex elements; BS, binding site; IOD, inter-origin distance; HP1, heterochromatin protein 1; For Mesner et al. (2011, 2013), the numbers in parentheses indicate the number of individual EcoRI fragments clustering into the indicated number of initiation zones.