DNA is not found as a naked polymer in vivo. Instead, it exists as part of a protein complex called chromatin, which is tightly packed into the nucleus. Chromatin is folded into these conformations in a non-random manner through the actions of many proteins, some of which are listed in Table 1. The mechanisms that govern this progressive folding as well as the downstream consequences of the final conformation have been important lines of investigation in the field of genome regulation. The significance of genome organization for cell identity, gene expression programmes, cancer, and cell division are some of the central topics of inquiry.
Chemical modification to DNA and the proteins surrounding it are important regulators of genome organization. These modifications, a layer of regulation termed “epigenetics”, exist in various flavors and at characteristic sites in various cellular contexts. For example, in embryonic stem cells, methylation of lysine 27 on histone protein 3 (H3K27me3) causes compaction of chromatin, and mediates long-range interactions between other similarly marked chromatin. This mark and its effect on the structure are dynamic throughout development [6, 7]. H3K9me3 is also associated with specialized chromatin structures termed heterochromatin, which is stably repressed and structurally compact [8].
Sym | Protein | Top three suppliers | Reference |
---|---|---|---|
ABL1 | ABL proto-oncogene 1, non-receptor tyrosine kinase | BD Biosciences 554148 (12), Santa Cruz Biotechnology sc-23 (9), Cell Signaling Technology 2865 (8) | [9] |
AIFM1 | apoptosis inducing factor mitochondria associated 1 | Santa Cruz Biotechnology sc-13116 (15), Cell Signaling Technology 5318 (13), Invitrogen MA5-15880 (5) | [10] |
AKAP8L | A-kinase anchoring protein 8 like | Santa Cruz Biotechnology sc-376630 (1) | [11] |
ASF1A | anti-silencing function 1A histone chaperone | Cell Signaling Technology 2990 (9), Santa Cruz Biotechnology sc-53171 (2) | [12, 13] |
ASF1B | anti-silencing function 1B histone chaperone | Santa Cruz Biotechnology sc-53171 (2), Invitrogen MA5-14836 (1) | [13] |
ASH1L | ASH1 like histone lysine methyltransferase | Abcam ab50981 (2) | [14] |
ATRX | ATRX, chromatin remodeler | Santa Cruz Biotechnology sc-55584 (5), Dianova DIA-AX1 (1), Cell Signaling Technology 14820 (1) | [15, 16] |
BLM | Bloom syndrome RecQ like helicase | Santa Cruz Biotechnology sc-13584 (1) | [17-19] |
BRD2 | bromodomain containing 2 | Cell Signaling Technology 5848 (11), Abcam ab139690 (7) | [20] |
CDCA5 | cell division cycle associated 5 | Abcam ab192237 (2) | [21] |
CDKN2A | cyclin dependent kinase inhibitor 2A | Invitrogen MA5-14260 (19), Abcam ab54210 (16), BD Biosciences 550834 (13) | [22] |
CENPV | centromere protein V | BioLegend 647201 (1) | [23] |
CHAF1A | chromatin assembly factor 1 subunit A | Novus Biologicals NB500-207 (4), Abcam ab126625 (3), Santa Cruz Biotechnology sc-133105 (1) | [13, 24] |
CHAF1B | chromatin assembly factor 1 subunit B | Novus Biologicals NB500-212 (6), Santa Cruz Biotechnology sc-393662 (1) | [13, 24] |
CHMP1A | charged multivesicular body protein 1A | Santa Cruz Biotechnology sc-271617 (2) | [25] |
CTCF | CCCTC-binding factor | Cell Signaling Technology 3418 (10), BD Biosciences 612149 (3), Abcam ab37477 (1) | [26] |
DAXX | death domain associated protein | Cell Signaling Technology 4533 (3), Santa Cruz Biotechnology sc-70952 (2), Bio-Rad MCA2143 (1) | [15, 27] |
DFFB | DNA fragmentation factor subunit beta | Santa Cruz Biotechnology sc-5295 (1), Novus Biologicals NB120-13592 (1) | [28] |
DHX36 | DEAH-box helicase 36 | Santa Cruz Biotechnology sc-377485 (1) | [29, 30] |
DHX9 | DExH-box helicase 9 | Abcam ab183731 (3), Santa Cruz Biotechnology sc-137198 (2), MilliporeSigma WH0001660M1 (2) | [31-33] |
ERCC3 | ERCC excision repair 3, TFIIH core complex helicase subunit | Santa Cruz Biotechnology sc-377301 (1), Cell Signaling Technology 8746 (1) | [34, 35] |
ERN2 | endoplasmic reticulum to nucleus signaling 2 | Abcam ab124945 (7) | [36] |
GATAD1 | GATA zinc finger domain containing 1 | Santa Cruz Biotechnology sc-81092 (1) | [37] |
H3F3B | H3 histone family member 3B | Abcam ab14955 (33), Active Motif 39763 (18), MilliporeSigma H9908 (15) | [13, 38, 39] |
HAT1 | histone acetyltransferase 1 | Santa Cruz Biotechnology sc-376200 (2), Abcam ab194296 (1) | [13, 40] |
HHEX | hematopoietically expressed homeobox | Abcam ab117864 (1), MilliporeSigma SAB1403914 (1) | [41] |
HIRA | histone cell cycle regulator | Active Motif 39557 (4), Abcam ab129169 (1) | [13] |
HIST1H3F | histone cluster 1 H3 family member f | Cell Signaling Technology 9733 (115), Abcam ab10799 (18), MilliporeSigma H9908 (15) | [13, 38, 39, 42, 43] |
HIST2H3C | histone cluster 2 H3 family member c | Novus Biologicals NBP1-30141 (4), Invitrogen AHO1432 (2), Active Motif 61623 (1) | [38, 39] |
HIST3H3 | histone cluster 3 H3 | Abcam ab12209 (10), BioLegend 641005 (1), Santa Cruz Biotechnology sc-518011 (1) | [39, 42] |
HIST4H4 | histone cluster 4 H4 | Cell Signaling Technology 2935 (7), Active Motif 39671 (4), Abcam ab17036 (2) | [13, 38, 39, 42, 43] |
HMGA1 | high mobility group AT-hook 1 | Abcam ab129153 (5), Cell Signaling Technology 7777 (1) | [22] |
HMGA2 | high mobility group AT-hook 2 | Cell Signaling Technology 8179 (7), Santa Cruz Biotechnology sc-130024 (1), Abcam ab207301 (1) | [22] |
MRE11 | MRE11 homolog, double strand break repair nuclease | GeneTex GTX70212 (16), Abcam ab214 (14), Cell Signaling Technology 4847 (5) | [44] |
NAA10 | N(alpha)-acetyltransferase 10, NatA catalytic subunit | Santa Cruz Biotechnology sc-373920 (1) | [45] |
NBN | nibrin | GeneTex GTX70224 (5), BD Biosciences 611871 (3), Abcam ab32074 (2) | [44] |
NPM1 | nucleophosmin 1 | Invitrogen 32-5200 (86), Abcam ab10530 (28), MilliporeSigma B0556 (19) | [46] |
PARP10 | poly(ADP-ribose) polymerase family member 10 | Santa Cruz Biotechnology sc-71851 (1), Aviva Systems Biology ARP42810_P050 (1) | [47] |
POT1 | protection of telomeres 1 | Santa Cruz Biotechnology sc-81711 (1) | [48] |
RAD21 | RAD21 cohesin complex component | Santa Cruz Biotechnology sc-166973 (1), Active Motif 39383 (1) | [49] |
RAD50 | RAD50 double strand break repair protein | GeneTex GTX70228 (9), Abcam ab124682 (5), Invitrogen MA1-23269 (2) | [44] |
RAD51 | RAD51 recombinase | Invitrogen MA5-14419 (26), Abcam ab213 (22), Cell Signaling Technology 8875 (12) | [50] |
RBBP4 | RB binding protein 4, chromatin remodeling factor | Abcam ab79416 (5), Invitrogen MA1-23277 (1), LifeSpan Biosciences LS-C53331 (1) | [13, 24] |
RECQL | RecQ like helicase | Santa Cruz Biotechnology sc-166388 (1), Cell Signaling Technology 4839 (1) | [51] |
RECQL4 | RecQ like helicase 4 | Strategic Diagnostics 2547.00.02 (1) | [51, 52] |
RSF1 | remodeling and spacing factor 1 | Abcam ab109002 (1) | [53, 54] |
SIRT6 | sirtuin 6 | Cell Signaling Technology 12486 (14), Abcepta AP6245a (1) | [55] |
SMARCA5 | SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 5 | Active Motif 39543 (2) | [53, 54] |
SMC1A | structural maintenance of chromosomes 1A | Cell Signaling Technology 4805 (7), Abcam ab133643 (1) | [49] |
SMC2 | structural maintenance of chromosomes 2 | Cell Signaling Technology 5329 (1) | [56] |
SMC3 | structural maintenance of chromosomes 3 | Cell Signaling Technology 5696 (2), Santa Cruz Biotechnology sc-376352 X (1), Abcam ab128919 (1) | [49] |
SMC4 | structural maintenance of chromosomes 4 | Cell Signaling Technology 5547 (2) | [56] |
SOX9 | SRY-box 9 | Abcam ab185966 (15), Santa Cruz Biotechnology sc-166505 (3), Abnova H00006662-M04 (1) | [57] |
SRPK1 | SRSF protein kinase 1 | Sino Biological 12249-MM03 (2), BD Biosciences 611072 (2) | [58] |
SYCP3 | synaptonemal complex protein 3 | Santa Cruz Biotechnology sc-74569 (24) | [59] |
TOP1 | DNA topoisomerase I | BD Biosciences 556597 (9), Santa Cruz Biotechnology sc-32736 (7), Abcam ab109374 (5) | [60-65] |
TOP2A | DNA topoisomerase II alpha | Abcam ab52934 (11), Santa Cruz Biotechnology sc-166934 (4), MBL International M042-3 (3) | [28, 31, 66] |
TOP2B | DNA topoisomerase II beta | Santa Cruz Biotechnology sc-25330 (7), R&D Systems MAB6348 (3) | [67] |
TP53 | tumor protein p53 | Santa Cruz Biotechnology sc-126 (394), Cell Signaling Technology 2524 (165), Invitrogen MA5-12557 (137) | [68, 69] |
TPR | translocated promoter region, nuclear basket protein | Santa Cruz Biotechnology sc-101294 (3), Abnova H00007175-M01 (2) | [70] |
WRN | Werner syndrome RecQ like helicase | MilliporeSigma W0393 (4), Cell Signaling Technology 4666 (3) | [17, 71] |
Inducing gene expression also causes changes to genome architecture, and artificially activating gene expression is sufficient to open chromatin and facilitate movement of genes in the nucleus [72]. Conversely, one consequence of changes in chromatin structure is thought to be in directly regulating, maintaining or fine-tuning downstream gene expression profiles [73]. Deciphering the functional impact of chromatin architecture changes is complicated by cause vs. consequence of structural changes and gene expression programmes.

Nuclear organization refers to the sequential folding of chromatin shown in Figure 1. The multiple hierarchies of folding make studying the functional consequences of perturbing chromatin structure at individual layers challenging. However, it is known that chromatin conformation is an important layer of gene regulation, from the contact of a gene promoter with a regulatory element [74], to the positioning of genes within the nucleus [75]. As a reflection of this, chromatin organization if dynamic, and changes throughout development.
The effects of chromatin structure on disease outcomes are not always clear, but epigenetics has been frequently implicated [76, 77]. One way in which nuclear organization is likely to be important in studying disease is the effect of mutations in non-coding sequences. For example, mutations in regulatory elements could cause alterations to gene expression profiles if they are no longer able to contact their genomic targets in the nuclear space. An understanding of the establishment and maintenance of global chromatin organization in different cell types and conditions will be crucial in uncovering this layer of regulation in development and disease [78].
Many of the limiting factors in understanding chromatin organization are due to the methodologies available to study such a dynamic and complex fiber. Most of our historical knowledge of chromatin organization in the nucleus including the existence of chromatin territories, clusters of active transcription, and gene looping, comes from directly visualizing structures using in situ hybridization technologies followed by microscopy. Whilst these technologies were, and remain to be, extremely important in dissecting chromatin architecture, a need for a higher-throughput, higher-resolution methodology to interrogate genome structure remained.
Both visual and molecular techniques that exist to analyze chromatin structure are continually being developed and improved, giving researchers new tools with which to probe the 3D structure of the genome. Among the most important of these new tools are Chromosome Conformation Capture (3C) technologies, which collectively describe a method used to infer chromosome structure based interaction frequencies of DNA fragments in a population.
This review will summarize chromosome conformation capture technologies available to investigate chromosome structure and their applications.
C-technologies aim to investigate the frequency with which regions of DNA interact with each other in the nucleus. The technologies rely on the principle that loci that are spatially close to one another will be more frequently cross-linked together than regions located further away in 3D space [79, 80]. C-technologies are based on the detection of hybrid DNA fragments that are generated by cross-linking proximal regions of DNA and ligating them together, generating unique DNA sequences. These hybrid DNA sequences no longer reflect the linear sequence of the DNA, but instead reflect its conformation at the time of fixation [81]. The frequency of contacts between two loci is thought to reflect the average chromatin conformation in a cell population at the time of fixation [3, 81].
Practically, the C-assays involve cross-linking chromatin, cleaving it by restriction digest, and re-ligating the digested fragments such that two fragments of DNA that were close enough to be cross-linked now form a hybrid sequence. This is now most frequently performed in situ, that is, the restriction and ligation reactions are performed in the intact nucleus rather than following the isolation of the DNA. A library of hybrid sequences can then be analyzed to determine pairwise contacts [82]. The relative abundance of a particular hybrid sequence measured during analysis can be used to ascertain average contacts within a population of cells. A graphic of this technology is shown in Figure 2. Importantly, different variations of C-technologies exist, and the decision of which to employ depends on what is being interrogated [81]. A summary of these techniques is displayed in Table 2.

The first of the C-technologies to be established was termed 3C [82]. 3C relies on the detection of ligation products by PCR followed by gel electrophoresis, or quantitative PCR (qPCR). Contacts within specific regions are investigated using locus-specific primers that recognize the restriction fragments of interest. Therefore, it is a candidate-based approach, usually used to assay the interaction between a pair of sequences, for example between a promoter and upstream regulatory elements [83] To carry out this experiment, cells are crosslinked with formaldehyde and lysed to release their nuclei. The nuclei are subject to digestion with a restriction enzyme of choice. The digested fragments are then ligated, de-crosslinked and purified. The DNA undergoes quality control checks and is then ready for analysis. Most of the derivatives of the C-technologies have these key steps in common [84].
In 3C, qPCR is normally used to quantify whether interaction frequencies between fragments of interest are higher than the interaction between surrounding fragments. Careful analysis and evaluation using comprehensive controls must be carried out to distinguish non-meaningful, background interactions from physiologically relevant interactions [85]. Background interactions can be assessed by correlating interaction frequencies to linear distance. Many important discoveries have been made using 3C. For example, it has been employed to study the long-range interactions of the mouse β-globin locus, with regulatory elements found to interact via a loop formation when the gene was expressed [83] The 3C method can only generate data for a small number of candidate loci, owing to the laborious PCR-based detection method. Thus, it is best used when there is a small pool of candidate interactions to investigate interactions on a small scale. Additionally, the technology has the limitation that the sequences of potential interactors must be known before the technique is carried out, so it is only suitable for candidate-based approaches. Finally, the efficiency of primer annealing can bias the contact frequency, and great care must be taken to ensure that the effectiveness of each primer in the PCR reaction is controlled for [84].
Various modifications on the original 3C have now been developed that allow the identification of interactions in an unbiased way, and more recently have taken advantage of next-generation sequencing to generate global interaction data.
4C was the next derivative of 3C, developed to improve the limits of scale and resolution of 3C experiments [86] 4C aims to detect regions interacting with a specific locus of interest. Thus, it removes the candidate-based selection of potential interacting partners required for 3C, and also allows more high-throughput analysis [87].. 4C involves a standard 3C library preparation, and the library is then amplified from the locus of interest. This is achieved using inverse PCR, which requires the hybrid fragments interacting with the candidate region to be circularized. Circularization normally takes place through an additional cleavage and ligation reaction. The inverse PCR is designed such that primers complementary to the region of interest anneal in the opposite direction from one another. This allows the amplification of unknown sequences that have successfully formed ligation events with the known fragment. This enables any interacting partners to be amplified, and are normally detected by high-throughput sequencing, making 4C much more sensitive and high-throughput than 3C [84]. The sequencing data can be processed and analyzed, and information about the conformation of the region of interest as well as long-range contacts it takes part in can then be extrapolated. The processed data is normally viewed as a single track on a genome browser, representing the signal from the viewpoint.
4C has been used to uncover various biological phenomena, including the effect of pluripotency factors in shaping the embryonic stem cell genome [88], and the impact of duplications on chromatin organization [89].
Whilst 4C allows the identification of all interaction partners from one viewpoint, it does not allow the interrogation of interactions from any other restriction fragments. Thus, it cannot be used to assess chromatin organization across complex regions such as topologically associating domains (TADs) [81]. 5C allows for the analysis of all the interactions within a pool of genomic fragments covering a particular locus of up to a few megabases [90]. This is carried out using pools of PCR primers with complementary 5’ sequences followed by multiplexed PCR amplification. 5C is frequently used to characterize all chromosomal contacts within several megabases simultaneously. Thus, it is useful to build a detailed picture of complex chromatin interactions between gene clusters and regulatory domains [81] Mechanistically, 5C achieves this sensitivity through combining 3C with ligation-mediated amplification. Unlike in 3C where the library is amplified by single primer pairs, 5C involves the addition of an oligo pool containing a series of primers (that anneal to either side of restriction sites within the 3C library) complementary to the locus of interest [91]. Primers located across the 3C ligation junction are then themselves ligated, generating a 5C library. This 5C library (i.e., a “carbon copy” of the 3C library) can then be amplified through the use of universal primers, which are complementary to a common sequence present on all the oligos [91]. The resulting fragments are analyzed by sequencing. 5C data generates sufficient resolution to visualize interactions beyond TADs confirmed by the early genome-wide technologies (see below).
5C was used to observe TAD formation on the inactive X chromosome [92] and also facilitated the discovery of sub-TADs, which are found within TADs. These are cell-type specific in their interaction patterns and form an important link between transcription regulation and genome architecture [93]. In the extensively studied Hox clusters, 5C has been used to show that regulatory elements within the HoxA cluster do not interact via a distinct looping mechanism, but form sub-TAD structures that show tissue-specificity in their interaction frequencies [94].
For a 5C experiment to be reliable, PCR has to be tightly controlled, as again primer binding efficiency is likely to vary among the pool of oligos. Additionally, whilst the potential interactions are greatly enhanced compared to 3C, researchers are limited to the interactions covered by the primer library. Furthermore, 5C is becoming outcompeted by Capture-C, a method that can be employed to answer similar questions, as discussed below [84].
Hi-C is a genome-wide extension of 3C, developed to probe the 3D organization of entire genomes. Hi-C protocols require modification to the 3C library preparation. Namely, which biotinylated nucleotides are incorporated into the restriction overhangs before blunt-end ligation is carried out [95]. The crosslinks are reversed, and the library is then fragmented, and the biotinylated ligation junctions are pulled-down by streptavidin beads. This enriches the sample for informative, hybrid molecules on a genome-wide scale [95]. The pulled-down fragments are then analyzed by high-throughput sequencing. For example, Zhang H et al incorporated biotin-14-dATP (Thermal Fisher Scientific, 19524016) with DNA polymerase I (NEB, M0210) into digested, crosslinked chromatin, ligated the blunt end DNA in-situ with T4 DNA ligase (NEB, M0202M), reversed crosslinks with proteinase K (3115879 BMB) and 10% SDS, digested RNA with DNase-free RNase, extracted DNA with phenol chloroform, fragmented DNA with Epishear from Active Motive, purified DNA with AMPure XP beads (Beckman Coulter), enriched biotinylated DNA with Dynabeads MyOne Streptavidin C1 beads (Thermo Fisher Scientific, 65002) before Library construction using streptavidin beads with NEBNext DNA Library Prep Master Mix Set for Illumina (NEB E6040, M0543L, E7335S) and sequencing [96]. A similiar process is also detailed by Rhodes JDP et al [49] and SE Johnstone et al [97].
Early studies using Hi-C uncovered additional properties of chromatin folding, including the partitioning of the genome into TADs [98], and compartments [95], and the structure of the Drosophila genome [99]. Higher resolution Hi-C data was generated in 2014 following the development of in situ Hi-C, in which the ligation is performed in intact nuclei, and a more frequently cutting restriction enzyme is used. This allows the visualization of small-scale structures including genome-wide analysis of regulatory interactions, and contact the visualization of contact domains too small to be previously detected [100]. For example, Rhodes JDP et al generated in situ Hi-C libraries from low numbers of embryonic stem cells to investigate the role of cohesin in polycomb-dependent chromosome interactions [49]. Meharena et al used the Arima Hi-C kit to generate Hi-C libraries from 2 million iPSCs and NPCs [101].
Single-cell Hi-C has been carried out in an attempt to overcome the population-averaging data generated by C-technologies, and decipher chromatin conformation in a single cell. Although the resulting contact maps are sparse in their coverage, it is possible to appreciate some key features.
At the scale of a megabase domain, chromatin organization is consistent between cells. However, larger scale chromosome interactions both in trans and in cis are more variable within cell populations [102, 103]. Further advances are being made to this technique, with increasing resolutions and the ability to perform experiments in rare cell types and combine the process with imaging to resolve structures using a dual mechanism [103].
Hi-C is unmatched in its ability to resolve large-scale structures and very long range contacts. However, the global nature of Hi-C means sacrificing analysis of fine range structures possible with other methods, as the number of contacts for any given loci is around 100 fold less than for 5C [81]. This is true even with the most recent data sets, which are now at a resolution of approximately a kilobase [100]. The increase in sensitivity of Hi-C assays is directly proportional to sequencing depth, and further efforts are largely inhibited by cost. But in principle, Hi-C could be employed to analyze the interactions of regulatory elements and genes on a genome-wide scale with sufficient sequencing data of regulatory elements and genes on a genome-wide scale with sufficient sequencing data and computational power/capabilities [81]. Delaneau O et al investigated the effect of genetic variations on 12,583 cis-regulatory domains and 30 trans-regulatory hubs using Hi-C data [104]. Barisic D et al studied the effect of SNF2H on the formation of chromatin loops and insulation of topologically associating domains in cultured cells [105]. Chromosome tracing via multiplexed DNA FISH can be used to study the entire individual chromosomes [106].
One aspect of Hi-C experiments is the huge volume and complexity of sequencing data that is generated. Thus, computational tools and bioinformatics skills to process and analyze the data are required. The main data analysis strategies have been previously reviewed and summarised [80]. Ultimately the data is most often visualized in contact maps, of which there is user-friendly software to analyze [107, 108].
Incorporation of a genome capture step is a useful way to enrich the sequencing library for informative reads. Capture-C utilizes oligo capture technology to visualize interactions from hundreds of viewpoints on a genome-wide scale, allowing purification of sequences of interest using biotinylated capture probes [109]. This is achieved through standard 3C library preparation, followed by sonication of the fragments and addition of sequencing adaptors. Samples are pooled, purified by pull-down, and PCR-amplified. Unique molecular identifiers present in the sequencing adaptor primers allows for the identification of PCR duplicates, allowing them to be removed from the data. Capture-C is unrivaled in its ability to detect weak or rare long-range interactions [81]. The pooling step also reduces experimental biases between samples and increases experimental throughput.
3C | 4C | 5C | Hi-C | Capture-C | |
---|---|---|---|---|---|
Regions | One-to-one | One-to-many | Many-to-many | All-to-all | Many-to-many |
Best application | Assessing whether expected interactions within a locus occur | Investigating unknown interactions with a region on interest on a genome-wide scale | Creating detailed interaction maps of one region of interest | Generating a global view of chromatin structure at moderate resolution | Analysing hundreds of different loci with a very high resolution |
Benefits | Requires no advanced analysis | Drastically increases the scale of 3C | Multiplexing allows mapping of finer detail interaction matrices | Gives an unbiased, genome-wide overview of chromatin architecture | High resolution and high throughput |
Caveats | Labour intensive | Restricted to a single viewpoint | Oligo annealing efficiencies could introduce bias | To achieve high-resolution datasets, sequencing depth must be improved, increasing costs | Capture efficiencies for different probes could bias the library |
The chromosome conformation capture technologies represent a collection of powerful tools that can be used to examine the spatial organization of chromatin. The choice of which technique to use hugely depends on the biological questions that are being investigated [81]. Before beginning an experiment, some other technical considerations must be taken into account.
In chromosome conformation capture technologies, chromatin is usually cleaved with restriction enzymes to generate smaller fragments. Thus, the selection of the initial restriction enzyme is one of the key limiting factors for the resolution of the experiment, as contacts can only be determined at ligation junctions. The amount of sequencing required is drastically reduced when a more frequently cutting enzyme is used. Initial libraries were generated using restriction enzymes with a 6 base pair recognition sequence, which cut on average every 4,096kb in the human genome. Later experiments used 4 base pair cutters, which cut 256bp fragments, generating much higher resolution data with more complexity [81].
For the techniques requiring oligo annealing (3C, 5C, Capture-C), the efficiency of the primers is hugely important and must be evaluated. Testing efficiencies is challenging, because the oligos are designed to recognize sequences not present in genomic DNA. Control PCR amplification can be carried out using digested and religated BAC clones across the region of interest [84].
Depending on your biological question, you might decide that genome-wide analysis is the best way to interrogate chromatin structure in your system. However, the time and expertise required to carry out high-quality bioinformatic analysis can be extensive. Considerable expertise is required to process and analyze such large volumes of data. Additionally, interpreting the meaning of the results also requires advanced technical and statistical knowledge.
It could be wise to utilize another technique such as fluorescence in situ hybridization to verify findings from chromatin conformation capture experiments. This alternative analysis would improve the reliability of your findings, and remove any concerns about PCR biases or experimental artifacts. This has been done in several studies [110]. However, care must be taken in the way data is interpreted interpret data from the two techniques, as they do not necessarily provide the same information.
Since the development of chromosome conformation capture technologies, there have been consistent improvements and augmentations to the methodologies. These modifications aim to enhance the capabilities of the technologies, such as enriching the libraries for specific genomic regions [111], applying the technologies to single cells [112], and rare cell types [113]. However, one important aspect of 3D genome regulation that has not yet been directly probed in Hi-C experiments is gene transcription. This is because sequencing methods that are used to study gene regulation (such as RNA sequencing) lack the spatial information provided by microscopy or chromosome conformation capture methods. Additionally, while single-cell Hi-C approaches exist and are continually improving, they often lack the resolution of conventional approaches, and do not retain spatial information of cells in a population. Therefore, uncovering chromatin organization in complex tissues or subpopulations is a novel challenge. Two recent technologies harness microscopy to close some of the gaps in our knowledge, Hi-M [5] and ORCA [114].
To overcome the limitations of chromosome conformation capture technologies and analyze the interplay between transcriptional activation while maintaining spatial information, Cardozo Gizzi and colleagues developed a novel technique which they termed Hi-M [5]. Hi-M relies on the principles of microfluidics coupled to microscopy, with sequential rounds of labelling and imaging. It is therefore distinct from chromosome conformation capture technologies.
In Hi-M, DNA is sequentially labelled with an Oligopaint approach [115], which uses a library of short oligos that span a region of the genome from several kilobases to several megabases, depending on the number of oligos present. These oligos are labelled at 5’ end and are hybridized to the nucleus to obtain spatial information from the region of interest [115]. In Hi-M, thousands of oligonucleotides are present in a single primary library, spanning hundreds of kilobases. Each group of oligos corresponding to a single region of interest contain a unique barcode that can be recognized and bound by complementary, fluorescent oligonucleotide probes. There probes are called the readout probes. There is also the presence of an additional barcode (termed the fiducial barcode) for downstream image analysis [111].
To perform multiple rounds of imaging in single cells, a microfluidics approach is used in conjunction with a fully automated microscope. Multiple cycles of hybridization, imaging and washing are performed to capture each barcode. These steps are performed after carrying out 4 color imaging to capture embryo morphology, nuclear staining, RNA expression, and the fiducial barcode.
The researchers employed their technique to study the link between zygotic genome activation (ZGA) and chromatin architecture in Drosophila embryos. They selected the sna-esg locus, as the genes contained within are among those expressed in the initial wave of ZGA in Drosophila embryos [116]. They were able to track the gene expression at this locus using RNA-FISH to label transcribed sna mRNA.
Hi-M can be used to calculate contact probabilities and pairwise distance distributions, as well as to generate normalized contact and distance maps, similar to Hi-C data. Importantly, their results were highly comparable to previously published Hi-C datasets, enhancing the validity of both methods. In the context of the Drosophila embryo, the researchers suggest that chromatin organization into TADs is dramatically altered by active transcription at the locus, consistent with previous reports [117, 118].
Hi-M promotes microscopy as a valid approach for characterising chromatin structure at this scale at the single cell level, where many cells can be analysed with relative ease and low expense. Hi-M also provides the additional information of RNA expression, enhancing the scope of the technology.
Mateo et al sought to develop a novel method that would combine the resolution of Hi-C data with the spatial information of microscopy. Specifically, they wanted to apply their technology to study TADs and enhancer-promoter contacts whilst being able to distinguish between subpopulations of cells and detect specific mRNAs in single cells.
The concept of ORCA is highly similar to Hi-M [5], but the authors point out ORCA achieves a better resolution (2kb). ORCA also takes advantage of Oligopaint technologies [115], with a library of primary oligos possessing unique barcodes. Each barcode is imaged by sequentially hybridizing a complementary read out oligo which contains a fluorescent label. This is washed away following imaging, and a new cycle of hybridization, imaging and washing is completed.
In this study, ORCA was employed to study the regulatory elements in the bithorax complex (BX-C) in Drosophila embryos. Their results were also comparable to previously published Hi-C data, and they were able to confirm this across other cell types. The researchers employed ORCA to study cell-type specific behaviors, firstly by labelling RNAs of interest and then by hybridizing the ORCA probes as previously described. They observed distinct differences in chromatin structure correlated with differences in transcription state at the BX-C locus. Their results revealed that that disrupting the border between active and repressed chromatin at the BX-C locus lead to aberrant enhancer-promoter contacts, inappropriate gene activation and developmental defects.
ORCA provides additional convincing evidence that microscopy is a viable and important tool for analyzing chromatin structure in large numbers of single cells, and can be used to dissect the interplay between mRNA expression, chromatin structure, cellular state and developmental state. Together, ORCA and Hi-M represent substantial progress towards consolidating microscopy and chromosome conformation capture data, and resolving the interplay between different levels of genomic regulation.
A huge amount of progress has been made in recent years towards understanding the mechanisms setting up nuclear architecture and deciphering its effects on gene expression. Much of this is owed to the rapid increase in the tools available to probe these questions. Further developments in these techniques will allow finer mapping of chromatin interactions, allowing researchers to investigate their meaning more reliably.
- David Cordonnier M, Hamdane M, Bailly C, D Halluin J. The DNA binding domain of the human c-Abl tyrosine kinase preferentially binds to DNA sequences containing an AAC motif and to distorted DNA structures. Biochemistry. 1998;37:6065-76 pubmed
- Martins S, Eide T, Steen R, Jahnsen T, Skålhegg B S, Collas P. HA95 is a protein of the chromatin and nuclear matrix regulating nuclear envelope dynamics. J Cell Sci. 2000;113 Pt 21:3703-13 pubmed
- Mello J, Sillje H, Roche D, Kirschner D, Nigg E, Almouzni G. Human Asf1 and CAF-1 interact and synergize in a repair-coupled nucleosome assembly pathway. EMBO Rep. 2002;3:329-34 pubmed
- Tagami H, Ray Gallet D, Almouzni G, Nakatani Y. Histone H3.1 and H3.3 complexes mediate nucleosome assembly pathways dependent or independent of DNA synthesis. Cell. 2004;116:51-61 pubmed
- Nakamura T, Blechman J, Tada S, Rozovskaia T, Itoyama T, Bullrich F, et al. huASH1 protein, a putative transcription factor encoded by a human homologue of the Drosophila ash1 gene, localizes to both nuclei and cell-cell tight junctions. Proc Natl Acad Sci U S A. 2000;97:7284-9 pubmed
- Li J, Harrison R, Reszka A, Brosh R, Bohr V, Neidle S, et al. Inhibition of the Bloom's and Werner's syndrome helicases by G-quadruplex interacting ligands. Biochemistry. 2001;40:15194-202 pubmed
- Rankin S, Ayad N, Kirschner M. Sororin, a substrate of the anaphase-promoting complex, is required for sister chromatid cohesion in vertebrates. Mol Cell. 2005;18:185-200 pubmed
- Narita M, Narita M, Krizhanovsky V, Nũnez S, Chicas A, Hearn S, et al. A novel role for high-mobility group a proteins in cellular senescence and heterochromatin formation. Cell. 2006;126:503-14 pubmed
- Verreault A, Kaufman P, Kobayashi R, Stillman B. Nucleosome assembly by a complex of CAF-1 and acetylated histones H3/H4. Cell. 1996;87:95-104 pubmed
- Stauffer D, Howard T, Nyun T, Hollenberg S. CHMP1 is a novel nuclear matrix protein affecting chromatin structure and cell-cycle progression. J Cell Sci. 2001;114:2383-93 pubmed
- Durrieu F, Samejima K, Fortune J, Kandels Lewis S, Osheroff N, Earnshaw W. DNA topoisomerase IIalpha interacts with CAD nuclease and is involved in chromatin condensation during apoptotic execution. Curr Biol. 2000;10:923-6 pubmed
- Vaughn J, Creacy S, Routh E, Joyner Butt C, Jenkins G, Pauli S, et al. The DEXH protein product of the DHX36 gene is the major source of tetramolecular quadruplex G4-DNA resolving activity in HeLa cell lysates. J Biol Chem. 2005;280:38117-20 pubmed
- Creacy S, Routh E, Iwamoto F, Nagamine Y, Akman S, Vaughn J. G4 resolvase 1 binds both DNA and RNA tetramolecular quadruplex with high affinity and is the major source of tetramolecular quadruplex G4-DNA and G4-RNA resolving activity in HeLa cell lysates. J Biol Chem. 2008;283:34626-34 pubmed publisher
- Zhou K, Choe K, Zaidi Z, Wang Q, Mathews M, Lee C. RNA helicase A interacts with dsDNA and topoisomerase IIalpha. Nucleic Acids Res. 2003;31:2253-60 pubmed
- Zhang S, Grosse F. Domain structure of human nuclear DNA helicase II (RNA helicase A). J Biol Chem. 1997;272:11487-94 pubmed
- Coin F, Oksenych V, Egly J. Distinct roles for the XPB/p52 and XPD/p44 subcomplexes of TFIIH in damaged DNA opening during nucleotide excision repair. Mol Cell. 2007;26:245-56 pubmed
- Hwang J, Moncollin V, Vermeulen W, Seroz T, van Vuuren H, Hoeijmakers J, et al. A 3' --> 5' XPB helicase defect in repair/transcription factor TFIIH of xeroderma pigmentosum group B affects both DNA repair and transcription. J Biol Chem. 1996;271:15898-904 pubmed
- Iwawaki T, Hosoda A, Okuda T, Kamigori Y, Nomura Furuwatari C, Kimata Y, et al. Translational control by the ER transmembrane kinase/ribonuclease IRE1 under ER stress. Nat Cell Biol. 2001;3:158-64 pubmed
- Verreault A, Kaufman P, Kobayashi R, Stillman B. Nucleosomal DNA regulates the core-histone-binding subunit of the human Hat1 acetyltransferase. Curr Biol. 1998;8:96-108 pubmed
- Lee J, Paull T. ATM activation by DNA double-strand breaks through the Mre11-Rad50-Nbs1 complex. Science. 2005;308:551-4 pubmed
- Tribioli C, Mancini M, Plassart E, Bione S, Rivella S, Sala C, et al. Isolation of new genes in distal Xq28: transcriptional map and identification of a human homologue of the ARD1 N-acetyl transferase of Saccharomyces cerevisiae. Hum Mol Genet. 1994;3:1061-7 pubmed
- Okuwaki M, Matsumoto K, Tsujimoto M, Nagata K. Function of nucleophosmin/B23, a nucleolar acidic protein, as a histone chaperone. FEBS Lett. 2001;506:272-6 pubmed
- Yu M, Schreek S, Cerni C, Schamberger C, Lesniewicz K, Poreba E, et al. PARP-10, a novel Myc-interacting protein with poly(ADP-ribose) polymerase activity, inhibits transformation. Oncogene. 2005;24:1982-93 pubmed
- Opresko P, Mason P, Podell E, Lei M, Hickson I, Cech T, et al. POT1 stimulates RecQ helicases WRN and BLM to unwind telomeric DNA substrates. J Biol Chem. 2005;280:32069-80 pubmed
- Benson F, Stasiak A, West S. Purification and characterization of the human Rad51 protein, an analogue of E. coli RecA. EMBO J. 1994;13:5764-71 pubmed
- Loyola A, Huang J, LeRoy G, Hu S, Wang Y, Donnelly R, et al. Functional analysis of the subunits of the chromatin assembly factor RSF. Mol Cell Biol. 2003;23:6759-68 pubmed
- LeRoy G, Orphanides G, Lane W, Reinberg D. Requirement of RSF and FACT for transcription of chromatin templates in vitro. Science. 1998;282:1900-4 pubmed
- Kimura K, Cuvier O, Hirano T. Chromosome condensation by a human condensin complex in Xenopus egg extracts. J Biol Chem. 2001;276:5417-20 pubmed
- Miyamoto T, Hasuike S, Yogev L, Maduro M, Ishikawa M, Westphal H, et al. Azoospermia in patients heterozygous for a mutation in SYCP3. Lancet. 2003;362:1714-9 pubmed
- Interthal H, Quigley P, Hol W, Champoux J. The role of lysine 532 in the catalytic mechanism of human topoisomerase I. J Biol Chem. 2004;279:2984-92 pubmed
- Christensen M, Barthelmes H, Boege F, Mielke C. Residues 190-210 of human topoisomerase I are required for enzyme activity in vivo but not in vitro. Nucleic Acids Res. 2003;31:7255-63 pubmed
- Bharti A, Olson M, Kufe D, Rubin E. Identification of a nucleolin binding site in human topoisomerase I. J Biol Chem. 1996;271:1993-7 pubmed
- Stewart L, Ireton G, Parker L, Madden K, Champoux J. Biochemical and biophysical analyses of recombinant forms of human topoisomerase I. J Biol Chem. 1996;271:7593-601 pubmed
- Stewart L, Ireton G, Champoux J. The domain organization of human topoisomerase I. J Biol Chem. 1996;271:7602-8 pubmed
- Labourier E, Rossi F, Gallouzi I, Allemand E, Divita G, Tazi J. Interaction between the N-terminal domain of human DNA topoisomerase I and the arginine-serine domain of its substrate determines phosphorylation of SF2/ASF splicing factor. Nucleic Acids Res. 1998;26:2955-62 pubmed
- West K, Meczes E, Thorn R, Turnbull R, Marshall R, Austin C. Mutagenesis of E477 or K505 in the B' domain of human topoisomerase II beta increases the requirement for magnesium ions during strand passage. Biochemistry. 2000;39:1223-33 pubmed
- Hublitz P, Kunowska N, Mayer U, Müller J, Heyne K, Yin N, et al. NIR is a novel INHAT repressor that modulates the transcriptional activity of p53. Genes Dev. 2005;19:2912-24 pubmed
- Brain R, Jenkins J. Human p53 directs DNA strand reassociation and is photolabelled by 8-azido ATP. Oncogene. 1994;9:1775-80 pubmed
- Kouzarides T. Chromatin modifications and their function. Cell. 2007;128:693-705 pubmed
- Dekker J, Rippe K, Dekker M, Kleckner N. Capturing chromosome conformation. Science. 2002;295:1306-11 pubmed
- Tolhuis B, Palstra R, Splinter E, Grosveld F, de Laat W. Looping and interaction between hypersensitive sites in the active beta-globin locus. Mol Cell. 2002;10:1453-65 pubmed
- Dekker J. The three 'C' s of chromosome conformation capture: controls, controls, controls. Nat Methods. 2006;3:17-21 pubmed
- Zhao Z, Tavoosidana G, Sjölinder M, Göndör A, Mariano P, Wang S, et al. Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions. Nat Genet. 2006;38:1341-7 pubmed
- Dostie J, Richmond T, Arnaout R, Selzer R, Lee W, Honan T, et al. Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res. 2006;16:1299-309 pubmed
- Dostie J, Dekker J. Mapping networks of physical interactions between genomic elements using 5C technology. Nat Protoc. 2007;2:988-1002 pubmed
- Materials and Methods [ISSN : 2329-5139] is a unique online journal with regularly updated review articles on laboratory materials and methods. If you are interested in contributing a manuscript or suggesting a topic, please leave us feedback.
- gene
- human ACIN1
- human AIF
- human AKAP8L
- human ANP32B
- human ASCC3
- human ASF1A
- human ASF1B
- human ASH1L
- human ATRX
- human BAHD1
- human BLM
- human BRD2
- human CABIN1
- human CDAN1
- human CDCA5
- human CENPV
- human CENPW
- human CHAF1A
- human CHAF1B
- human CHMP1A
- human CNAP1
- human CTCF
- human DDX1
- human DDX11
- human DFFB
- human DHX36
- human DHX9
- human DNA2
- human Daxx
- human ERCC3
- human ERN2
- human GRWD1
- human H2AB1
- human H2AB3
- human H2BC1
- human H3-3B
- human H3-4
- human H3
- human H3C7
- human H4-16
- human HAT1
- human HBXAP
- human HHEX
- human HIRA
- human HJURP
- human HMGA1
- human HMGA2
- human IPO4
- human MCM9
- human MOZ
- human Mre11
- human NAA10
- human NAA60
- human NAP1L4
- human NASP
- human NCAPD3
- human NCAPG
- human NCAPH
- human NIR
- human NPM1
- human NUSAP1
- human Nbs1
- human ODAG
- human PAF1
- human PARP10
- human POT1
- human PRDM9
- human PRM1
- human PRM2
- human PURA
- human RAD21
- human RAD51
- human RECQL
- human RECQL4
- human Rad50
- human RbAp48
- human SART3
- human SIRT6
- human SMARCA5
- human SMC2
- human SMC3
- human SMC4
- human SPOC1
- human SPTY2D1
- human SRPK1
- human SUPV3L1
- human SYCP3
- human Smc1
- human Sox9
- human TOP2A
- human TOP2B
- human TOP3A
- human TPR
- human UBN1
- human WRN
- human ZRANB3
- human abl
- human mitochondrial transcription termination factor
- human p16
- human p53
- human topoisomerase I