Introduction: Structural Variation as a Central Feature of the Human Genome
Structural variants (SVs) are genomic alterations typically larger than 50 base pairs, encompassing copy number variants (CNVs), inversions, translocations, and complex rearrangements. They are a major source of genetic diversity and disease. While small mutations change individual nucleotides, SVs reshape entire genomic segments—sometimes altering megabases of DNA.
Carvalho and Lupski argue that understanding SVs requires a mechanistic perspective: how does the physical structure of DNA and its repair machinery lead to rearrangements? Their review organizes the various mutational mechanisms into a unified framework that links genome architecture, replication dynamics, and DNA repair pathways to both recurrent and nonrecurrent SVs.
The authors emphasize that these mechanisms are not random accidents but consequences of the genome’s inherent design—rich in repeated sequences, fragile sites, and structural motifs that predispose to rearrangement.
Genome Architecture and Recombination Substrates
The human genome contains abundant repetitive elements, including Alu elements, LINE-1 sequences, and segmental duplications (also known as low-copy repeats, or LCRs). These duplications, typically 10–500 kilobases in size and sharing 90–99% sequence identity, provide substrates for ectopic recombination—recombination between similar sequences located in non-allelic positions.
The existence of these homologous regions underpins the most common mechanism of recurrent structural variation: non-allelic homologous recombination (NAHR).
When homologous chromosomes misalign during meiosis because of repeated sequences, recombination between mispaired LCRs can delete, duplicate, or invert the intervening genomic segment.
Genomic disorders such as Charcot–Marie–Tooth disease type 1A (CMT1A) and hereditary neuropathy with liability to pressure palsies (HNPP) exemplify this mechanism. Both arise from NAHR between the same flanking repeats on chromosome 17p12 but result in reciprocal duplication (CMT1A) or deletion (HNPP). This “recurrent reciprocal rearrangement” pattern is a hallmark of NAHR-mediated SVs.
Mechanistic Categories of Structural Variant Formation
Carvalho and Lupski categorize SV mechanisms into three broad classes:
- Recombination-based mechanisms, such as NAHR and single-strand annealing (SSA).
- Replication-based mechanisms, including FoSTeS (Fork Stalling and Template Switching) and MMBIR (Microhomology-Mediated Break-Induced Replication).
- Repair-based mechanisms, primarily non-homologous end joining (NHEJ) and microhomology-mediated end joining (MMEJ).
Each mechanism leaves distinct molecular signatures at the rearrangement breakpoints—patterns of homology or microhomology that serve as “mutational fingerprints.”
- Non-Allelic Homologous Recombination (NAHR)
NAHR is mediated by long stretches of high sequence identity, typically greater than 300 base pairs, often found in segmental duplications. When homologous chromosomes or sister chromatids misalign, crossing over between these repeats produces recurrent rearrangements of predictable size and orientation.
Because NAHR relies on the same repeat pairs in different individuals, breakpoints are almost identical across patients. This explains the recurrent nature of syndromes such as:
- Williams–Beuren syndrome (7q11.23 deletion)
- Smith–Magenis and Potocki–Lupski syndromes (17p11.2)
- CMT1A/HNPP (17p12)
NAHR can occur in meiosis or mitosis, and it generates reciprocal products: a deletion in one chromatid and a duplication in the other. The mechanism depends heavily on genomic architecture—without homologous repeats, NAHR cannot occur.
- Non-Homologous End Joining (NHEJ)
NHEJ repairs DNA double-strand breaks (DSBs) without requiring extensive homology. It simply ligates DNA ends after limited end processing. This mechanism can create unique, nonrecurrent rearrangements characterized by blunt or microhomologous joins (1–5 bp) and occasional small insertions.
NHEJ operates throughout the cell cycle, especially in G1, and contributes to structural variation in somatic cells, including in cancer. In the germ line, it can produce deletions, duplications, and translocations at fragile sites where replication stress leads to DSBs. Because it lacks sequence constraints, NHEJ-mediated SVs are highly heterogeneous in size and position.
- Replication-Based Mechanisms (FoSTeS and MMBIR)
One of the review’s most influential contributions is its description of replication-based mechanisms, first proposed by Lupski’s group to explain complex, nonrecurrent CNVs.
When replication forks stall—due to secondary structures, DNA lesions, or transcription conflicts—the lagging strand can disengage and anneal to a new template elsewhere in the genome using microhomology (2–15 bp). Replication then resumes from this ectopic template, creating duplications, triplications, or complex rearrangements. This is the essence of Fork Stalling and Template Switching (FoSTeS).
A related mechanism, Microhomology-Mediated Break-Induced Replication (MMBIR), occurs when a collapsed replication fork invades another DNA molecule using short homology tracts, similarly generating complex rearrangements.
Replication-based mechanisms explain SVs that are nonrecurrent, complex, and contain multiple templated segments, such as triplication-within-duplication structures or inverted insertions. Breakpoint sequencing in patients with genomic disorders often reveals these signatures.
Complex and Chromothriptic Rearrangements
The authors discuss chromothripsis—a catastrophic event in which a chromosome shatters into dozens of fragments that are then stitched back together in random order. Although originally observed in cancer, chromothripsis-like events have been found in congenital disorders.
Such complex rearrangements likely involve multiple DSBs and replication stress, engaging combinations of NHEJ and MMBIR mechanisms. These phenomena blur the distinction between “simple” SVs and massive genome restructuring, highlighting the continuum of mutational complexity.
Determinants of Structural Variant Hotspots
Genome architecture strongly influences where SVs form:
- Segmental duplications promote NAHR.
- AT-rich regions and replication origins are prone to fork stalling.
- Palindromic or repetitive motifs can form secondary structures (e.g., cruciforms, hairpins) that trigger DSBs.
- Replication timing and chromatin state affect accessibility and susceptibility to breakage.
Furthermore, some regions are “reused” in multiple independent rearrangements across different syndromes, emphasizing that genomic context, not just random error, defines mutational landscapes.
Recurrent vs. Nonrecurrent Rearrangements
The review distinguishes:
- Recurrent SVs: identical size and breakpoints in unrelated individuals (mediated by NAHR).
- Nonrecurrent SVs: variable size and breakpoint positions (arising from NHEJ, FoSTeS, or MMBIR).
This distinction is clinically relevant. Recurrent events cause well-defined genomic syndromes with consistent phenotypes, whereas nonrecurrent events often underlie sporadic or unique cases.
Clinical and Evolutionary Implications
From a medical standpoint, these mechanisms explain both genomic disorders (due to recurrent rearrangements) and individual pathogenic CNVs (due to replication-based events). Understanding mechanism can guide diagnostic interpretation: for example, if breakpoints occur within LCRs, NAHR is likely; if unique and complex, replication errors are suspected.
From an evolutionary perspective, the same mechanisms that cause disease also generate beneficial variation. NAHR-driven duplications can create raw material for new gene functions, while replication-based processes contribute to gene family expansion. Thus, genome instability is both a source of disease and of innovation.
Molecular Signatures for Mechanism Inference
Carvalho and Lupski outline diagnostic clues visible at breakpoint junctions:
- >100 bp homology → NAHR
- 2–15 bp microhomology → FoSTeS/MMBIR
- blunt or minimal overlap → NHEJ
- insertions of non-templated bases → end joining errors
Sequencing of CNV junctions thus reveals the underlying mutational pathway, offering mechanistic insights in both research and clinical diagnostics.
Unifying Model: Replication–Recombination–Repair (RRR) Interplay
The review concludes by proposing an integrative Replication–Recombination–Repair (RRR) model, recognizing that these processes operate concurrently and sometimes sequentially. DNA replication stress can initiate DSBs, recombination resolves them, and repair pathways finalize the rearrangement. The balance among these processes determines whether genome maintenance succeeds or mutational catastrophe ensues.
Conclusions
Carvalho and Lupski provide a comprehensive mechanistic framework for understanding how structural variants form and why certain genomic regions are unstable. Their key messages are:
- Genome architecture predisposes to rearrangement by providing homologous or repetitive substrates.
- Multiple molecular mechanisms—NAHR, NHEJ, and replication-based pathways—underlie SV formation.
- Breakpoint signatures enable inference of mutational origin.
- Recurrent and nonrecurrent SVs represent two ends of a mechanistic continuum.
- The same processes drive both pathology and evolution, reflecting the dual nature of genome instability.
This review transformed cytogenetic thinking: structural variation is no longer viewed as a random anomaly but as an intrinsic feature of a dynamic genome shaped by its own repair and replication machinery. It laid the conceptual groundwork for modern mechanistic cytogenomics, where the origin of a rearrangement is as important as its outcome.