Omni-ATAC: Updated and Optimized ATAC-seq Protocol

Previously, we shared a comprehensive overview that described various tools for each step of ATAC-seq data analysis, see “Overview: A Complete Guide to ATAC-seq Data Analysis Tools.” This time, we introduce another overview that presents an updated and optimized ATAC-seq protocol called Omni-ATAC. The literature information is as follows:

Title: Chromatin accessibility profiling by ATAC-seq

Published in: Nat Protoc. 2022 Apr 27;17(6):1518–1552.

DOI: 10.1038/s41596-022-00692-9

Link: https://pmc.ncbi.nlm.nih.gov/articles/PMC9189070/

Overview of the Omni-ATAC Protocol

ATAC-seq requires relatively low input cells and does not require prior knowledge of the dynamic epigenetic marks or transcription factors of the regulatory system. Here, the authors describe an updated and optimized ATAC-seq protocol called Omni-ATAC, suitable for a wide range of cell and tissue types. This protocol details the steps for generating and sequencing ATAC-seq libraries and offers suggestions for sample preparation and downstream bioinformatics analysis. The ATAC-seq workflow mainly includes five steps:

  • Sample preparation
  • Transposition
  • Library preparation
  • Sequencing
  • Data analysis

As shown in the figure below, it also includes the approximate time required for each step:

Omni-ATAC: Updated and Optimized ATAC-seq Protocol
Figure 2: Schematic overview of ATAC-seq protocol

Comparison with Other Technologies

There are many existing technologies for mapping DNA regulatory elements, making it challenging to choose the most appropriate and informative technology for specific applications. In the table below, the authors compare some of the most commonly used technologies for mapping DNA regulatory elements: ATAC-seq, DNase-seq, MNase-seq, ChIP-seq, and targeted CUT&TAG, to help new users decide which detection method is most suitable for their specific applications.

Selection principles:

  • (i) What type of information is needed to answer specific research questions
  • (ii) What type of input material is available.

Generally, epigenomic analysis is suitable for answering questions about how or why a cell type or tissue may exhibit changes in gene regulation. For questions primarily involving “what changes occurred,” we recommend starting with RNA sequencing.

ATAC-seq DNase-seq MNase-seq CUT&TAG or related ChIC techniques
Type of enzyme Tn5 endonuclease endonuclease and exonuclease Tn5 conjugated to an antibody via Protein A.
Is there sequencing bias? Yes; complex, Tn5 insertion bias, with preference for A/Ts in insertion site and C/Gs flanking133-135 Yes; complex, partially dependent on enzyme concentration and on methylation status of CpGs85,136 Yes; preferential cutting upstream of A/T compared to G/C137,138 Yes; dictated by antibody used to guide Tn5 and by Tn5 bias.
Input cell/nucleus number in standard analysis 500-50,000 1-10 million 10,000-100,000 100,000-500,000
Are low starting amount/single-cell methods available? Yes86,87; commercial solutions available. Yes67 Yes66 Yes62,64,139-141
Sample type Fresh or cryopreserved cells or nuclei. Fresh or frozen tissues. Fresh or cryopreserved cells or nuclei. Fresh or frozen tissues. Formaldehyde cross-linked or formalin-fixed paraffin-embedded samples. Fresh or cryopreserved cells or nuclei. Fresh or frozen tissues. Formaldehyde cross-linked samples. Fresh or cryopreserved cells or nuclei. Fresh or frozen tissues.
Library preparation time ~10 hours for 12 samples (this protocol) 1-3 days ~ 2-days 1-2 days
Technical considerations Library quality is highly dependent on cell viability. Protocol alterations are required for use on fixed cells and data quality is often reduced for those samples. Enzyme concentration and digestion duration may need to be optimized to sample type. Size of fragments selected affects downstream analysis.28 Enzyme concentration and digestion duration may need to be optimized to sample type. Apparent nucleosome occupancy is a function of MNase concentration. The amount of antibody used must be titrated for the cell type or sample. This will be a function of the strength of the antibody and the abundance of the target protein. The assay is as specific as the primary antibody used. Additionally, this is a targeted technique, so additional libraries must be made of each modification or protein tested.
Sequencing type Paired-end Single-end Single-end Single-end or paired-end
Sequencing depth Low; 10 million read-pairs per sample with Omni-ATAC. Medium/high: 20-50 million uniquely mapping reads per sample; 200 million for TF footprinting. High; 150-200 million reads per sample (human)142 Very low; 3 million read-pairs per sample.
Data yield Tn5-accessible chromatin; DNase-accessible chromatin; TF footprinting. Nucleosome positioning, inaccessible chromatin. Location of target on DNA.
Main advantages Links labeling of accessible regions and NGS library preparation, making preparation of library straightforward. Footprinting analysis. Method of choice for nucleosome positioning and quantitative nucleosome dynamics. Enables mapping of specific TF or histone modification in low cell numbers. Some histone modifications, like H3K27ac, can be used to look for active enhancers.

Comparison with Previous ATAC-seq Methods

There are still several shortcomings in early ATAC-seq methods. For example,

  • Due to the lack of chromatinization of mitochondrial DNA, if lysed mitochondria are present in the ATAC-seq reaction, it will lead to a large number of ATAC-seq sequencing reads mapping to mitochondrial DNA.
  • In many cell types and contexts, low signal-to-noise ratio makes it difficult or even impossible to apply ATAC-seq to certain experimental systems.

To address some of the above situations, the authors previously developed a universal and optimized ATAC-seq method called Omni-ATAC, which resolves many of the cell or context-specific issues that limit the widespread application of ATAC-seq.

Development of the Omni-ATAC Protocol

The Omni-ATAC protocol improves the original ATAC-seq method by reducing reads mapped to mitochondrial DNA and increasing the signal-to-noise ratio in various cell lines, tissues, and frozen samples. This improvement is achieved through the optimization of cell lysis, nuclear separation, and transposition reactions. The optimizations in the Omni-ATAC protocol enable lysis of various cell types by adding Tween-20 and digitonin, along with the traditional Nonidet P40 (NP40).

Omni-ATAC: Updated and Optimized ATAC-seq Protocol
Figure 1: Schematic of the ATAC-seq transposition reaction and library preparation

Experimental Design

1. Preparation of Input Materials

  • Applicable to various mammalian cell and tissue types:
  • As low as 500 cells (or nuclei), with optimal results at 50,000 cells
  • Samples should preferably be: fresh or cryopreserved intact cells or nuclei
  • Not applicable to formalin-fixed paraffin-embedded (FFPE) tissues
  • Biological replicates and technical replicates: When resources are limited, biological replicates are recommended instead of technical replicates; if biological replicates are limited, it is best to perform 2-3 technical replicates.

2. Quality Control of ATAC-seq Libraries

The authors strongly recommend determining the quality of the final ATAC-seq library through low-depth sequencing (50,000 to 100,000 read pairs per sample). The success of ATAC-seq library generation depends on four key factors:

  • (i) Enrichment of transposase insertions in known chromatin-accessible regions (signal-to-noise ratio)
  • (ii) Total number of unique fragments (library complexity)
  • (iii) Mapping rate to the nuclear genome versus mitochondrial genome
  • (iv) Distribution of library insert fragment sizes

As shown in the figures below:

  • (e) A successful ATAC-seq library: with high transcription start site (TSS) enrichment scores, but the nucleosome periodicity observed in the Bioanalyzer electropherogram is not obvious.
  • (f) An unsuccessful ATAC-seq library: with low TSS enrichment scores, and no obvious nucleosome periodicity observed in the Bioanalyzer electropherogram.
Omni-ATAC: Updated and Optimized ATAC-seq Protocol
Figure 3: Assessing ATAC-seq library quality

3. Sequencing Parameter Guidance

Sequencing Application Insight Gained Minimum Read Length† Index Length* Paired or Single-End Sequencing Data Amount (reads/sample)
Gene regulatory landscape profiling Peaks, differential peaks between samples, motif analysis of peaks 36 bp 8 Paired 10M
Genotyping Gene regulatory landscape + genotype of sample; useful for patient samples and to determine if sequence variants affect a peak. 100 bp 8 Paired 10M
Footprinting Analysis Footprinting of different TFs to determine binding sequence at base-pair resolution 36 bp 8 Paired 200M
Nucleosome occupancy Location of nucleosomes along DNA 36 bp 8 Paired 60M

More detailed requirements can be found in the original text.

Data Analysis

After sequencing, the authors recommend using publicly available analysis workflows to perform alignment and downstream analysis, such as:

Omni-ATAC: Updated and Optimized ATAC-seq Protocol
Figure 4: Overview of the steps of ATAC-seq data analysis

Comparison of the Above Three Analysis Pipelines:

Step/Process ENCODE ATAC-seq PEPATAC nf-core atacseq
Version for Comparison v1.10.0 v0.10.0 v1.2.1
Running Environment Cromwell/caper Pypiper Nextflow
Adapter Removal, Alignment, and Deduplication Cutadapt, bowtie2, Picard TRIMMOMATIC, skewer, bowtie2, BWA, samblaster, Picard TrimGalore, BWA, Picard
Tn5 Offset Correction Yes Yes No
Mitochondrial Gene Filtering Yes Yes Yes
Peak Calling Method MACS2 MACS2 (default), F-seq, Genrich MACS2
Method Based on the irreproducible discovery rate (IDR) for replicates – does not merge for a whole set of samples Fixed-width, iterative overlap Raw peak overlap using bedtools109 merge
Output Results BAM files, bigwig files (one representing fold enrichment over expected background and the other representing statistical significance), BED file of peaks for each file and for the merged peak set QC plots including alignment scoring, TSS scores and library complexity, BED peaks and counts, bam files, bigwig files (nucleotide resolution and smoothed) QC html report, bam files, normalized bigwig files, BED peaks, annotation of peaks (HOMER), merged peak set, differential accessibility (DESeq2), IGV output.
Code Repository https://github.com/ENCODE-DCC/atac-seq-pipeline https://github.com/databio/pepatac https://github.com/nf-core/atacseq

Peak Merging Strategy in Downstream Analysis:

Omni-ATAC: Updated and Optimized ATAC-seq Protocol
Figure 5: Schematic of peak merging strategies and the resulting merged peak sets

Single-cell ATAC-seq

Omini-ATAC is specifically designed for bulk ATAC-seq; single-cell ATAC-seq can refer to mature commercial applications such as 10X Genomics.

Friendly Announcement:

Online live course on bioinformatics introduction & data mining January 2025 class

After a 5-year hiatus, our bioinformatics skill tree VIP apprentices continue enrollment!

Affordable solutions to meet your bioinformatics analysis and computational needs.

Leave a Comment