8. ATAC-seq analysis#

This page describes how to analyze ATAC-seq data for open chromatin analysis with Churros. Churros includes HMMRATAC bundled in MACS3 and TOBIAS. The sample scripts are also available at Churros GitHub site.

Note

This tutorial assumes using the Churros singularity image (churros.sif). Please add apptainer exec churros.sif before each command below.
Example: apptainer exec churros.sif download_genomedata.sh

8.1. Get data#

Here we use three ATAC-seq samples, which are paired-end.

mkdir -p fastq
for id in SRR2453157 SRR2453159 SRR2453158
do
    $sing pfastq-dump -t 4 -s $id -O fastq/ --gzip --split-files
done
Then download and generate the reference dataset including genome, gene annotation and index files. Chuross contains scripts for that: download_genomedata.sh and build-index.sh.
Here we specify hg38 for genome build. See Appendix for the detail of genome build.
mkdir -p log
build=hg38      # genome build
Ddir=Referencedata_$build   # output directory
ncore=12    # number of CPUs
# download the genome
download_genomedata.sh -s $build $Ddir
# make Bismark index
build-index.sh -p $ncore bowtie2 $Ddir

8.2. Running Churros#

churros has the --atac` option for ATAC-seq analysis, where MACS3 uses hmmratac command in addition to callpeak.

build=hg38
Ddir=/work/Database/Database_fromDocker/Referencedata_$build
ncore=12

churros --atac -p $ncore samplelist.txt samplepairlist.txt $build $Ddir

The results of HMMRATAC are output in Churros_result/hg38/hmmratac/.

8.3. Running TOBIAS#

After running churros, you can use churros_tobias.sh to apply TOBIAS for ATAC-seq footprinting analysis.

churros_tobias.sh requires BAM and peak files for input.

churros_tobias.sh [Options] <bam> <refpeak> <genome> <label>
    <bam>: BAM of the ATAC-seq sample
    <refpeak>: Reference peak file (BED format)
    <genome>: Genome fasta file
    <label>: Label of the sample
    Options:
        -o <str>: Output directory (default: "tobias")
        -p <int>: Number of cores to use (default: 4)

churros_tobias.sh executes TOBIAS ATACorrect, TOBIAS ScoreBigwig, and TOBIAS BINDetect. It also uses TOBIAS PlotAggregate to plot top three target motifs from JASPAR2026 database.

This is a example script to execute TOBIAS to all samples included in samplelist.txt.

build=hg38
Ddir=Referencedata_$build
genome=$Ddir/genome.fa

samplelist=samplelist.txt
chdir=Churros_result/$build/
odir=$chdir/tobias

while read -r LINE || [ -n "$LINE" ]; do
    LINE=($LINE)
    label=${LINE[0]}

    bam=$chdir/bam/$label.sort.bam
    peak=$chdir/macs/${label}_peaks.narrowPeak
    echo -e "\nTOBIAS: $bam and $peak.."
    $sing churros_tobias.sh -o $odir $bam $peak $genome $label
done < $samplelist

The results are output in Churros_result/hg38/tobias/. See TOBIAS User Guide for more detail.