visualize ngs data using genomics viewer app
the app lets you view and explore integrated genomic data with an embedded version of the integrative genomics viewer (igv) [1][2]. the genomic data include ngs read alignments, genome variants, and segmented copy number data.
the first part of this example gives a brief overview of the app and supported file formats. the second part of the example explores a single nucleotide variation in the cytochrome p450 gene (cyp2c19).
open the app
at the command line, type genomicsviewer
. alternatively, click the
app icon on the apps tab. the app requires an internet
connection.
by default, the app loads human (grch38/hg38) as the reference sequence and refseq genes as the annotation file. there are two main panels in the app. the left panel is the tracks panel and the right panel is the embedded . the tracks panel is a read-only area displaying the track names, source file names, and track types. the tracks panel updates accordingly as you configure the tracks in the embedded igv app.
the reset button restores the app to the default view with two
tracks (hg38 with refseq genes) and removes any other existing tracks. before resetting, you
can save the current view as a session (.json
) file and restore it
later.
add tracks by importing data
import reference sequence
you can import a single reference sequence. the reference sequence must be in a fasta file. select import reference on the home tab. you can also import a corresponding cytoband file that contains cytogenetic g-banding data. you can add local files or specify external urls. the url must start with either https or gs. other file transfer protocols, such as ftp, are not supported.
import sequence read alignment data
you can import multiple data sets of sequence read alignment data. the alignment data
must be a bam or cram file. it is not required that you have the corresponding index file
(.bai
or .crai
) in the same location as your bam
or cram file. however, the absence of the index file will make the app slower.
you can add read alignment files using add tracks from file and add tracks from url options from the add tracks button. if you are specifying a url, the url must start with either https or gs. other file transfer protocols, such as ftp, are not supported.
import feature annotations and other genomic data
you can import multiple sets of feature annotations from several files that contain
data for a single reference sequence. the supported annotation files are:
.bed
, .gff
, .gff3
, and
.gtf
.
you can also import structural variants (.vcf) and visualize genetic alterations, such as insertions and deletions.
you can view segmented copy number data (.seg
) and quantitative
genomic data (.wig
, .bigwig
, and
.bedgraph
), such as chip peaks and alignment coverage.
you can add annotation and genomic data files using add tracks from file and add tracks from url options from the add tracks button. if you are specifying a url, the url must start with either https or gs. other file transfer protocols, such as ftp, are not supported.
visualize single nucleotide variation in cytochrome p450
the cyp2c19 gene is a member of the cytochrome p450
gene family. enzymes produced from cytochrome p450 genes are involved in the metabolism of
various molecules and chemicals within cells. the cyp2c19 enzyme plays a role in the
metabolizing of at least 10 percent of commonly prescribed drugs [3]. polymorphisms in the
cytochrome p450 family may cause adverse drug responses in individuals. one example of
single nucleotide variation is rs4986893 at position
chr10:94,780,653 where g
is replaced by
a
. this allelic variant is also known as cyp2c19*3.
the following steps show how to visualize such variation in the app using both low coverage
and high coverage data.
load session file
for the purposes of this example, start with a session file (rs4986893.json
) that has some preloaded tracks. after downloading the file, load it in the app. click open and select rs4986893.json
.
explore low coverage data
the session contains three tracks:
human (grch38/hg38) as a reference
na18564 as low coverage alignment data
refseq genes
the low coverage alignment data comes from a female han chinese from beijing, china. the sample id is na18564 and the sample has been identified with the cyp2c19*3 mutation [4].
the alignment data has been centered around the location of the mutation on the cyp2c19 gene.
click the orange bar in the coverage area to look at the position and allele distribution information.
it shows that 71% of the reads have g while 29% have a at the location chr10:94,780,653. this data is a low coverage data and may not show all the occurrences of this mutation. a high coverage data will be explored later in the example.
close the data tip window.
you can customize the various aspects of the data display in the app. for example, you can change the track height to make more room for later tracks. click the second gear icon. select
set track height
. enter 200.for details on the embedded igv app and its available options, visit .
explore high coverage data
you can look at the high coverage data from the same sample to see the occurrences of this mutation.
go to the international genome sample resource .
search for the sample na18564.
download the exome alignment file that is in the
.cram
format.also download the corresponding index file that is in the
.crai
format. save the file in the same location as the source.cram
file.click the ( ) icon on the home tab. select the downloaded
.cram
file and click open.the high coverage data appears as track3. you can now see many occurrences of the mutation in several reads.
click the orange bar in the coverage area to see the allele distribution. it shows that g is replaced by a in almost 50% of the time.
references
[1] robinson, j., h. thorvaldsdóttir, w. winckler, m. guttman, e. lander, g. getz, j. mesirov. 2011. integrative genomics viewer. nature biotechnology. 29:24–26.
[2] thorvaldsdóttir, h., j. robinson, j. mesirov. 2013. integrative genomics viewer (igv): high-performance genomics data visualization and exploration. briefings in bioinformatics. 14:178–192.
[3]
[4]
see also
| |