visualize ngs data using genomics viewer app -凯发k8网页登录

main content

visualize ngs data using genomics viewer app

the app lets you view and explore integrated genomic data with an embedded version of the integrative genomics viewer (igv) [1][2]. the genomic data include ngs read alignments, genome variants, and segmented copy number data.

the first part of this example gives a brief overview of the app and supported file formats. the second part of the example explores a single nucleotide variation in the cytochrome p450 gene (cyp2c19).

open the app

at the command line, type genomicsviewer. alternatively, click the app icon on the apps tab. the app requires an internet connection.

by default, the app loads human (grch38/hg38) as the reference sequence and refseq genes as the annotation file. there are two main panels in the app. the left panel is the tracks panel and the right panel is the embedded . the tracks panel is a read-only area displaying the track names, source file names, and track types. the tracks panel updates accordingly as you configure the tracks in the embedded igv app.

default view of the genomics viewer app. the toolstrip is at the top. the tracks panel is on the left. the embedded integrative genomics viewer igv is on the right.

the reset button restores the app to the default view with two tracks (hg38 with refseq genes) and removes any other existing tracks. before resetting, you can save the current view as a session (.json) file and restore it later.

add tracks by importing data

import reference sequence

you can import a single reference sequence. the reference sequence must be in a fasta file. select import reference on the home tab. you can also import a corresponding cytoband file that contains cytogenetic g-banding data. you can add local files or specify external urls. the url must start with either https or gs. other file transfer protocols, such as ftp, are not supported.

import sequence read alignment data

you can import multiple data sets of sequence read alignment data. the alignment data must be a bam or cram file. it is not required that you have the corresponding index file (.bai or .crai) in the same location as your bam or cram file. however, the absence of the index file will make the app slower.

you can add read alignment files using add tracks from file and add tracks from url options from the add tracks button. if you are specifying a url, the url must start with either https or gs. other file transfer protocols, such as ftp, are not supported.

import feature annotations and other genomic data

you can import multiple sets of feature annotations from several files that contain data for a single reference sequence. the supported annotation files are: .bed, .gff, .gff3, and .gtf.

you can also import structural variants (.vcf) and visualize genetic alterations, such as insertions and deletions.

you can view segmented copy number data (.seg) and quantitative genomic data (.wig, .bigwig, and .bedgraph), such as chip peaks and alignment coverage.

you can add annotation and genomic data files using add tracks from file and add tracks from url options from the add tracks button. if you are specifying a url, the url must start with either https or gs. other file transfer protocols, such as ftp, are not supported.

visualize single nucleotide variation in cytochrome p450

the cyp2c19 gene is a member of the cytochrome p450 gene family. enzymes produced from cytochrome p450 genes are involved in the metabolism of various molecules and chemicals within cells. the cyp2c19 enzyme plays a role in the metabolizing of at least 10 percent of commonly prescribed drugs [3]. polymorphisms in the cytochrome p450 family may cause adverse drug responses in individuals. one example of single nucleotide variation is rs4986893 at position chr10:94,780,653 where g is replaced by a. this allelic variant is also known as cyp2c19*3. the following steps show how to visualize such variation in the app using both low coverage and high coverage data.

load session file

for the purposes of this example, start with a session file (rs4986893.json) that has some preloaded tracks. after downloading the file, load it in the app. click open and select rs4986893.json.

explore low coverage data

the session contains three tracks:

  • human (grch38/hg38) as a reference

  • na18564 as low coverage alignment data

  • refseq genes

the low coverage alignment data comes from a female han chinese from beijing, china. the sample id is na18564 and the sample has been identified with the cyp2c19*3 mutation [4].

the tracks panel has three tracks, namely, hg38.fa sequence, na18564 alignment data, and refseq genes annotation. the igv shows the aligned reads graphically.

the alignment data has been centered around the location of the mutation on the cyp2c19 gene.

  1. click the orange bar in the coverage area to look at the position and allele distribution information.

    image of aligned reads with overlapping context menu in igv. the context menu shows the counts of a, c, g, t, n.

    it shows that 71% of the reads have g while 29% have a at the location chr10:94,780,653. this data is a low coverage data and may not show all the occurrences of this mutation. a high coverage data will be explored later in the example.

    close the data tip window.

  2. you can customize the various aspects of the data display in the app. for example, you can change the track height to make more room for later tracks. click the second gear icon. select set track height. enter 200.

    image of the context menu. it has various options to change the appearance of the track, such as track color, track name, and track height.

    for details on the embedded igv app and its available options, visit .

explore high coverage data

you can look at the high coverage data from the same sample to see the occurrences of this mutation.

  1. go to the international genome sample resource .

  2. search for the sample na18564.

  3. download the exome alignment file that is in the .cram format.

  4. also download the corresponding index file that is in the .crai format. save the file in the same location as the source .cram file.

  5. click the ( ) icon on the home tab. select the downloaded .cram file and click open.

    the tracks panel now shows the fourth track for the alignment data that was loaded. igv also shows the additional track for the added alignment data.

    the high coverage data appears as track3. you can now see many occurrences of the mutation in several reads.

  6. click the orange bar in the coverage area to see the allele distribution. it shows that g is replaced by a in almost 50% of the time.

    image of the context menu which shows the counts for a, c, g, t, and n. the counts for a is 79 (49%) and the counts for g is 82 (51%). other counts are zero.

references

[1] robinson, j., h. thorvaldsdóttir, w. winckler, m. guttman, e. lander, g. getz, j. mesirov. 2011. integrative genomics viewer. nature biotechnology. 29:24–26.

[2] thorvaldsdóttir, h., j. robinson, j. mesirov. 2013. integrative genomics viewer (igv): high-performance genomics data visualization and exploration. briefings in bioinformatics. 14:178–192.

[3]

[4]

see also

| |

网站地图