28.02.2014 Views

An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib

An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib

An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

2.4. A FLEXIBLE SEQUENCING READ DEMULTIPLEXING SYSTEM 37<br />

1 #?lane sample barcode_group read barcode extbarcode barcode_type<br />

# Bar codes for the Ler-1 sample.<br />

1 Ler-1 0 1 AACT TGCAG 5prime<br />

5 1 Ler-1 0 1 TAGC TGCAG 5prime<br />

1 Ler-1 0 2 AACT * read<br />

1 Ler-1 0 2 CCCT * read<br />

# Bar codes for the Col-0 sample.<br />

10 1 Col-0 0 1 AACTT GCAG 5prime<br />

1 Col-0 0 2 GGAC * read<br />

1 Col-0 1 1 TTGCT GCAG 5prime<br />

1 Col-0 1 2 CCCT * read<br />

15 # Third read has the same bar codes for all valid samples.<br />

1 * * 3 GGAC * 5prime<br />

1 * * 3 TTGC * 5prime<br />

# Lane 2 is not multiplexed, discard the index read.<br />

20 2 Bur-0 0 2 * * read<br />

Listing 2.2: Example of a Full Demultiplexing Sheet<br />

lane (e. g. lines 1617).<br />

For certain applications, the identity of several nucleotides immediately following 5 ′ bar code<br />

sequences is known, like for example restriction site sequence in RAD-Seq (section 1.3). While<br />

such sequences can be exploited to correctly assign each read to the appropriate sample, they<br />

usually should in contrast to the bar code oligomers not be removed from the output. Parts of the<br />

recognition sequence that should not be clipped from the read can be specied in the extended<br />

bar code (extbarcode) eld of the table. Internally, the sequence to be recognized is contructed<br />

by concatenation of the barcode <strong>and</strong> extbarcode elds, while the division of sequence among<br />

both elds is translated into a bar code cut position. For bar code types other than 5prime, the<br />

split into bar code <strong>and</strong> extended bar code has no eect. The bar code cut position is determined<br />

in the context of the respective bar code tuple, i. e. for the same recognition sequence dierent<br />

samples may dene a dierent split between bar code <strong>and</strong> extended bar code, as demonstrated<br />

by lines 4 <strong>and</strong> 10 of listing 2.2. If no part of the recognition sequence is to be removed from the<br />

output, then the entire oligomer can be provided as extended bar code, with column barcode<br />

either omitted or set to * .<br />

If neither bar code nor extended bar code are provided with a value other than * , then the<br />

respective sample sheet entry will match any read sequence. With bar code type either none<br />

or 5prime, such an entry can be utilized for assigning a certain sample identier to an entire<br />

sequencing lane. On the other h<strong>and</strong>, this property may be exploited to completely remove all<br />

reads with a certain read index from the output (e. g. listing 2.2, line 20).<br />

The sample sheet column lane serves to allow independent demultiplexing specications for<br />

dierent sequencing lanes in a single sample sheet le. Rows with diering sequencing lane elds<br />

are completely independent of each other. If the sequencing lane column is omitted, then the<br />

entire demultiplexing specication is considered valid for all lanes of the instrument run.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!