04.06.2013 Views

Supporting Information (SI) Appendix - Proceedings of the National ...

Supporting Information (SI) Appendix - Proceedings of the National ...

Supporting Information (SI) Appendix - Proceedings of the National ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Supplementary Methods<br />

Genotype data quality filters<br />

To verify genotypes, we compared genotype calls from <strong>the</strong> BeadStudio s<strong>of</strong>tware to Illumina GA<br />

sequence data from 17 accessions <strong>of</strong> diverse origin for which a strict set <strong>of</strong> rules were used to<br />

call genotypes [3]. We only considered genotypes above <strong>the</strong> following SNP quality thresholds:<br />

GenTrain Score ≥ 0.3 and GenCall ≥ 0.2. At <strong>the</strong>se thresholds, we observe 93.18% concordance<br />

for 60411 genotypes called on both platforms, which is an underestimate <strong>of</strong> genotyping accuracy<br />

as reduced representation libraries (RRLs) were sequenced with <strong>the</strong> Illumina GA, which results<br />

in heterozygotes being called homozygotes at an unknown rate. In addition, lower concordance<br />

rates are expected in highly diverse species, where numerous unknown flanking polymorphisms<br />

cause hybridization issues on <strong>the</strong> genotyping arrays. Based on 145 pairwise comparisons<br />

between replicate samples genotyped with <strong>the</strong> Vitis9KSNP array, we discarded SNPs with<br />

replication rates < 97%. The mean replication rate for <strong>the</strong> remaining 6507 SNPs was 0.9981. We<br />

discarded 307 excessively heterozygous SNPs with HWE p-values < 1e-4 within a group <strong>of</strong><br />

vinifera that was pruned so that no two accessions had an Identity-by-State (IBS) > 0.95. SNPs<br />

with significant excess homozygosity were left in after visual inspection <strong>of</strong> cluster plots. An<br />

additional 727 SNPs were removed because <strong>the</strong>y were monomorphic in <strong>the</strong> sample analysed here<br />

and 86 poor quality SNPs were removed after visual inspection <strong>of</strong> cluster plots. The total number<br />

<strong>of</strong> SNPs remaining for analysis was 5387.<br />

The species, cultivar name and cultivar type (wine or table grape) <strong>of</strong> each sample was obtained<br />

from <strong>the</strong> Germplasm Resources <strong>Information</strong> Network (GRIN) database <strong>of</strong> <strong>the</strong> USDA<br />

(http://www.ars-grin.gov/). Twenty-three samples labeled as vinifera in GRIN were excluded<br />

from analysis because <strong>the</strong>y were identified as wild Vitis species or wild/vinifera hybrids based on<br />

multi-dimensional scaling (MDS) plots <strong>of</strong> an IBS matrix that included hundreds <strong>of</strong> samples from<br />

numerous wild Vitis species and wild/vinifera hybrids. A geographic region <strong>of</strong> origin was<br />

assigned to 811 vinifera accessions based on mostly on information from GRIN and <strong>the</strong><br />

geographic origin <strong>of</strong> each sylvestris accession was assigned based on its collection location<br />

(Table S1). Samples with genotype call rates < 0.7 were excluded. In total, 950 vinifera and 59<br />

sylvestris remained for analysis.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!