The sequence variations are the nucleotide differences between two (or several) sequences at the same locus (usually between a reference sequence and another sequence). Three types of sequence variations— single-nucleotide polymorphisms (SNPs), insertions and deletions (indels), and short tandem repeats (STRs) — have been reported mainly in plant genomes.
The most currently available sequence variations for wheat are SNPs.
Recommendations
Summary
For Variant (e.g. SNP) calling performed by bioinformaticians:
- Use a reference wheat genome sequence
- Data format: use the VCF
- Provide associated metadata
1. Reference sequence
The currently most commonly used reference bread wheat sequence is the IWGSC survey sequence (cv Chinese Spring), available at the IWGSC Sequence Repository and EBI.
When available, we encourage the use of the chromosome reference sequence.
2. Data format
We recommend using the latest VCF file format.
Description
The Variant Call Format (VCF) is a text file used in bioinformatics for storing gene sequence variations. The format has been developed with the advent of large-scale genotyping and DNA sequencing projects, such as the 1000 Genomes Project. VCF format specifications can be found here.
Warning: The VCF files generated for exome capture need to be labeled as such and cannot be merged with those from IWGSC context.
Convert data format
You can convert different formats to VCF using the Bioconvert tool.
3. Metadata
We recommend providing a minimal set of metadata to contextualize the provenance of the SNPs and providing information about the SNP quality analysis.
Data sharing
For data sharing, the following information should be provided in the header section of the VCF file (header lines have to be preceded by “##” characters) or as a separate tabulated file.
Name | Description |
RUN NAME | Name of the sequencing run that produced the data we are interested in. |
RUN DESCRIPTION | Description of this run. |
SUB RUN NAME | Part of a sequencing run that produced the data we are interested in. According to the sequencing technology involved, the sub run can be a lane (for 454 sequencers), a flowcell for (Ilumina sequencers)… |
ANALYSIS NAME | Name of the SNP calling analysis |
ANALYSIS SOFTWARE NAME | Software used for the SNP calling analysis |
ANALYSISCONTACT NAME | Person who performed the analysis |
PROTOCOL NAME | Name of the sequencing protocol |
MAPPING GENOME NAME | Name and version of the reference genome used to call the variations |
MAPPING GENOME TAXON NAME | Taxon of the reference genome used to call the variations |
MAPPING_GENOME DESCRIPTION | Description of the reference genome used to call the variations |
GENOTYPE NAME | Name of the sample/individual that has been sequenced. |
GENOTYPE TAXON | Taxon of the sample/individual that has been sequenced. |
PROJECT NAME | Name of the project that funded the sequencing |
FILTERS | Filters applied to call SNPs (ex: DP > 10) |
Warning: BAM/SAM files should be kept for tracaeability of further analysis since they are not suitable for sharing.
Data submission
For data submission in international repositories (EBI, NCBI), we advise filling in the dedicated XML format (http://www.ebi.ac.uk/ena/submit/preparing-xmls#vcf).
Most popular Tools
Identification of sequence variations includes 3 steps:
- Mapping of the reads on the reference genome
- Calling the sequence variations
- Filtering out irrelevant results regarding mainly depth and sequence quality and mapping quality.
Mapping tools
SNP calling tools
Filter tools
Example
Example of a VCF file dedicated to wheat data:
##fileformat=VCFv4.1 #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 102 403 407-IV_60 93 ACBarrie Alabasskaja CS Estacao M6 Marquis Neepawa PI153785 PI166180 PI166333 PI177943 PI185715 PI192001 PI192147 PI192569 PI210945 PI222669 PI245368 PI262611 PI278297 PI349512 PI366716 PI366905 PI382150 PI406517 PI445736 PI470817 PI477870 PI481718 PI481923 PI565213 PI82469 PI8813 PR267 Roemer Taxi Utmost acc1 acc2 acc3 acc4 acc5 berkut chakwal86 cham6 clear_white dharwar_dry hidhab klein_chamaco opata pavon pbw343 rac875 vorobey 3929455_1al 1623 . T C 245.53 . AC=18;AF=0.196;AN=92;BaseQRankSum=0.079;DP=48;Dels=0.00;FS=0.000;HaplotypeScore=0.1087;InbreedingCoeff=0.2057;MLEAC=18;MLEAF=0.196;MQ=100.00;MQ0=0;MQRankSum=-1.426;QD=27.28;ReadPosRankSum=-0.158 GT:AD:DP:GQ:PL 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,41 1/1:0,1:1:3:41,3,0 1/1:0,1:1:3:41,3,0 ./. 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,39 ./. ./. 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 ./. 1/1:0,1:1:3:38,3,0 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 ./. 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:3,0:3:6:0,6,84 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 ./. 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:41,3,0 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,38 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 ./. ./. 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 3929455_1al 1625 . A C 417.94 . AC=28;AF=0.304;AN=92;BaseQRankSum=-1.418;DP=48;Dels=0.00;FS=0.000;HaplotypeScore=0.1087;InbreedingCoeff=0.2887;MLEAC=28;MLEAF=0.304;MQ=100.00;MQ0=0;MQRankSum=-0.261;QD=29.85;ReadPosRankSum=-1.077 GT:AD:DP:GQ:PL 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,41 1/1:0,1:1:3:41,3,0 1/1:0,1:1:3:41,3,0 ./. 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,39 ./. ./. 1/1:0,1:1:3:39,3,0 1/1:0,1:1:3:39,3,0 ./. 1/1:0,1:1:3:39,3,0 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 ./. 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:3,0:3:6:0,6,84 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 ./. 1/1:0,1:1:3:39,3,0 ./. 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 ./. 1/1:0,1:1:3:39,3,0 1/1:0,1:1:3:41,3,0 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 ./. ./. 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 3967255_1al 9694 . C G 654.37 . AC=41;AF=0.477;AN=86;BaseQRankSum=1.762;DP=45;Dels=0.00;FS=0.000;HaplotypeScore=0.1860;InbreedingCoeff=0.3008;MLEAC=41;MLEAF=0.477;MQ=100.00;MQ0=0;MQRankSum=0.141;QD=29.74;ReadPosRankSum=0.540 GT:AD:DP:GQ:PL ./. 0/0:1,0:1:3:0,3,39 ./. ./. ./. 1/1:0,1:1:3:40,3,0 0/0:1,0:1:3:0,3,37 ./. ./. 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,37 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,35 0/0:1,0:1:3:0,3,39 ./. 1/1:0,1:1:3:39,3,0 ./. 1/1:0,1:1:3:39,3,0 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,34 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,37 0/0:1,0:1:3:0,3,37 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:35,3,0 1/1:0,1:1:3:39,3,0 1/1:0,1:1:3:39,3,0 ./. 1/1:0,1:1:3:41,3,0 0/0:1,0:1:3:0,3,39 ./. ./. 1/1:0,1:1:3:39,3,0 ./. 0/0:1,0:1:3:0,3,28 1/1:0,1:1:3:25,3,0 1/1:0,1:1:3:39,3,0 1/1:0,1:1:3:37,3,0 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,37 1/1:0,1:1:3:39,3,0 ./. 1/1:0,1:1:3:37,3,0 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/1:1,1:2:33:33,0,33 3967255_1al 9864 . A G 663.37 . AC=41;AF=0.477;AN=86;BaseQRankSum=0.141;DP=45;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;InbreedingCoeff=0.3008;MLEAC=41;MLEAF=0.477;MQ=100.00;MQ0=0;MQRankSum=-0.658;QD=30.15;ReadPosRankSum=4.417 GT:AD:DP:GQ:PL ./. 0/0:1,0:1:3:0,3,41 ./. ./. ./. 1/1:0,1:1:3:40,3,0 0/0:1,0:1:3:0,3,40 ./. ./. 0/0:1,0:1:3:0,3,40 ./. 0/0:1,0:1:3:0,3,40 1/1:0,1:1:3:40,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,40 1/1:0,1:1:3:40,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,40 ./. 1/1:0,1:1:3:40,3,0 ./. 1/1:0,1:1:3:40,3,0 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,40 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,40 0/0:1,0:1:3:0,3,40 0/0:1,0:1:3:0,3,40 0/0:1,0:1:3:0,3,40 1/1:0,1:1:3:40,3,0 0/0:1,0:1:3:0,3,40 1/1:0,1:1:3:40,3,0 0/0:1,0:1:3:0,3,40 1/1:0,1:1:3:34,3,0 1/1:0,1:1:3:40,3,0 1/1:0,1:1:3:40,3,0 ./. 1/1:0,1:1:3:41,3,0 0/0:1,0:1:3:0,3,41 ./. ./. 1/1:0,1:1:3:40,3,0 ./. 0/0:1,0:1:3:0,3,40 1/1:0,1:1:3:25,3,0 1/1:0,1:1:3:40,3,0 1/1:0,1:1:3:31,3,0 0/0:1,0:1:3:0,3,40 ./. 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:40,3,0 ./. 1/1:0,1:1:3:40,3,0 0/0:1,0:1:3:0,3,40 1/1:0,1:1:3:40,3,0 0/0:1,0:1:3:0,3,40 0/0:1,0:1:3:0,3,40 0/1:1,1:2:29:34,0,29 3967255_1al 9908 . C T 752.26 . AC=43;AF=0.489;AN=88;BaseQRankSum=0.125;DP=45;Dels=0.00;FS=0.000;HaplotypeScore=0.0000;InbreedingCoeff=0.3019;MLEAC=43;MLEAF=0.489;MQ=100.00;MQ0=0;MQRankSum=0.534;QD=32.71;ReadPosRankSum=-5.506 GT:AD:DP:GQ:PL ./. 0/0:1,0:1:3:0,3,41 ./. ./. ./. 1/1:0,1:1:3:41,3,0 0/0:1,0:1:3:0,3,42 ./. ./. 0/0:1,0:1:3:0,3,42 ./. 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,42 ./. 1/1:0,1:1:3:42,3,0 ./. 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 0/0:1,0:1:3:0,3,42 0/0:1,0:1:3:0,3,42 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:36,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:34,3,0 1/1:0,1:1:3:41,3,0 0/0:1,0:1:3:0,3,40 ./. ./. 1/1:0,1:1:3:42,3,0 ./. 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:35,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 ./. 0/0:1,0:1:3:0,3,41 1/1:0,1:1:3:42,3,0 ./. 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 0/0:1,0:1:3:0,3,42 0/1:1,1:2:32:36,0,32 3967255_1al 12120 . C T 179.18 . AC=12;AF=0.125;AN=96;BaseQRankSum=-1.123;DP=97;Dels=0.00;FS=0.000;HaplotypeScore=0.0417;InbreedingCoeff=0.1143;MLEAC=12;MLEAF=0.125;MQ=100.00;MQ0=0;MQRankSum=-0.893;QD=14.93;ReadPosRankSum=0.082 GT:AD:DP:GQ:PL ./. ./. ./. ./. 0/0:2,0:2:3:0,3,45 ./. 0/0:2,0:2:3:0,3,45 ./. 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 ./. 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 1/1:0,2:2:3:45,3,0 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 1/1:0,2:2:3:45,3,0 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 1/1:0,2:2:3:45,3,0 1/1:0,2:2:3:45,3,0 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 1/1:0,2:2:3:45,3,0 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 1/1:0,2:2:3:45,3,0 ./. ./. ./. 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:3,0:3:6:0,6,77 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 0/0:2,0:2:3:0,3,45 3956094_1al 2036 . A C 1241.41 . AC=66;AF=0.717;AN=92;BaseQRankSum=1.244;DP=46;Dels=0.00;FS=0.000;HaplotypeScore=0.0213;InbreedingCoeff=0.2712;MLEAC=66;MLEAF=0.717;MQ=100.00;MQ0=0;MQRankSum=0.927;QD=37.62;ReadPosRankSum=-0.927 GT:AD:DP:GQ:PL 0/0:1,0:1:3:0,3,25 1/1:0,1:1:3:41,3,0 1/1:0,1:1:3:41,3,0 1/1:0,1:1:3:41,3,0 ./. 0/0:1,0:1:3:0,3,41 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:41,3,0 ./. ./. 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 ./. 1/1:0,1:1:3:37,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 ./. 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:37,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 ./. ./. ./. 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 0/0:1,0:1:3:0,3,42 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:42,3,0 ./. 1/1:0,1:1:3:42,3,0 ./. 0/0:1,0:1:3:0,3,42 ./. 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 0/0:1,0:1:3:0,3,41 ./. 1/1:0,1:1:3:41,3,0 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,37 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,41 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 1/1:0,1:1:3:42,3,0 0/0:1,0:1:3:0,3,42 1/1:0,1:1:3:42,3,0 3956094_1al 2093 . C T 488.53 . AC=32;AF=0.348;AN=92;BaseQRankSum=-1.073;DP=54;Dels=0.00;FS=1.522;HaplotypeScore=0.0000;InbreedingCoeff=0.3038;MLEAC=32;MLEAF=0.348;MQ=100.00;MQ0=0;MQRankSum=-1.385;QD=27.14;ReadPosRankSum=-0.523 GT:AD:DP:GQ:PL 0/0:2,0:2:3:0,3,45 1/1:0,2:2:3:45,3,0 1/1:0,2:2:3:45,3,0 0/0:2,0:2:3:0,3,45 ./. 0/0:2,0:2:3:0,3,45 0/0:1,0:1:3:0,3,38 0/0:2,0:2:3:0,3,45 ./. ./. 1/1:0,1:1:3:39,3,0 1/1:0,1:1:3:38,3,0 ./. 0/0:1,0:1:3:0,3,30 1/1:0,1:1:3:38,3,0 1/1:0,1:1:3:39,3,0 ./. 0/0:1,0:1:3:0,3,38 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,36 0/0:1,0:1:3:0,3,38 0/0:1,0:1:3:0,3,38 ./. ./. ./. 1/1:0,1:1:3:39,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,38 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,39 ./. 0/0:1,0:1:3:0,3,38 0/0:1,0:1:3:0,3,30 0/0:2,0:2:3:0,3,45 ./. 0/0:2,0:2:3:0,3,45 1/1:0,1:1:3:38,3,0 0/0:1,0:1:3:0,3,39 1/1:0,1:1:3:38,3,0 1/1:0,1:1:3:39,3,0 1/1:0,1:1:3:38,3,0 0/0:1,0:1:3:0,3,36 0/0:1,0:1:3:0,3,38 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,38 0/0:1,0:1:3:0,3,38 0/0:1,0:1:3:0,3,38 1/1:0,1:1:3:39,3,0 1/1:0,1:1:3:38,3,0 1/1:0,1:1:3:38,3,0 0/0:1,0:1:3:0,3,39 0/0:1,0:1:3:0,3,38 1/1:0,1:1:3:38,3,0
Writing: WDI working group Creation date: 02 October 2014 Update: 30 June 2015
No Comments Yet