Physical maps are built using molecular biology techniques, like fingerprinting, to examine DNA molecules in order to show sequence features positions. Physical distance between landmarks is measured in base pairs.
To display physical maps in browsers performed by data managers:
- Use the FPC format for physical map raw data
- Use the GFF3 format for data integration
1. Raw data format and submission process
- File formats: We recommend using FPC file format (generate by FPC and LTC softwares).
- Data submission: INRA URGI is the official repository for IWGSC physical maps. We recommend using the following submission support: http://wheat-urgi.versailles.inra.fr/Seq-Repository/Support-to-assembly-and-data-submission
2. Data integration file format
GFF3 is recommended to integrate data to feed GBrowse and display physical maps. Look at the “Genome annotations” recommendation part to find more details about GFF3 format. See also how to produce GFF3.
Warning: GFF2 to GFF3 conversion
Converting a file from GFF2 to GFF3 format is problematic for several reasons. There are several GFF2 to GFF3 converters available on the web, but each makes specific assumptions about the GFF2 data that limit its applicability. GMOD does not endorse (or disparage) any particular converter. If you have GFF2 data from an external source, and they do not also provide it in GFF3 format, then you may be stuck with GFF2.
If the GFF2 file does not use Sequence Ontology terms in column 3 then some sort of translation will need to be done on the types in the GFF2 to convert them to be SO terms.
Another big problem is that GFF2 supports only one level of feature nesting. While you can certainly reproduce this minimal nesting in GFF3, it would be better to also convert your feature representations to be multi-level at the time you migrate the data to GFF3.
Convert data format
You can convert different formats to GFF3 using the Bioconvert tool.
Most popular tools
Physical map building
GFF3 sample of the 3B physical map browser:
ctg110 assembly contig 1 1041601 . . . Sequence "ctg110"; Name "ctg110" ctg110 FPC BAC 820801 938401 . . . BAC "TaaCsp3BFhA_0290A06"; Name "TaaCsp3BFhA_0290A06"; Contig_hit "110" ctg110 FPC BAC 835201 912001 . . . BAC "TaaCsp3BFhA_0130L06"; Name "TaaCsp3BFhA_0130L06"; Contig_hit "110" ctg110 FPC BAC 261601 468001 . . . BAC "TaaCsp3BFhA_0117E07"; Name "TaaCsp3BFhA_0117E07"; Contig_hit "110" ctg110 FPC BAC 55201 327601 . . . BAC "TaaCsp3BFhA_0111D21"; Name "TaaCsp3BFhA_0111D21"; Contig_hit "110" ctg110 FPC marker 808801 808801 . . . marker "Ta#S32641420-3B"; Name "Ta#S32641420-3B"; Contig_hit "ctg110 - 1" (TaaCsp3BFhA_0347M21) ctg110 FPC marker 345601 345601 . . . marker "Xcfp1207-3B"; Name "Xcfp1207-3B"; Contig_hit "ctg110 - 1" (TaaCsp3BFhA_0017C12)
Written on: WDI working group
Published on: 02 October 2014
Updated on: 09 July 2015