Conversion of DGRP genotype data from reference assembly dm3 to dm6. Last updated 26th September 2016 Data on genomic variation generated and made publicly-available by the Drosophila Genetics Reference Panel has been used in many published studies. The reference genome for Drosophila melanogaster has since changed, and so has the position of many genetic variants, in an unpredicatable manner. This directory contains files used in the conversion ('lift-over') of genetic variants from one reference genome to another, including genotype information from reference assembly dm3/r5.9 (April 2006) to dm6/r6 (August 2014). The in-house unique identifiers for each variant locus have been replaced with those from NCBI dbSNP. This alteration enables better links with other databases in NCBI e.g. for gene function. The original genotypes file is dgrp2.vcf.gz The lifted-over and renamed genotypes file is dgrp2_dm6_dbSNP.vcf.gz The file lifted.annotated.dgrp2.vcf.gz is an intermediate file in the processing, with the variant IDs being a concatenation of DGRP and dbSNP, so potentially useful. Other input meta-data are: i. The dm3 and dm6 reference genome assemblies, and their associated index files - located in local_ref/ ii. The chain file, mod_dm3ToDm6.over.chain, which provides information on which parts of genome have moved between the two assembly versions. dgrp2_dm6_dbSNP.vcf.vareval.txt contains counts of different genotypes and types of variant for each fly. MD5 values for each files is presented in the run log file. variant_counts.txt provides a count of how many variant loci there are in each vcf genotypes file. Contact: http://www.sussex.ac.uk/lifesci/morrowlab/ w.gilks@sussex.ac.uk wpgilks@gmail.com Links: DGRP: http://dgrp2.gnets.ncsu.edu/ dbSNP variant identifiers: ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/fruitfly_7227/VCF/ Chain file: http://hgdownload-test.cse.ucsc.edu/goldenPath/dm3/liftOver/dm3ToDm6.over.chain Probablems with original chain file, probably caused by operating system conflicts with text file formatting. Some combination of reading file into Python and copy+paste resolved the issue. Failed methods were iconv, dos2unix, tr, perl. perl script for sorting vcf by position: https://git.lumc.nl/rig-framework/magpie/raw/24273d59ab34d2398a86e811c381e0ac60699abf/scripts/vcfsort.pl https://code.google.com/p/vcfsorter/ End of readme.