Genotyping by Sequencing for Crop Improvement. Группа авторовЧитать онлайн книгу.
SNP. The range of SNPs to be genotyped ranges from 10k to 2000k SNPs. As gene chip requires high‐quality DNA, so for GeneChip assays to work efficiently, the complexity of the genomic DNA must be reduced through digestion with restriction endonucleases and fractionation. In previous years, researchers used a high‐density oligonucleotide probes array to genotype the entire genome, paving the way for genome‐wide association studies (GWAS) (Kennedy et al. 2003; Matsuzaki et al. 2004; McGall and Christians 2002). Affymetrix GeneChip array contains a large number of synthetic fragments (25‐mer) probes immobilized on a solid substrate. First, the denatured ssDNA are hybridized with these probes, which takes place in a highly specific manner as noncomplementary alleles will not hybridize. Subsequently, the fragments of noncomplementary strands are washed away. The hybrid strands containing probes are stained and then the gene chip is subjected to the CCD imaging device for scanning. Each SNP is represented by a probe set that contains multiple probe pairs. The probe pairs differ in the location of SNP within the oligonucleotide sequence (5 location’s probes are selected). For each position, probes are included from the sense and antisense strand. Hence, the total probes for each allele are 40 probes per SNP.
Although DNA array‐based genotyping has completely revolutionized the field of RNA sequencing and expression analysis, but comparably it has lesser use for SNP genotyping mainly because of the difficulty in obtaining a good signal/noise ratio in allele‐specific hybridization. The resolution of discrimination between completely matched/mismatched oligonucleotides in the array is totally based on high‐throughput fluorescence detection system which is further complicated with an increase in the number of SNPs to be genotyped under a single condition. This specificity compromises genotype call rates and accuracy. Generally, effectiveness in allele differentiation depends on length and sequence of the probe, location of SNP in the probe, and hybridization conditions (Table 2.1).
Table 2.1 Customized SNP array details in plant species.
Crop | Species | Array fsize | References | Trait |
---|---|---|---|---|
Pigeonpea | (Cajanus cajan) | 62K | Singh et al. (2020) | Genetic diversity |
Wheat | (Triticum) | 55K | Zhang et al. (2019) | Adult‐plant resistance to leaf and stripe rust |
Apple | (Malus domestica) | 480K | Bianco et al. (2016) | Phenology, fruit quality, disease resistance, or drought tolerance |
Apple | (Malus domestica) | 8K | Chagné et al. (2012) | Quantitative traits |
Rice | (Oryza sativa) | 50K | Singh et al. (2015) | Genetic diversity |
Rice | (Oryza sativa) | 44 K | Zhao et al. (2011) | Plant morphology, grain quality, plant development |
Rye | (Secale cereale) | 5 K | Li et al. (2011) | Diversity analysis |
Barley | (Hordeum vulgare) | 50 K | Bayer et al. (2017) | Evaluation and use of barley genetic resources |
Sweet Cherry | (Prunus avium) | 6K | Peace et al. (2012) | Fruit taste |
Pear | (Prunus) | 70K | Montanari et al. (2019) | Genetic diversity studies |
Potato | (Solanum tuberosum) | 22K | Khlestkin et al. (2019) | Starch phosphorylation |
Pear | (Pyrus) | 200K | Li et al. (2019) | Flowering time and candidate genes linked to the size of fruit |
Walnut | (Juglans regia) | 700K | Marrano et al. (2019) | Genetic diversity of germplasm |
Cotton | (Gossypium barbadense) | 63K | Kumar et al. (2019) | Fiber quality |
2.2.2.1.4 Sequencing‐based Platforms
The simplest method of genotyping is whole‐genome resequencing of genotypes followed by the reference‐based assembly for variant calling. But as the majority of the genome is noncoding, these noncoding regions are also reflected in the variant file generated by the mapping of the reads which in turn increases the complexity of analysis as well as interpretation. Reduced representation approaches like RAD‐seq and GBS methods have solved this issue as in spite of sequencing the whole genome, these methods are confined to coding/genic regions only. For this, we first digest the DNA with methylation‐sensitive restriction enzyme with the hypothesis that DNA is more methylated in the noncoding region hence the digestion will be limited to coding regions only. The restricted ends generated are then ligated with adapters and sequenced. Other widely used sequencing‐based technologies are exome sequencing (Exom‐seq) and double digest RAD‐seq (ddRAD‐seq).
Restriction Site‐Associated DNA (RAD‐seq)
Restriction site‐associated DNA (RAD) is a next‐generation sequencing‐based SNP genotyping method, which allows simultaneous discovery and scoring of thousands of SNP markers in hundreds of individuals. The basic principle behind RAD is the identification of genomic SNPs adjacent to enzyme restriction sites through sequencing. It requires genomic DNA, restriction enzymes, P1 adaptor containing Illumina sequencing forward primer and barcode information, and P2 adaptor containing reverse complement of the reverse amplification primer site. The RAD sequencing library is prepared by digesting genomic DNA of genotypes using restriction enzymes and P1 adaptor is ligated to the compatible ends of the fragments. The adaptor‐ligated fragments are subsequently pooled, randomly sheared, and a specific size fraction is selected following electrophoresis. The DNA fragments are then ligated with P2 adapter and sequenced using Illumina platform (