SNP Discovery of Chicken Liver with Divergent Unsaturated Fatty Acid using Next Generation RNA Sequencing

RNA sequencing (RNA-Seq) reveals new opportunity for identification SNP discovery in different tissues with divergent phenotype. The objective of this study was to characterize SNP profile from divergent unsaturated fatty acids using RNA-Seq. Six liver samples were selected from 62 chicken which classified 3 high and 3 low unsaturated fatty acids were analyzed using RNA-Seq. The SNP identification showed 1208 SNPs in chicken samples and a large number of those corresponded to differences between high and low chicken genome assembly Gallus gallus (GGA) v4.0. Among them, about 91% of genes had multiple polymorphisms within 5 genes (SCD, COL6A2, CYP2J2L4, HSD17B4, and SLC23A3). The SCD, HSD17B4, and SLC23A3 contained the largest number of mutations with 18, 13, and 12 SNPs respectively. Combining the significant level of SNPs and gene function related with fatty acid composition allow us to suggest SCD, SLC23A3, HSD17B4 as the three novel and promising candidate genes for selecting unsaturated fatty acids. However, further validation is required to confirm the effect of these candidate genes in larger chicken populations.


INTRODUCTION
Fatty acid composition plays an important role in meat quality, not only in the nutritional value, but also in the flavor quality of meat (Yu et al., 2013).Fatty acids, especially unsaturated fatty acids (UFA) that consist of monounsaturated fatty acids (MUFAs) and some polyunsaturated fatty acids (PUFAs) are recognized as health benefits.The ability to genetically select for animal that better suit a consumers needs in term of high unsaturated fatty acids (UFA) that can be predicted would benefit the chicken industry.A candidate gene approach through single nucleotide polymorphisms (SNPs) analysis has been already successfully applied to identify several DNA markers (Van der Steen & Plastow 2005).SNPs are co-dominantly inherited, strong and enough marker to study diversity acceptable and adaptable for genome-scan association studies (Salem et al., 2012;Wang et al., 2008).The largest number of sources genetic variation and are connected with heritable differences among individuals could be detected by SNPs characterization (Aslam et al., 2012;Suh et al., 2005).Additionally, SNPs are very important because they might cause a modification in protein function and associated with the phenotype.Detection of single nucleotide polymorphisms (SNPs) within genes affecting fatty acid compositions and their association, linkage analysis, and gene expression is the important and commonly applied tools to characterize candidate genes.Recent developments of the next-generation sequencing technologies playing an important role in fast discovery of next generation transcriptomes sequencing (RNA-Seq) and SNPs at accurate and affordable scale (Wang et al., 2008).
RNA-Seq generates sequences on a very large scale at a fraction of the cost required for traditional Sanger sequencing, allowing the application of sequencing approaches to biological questions that would not have been economically or logistically practical before (Mortazavi et al. 2008).RNA-Seq platforms were applied in different studies to accomplish and fulfill highly inessential coverage of the genome, a necessity for high quality genome-wide SNP discovery in the complex genomes of animals and plants (Gunawan et al., 2013a(Gunawan et al., & 2013b;;Kilian & Graner 2012;You et al. 2011).A few studies have been published that explored genetic variant in chicken genetic resources, especially that related to fatty acids effects in chicken.Genetic variation is responsible for variation in phenotypic traits, as a result, the determination of genetic variation in traits of economic importance such as fatty acid composition is considered as one of the main targets of livestock genomic research (Lin et al., 2015).Taking this into account, we applied this novel approach to identify SNPs in the expressed coding regions of the chicken liver transcriptome with a divergent unsaturated fatty acid.The aim of this study was to characterize the SNPs discovery analysis in the unsaturated fatty acids transcriptome using RNA-Seq.The novel polymorphisms using RNA deep sequencing could reveal potential candidate genes affecting unsaturated fatty acids in chicken.It is presumed that in future these polymorphisms could be used as markers for chicken related to divergent unsaturated fatty acids.

Animal and Phenotypic Data Collection
The carcass and meat quality data were collected according to guidelines of the Indonesian performance test.Sixty two Indonesian crossbred chickens (F2 Kampung × Broilers) were used in this study for fatty acid composition analysis.They were reared under the same feeding conditions until 12 weeks old and had approximately 1.6 kg of slaughter weight per chicken.Tissue samples from muscle and liver will be frozen in liquid nitrogen immediately after slaughter and stored at -80°C until used for RNA extraction.Total lipids in each sample were extracted from breast muscle (BM) using chloroform-methanol (2:1) according to the procedure of Folch et al. (1957).FA methyl esters were prepared from the extracted lipids with BF3-methanol (Sigma-Aldrich, St. Louis, MO, USA) and separated on a HP-6890N gas chromatograph (Hewlett-Packard, Palo Alto, CA, USA) as described previously (Jeong et al., 2010).

RNA-Seq Library Construction
Six chicken with extreme high (n=3) and low (n=3) fatty acids level were selected from the 62 chickens.RNA was isolated from the liver tissues of 3 chicken (HUSFA, high unsaturated fatty acids group) with high PUFA and MUFA (57.06±2.16%) and 3 chicken (LUSFA, low unsaturated fatty acids group) with low levels of PUFA and MUFA (51.34 ± 1.25%) (Table 1).Total RNA was extracted using RNeasy Mini Kit according to manufacturer's recommendations (Qiagen).Total RNA was treated using on-column RNase-Free DNase set (Promega) and quantified using a spectrophotometer (NanoDrop, ND8000, Thermo Scientific).The RNA quality was assessed using an Agilent 2100 Bioanalyser and RNA Nano 6000 Labchip kit (Agilent Technologies).RNA deep sequencing technology was used to obtain differential expression, polymorphism, and alternative splicing detection.For this purpose, a full-length cDNA library was constructed from 1 μg of RNA using the SMART cDNA Library Construction Kit (Clontech, USA), according to the manufacturer's instructions.Libraries of amplified RNA for each sample was prepared following the Illumina mRNA-Seq protocol.The library preparations were sequenced on an Illumina HiSeq 2500 as single-reads to 100 bp using 1 lane per sample on the same flow-cell (first sequencing run) at Macrogen, South Korea.All sequences are analyzed using the CASAVA v1.7 (Illumina, USA).

RNA-Seq Analysis and SNP Detection
Genome Gallus gallus 4.0 (http://www.ncbi.nlm.nih.gov/genome/guide/cow/index.html)were used as the annotated references to produce short sequence reads (36-40 bp) by mapping and assembly using CLC Genomics Workbench software (CLC Bio, Aarhus, Denmark).RNA-Seq and SNP discovery analyses were conducted from each sample sequencing reads (n=6 sample polled).Stringent criteria were implemented in order to reduce the rate of identification of false-positive SNPs.For the assembly step, the sequences were annotated to the consensus genome accounting for a maximum of two gaps or mismatches in each sequence.SNP identification was analyzed using the following quality and significance filters according to Abuzahraa et al. (2018).

Gene Variation Analysis
Gene variation analysis was analyzed on the mapping files generated by TopHat algorithm using same tools mpileup command and associated algorithms.Variants with criteria a minimum Root Mean Square (RMS), mapping quality of 20 and a minimum read depth of 100 were chosen for further analysis.The selected variants were crosschecked against dbSNP database to identify mutations that are already studied.To understand high and low unsaturated fatty acid sample or in both groups, we have calculated the coverage/quality depth of polymorphisms detected in highly polymorphic Differentially expressed genes (DEGs).

Genetic Variation and SNP Detection
The variant discovery detection applied to sequence chicken was the pooled samples analyses and it revealed 1208 SNPs in 116 candidate genes.In the results, it was observed that most of the SNPs were in downstream gene variant with the maximum number of 1208 SNPs (Table 2).The RNA-seq data allowed to enhance SNP variants than whole-genome sequencing in coding exons (12%), intergenic (16%), downstream (30%), untranslated regions (3´UTRs; 9%), upstream (28%), and introns (2%).Specifically, in Illumina Genome Hiseq 2500 assigned 143 SNPs to exons.Only 47 of these SNPs in the exons were nonsynonymous.The resulting SNPs were highly abundant in these eight categories (Table 2).Although the libraries for RNA-seq are supposed to be enriched for mRNAs via poly(A)+ selection, a certain amount of immature pre-mRNA, which carries the introns, usually infiltrates into the libraries.Exonic compose a much larger fraction of the chicken genome than intron regions.Those two facts mainly elucidate why so many SNPs exist in exonic than in regions introns.Low coverage of short reads leads usually to low quality of intergenic and intronic regions.Only a small fraction, 16%, of SNPs fell into intergenic regions, which compose 84% of the chicken genome.The quantification process of the transcriptome, RNA-Seq approach supply beneficial information concerning gene polymorphisms that may directly correspond with the relevant phenotype (Gunawan et al. 2013a and2013b).SNP discovery was achieved in chicken, a comparative analysis was carried out.In total, 1208 SNPs were concordant in chicken (Table 2).After SNP calling, a functional class was allocated to each SNP and provided some fields of information describing the affected transcripts and proteins were provided, if applicable.The explained SNPs identified in this study can serve as useful genetic tools and as candidates in searches for phenotype-altering DNA and transcript differences (Stothard et al. 2011).According to the present finding the location of SNPs should be within 3'UTR, or 5'UTR that have the potential to be causative mutations.Designing a focused SNP panel with causative mutations may be advantageous over SNPs selected to be of equal distance across the chromosomes on the current SNP-chip platforms and linkage disequilibrium could not affect such types of mutations, and can be helpful for selection in proceeding generations, and also in across breeds (Fortes et al. 2014).It should also be distinguished, that the Gallus gallus reference genome was used in the present study; consequently, the development of a Gallus gallus reference genome could greatly improve results from SNP discovery studies.

Variant Characterization and Candidate Genes
The distribution of SNPs number and selected SNPs as potential candidate gene for selecting unsaturated fatty acid shown in Figure 1 and 2, respectively.The results showed that about 91% of genes had multiple polymorphisms (Figure 1).Combining the significant level of SNPs and and gene function related with fatty acid composition allow us to suggest SCD, COL6A2, CYP2J2L4, HSD17B4 and SLC23A3 as the 5 novel and promising candidate genes for selecting unsaturated fatty acids (Figure 2).This study revealed 52 SNPs in 5 highly polymorphic DEGs from chicken liver tissues (Table 3).The average number of SNPs per gene was 10 SNPs (Figure 2).The fewest numbers of SNPs were COL6A2, with four polymorphism detection.Interestingly, the HSD17B4, SLC23A3, and SCD genes showed the highest polymorphic SNP count difference between high and low unsaturated fatty acid.The HSD17B4, SLC23A3, and SCD gene contained the largest number of mutations with 18, 13, and 12 SNPs respectively.A large amount of data was generated in this study, a detailed description of the SNPs is available from the authors upon request.SCD is the rate limiting enzyme which catalysis the synthesis of monounsaturated fatty acids (MUFA) from saturated fatty acids (SFA) (Dujková et al., 2015;García-Fernández et al., 2009).The composition in fatty acids kept in the fat depot's throwback the foregoing action of SCD on substrates such as palmitic acid and stearic acid in chicken (Maharani et al., 2013).Some SFA, ordinarily discovered in meat, particularly palmitic and mystric acids are one of the risk agent of heart diseases (Erkkila et al. 2008).Diet high in MUFA inclined to decrease blood cholesterol levels while Diet high in SFA inclined to raise blood cholesterol levels.Cholesterol is moved in the bloodstream as lipoproteins (Oh et al., 2013).High-density lipoprotein (HDL) cholesterol is the good cholesterol since high HDL level related to less heart diseases, while low-density lipoprotein (LDL) is the bad cholesterol because high LDL levels are connected with a raise danger of heart disease (Jiang et al., 2008).
Collagen VI (COL6) could be specific for adipocytes because it is the primary collagen expressed by adipocytes (Mariman & Wang 2010;Nakajima et al., 2002).An in vivo study using BIP cells indicated that COL6 gene was able to raise and increase lipid synthesis (Nakajima et al. 2002).SLC (solute carrier family) is involved in infux transporters of various drugs, steroidal hormones, and some other substrates.SLC23A3 was extremely expressed in the ovary through growing follicle stage, dissimilarity, the expression of SLC23A3 was raised in granulosa cells at the growing follicle stage (Lee and Chang-Eun 2017).CYP2J2 is one of the major P450 enzymes to metabolize arachidonic acid (AA) predominantly via NADPH-dependent olefin epoxidation to 20-HETE and regioisomeric cisepoxyeicosatrienoic acids.(Zangerand & Schwab, 2013;Xu et al., 2011).
HSD17B4 was the gene with the largest number of SNPs in chicken in this study.HSD17B4 (hydroxysteroid (17-beta) dehydrogenase 4) is a bifunctional enzyme mediating anhydration and dehydrogenation through βoxidation of long-chain fatty acids (Pierce et al. 2010).HSD17B4 might works as a possible regulator of muscle development, and its identification should help to select for improved economic traits of Berkshire pigs such as backfat thickness, drip loss, and carcass weight (Jo et al., 2016).In the testicles, HSD17B4 leads to high levels of androstenone in Duroc pigs (Leung et al., 2010;Moe et al., 2007) and in the liver (Moe et al., 2008) because liver contains many proteins that regulate androstenone and plays a role in steroid homeostasis.According to Jo et al. (2016) the liver is closely associated with meat traits and male odor, accordingly, it could be concluded that HSD17B4, regulates steroid activity.
Most of these genes that involved SNPs in chicken in the present study were associated with unsaturated fatty acids in gene expression in a previous study (Maharani et al., 2013).Biologically relevant polymorphisms are used to design a focused or low-density panel of selected SNPs (Fortes et al., 2014).The present study is the first study to determine candidate genes to be determined as associated with unsaturated fatty acids in chicken and the SNPs discovered should be useful to increase the unsaturated fatty acids in chicken.

CONCLUSIONS
In the present study, RNA-Seq association study was carried out to identify candidate genes associated with unsaturated fatty acids in chicken.Several of the SNPs (SCD, COL6A2, CYP2J2L4, HSD17B4, and SLC23A3) founded in this study could be included as suitable markers in genotyping platforms to perform association analyses in commercial populations and apply genomic selection protocols in the chicken production.However, further investigation is required to confirm the effect of these genetic markers in other chicken populations.

Figure 1 .
Figure 1.Distribution of the number of SNPs detected in the DEGs

Table 2 .
Descriptive statistics results from the Variant Effect Predict (VEP) tool Ensembl of SNPs identified in chicken