They also suggest that genes with relatively stable expression are more likely to evolve slowly when compared with VEG. Gene expression level and functional necessity GDC-0068 cell line independently influence core genome stabilization It is well established that the core genome contains more indispensible genes that play central metabolic roles [11, 12]. This results in a lower mutation rate than in the flexible genome. To define essential genes, we searched for homologs of all PDGFR inhibitor MED4 coding
genes in the Database of Essential Genes (DEG8.0), a database that collects all indispensible genes measured in laboratory by far [39]. Using BLASTx (E-value = 1 × 10-4), we found homologs for 871 MED4 coding genes. A total of 767 genes were distributed in the core genome, representing 61.3% of core genes. This was a significantly higher proportion of genes than those distributed in the flexible genome (14.6%; P < 0.001; Figure 4a). These data support the hypothesis that core genes are responsible for central cell metabolism in Prochlorococcus. Figure 4 Gene necessity analysis and COG functional enrichment of HEG. All coding-sequence genes were searched on the Database of Essential Genes (DEG8.0 [39]) using BLASTx Captisol solubility dmso (E-value = 1 × 10-4). (a) Comparison of the DEG-hit genes in the core and flexible genomes. (b) Comparisons of gene expression subclasses
between DEG-hit and DEG-miss genes. (c) COG functional enrichment of HEG in the core genome. Statistic significance was
performed by Fisher’s exact test (one-tailed). P-value ≤ 0.05 was indicated in figure. COG, clusters of orthologous groups; Core, the core genome; DEG-hit, genes with homologs identified in the database; DEG-miss, genes without any known homologs; Flexible, the flexible genome; Unk, unknown function. We also compared the expression levels of the core MED4 genes that had homologs in the DEG database (DEG-hit) with those genes that did not have any known homologs (DEG-miss). HEG, LEG, and NEG had no enrichment for either DEG-hit or DEG-miss genes (P > 0.1; Figure 4b). Although the MEG subclass had a Amisulpride significantly higher rate of DEG-hit genes (P < 0.001; Figure 4b), the mean expression level of the DEG-hit genes (mean RPKM = 602.62) was not significantly different from that of the DEG-miss genes (mean RPKM = 874.81; Student’s t-test, two-tailed P = 0.084). Therefore, as previous works reported [14, 40, 41], this suggests that essential genes are not necessarily highly expressed and that gene expression levels relatively independently affect sequence evolution in Prochlorococcus MED4. We also performed functional enrichment analysis on each gene expression subclass. As most of the genes in the flexible genome have no COG categories [42], we mainly focused on the core genes’ expression subclasses, especially the HEG.