Brassica napus pan-genome Information Resource (BnPIR)

HOW TO CITE

Song, J. M., Guan, Z., Hu, J., Guo, C., Yang, Z., Wang, S., Liu, D., Wang, B., Lu, S., Zhou, R., Xie, W. Z., Cheng, Y., Zhang, Y., Liu, K., Yang, Q. Y., Chen, L. L., & Guo, L. (2020). Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus. Nature plants, 6(1), 34-45. https://doi.org/10.1038/s41477-019-0577-7

Song, J. M., Liu, D. X., Xie, W. Z., Yang, Z., Guo, L., Liu, K., Yang, Q. Y., & Chen, L. L. (2021). BnPIR: Brassica napus pan-genome information resource for 1689 accessions. Plant biotechnology journal, 19(3), 412–414. https://doi.org/10.1111/pbi.13491

PAN BROWSER

We providing 1,771 tracks for user-selective display, which include genes, transposable elements (TEs), expression profile data, presence frequency, and coverage of different accessions in the pan-genome. Considering the assess speed and best performance, we recommend
the users to select less than 30 tracks each time. Furthermore, users can filter the tracks in batches according to the region, country,
subgroup, sequencing quality and so on.

The page of selecting tracks:

Jbrowser is still under continuous development. When an "Error message()" prompt appears during your access to the Panbrower, this may be caused by the incompatibility of the browser's cookie rules. We recommend that you refresh twice to clear the cache or open the page with another browser. It is recommended that you use Microsofe Edge and Chrome to view Panbrower.

SEARCH

The SEARCH option supplies kinds of feature. Genes for knowledge of genes in rape. Species for 1688 rapeseed accession informations. Gene Expression for ~100 thousand genes expression level. Transposable Elements for rape repeat data. Population Variation for ~50 million variations. NLR Genes for ~three thousand NLRs.

Genes

We collected 773,064 genes information of B.napus. Users can get the orthologou gene information of different genomes by searching gene locus, gene symbol or Arabidopsis gene name, for example, FT. The result will be showed in a table and we recommend click each link to
know more knowledge detail for interested genes.

Species

By switching to the Species, the map chart illustrates the distribution of 1688 rapeseed accessions. The ecotype, region and sequencing
depth of each species are provide in the refresh table.

Gene Expression

As to the Gene Expression, users can searching gene locus to visualize gene expression levels of 40 tissues throughout the flowering life
cycle of eight rape accessions. For a given gene, its expression profile in all available tissues was visualized in a line chart.

Transposable Elements

For the Transposable Elements, users can get the repeats sequence by searching transposon name and download the Rape Repeat Database.

Population Variation

In this section, you can choose a chromsome and chromsome interval which you interest. For example, scaffoldA01, the information of SNPs and SNP density are shown in this page.

NLR Genes

The bar chart illustrates the percentage of NLR genes for 8 species. The number of each class of NLR genes are provide in violin figure.

GBROWSE

The GBROWSE option supplies many features. Gbrowse Synteny for the genomes Synteny and gene index. This is the main function in this section. Genome browser for 8 genomes, for example, ZS11, Gangan and Westar, etc.

Gbrowse Synteny

In this section, you can choose one genome and chromosome interval which you interest, then click the "Search" button. The page will provide the collinearity of genomes versus your interest genome.

Genome Browser for ZS11

TOOLS

This section has nine main functions, for examples, Gene Index, Blast, KEGG/GO Enrichment, Homologous Regions, Orthologus and Phylogenetic Tree, etc. In the Gene Index, we constructed a unique gene index in the above nine B. napus genomes. You can retrieval of genes of interest among different repeseed lines. In the Blast, you can obtain the summary of your sequence by selecting different types of BLAST and reference genomes. In the KEGG/GO Enrichment, you can upload gene list of interest of different datasets to perform enrichment analysis based on hypergenometric test. In the Homologous Regions, the A sub genomes of eight rape accessions were aligned to C sub genomes. The homologous regions in eight rape genomes are illustrates by circos chart and you can download the detail of syntenic links. In the Orthologous, you can using different species locus for orthologou group search among Arabidopsis, Brassica rapa, Brassica oleracea and eight Brassica napus accessions and you can download the list of orthologous groups. In the Phylogenetic Tree, all genes were classified into core genes, subspecies imbalance genes, subspecies specific genes and random genes according to the frequency of existence of genes in different subspecies. You can obtain the distribution of genes that existence in 1688 accessions.

Gene Index

Gene Index of B. napus genomes. For convenience of gene comparison among different rapeseed lines and retrieval of genes of interest, we constructed a unique gene index in the above nine B. napus genomes. Figure shows an example of the gene index HUBnaA01G0071 in ZS11 and the other eight genomes. The exon-intron structure of HUBnaA01G0071 is conserved in all nine genomes.

Blast

BLAST finds regions of similarity between biological sequences. The program compares nucleotide or protein sequences to sequence databases (B. rape, B. oleracea, and eight B. napus accessions) and calculates the statistical significance.

KEGG/GO Enrichment

One of the main uses of the GO/KEGG is to perform enrichment analysis on gene sets. For example, given a set of genes that are up-regulated under certain conditions, an enrichment analysis will find which GO/KEGG terms are over-represented (or under-represented) using annotations for that gene set. BnIR collected gene GO and KEGG annotaion of eight rape genomes and you can upload "Interest gene List" to perform enrichment analysis based on hypergenometric test.

Orthologous

You can do orthologous groups searches using for different species locus (AT5G52390/BraA02g06253Z/BnaA02G0145800ZS) by submitting your query in the search box above. A total of 1,009,733 representative peptide sequences from Arabidopsis (release 10), Brassica rapa (NP), Brassica oleracea (NP), and eight B. napus accessions (release 1.0) were used for identification of putative orthologous groups of each gene in Arabidopsis using OrthoMCL with default parameters. A total of 27,628 putative orthologous groups including in-paralogs were identified.

Phylogenetic Tree

Compared to ZS11 RefSeq, pan-genome added 781.9 Mb DNA sequence and 21,020 coding genes. All genes were classified into core genes and distribution genes by their presence in each variety. Further distributed genes are further divided into subspecies imbalance genes (frequency in one subspecies is significantly higher than that in other subspecies, P value <0.05), subspecies specific genes (>95% in one subspecies) and random genes, according to the frequency of existence of genes in different subspecies. Breeders were supposed to focus on the present accessions in selecting breeding donors. The phylogenetic tree was constructed in Rstudio use the following script.

# Load R packages
library("ggtree")
library("ggplot2")
setwd("")
tree <- read.tree("yourtree.nhk") # Read your tree file
group_file <- read.table("species_ecotype.txt", header = T,row.names = 1) #Ecotype file
groupInfo <- split(row.names(group_file), group_file$Group)
tree <- groupOTU(tree, groupInfo)
ggtree(tree, layout="circular", ladderize = TRUE, # Generate tree
aes(color=group)) + legend("top",pt.cex=1.5) +
theme(legend.position = "top",
legend.title = element_blank(),
legend.key.width = unit(2,"cm"),
legend.text=element_text(size = 12)) +
scale_color_manual(values=c("#C5773F","#B32D2C","#D4D4D4","#3C5293"))
#The number of colors corresponds to the number of ecotypes
ggsave("Tree.png",width = 3,height = 3)

PATHWAY

In this section, you can obtained all the pathways of eight rape accessions from KEGG. You can retrieval a ko id of interest and the metabolism pathways are provide in the page.