Using RCircos for High-Quality Genomic Data Plots: A Step-by-Step Guide.

Introduction to RCircos Package for Plotting Genomic Data

The RCircos package is a powerful tool in R for plotting genomic data, particularly useful for visualizing the structure of chromosomes and identifying links between genomic positions. This article aims to guide users through the process of preparing their genomic data for use with RCircos and provide an overview of how to create high-quality plots.

Installing and Loading the RCircos Package

Before we dive into the details, ensure that you have installed the RCircos package in R using the following command:

install.packages("RCircos")

To load the package, use the following command:

library(RCircos)

Understanding the Data Requirements for RCircos

The RCircos package requires specific formatting of genomic data to produce high-quality plots. This includes positioning information for each gene, such as ChromStart and ChromEnd, which represent the start and end positions of a chromosome or gene.

Positioning Information for Genes

In the provided example, the pts3 dataset contains positional information for genes on chromosomes. To use this data with RCircos, we need to create additional columns that will be used to position the chromosomes in the plot:

  • ChromStart: The start position of a chromosome or gene.
  • ChromEnd: The end position of a chromosome or gene.
  • startLoc and endLoc are optional but can be used to reposition genes along their respective chromosomes.

Preparing Data for RCircos

To prepare our data, we need to ensure that it conforms to the required format. In this case, we have already formatted our data to include ChromStart and ChromEnd columns, which will serve as the core of our plot.

Converting ChromStart Positions to 0

As mentioned in the provided example, one potential issue arises when the start position of a chromosome is not set to 0, but rather continues from the previous chromosome. To resolve this, we need to reformat the data so that all ChromStart positions are 0.

For instance, if we have a gene on chromosome 1 with a start position of 33, it would be placed at position 33, and another gene starting at 34 would be placed at position 34. However, by setting ChromStart to 0 for all genes, they will appear as a continuous sequence along their respective chromosomes.

Setting Up RCircos

To create our plot using RCircos, we’ll use the RCircos.Set.Core.Components() function to initialize the package with our data. This function takes in several parameters, including:

  • cyto.info: A dataframe containing positional information for genes.
  • chr.exclude: A vector of chromosome names to exclude from the plot.
  • tracks.inside and tracks.outside: Parameters controlling where tracks are placed relative to the outer edge of the plot.

Initial Setup

Let’s create a basic setup using our prepared data:

core.chrom <- data.frame(
  "Chromosome" = c("chr1", "chr2", "chr3", "chr4", "chr5", "chr6"),
  "ChromStart" = c(0, 2343, 4684, 6918, 8711, 10277),
  "ChromEnd" = c(2342, 4683, 6917, 8710, 10276, 11735)
)

RCircos.Reset.Plot.Ideogram(chrom.ideo = core.chrom)

Customizing the Plot

We can customize our plot further by adjusting various parameters and adding additional features.

Adding Additional Features

For example, we might want to add a track for genes that are linked between chromosomes:

core.link <- data.frame(
  "Chromosome" = c("chr1", "chr2"),
  "startLoc" = c(2343, 4684),
  "endLoc" = c(4683, 6917)
)

RCircos.Reset.Plot.Ideogram(chrom.ideo = core.link)

Conclusion

In this article, we’ve covered the basics of preparing genomic data for use with RCircos and walked through the process of creating high-quality plots using the package. By following these steps and experimenting with different parameters, you can create informative and visually striking plots to help understand the structure of your chromosomes.

Remember that customizing your plot requires some trial and error, so be patient and don’t hesitate to experiment until you achieve the desired results!


Last modified on 2024-04-30