regionplot() displays the association results for a smaller genetic regions within one chromosome. Required parameter is at least one dataset (dataframe) containing the association data (with columns CHROM,POS,P in upper or lowercase) and either a variant ID, gene name or the genetic region represented as a chromosome together with start and stop positions (either as a single string or as three separate arguments).

All other input parameters are optional

regionplot(
  df,
  ntop = 10,
  annotate = NULL,
  xmin = 0,
  size = 2,
  shape = 19,
  alpha = 1,
  label_size = 4,
  annotate_with = "ID",
  color = get_topr_colors(),
  axis_text_size = 11,
  axis_title_size = 12,
  title_text_size = 13,
  show_genes = NULL,
  show_overview = TRUE,
  show_exons = FALSE,
  max_genes = 200,
  sign_thresh = 5e-09,
  sign_thresh_color = "red",
  sign_thresh_label_size = 3.5,
  xmax = NULL,
  ymin = NULL,
  ymax = NULL,
  protein_coding_only = FALSE,
  region_size = 1e+06,
  gene_padding = 1e+05,
  angle = 0,
  legend_title_size = 12,
  legend_text_size = 11,
  nudge_x = 0.01,
  nudge_y = 0.01,
  rsids = NULL,
  variant = NULL,
  rsids_color = NULL,
  legend_name = "",
  legend_position = "right",
  chr = NULL,
  vline = NULL,
  show_gene_names = NULL,
  legend_labels = NULL,
  gene = NULL,
  title = NULL,
  label_color = NULL,
  locuszoomplot = FALSE,
  region = NULL,
  legend_nrow = NULL,
  gene_label_size = NULL,
  scale = 1,
  show_legend = TRUE,
  sign_thresh_linetype = "dashed",
  sign_thresh_size = 0.5,
  rsids_with_vline = NULL,
  annotate_with_vline = NULL,
  show_gene_legend = TRUE,
  unit_main = 7,
  unit_gene = 2,
  unit_overview = 1.25,
  verbose = NULL,
  gene_color = NULL,
  segment.size = 0.2,
  segment.color = "black",
  segment.linetype = "solid",
  max.overlaps = 10,
  unit_ratios = NULL,
  extract_plots = FALSE,
  label_fontface = "plain",
  label_family = "",
  gene_label_fontface = "plain",
  gene_label_family = "",
  build = 38,
  label_alpha = 1
)

Arguments

df

Dataframe or a list of dataframes (required columns are CHROM,POS,P), in upper- or lowercase) of association results.

ntop

An integer, number of datasets (GWASes) to show on the top plot

annotate

A number (p-value). Display annotation for variants with p-values below this threshold

xmin, xmax

Integer, setting the chromosomal range to display on the x-axis

size

A number or a vector of numbers, setting the size of the plot points (default: size=1.2)

shape

A number of a vector of numbers setting the shape of the plotted points

alpha

A number or a vector of numbers setting the transparency of the plotted points

label_size

An number to set the size of the plot labels (default: label_size=3)

annotate_with

A string. Annotate the variants with either Gene_Symbol or ID (default: "Gene_Symbol")

color

A string or a vector of strings, for setting the color of the datapoints on the plot

axis_text_size

A number, size of the x and y axes tick labels (default: 12)

axis_title_size

A number, size of the x and y title labels (default: 12)

title_text_size

A number, size of the plot title (default: 13)

show_genes

A logical scalar, show genes instead of exons (default show_genes=FALSE)

show_overview

A logical scalar, shows/hides the overview plot (default= TRUE)

show_exons

Deprecated : A logical scalar, show exons instead of genees (default show_exons=FALSE)

max_genes

An integer, only label the genes if they are fewer than max_genes (default values is 200).

sign_thresh

A number or vector of numbers, setting the horizontal significance threshold (default: sign_thresh=5.1e-9). Set to NULL to hide the significance threshold.

sign_thresh_color

A string or vector of strings to set the color/s of the significance threshold/s

sign_thresh_label_size

A number setting the text size of the label for the significance thresholds (default text size is 3.5)

ymin, ymax

Integer, min and max of the y-axis, (default values: ymin=0, ymax=max(-log10(df$P)))

protein_coding_only

A logical scalar, if TRUE, only protein coding genes are used for annotation

region_size

An integer (default = 1000000) (or a string represented as 100kb or 1MB) indicating the window size for variant labeling. Increase this number for sparser annotation and decrease for denser annotation.

gene_padding

An integer representing size of the region around the gene, if the gene argument was used (default = 100000)

angle

A number, the angle of the text label

legend_title_size

A number, size of the legend title

legend_text_size

A number, size of the legend text

nudge_x

A number to vertically adjust the starting position of each gene label (this is a ggrepel parameter)

nudge_y

A number to horizontally adjust the starting position of each gene label (this is a ggrepel parameter)

rsids

A string (rsid) or vector of strings to highlight on the plot, e.g. rsids=c("rs1234, rs45898")

variant

A string representing the variant to zoom in on. Can be either an rsid, or a dataframe (with the columns CHROM,POS,P)

rsids_color

A string, the color of the variants in variants_id (default color is red)

legend_name

A string, use to change the name of the legend (default: None)

legend_position

A string, top,bottom,left or right

chr

A string or integer, the chromosome to plot (i.e. chr15), only required if the input dataframe contains results from more than one chromosome

vline

A number or vector of numbers to add a vertical line to the plot at a specific chromosomal position, e.g vline=204000066. Multiple values can be provided in a vector, e.g vline=c(204000066,100500188)

show_gene_names

A logical scalar, if set to TRUE, gene names are shown even though they exceed the max_genes count

legend_labels

A string or vector of strings representing legend labels for the input dataset's

gene

A string representing the gene to zoom in on (e.g. gene=FTO)

title

A string to set the plot title

label_color

A string or a vector of strings. To change the color of the gene or variant labels

locuszoomplot

A logical scalar set to FALSE. Only set to TRUE by calling the locuszoom function

region

A string representing a genetic region, e.g. chr1:67038906-67359979

legend_nrow

An integer, sets the number of rows allowed for the legend labels

gene_label_size

A number setting the size of the gene labels shown at the bottom of the plot

scale

A number, to change the size of the title and axes labels and ticks at the same time (default = 1)

show_legend

A logical scalar, set to FALSE to hide the legend (default value is TRUE)

sign_thresh_linetype

A string, the line-type of the horizontal significance threshold (default = dashed)

sign_thresh_size

A number, sets the size of the horizontal significance threshold line (default = 1)

rsids_with_vline

A string (rsid) or vector of strings to highlight on the plot with their rsids and vertical lines further highlighting their positions

annotate_with_vline

A number (p-value). Display annotation and vertical lines for variants with p-values below this threshold

show_gene_legend

A logical scalar, set to FALSE to hide the gene legend (default value is TRUE)

unit_main

the height unit of the main plot (default = 7)

unit_gene

the height unit of the gene plot (default= 2 )

unit_overview

the height unit of the overview plot (default = 1.25)

verbose

Logical, set to FALSE to get suppress printed information

gene_color

A string representing a color, can be used to change the color of the genes/exons on the geneplot

segment.size

line segment color (ggrepel argument)

segment.color

line segment thickness (ggrepel argument)

segment.linetype

line segment solid, dashed, etc.(ggrepel argument)

max.overlaps

Exclude text labels that overlap too many things. Defaults to 10 (ggrepel argument)

unit_ratios

A string of three numbers separated by ":", for the overview, main and gene plots height ratios e.g 1.25:7:2

extract_plots

Logical, FALSE by default. Set to TRUE to extract the three plots separately in a list

label_fontface

A string or a vector of strings. Label font “plain”, “bold”, “italic”, “bold.italic” (ggrepel argument)

label_family

A stirng or a vector of strings. Label font name (default ggrepel argument is "")

gene_label_fontface

Gene label font “plain”, “bold”, “italic”, “bold.italic” (ggrepel argument)

gene_label_family

Gene label font name (default ggrepel argument is "")

build

A number representing the genome build. Set to 37 to change to build (GRCh37). The default is build 38 (GRCh38).

label_alpha

An number or vector of numbers to set the transparency of the plot labels (default: label_alpha=1)

Value

plots within ggplotGrobs, arranged with egg::gtable_frame

Examples

if (FALSE) {
regionplot(CD_UKBB, gene="IL23R")
}