Introduction

R is an open-source programming language and environment for statistical computing, data science, visualization, and reproducible research. It is especially common in statistics, bioinformatics, econometrics, social science, machine learning, and academic data science workflows.

R is built around vectorized data structures, interactive exploration, package-based extension, and strong graphics support. Its ecosystem is useful for the full research cycle: importing data, cleaning it, fitting statistical models, producing plots, running experiments, and publishing reports.

Typical Workflow

  1. Install R from CRAN or the system package manager.
  2. Create a project directory for code, data, figures, and reports.
  3. Use scripts, notebooks, or Quarto/R Markdown documents for reproducible work.
  4. Install packages with install.packages and load them with library.
  5. Keep project dependencies reproducible with tools such as renv.
  6. Record environment details with sessionInfo when sharing results.

RStudio R IDEs and Editors

RStudio is the most common IDE for R. It provides an editor, console, environment browser, plotting pane, package tools, project support, and integration with notebooks and reports.

Posit is the company behind RStudio and related data-science tooling.

Packages

R packages extend the language with functions, datasets, modeling tools, plotting systems, and interfaces to external software. CRAN is the central package repository, while Bioconductor is widely used for bioinformatics and computational biology.

Common package families:

  • tidyverse: data import, cleaning, transformation, and visualization.
  • ggplot2: grammar-of-graphics plotting.
  • dplyr and tidyr: data manipulation and reshaping.
  • data.table: high-performance tabular data processing.
  • knitr, rmarkdown, and quarto: reproducible reports.
  • shiny: interactive web applications.
  • tidymodels: modeling and machine learning workflows.

Basic Commands

# Install and load packages
install.packages("tidyverse")
library(tidyverse)
 
# Inspect the current environment
getwd()
sessionInfo()
 
# Read and write CSV files
df <- read.csv("data/input.csv")
write.csv(df, "data/output.csv", row.names = FALSE)

Reproducibility Notes

Prefer project-relative paths, keep raw data separate from derived data, and place reusable code in scripts or package-style functions. For research projects, save both the analysis code and the package environment so results can be rerun later.

Useful habits:

  • Use one project directory per analysis.
  • Avoid manual edits to intermediate data files.
  • Keep figures, tables, and reports generated from source code.
  • Use version control for scripts and notebooks.
  • Capture package versions with renv or sessionInfo.

Reference List

  1. https://www.r-project.org/
  2. https://cran.r-project.org/
  3. https://posit.co/download/rstudio-desktop/
  4. https://posit.co/blog/rstudio-is-becoming-posit/
  5. https://www.tidyverse.org/
  6. https://r4ds.hadley.nz/