Chemometrics and computational physics are concerned with the analysis
of data arising in chemistry and physics experiments, as well as the
simulation of physico-chemico systems. Many of the functions in base
R are useful for these ends.
Chemometrics with R
by Ron Wehrens,
ISBN: 978-3-642-17840-5, Springer, 2011, provides an introduction to
multivariate statistics in the life sciences, as well as coverage of several
specific topics from the area of chemometrics; the examples in the book are possible to
reproduce using the packages
Modern Statistical Methods for Astronomy With R Applications
by Eric D. Feigelson and G. Jogesh Babu, ISBN-13: 9780521767279, Cambridge, 2012,
provides an introduction to statistics for astronomers and an
overview of the foremost methods being used in astrostatistical analysis,
illustrated by examples in R.
The book by Kurt Varmuza and Peter Filzmoser,
Introduction to Multivariate Statistical Analysis in Chemometrics,
ISBN 978-1-420-05947-2, CRC Press, 2009, is associated with the
A special issue of R News with a focus on
R in Chemistry
was published in August 2006. A special volume of Journal of Statistical Software (JSS) dedicated to
oscopy and Chemometrics in R
was published in January 2007.
Please let us know
we have omitted something of importance, or if a new package or function
should be mentioned here.
Linear Regression Models
Linear models can be fitted (via OLS) with
(from stats). A least squares solution for
Ax = b
can also be computed as
provides a means of constraining
to non-negative or non-positive values; the package
allows other bounds on
to be applied.
Functions for isotonic regression are available in the package
and are useful to determine the unimodal vector that is closest to
a given vector
under least squares criteria.
Heteroskedastic linear models can be fit using the
function of the
Nonlinear Regression Models
(from stats) as well as the package
allow the solution of nonlinear
least squares problems.
unequal variances can be modeled using the
function of the
package provides functions for
Principal Tensor Analysis on k modes.
The package includes also some other multiway methods:
PCAn (Tucker-n) and PARAFAC/CANDECOMP.
Multivariate curve resolution alternating least squares (MCR-ALS)
is implemented in the package
package provides MCR-ALS support for Liquid chromatography with PhotoDiode Array Detection
(LC-DAD) data with
many injections, with features for peak alignment and identification.
provides functions for the analysis
of one or multiple non-linear curves with focus on models for
concentration-response, dose-response and time-response data.
Partial Least Squares
Partial Least Squares Regression (PLSR) and Principal
Component Regression (PCR).
least squares-partial least squares (LS-PLS) method.
Penalized Partial Least Squares is implemented in the
Sparse PLS is implemented in the package
generalized partial least squares, based on the Iteratively
ReWeighted Least Squares (IRWLS) method of Brian Marx.
contains, in addition to the
usual functions for PLS regression, also functions for
Principal Component Analysis
Principal component analysis (PCA) is in the package stats as functions
princomp(). Some graphical PCA representations can be
found in the
package provides nonlinear
PCA and, by defining sets, nonlinear canonical
correlation analysis (models of the Gifi-family).
A desired number of robust principal components can be computed
package. The package
is applicable to sparse PCA. The package
can be applied to restricted MLE for functional PCA.
task view for further packages dealing with
PCA and other projection methods.
Factor analysis (FA) is in the package stats as functions
task view for details on extensions.
Compositional Data Analysis
provides functions for the consistent analysis of compositional data (e.g. portions of substances) and positive numbers (e.g. concentrations).
See also the book,
Analyzing Compositional Data with
by K. Gerald von den Boogaart und Raimon Tolosana-Delgado,
ISBN: 978-3-642-36808-0, Springer, 2013.
Independent Component Analysis
Independent component analysis (ICA) can be computed using
task view provides a list of packages that can be
used for clustering problems.
Stepwise variable selection for linear models, using AIC, is available
selection, by default using Mallow's Cp.
stepwise variable selection for penalized logistic regression.
Variable selection based on evolutionary
algorithms is available in package and
subselect. The latter also provides simulated annealing and
leaps-and-bounds algorithms, as well as local refinements.
provides variable selection methods for random
forests. Cross-validation-based variable selection using Wilcoxon rank
sum tests is available in package
WilcoxCV, focused on
binary classification in microarrays. Package
implements variable selection for model-based clustering.
implements two meta-methods for variable selection: stability selection (applying a primary selection method like a t-test, VIP value or PLSDA regression coefficient) to different subsets of the data, and higher criticism, which provides a data-driven choice of significance cutoffs in statistical testing.
package implements self-organizing maps as well as
some extensions for supervised pattern recognition and data fusion.
package provides functions for self-organizing maps.
package facilitates calibration/inverse
estimation with linear and nonlinear regression models.
package provides functions for plotting
linear calibration functions and estimating standard errors for
package provides functions for
statistical evaluation of calibration curves by different regression
package and the
package are useful for nonlinear calibration models.
calculates the 'representativity'
of two multidimensional
datasets, which involves comparison of the similarity of principal component
analysis loading patterns, variance-covariance matrix structures,
and data set centroid locations.
package includes functions for cellular automata
modeling. One-dimensional cellular automata are also possible to model with
package provides functions
for calculating the standard Gibbs energies and
other thermodynamic properties, and chemical affinities of reactions
between species contained in a thermodynamic database.
Interfaces to External Libraries
the user to access functionality in the
Chemistry Development Kit (CDK),
a Java framework for cheminformatics. This allows the
user to load molecules, evaluate fingerprints, calculate molecular
descriptors and so on. In addition, the CDK API allows the user to
view structures in 2D. The
package provides the CDK
libraries for use in R.
package gives access
(compounds, substance, assays).
a cheminformatics toolkit for analyzing small molecules in R. Its add-on
mismatch tolerant maximum common substructure matching,
accelerated structure similarity searching;
for analyzing bioactivity data, and
functionalities from R.
packages allows analysis of
plus further information such as spatial information, time,
concentrations, etc. Such data are frequently encountered in the
analysis of Raman, IR, NIR, UV/VIS, NMR, etc., spectroscopic data sets.
package collects user-friendly
functions for plotting spectra (NMR, IR, etc) and
carrying top-down exploratory data analysis, such as HCA, PCA
and model-based clustering.
GitHub package: HyperChemoBridge
package implements functions for spectrum
manipulation, ported from the ROOT/TSpectrum class.
package implements the hierarchical Cluster-based Peak Alignment (CluPA) and may be used for aligning NMR spectra.
Software for the book by Donald B. Percival and Andrew T. Walden,
Spectral Analysis for Physical Applications,
Cambridge University Press, 1993, is found in the package
provides a problem solving environment for fitting
separable nonlinear models in physics and chemistry applications, and has been
extensively applied to time-resolved spectroscopy data.
provides functions for pretreatment and sample selection of visible and near infrared diffuse reflectance spectra.
includes functions for spectral dissimilarity analysis and memory-based learning (a.k.a. local modeling) for non-linear modeling in spectral datasets.
defines infrastructure for
mass spectrometry-based proteomics data handling,
plotting, processing and quantification.
provides tools for quantitative analysis
of MALDI-TOF mass spectrometry data, with support for
baseline correction, peak detection and plotting of mass spectra.
is for organic/biological mass spectrometry, with a focus on
graphical display, quantification
using stable isotope dilution, and protein hydrogen/deuterium
package provides functions
for Analyzing Fourier Transform-Ion Cyclotron
Resonance Mass Spectrometry Data.
provides a GUI to analyze mass spectrometric data
on the relative abundance of two substances from a titration series.
The Bioconductor packages
are designed for the analysis of mass spectrometry data.
package is designed for the processing of LC/MS based metabolomics data.
package allows merging
sample processing results from multiple sets of parameter
settings, among other features.
package is for post-processing of metabolomic data, including summarization of replicates, filtering, imputation, and normalization.
package is an MS-based metabolomics data processing and compound annotation pipeline.
Functional Magnetic Resonance Imaging
Functions for I/O, visualization and analysis of functional
Magnetic Resonance Imaging (fMRI) datasets stored in the ANALYZE
or NIFTI format are available in the package
contains functions to analyze fMRI data using
adaptive smoothing procedures.
Fluorescence Lifetime Imaging Microscopy
Functions for visualization and analysis of
Fluorescence Lifetime Imaging Microscopy (FLIM)
datasets are available in the package
chronologies based on radiocarbon and non-radiocarbon dated depths.
baseline identification and peak decomposition for x-ray
Astronomy and astrophysics
package collects 19 datasets from
contemporary astronomy research, many of which are described in the aforementioned textbook ‘Modern Statistical Methods for Astronomy with R Applications’.
package presents an R interface to low-level utilities and codes from the
Interactive Data Language (IDL) Astronomy Users Library
photometric redshift estimation using generalized linear models.
collects R functions for cosmological research, with
its main functions being similar to the python library, cosmolopy.
package calculates periodograms based on (robustly) fitting periodic functions to light curves.
contains functions for reading and writing N-body snapshots from the GADGET code for cosmological N-body/SPH simulations.
performs unsupervised photometric membership assignment in stellar clusters using, e.g., photometry and spatial
functions for basic astronomical calculations.
package provides functions to determine the movement of the sun from
the earth and to determine incident solar radiation.
package provides utilities to read and write files in the FITS (Flexible Image Transport System) format, a standard format in astronomy.
package manages and displays stellar tracks and isochrones from the Pisa low-mass database.
provides miscellaneous astronomy functions, utilities, and data.
contains standard expressions for
distances, times, luminosities, and other quantities useful in
observational cosmology, including molecular line observations.
package provides tools for astronomy; functions
provided may be grouped into 4 main areas: cosmology, FITS file manipulation, the Sersic function and general (plotting and scripting) tools.
package includes a number of common astronomy conversion routines, particularly the HMS and degrees schemes.
is used to
estimate stellar mass and radius given observational data of effective
temperature, [Fe/H], and astroseismic parameters.
Astrostatistics and Astroinformatics Portal Software Forum
is an R-centric collection of information and discussion regarding software for statistical analysis in astronomy.
Optics and Scattering Approximations
package provides code to simulate
reflection and transmission at a multilayer planar interface.
package calculates the polarizability tensor for the dipoles associated with a set of ellipsoidal nanoparticles, and solves the coupled-dipole equations by direct inversion of the interaction matrix.
package defines some physical constants and dielectric functions commonly used in optics and plasmonics.
package provides functions to simulate and model systems involved in
the capture and use of solar energy, including
Positron Emission Tomography
implements different analytic/direct and iterative reconstruction methods
for positron emission tomography (PET) data.
Water and Soil Chemistry
package is a toolbox for aquatic
chemical modelling focused on (ocean) acidification and CO2 air-water
task view for further related
packages related to water and soil chemistry.