*Leonardo Ramirez-Lopez & Antoine Stevens*

*Last update: 26.06.2019 :::: 10:05 GMT+2*

The current (released) version of the `resemble`

package can be downloaded and installed directly from the CRAN repository. Pretty simple, you just have to type in your R console:

```
install.packages('resemble')
```

If you do not have the following packages installed, in some cases it is better to isntall them first
```
install.packages('Rcpp')
install.packages('RcppArmadillo')
install.packages('foreach')
install.packages('iterators')
```

NOTE: Apart from these packages we stronly recommend to download and install Rtools (directly from here or from CRAN https://cran.r-project.org/bin/windows/Rtools/). This is important for obtaining the proper C++ toolchain that you might need for using `resemble`

.

Then, install `resemble`

:

In this website you can also get the last development version of the `resemble`

package. You can download the binary (.zip) file or the source file (.tar.gz) by selecting the corresponding option in the left panel. Remeber you should have R>=3.2.2. Supose you downloaded the binary file to 'C:/MyFolder/', then you should be able to install the package as follows:

```
install.packages('C:/MyFolder/resemble-1.2.2.zip', repos = NULL)
```

or

```
install.packages('C:/MyFolder/resemble-1.2.2.tar.gz', type = 'source', repos = NULL)
```

You can also install the `resemble`

package directly from github using `devtools`

(with a proper installed version of Rtools):

```
require("devtools")
install_github("resemble","l-ramirez-lopez")
```

After installing `resemble`

you should be also able to run the following lines:

```
require(resemble)
help(mbl)
#install.packages('prospectr')
require(prospectr)
data(NIRsoil)
Xu <- NIRsoil$spc[!as.logical(NIRsoil$train),]
Yu <- NIRsoil$CEC[!as.logical(NIRsoil$train)]
Yr <- NIRsoil$CEC[as.logical(NIRsoil$train)]
Xr <- NIRsoil$spc[as.logical(NIRsoil$train),]
Xu <- Xu[!is.na(Yu),]
Xr <- Xr[!is.na(Yr),]
Yu <- Yu[!is.na(Yu)]
Yr <- Yr[!is.na(Yr)]
# Example of the mbl function
# A mbl approach (the spectrum-based learner) as implemented in Ramirez-Lopez et al. (2013)
# An exmaple where Yu is supposed to be unknown, but the Xu (spectral variables) are known
ctrl1 <- mblControl(sm = 'pc', pcSelection = list('opc', 40),
valMethod = 'NNv', center = TRUE)
sbl.u <- mbl(Yr = Yr, Xr = Xr, Yu = NULL, Xu = Xu,
mblCtrl = ctrl1,
dissUsage = 'predictors',
k = seq(40, 150, by = 10),
method = 'gpr')
getPredictions(sbl.u)
plot(sbl.u)
```

`resemble`

implements a function dedicated to non-linear modelling of complex visible and infrared spectral data based on memory-based learning (MBL, *a.k.a* instance-based learning or local modelling in the chemometrics literature). The package also includes functions for: computing and evaluate spectral similarity/dissimilarity matrices; projecting the spectra onto low dimensional orthogonal variables; removing irrelevant spectra from a reference set; etc.

The functions for computing and evaluate spectral similarity/dissimilarity matrices can be summarized as follows:

** fDiss**: Euclidean and Mahalanobis distances as well as the cosine dissimilarity (

`corDiss`

:`sid`

`orthoDiss`

`simEval`

The functions for projecting the spectra onto low dimensional orthogonal variables are:

** pcProjection**: projects the spectra onto a principal component space

`plsProjection`

`orthoProjection`

`pcProjection`

or the `plsProjection`

functions The projection functions also offer different options for optimizing/selecting the number of components involved in the projection.

The functions modelling the spectra using memory-based learning are:

** mblControl**: controls some modelling aspects of the

`mbl`

function`mbl`

Some additional miscellaneous functions are:

** print.mbl**:prints a summary of the results obtained by the

`mbl`

function`plot.mbl`

`mbl`

function`print.localOrthoDiss`

`orthoDiss`

function In order to expand a little bit more the explanation on the `mbl`

function, let's define first the basic input datasets:

**Reference (training) set**: Dataset with*n*reference samples (e.g. spectral library) to be used in the calibration of spectral models. Xr represents the matrix of samples (containing the spectral predictor variables) and Yr represents a given response variable corresponding to Xr.**Prediction set**: Data set with*m*samples where the response variable (Yu) is unknown. However it can be predicted by applying a spectral model (calibrated by using Xr and Yr) on the spectra of these samples (Xu).

In order to predict each value in Yu, the `mbl`

function takes each sample in Xu and searches in Xr for its *k*-nearest neighbours (most spectrally similar samples). Then a (local) model is calibrated with these (reference) neighbours and it immediately predicts the correspondent value in Yu from Xu. In the function, the *k*-nearest neighbour search is performed by computing spectral similarity/dissimilarity matrices between samples. The `mbl`

function offers the following regression options for calibrating the (local) models:

** 'gpr'**: Gaussian process with linear kernel

`'pls'`

`'wapls1'`

`'wapls2'`

*Infrared spectroscopy**Chemometrics**Local modelling**Spectral library**Lazy learning**Soil spectroscopy*

- 2019-06: Two videos (video 1 and video 2) where a renowned NIR scientist talks about local calibrations.
- 2019-03: Another paper using
`resemble`

... I just published a scientific paper were we used memory-based learning (MBL) for digital soil mapping. Here we use MBL to remove local calibration outliers rather than using this approach to overcome the typical complexity of large spectral datasets. (Ramirez‐Lopez, L., Wadoux, A. C., Franceschini, M. H. D., Terra, F. S., Marques, K. P. P., Sayão, V. M., & Demattê, J. A. M. (2019). Robust soil mapping at the farm scale with vis–NIR spectroscopy. European Journal of Soil Science. 70, 378–393). - 2018-11: In Here The authors predicted brix values in differet food products using memory-based learning implemented with
`resemble`

. (Kopf, M., Gruna, R., Längle, T. and Beyerer, J., 2017, March. Evaluation and comparison of different approaches to multi-product brix calibration in near-infrared spectroscopy. In OCM 2017-Optical Characterization of Materials-conference proceedings (p. 129). KIT Scientific Publishing). - 2016-05: In this recent scientific paper the authors sucesfully used
`resemble`

to predict soil organic carbon content for at national scale in France. - 2016-04: This paper shows some interesting results on appliying memory-based learning to predict soil properties.
- 2016-04: In some recent entries of this blog the author shows some examples on the use of
`resemble`

. - 2016-02: As promised, the version 1.2 (alma-de-coco) is now available on CRAN!
- 2016-01: The version 1.2 (alma-de-coco) has been submitted to CRAN and is available from the github repository!
- 2015-11: A pre-release of the version 1.2.0 (1.2.0.9000 'alma-de-coco') is now available!
`resemble`

is now faster! Some critical functions (e.g. pls and gaussian process regressions were re-written in C++ using Rcpp). This time the new version will be available at CRAN very soon! (we promise). - 2015-11: Well, the version 1.1.3 was never released on CRAN since we decided to carry out major improvements in terms of computational performance
- 2014-10: A pre-release of the version 1.1.3 of the package is already available at this website. We hope it will be available at CRAN very soon!
- 2014-04: A short note on the
`resemble`

and`prospectr`

packages was published in this newsletter. There we provide some examples on representative subset selection and on how to reproduce the LOCAL and spectrum-based learner algorithms. In those examples the dataset of the Chemometric challenge of 'Chimiométrie 2006' (included in the`prospectr`

package) is used. - 2014-03: The package was released on CRAN!

- Check out our other project called
`prospectr`

. - Check this presentation in which we used the
`resemble`

package to predict soil attributes from large scale soil spectral libraries.

If you detect bugs, or if you have a question or request you can just send an e-mail to Leo who is the package maintainer (ramirez.lopez@gmail.com) or create an issue on github. We would be happy to hear from you!