vignettes/MsBackendTimsTof.Rmd
MsBackendTimsTof.Rmd
Package: MsBackendTimsTof
Authors: Johannes Rainer [aut, cre] (https://orcid.org/0000-0002-6977-7147), Andrea Vicini
[aut] (https://orcid.org/0000-0001-9438-6909), Steffen Neumann
[ctb] (https://orcid.org/0000-0002-7899-7192), Carolin Huber
[ctb] (https://orcid.org/0000-0002-9355-8948)
Compiled: Fri Apr 8 13:01:26 2022
The Spectra
package provides a central infrastructure for the handling of Mass
Spectrometry (MS) data. The package supports interchangeable use of
different backends to import MS data from a variety of sources
(such as mzML files). The MsBackendTimsTof
package adds
support for Bruker TimsTOF raw data files. This vignette shows how, and
which data can be retrieved from such files.
The package depends on the OpenTIMS C++ library to
access data in timsTOF Pro data format (TDF) which is provided by the
opentimsr
R package. The MsBackendTimsTof
package can be installed with:
BiocManager::install("RforMassSpectrometry/MsBackendTimsTof")
To get some variables from the data files an additional library from the manufacturer is needed. This library can be downloaded with:
so_folder <- tempdir()
library(opentimsr)
so_file <- download_bruker_proprietary_code(so_folder)
This downloads the shared library to a temporary folder. Note however
that at present this shared library is only available for
Windows and Linux (i.e. no macOS support). Next, to use this
library, it has to be registered with the opentimsr
package:
setup_bruker_so(so_file)
These steps would be necessary for every new R session. To avoid
that, it is suggested to copy the downloaded shared library above to a
directory on the computer and to define an environment variable called
TIMSTOF_LIB
that defines the full path where this file is
located (i.e. a character string defining the full file path with the
file name). This variable can either be defined system wide, or within
the .Rprofile file. An example entry in a .Rprofile
could for example be:
options(TIMSTOF_LIB = "/Users/jo/lib/libtimsdata.so")
The MsBackendTimsTof
package adds support for Bruker
TimsTOF files to Spectra
-based analysis workflows. Below we
load the package and in addition fetch the required shared library and
store that to a temporary folder.
library(MsBackendTimsTof)
## Load the opentimsr package and download and register the shared library
library(opentimsr)
so_folder <- tempdir()
so_file <- download_bruker_proprietary_code(so_folder, method = "wget")
## [1] "Downloading 64-bit Linux binary."
## [1] "Downloading from: https://raw.githubusercontent.com/MatteoLacki/opentims_bruker_bridge/main/opentims_bruker_bridge/libtimsdata.so"
setup_bruker_so(so_file)
As detailed in the installation section, the code to download the
shared library would only be necessary once, if the path to this file is
defined in a environment variable TIMSTOF_LIB
.
We next load the TDF test file which is bundled within this package.
fl <- system.file("ddaPASEF.d", package = "MsBackendTimsTof")
be <- backendInitialize(MsBackendTimsTof(), fl)
In a real use case, we would however directly load the data into a
Spectra
object:
sps <- Spectra(fl, source = MsBackendTimsTof())
sps
## MSn data (Spectra) with 9120 spectra in a MsBackendTimsTof backend:
## msLevel precursorMz polarity
## <integer> <numeric> <integer>
## 1 1 NA 1
## 2 1 NA 1
## 3 1 NA 1
## 4 1 NA 1
## 5 1 NA 1
## ... ... ... ...
## 9116 1 NA 1
## 9117 1 NA 1
## 9118 1 NA 1
## 9119 1 NA 1
## 9120 1 NA 1
## ... 35 more variables/columns.
## Use 'spectraVariables' to list all of them.
We thus have access to all spectra variables within the file:
spectraVariables(be)
## [1] "msLevel" "rtime"
## [3] "acquisitionNum" "scanIndex"
## [5] "mz" "intensity"
## [7] "dataStorage" "dataOrigin"
## [9] "centroided" "smoothed"
## [11] "polarity" "precScanNum"
## [13] "precursorMz" "precursorIntensity"
## [15] "precursorCharge" "collisionEnergy"
## [17] "isolationWindowLowerMz" "isolationWindowTargetMz"
## [19] "isolationWindowUpperMz" "tof"
## [21] "inv_ion_mobility" "frameId"
## [23] "ScanMode" "MsMsType"
## [25] "TimsId" "MaxIntensity"
## [27] "SummedIntensities" "NumScans"
## [29] "NumPeaks" "MzCalibration"
## [31] "T1" "T2"
## [33] "TimsCalibration" "PropertyGroup"
## [35] "AccumulationTime" "RampTime"
## [37] "Pressure" "file"
And the full data can be retrieved with spectraData
:
all <- spectraData(be)
all
## DataFrame with 9120 rows and 38 columns
## msLevel rtime acquisitionNum scanIndex mz intensity
## <integer> <numeric> <integer> <integer> <NumericList> <NumericList>
## 1 1 0.218626 NA 218 1221.98 52
## 2 1 0.218626 NA 220 1222 93
## 3 1 0.218626 NA 221 1221.99 108
## 4 1 0.218626 NA 222 1222.00,1224.01 38,57
## 5 1 0.218626 NA 223 1222.00,1222.02 61,34
## ... ... ... ... ... ... ...
## 9116 1 6.7812 NA 719 425.83 82
## 9117 1 6.7812 NA 723 425.84 78
## 9118 1 6.7812 NA 726 425.84 89
## 9119 1 6.7812 NA 727 425.837 50
## 9120 1 6.7812 NA 730 738.739 77
## dataStorage dataOrigin centroided smoothed polarity
## <character> <character> <logical> <logical> <integer>
## 1 /__w/_temp/Library/M.. NA NA NA 1
## 2 /__w/_temp/Library/M.. NA NA NA 1
## 3 /__w/_temp/Library/M.. NA NA NA 1
## 4 /__w/_temp/Library/M.. NA NA NA 1
## 5 /__w/_temp/Library/M.. NA NA NA 1
## ... ... ... ... ... ...
## 9116 /__w/_temp/Library/M.. NA NA NA 1
## 9117 /__w/_temp/Library/M.. NA NA NA 1
## 9118 /__w/_temp/Library/M.. NA NA NA 1
## 9119 /__w/_temp/Library/M.. NA NA NA 1
## 9120 /__w/_temp/Library/M.. NA NA NA 1
## precScanNum precursorMz precursorIntensity precursorCharge collisionEnergy
## <integer> <numeric> <numeric> <integer> <numeric>
## 1 NA NA NA NA NA
## 2 NA NA NA NA NA
## 3 NA NA NA NA NA
## 4 NA NA NA NA NA
## 5 NA NA NA NA NA
## ... ... ... ... ... ...
## 9116 NA NA NA NA NA
## 9117 NA NA NA NA NA
## 9118 NA NA NA NA NA
## 9119 NA NA NA NA NA
## 9120 NA NA NA NA NA
## isolationWindowLowerMz isolationWindowTargetMz isolationWindowUpperMz
## <numeric> <numeric> <numeric>
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## ... ... ... ...
## 9116 NA NA NA
## 9117 NA NA NA
## 9118 NA NA NA
## 9119 NA NA NA
## 9120 NA NA NA
## tof inv_ion_mobility frameId ScanMode MsMsType TimsId
## <NumericList> <numeric> <integer> <integer> <integer> <integer>
## 1 315091 1.39278 1 8 0 0
## 2 315094 1.39053 1 8 0 0
## 3 315092 1.38940 1 8 0 0
## 4 315093,315457 1.38827 1 8 0 0
## 5 315094,315097 1.38715 1 8 0 0
## ... ... ... ... ... ... ...
## 9116 134284 0.831408 52 8 0 208896
## 9117 134287 0.826949 52 8 0 208896
## 9118 134287 0.823604 52 8 0 208896
## 9119 134286 0.822490 52 8 0 208896
## 9120 216904 0.819146 52 8 0 208896
## MaxIntensity SummedIntensities NumScans NumPeaks MzCalibration T1
## <integer> <integer> <integer> <integer> <integer> <numeric>
## 1 734 44601 927 348 1 25.143
## 2 734 44601 927 348 1 25.143
## 3 734 44601 927 348 1 25.143
## 4 734 44601 927 348 1 25.143
## 5 734 44601 927 348 1 25.143
## ... ... ... ... ... ... ...
## 9116 742 42368 927 304 1 25.143
## 9117 742 42368 927 304 1 25.143
## 9118 742 42368 927 304 1 25.143
## 9119 742 42368 927 304 1 25.143
## 9120 742 42368 927 304 1 25.143
## T2 TimsCalibration PropertyGroup AccumulationTime RampTime
## <numeric> <integer> <integer> <numeric> <numeric>
## 1 23.9875 1 1 100.008 100.008
## 2 23.9875 1 1 100.008 100.008
## 3 23.9875 1 1 100.008 100.008
## 4 23.9875 1 1 100.008 100.008
## 5 23.9875 1 1 100.008 100.008
## ... ... ... ... ... ...
## 9116 23.9898 1 1 100.008 100.008
## 9117 23.9898 1 1 100.008 100.008
## 9118 23.9898 1 1 100.008 100.008
## 9119 23.9898 1 1 100.008 100.008
## 9120 23.9898 1 1 100.008 100.008
## Pressure file
## <numeric> <integer>
## 1 2.57282 1
## 2 2.57282 1
## 3 2.57282 1
## 4 2.57282 1
## 5 2.57282 1
## ... ... ...
## 9116 2.57283 1
## 9117 2.57283 1
## 9118 2.57283 1
## 9119 2.57283 1
## 9120 2.57283 1
The data is organized by individual spectra, all spectra measured
within the same frame have the same value in the spectra
variable "frameId"
. Spectra variable
"inv_ion_mobility"
provides the inverse ion
mobility information. This variable is available as a spectra
variable, but also as a peaks variable along with
e.g. "tof"
.
Below we subset the backend to a range of spectra and extract their
peaksData
.
be_sub <- be[218:226]
peaksData(be_sub, columns = c("mz", "intensity", "tof", "inv_ion_mobility",
"retention_time"))
## [[1]]
## mz intensity tof inv_ion_mobility retention_time
## [1,] 1221.984 9 315091 1.350001 0.344953
##
## [[2]]
## mz intensity tof inv_ion_mobility retention_time
## [1,] 1221.995 38 315093 1.346626 0.344953
##
## [[3]]
## mz intensity tof inv_ion_mobility retention_time
## [1,] 1221.99 81 315092 1.3455 0.344953
##
## [[4]]
## mz intensity tof inv_ion_mobility retention_time
## [1,] 921.9951 73 257110 1.19156 0.344953
##
## [[5]]
## mz intensity tof inv_ion_mobility retention_time
## [1,] 922.0047 83 257112 1.190438 0.344953
##
## [[6]]
## mz intensity tof inv_ion_mobility retention_time
## [1,] 922.0047 43 257112 1.189316 0.344953
## [2,] 923.0103 71 257321 1.189316 0.344953
##
## [[7]]
## mz intensity tof inv_ion_mobility retention_time
## [1,] 921.9999 123 257111 1.188193 0.344953
##
## [[8]]
## mz intensity tof inv_ion_mobility retention_time
## [1,] 921.9999 47 257111 1.187071 0.344953
## [2,] 923.0248 18 257324 1.187071 0.344953
##
## [[9]]
## mz intensity tof inv_ion_mobility retention_time
## [1,] 923.0248 83 257324 1.185949 0.344953
Note however that both the "inv_ion_mobility"
and
"retention_time"
have the same value for all peaks in each
spectrum. Thus, these variables should be accessed not through
peaksData
, but through spectraData
or using
the $
operator or the dedicated rtime
function. Below we extract the inverse ion mobility values and display
the first 6 of them.
head(sps$inv_ion_mobility)
## [1] 1.392779 1.390527 1.389401 1.388274 1.387148 1.386022
## R Under development (unstable) (2022-03-14 r81896)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
##
## Matrix products: default
## BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] opentimsr_1.0.13 MsBackendTimsTof_0.1.1 Spectra_1.5.16
## [4] ProtGenerics_1.27.2 BiocParallel_1.29.20 S4Vectors_0.33.17
## [7] BiocGenerics_0.41.2 BiocStyle_2.23.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.8.3 bslib_0.3.1 compiler_4.2.0
## [4] BiocManager_1.30.16 jquerylib_0.1.4 tools_4.2.0
## [7] bit_4.0.4 digest_0.6.29 RSQLite_2.2.12
## [10] jsonlite_1.8.0 evaluate_0.15 memoise_2.0.1
## [13] clue_0.3-60 pkgconfig_2.0.3 rlang_1.0.2
## [16] DBI_1.1.2 cli_3.2.0 yaml_2.3.5
## [19] parallel_4.2.0 pkgdown_2.0.2.9000 xfun_0.30
## [22] fastmap_1.1.0 cluster_2.1.3 stringr_1.4.0
## [25] knitr_1.38 vctrs_0.4.0 desc_1.4.1
## [28] fs_1.5.2 sass_0.4.1 systemfonts_1.0.4
## [31] IRanges_2.29.1 MsCoreUtils_1.7.4 bit64_4.0.5
## [34] rprojroot_2.0.3 R6_2.5.1 textshaping_0.3.6
## [37] rmarkdown_2.13 bookdown_0.25 blob_1.2.2
## [40] purrr_0.3.4 magrittr_2.0.3 htmltools_0.5.2
## [43] MASS_7.3-56 ragg_1.2.2 stringi_1.7.6
## [46] cachem_1.0.6