../../Rtrack_website/vignettes/Rtrack_OFT_analysis.Rmd
Rtrack_OFT_analysis.Rmd
Developed initially as a test for anxiety, where the readiness of nocturnal rodents to explore the centre of a brightly-lit field was the primary outcome, the advent of sophisticated video tracking technologies has meant that open field can also be used to identify more subtle variations in behaviour. A multivariable analysis of the exploratory path, such as provided by Rtrack, can turn this technically extremely simple test into a rich source of phenomics data.
The Rtrack package aims to be an easy-to-use interface for data management and analysis of spatial tracking data. This workflow describes analysis of data from the open field test. Although the open field is a very simple paradigm, experiments often accumulate many tracks each with associated metadata. Managing all these data can be overwhelming and lead to confusion and potential errors. Unfortunately, due to the potential complexity of the experiments and the range of computational abilities of the researchers (who are typically experimentalists with little background in programming or data analysis), many existing software solutions have not found their way into laboratory workflows. The Rtrack package is built on the popular and ubiquitous R platform which runs on all commonly-used operating systems, is free and open-source (so that the cost of licencing is not a limitation) and which integrates well into existing analysis environments. The data preparation involves exporting path data from the acquisition software used (outputs from a number of software platforms are supported and more are being added), defining the arenas in which the experiment was performed and creating a spreadsheet with information about each track. Once the data and information spreadsheet have been prepared, the actual analysis runs in very little time (processing for 1000 tracks takes under a minute on a typical modern computer) and the results are available in R for further analysis, or can be exported for analysis using other software if desired. Publication-quality graphs can be generated and exported directly from the functions in Rtrack. In addition, the raw path data can be saved to a standardised format to allow data sharing and to enhance reproducibility.
The data provided together with this vignette are from a published study (Zocher et al. 2020) where mice were housed in three different environments (either standard laboratory cages, STD; enriched cages with many cage mates and toys, ENR; or enrichment housing followed by a return to standard cages; ENR-STD). There was a single trial of 5 min in a 60 times 60 cm field.
In this example, an archived experiment (raw data that has been saved in the portable trackxf format) is read in directly from a URL. The experiment is reconstructed, strategies calculated and plotted for a visual overview. To explore the further functionality of the package, please work through the tutorial examples below.
experiment = Rtrack::read_experiment("https://rupertoverall.net/Rtrack/examples/OFT_example.trackxf")
#> Restoring archived experiment.
#> Processing tracks.
Rtrack::plot_variable("path.length", experiment = experiment, factor = "Group",
las = 1)
An ‘arena’ is a description of the open field and the recording parameters for each session. This means that any change in layout of the field (for example, if the camera position is moved) will require a separate arena file. The description files are simple and consist of only three lines:
There are actually some other ways of describing the field shape, but this is the preferred method.
The units for the x and y coordinates do not need to be specified, but these must be the same units used in the raw track files.
For example, the description for one of the arenas in the example file (‘Arena_OF_1.txt’) is:type = oft time.units = s arena.bounds = square -1.42 126.91 60.34 120.23 55.19 58.12 -8.1 65.15
The key task before analysing an experiment is to gather together all the information you need for the analysis. This is always necessary for any analysis, and is always a nasty task. Nevertheless, Rtrack uses a straightforward spreadsheet format to make this task less tedious and less confusing.
Several columns are required, these all must begin with an underscore ’_’:
project.dir
in the
read_experiment
function. See the note on relative paths
below.data.dir
in the
read_experiment
function. See the note on relative paths
below.?Rtrack::identify_track_format
) for a list of the supported
file formats.You can also add any other columns of factors. In the example, there is a factor ‘Group’, which records the tpe of housing the animals were in.
If your analysis will be done in the same directory as the raw data
files are in, then you can ignore this comment. If, however, your raw
data are large, you may have them stored on an external disc or network
volume. By specifying the data.dir
parameter, you can keep
these raw data anywhere you like and even move them without having to
update the experiment description spreadsheet. All file paths in the
experiment description are relative to the data.dir
directory.
Look at one track to get a feel for the workflow. Firstly an arena
definition must be read in. The resulting object has the class
Rtrack_arena
.
arena = Rtrack::read_arena("OFT_example/Arena_OF_1.txt")
There are many different raw data formats. The format of the data
files depends on the software they were recorded with, the locale and
(sometimes) the computer system they were recorded with. Each format
supported by Rtrack has a code, which must be given to the
read_path
function. Run the function
identify_track_format
with one of the raw track files to
help you determine the appropriate format code for your data.
track.format = Rtrack::identify_track_format("OFT_example/Data/Trial_2_Arena_1.txt")
#> ✔ This track seems to be in the format 'ethovision.xt.csv2'.
The tracks for the example are in the format
ethovision.xt.csv2
, we need to pass this information on to
the reader function. The arena is also required for reading in the path
(to provide calibration information).
path = Rtrack::read_path("OFT_example/Data/Trial_2_Arena_1.txt", arena, id = "test",
track.format = "ethovision.xt.csv2")
The path (of class Rtrack_path
) can now be used to
collect a range of metrics. This results in a list of various secondary
variables which can be used for plotting and analysis.
metrics = Rtrack::calculate_metrics(path, arena)
The path (the coordinates of the animal in the arena during the experiment) can be plotted. This representation shows the path as a black line and some informative areas of the field (called ‘zones’ by Rtrack; the zones in an open field experiment are the wall, the corners and the centre.) in shades of blue.
Rtrack::plot_path(metrics)
Paths can also be plotted as a density heatmap.
Rtrack::plot_density(metrics)
Feel free to play with the colours (just please don’t use a garish ‘rainbow’ scheme). The colour scales are best defined using the ‘colorRampPalette’ function.
Rtrack::plot_density(metrics, col = colorRampPalette(c("yellow", "orange", "red"))(256))
You can use any of the colour definitions provided in R, and reducing the number of colours in the palette gives a contour effect.
Rtrack::plot_density(metrics, col = colorRampPalette(c(rgb(1, 1, 0.2), "orange",
"#703E3E"))(8))
Usually, an experiment will consist of multiple subjects/animals and
possibly more than one trial per subject. Running each track separately
by hand as we have done above would be tedious and error-prone. Rtrack
allows you to set up a batch processing workflow to make this task
easier. A description of the experiment is filled out with all the
required data and passed to to the read_experiment
function
to be processed automatically.
The experiment information is read in using metadata in a
spreadsheet. See ‘Preparing the input files’ above for details on how to
properly construct this file. The raw data are read in, metrics
calculated and returned in a list object of class
Rtrack_experiment
. This is the most processor-intensive
part of the workflow and an experiment will typically consist of many
hundreds of tracks. Depending on the size of the experiment and the
speed of your computer, this step may take several minutes (a friendly
progress bar will let you know if there is time for a coffee at this
step—the software is fast though, so it may be an espresso!).
experiment = Rtrack::read_experiment("OFT_example/Experiment.xlsx", data.dir = "OFT_example/Data")
#> Processing tracks.
By default, processing the experiment will run as one single
process2
but is trivial to parallelise this potentially time-consuming step (if
perhaps you have run out of coffee). Rtrack version 2 will take care of
parallelising the code and all you need to do is adjust the
threads
parameter. The simple option is to specify
threads = 0
, which tells Rtrack to use as much processing
power as it can. Now try running the read_experiment
code
again and see if this makes a difference in processing time.
experiment = Rtrack::read_experiment("OFT_example/Experiment.xlsx", data.dir = "OFT_example/Data",
threads = 0)
Once the experiment object has been constructed, you can use this to start analysing the results. Individual metrics might be of interest for separate analysis; for example, the number of time each subject crossed the centre zone—traditionally an indicator of boldness in mice The built-in plotting function allows you to quickly inspect your data and includes the ability to split the results by a grouping factor—here the housing group.
Rtrack::plot_variable("centre.zone.crossings", experiment = experiment, factor = "Group")
title(main = "Crossings of the centre zone")
The ‘summary.variables’ element shows all the metrics available.
experiment$summary.variables
#> [1] "path.length" "total.time"
#> [3] "velocity" "immobility"
#> [5] "coverage" "distance.from.centre"
#> [7] "distance.from.wall" "distance.from.corner"
#> [9] "roaming.entropy" "velocity.in.centre.zone"
#> [11] "velocity.in.wall.zone" "velocity.in.corner.zone"
#> [13] "immobility.in.centre.zone" "immobility.in.wall.zone"
#> [15] "immobility.in.corner.zone" "latency.to.centre.zone"
#> [17] "latency.to.wall.zone" "latency.to.corner.zone"
#> [19] "time.in.centre.zone" "time.in.wall.zone"
#> [21] "time.in.corner.zone" "centre.zone.crossings"
#> [23] "wall.zone.crossings" "corner.zone.crossings"
It is also possible to create a density heatmap for many tracks together.
Rtrack::plot_density(experiment$metrics)
#> Warning in Rtrack::plot_density(experiment$metrics): Multiple arena definitions
#> have been used. A merged plot may not make sense.
The warning tells us that there are data from tracks using different arenas in our ‘metrics’ list. This almost certainly does not make sense.
However, it might be interesting compare all the tracks from each of the different groups.
std.metrics = experiment$metrics[experiment$factors$Group == "STD"]
enr.metrics = experiment$metrics[experiment$factors$Group == "ENR"]
enrstd.metrics = experiment$metrics[experiment$factors$Group == "ENR-STD"]
par(mfrow = c(1, 3))
Rtrack::plot_density(std.metrics, title = "STD")
#> Warning in Rtrack::plot_density(std.metrics, title = "STD"): Multiple arena
#> definitions have been used. A merged plot may not make sense.
Rtrack::plot_density(enr.metrics, title = "ENR")
#> Warning in Rtrack::plot_density(enr.metrics, title = "ENR"): Multiple arena
#> definitions have been used. A merged plot may not make sense.
Rtrack::plot_density(enrstd.metrics, title = "ENR-STD")
#> Warning in Rtrack::plot_density(enrstd.metrics, title = "ENR-STD"): Multiple
#> arena definitions have been used. A merged plot may not make sense.
To get a data.frame
containing all the experiment
metadata, metrics and strategies for each track, it is possible to
export the experiment results. The function export_results
is really intended for saving to a file, but if no filename is given,
then you get the data as a data.frame
.
results = Rtrack::export_results(experiment)
The results can be written to file in any one of several formats. The format will be determined from the filename extension. The default, and most likely to be used in an experimental workflow, is the Excel ‘.xlsx’ format.
Rtrack::export_results(experiment, file = "Results/OFT_results.xlsx")
Also supported are tab-delimited text (recommended for maximum portability; file extension can be any of ‘.tsv’, ‘txt’ or ‘.tab’) and comma-delimited values (‘.csv’, or ‘.csv2’ where decimal commas are needed). You can actually use any file extension, but it will be written in that case as tab-delimited text and you’ll get a warning.
It is also possible to only export some of the results. To do this, just specify the indices or names of the tracks you would like to export.
# Export just the data for the standard-housed animals.
std = experiment$factors$Group == "STD"
Rtrack::export_results(experiment, tracks = std, file = "Results/OFT_results_STD.xlsx")
It is worthwhile noting that the order of the exported results is
also determined by the order of the values given to the
tracks
parameter.
results = Rtrack::export_results(experiment) # Get the results as a data frame.
ordered = order(results$path.length, decreasing = TRUE) # Then sort by path length (highest to lowest).
Rtrack::export_results(experiment, tracks = ordered, file = "Results/OFT_results_ordered.xlsx")
The entire Rtrack_experiment
object can easily be saved
and reloaded into a later R session. The .RData
format is a
compressed version of the Rtrack_experiment
object and
requires very little space.
save(experiment, file = "Results/OFT_experiment.RData")
Load the file again (not necessary in this session, but the following
line demonstrates the command needed to read in the .RData
file we just created).
load("Results/OFT_experiment.RData")
We have also developed a format for saving the raw data in a way that
it can be accessed by other software. This allows sharing with other
people and archiving in a way that is more likely to be readable in the
future. The command below will create a file with the extension
.trackxf
—you do not need to add the extension though (in
fact it is better not to) as Rtrack will take care of naming the file
correctly.
Rtrack::export_data(experiment, file = "Results/OFT_Experiment")
#> Creating trackxf archive.
#> Compressing trackxf archive.
Data saved in this way can be read back into Rtrack using the
read_experiment
function (with the format
trackxf
, although Rtrack will work this out for you).
Because only the raw data are saved in trackxf, recreating an
experiment in this way will re-calculate all of the Rtrack-specific
metrics.
recreated.experiment = Rtrack::read_experiment("Results/OFT_Experiment.trackxf",
threads = 0)
#> Restoring archived experiment.
#> Initialising cluster.
#> Processing tracks using 8 threads.
This experiment object recreated from the saved trackxf file is (almost) identical to the original object. Only the export information will obviously be different.
# If we set 'export.note' back to empty, then the objects are the same.
recreated.experiment$info$export.note = experiment$info$export.note
all.equal(recreated.experiment, experiment)
#> [1] TRUE
The full range of supported codes is: ‘us’ or ‘micros’ for microseconds, ‘ms’ for milliseconds, ‘s’ for seconds, ‘min’ for minutes, ‘h’ for hours, ‘d’ for days and ‘y’ for years.↩︎
modern computers can multi-task and run several jobs side-by-side at the same time. These separate processes are called ‘threads’ in computer terminology. Running multiple parallel threads can make make optimal use of the multiple ‘cores’ of your CPU (the microchip at the heart of your computer) and allow programs to process data more quickly.↩︎