This report was created the using the behavseqanalyser v0.2.1, behaviour were grouped using the Berlin categorisation (date: 2021-06-07).

#Ro_testdata_mbr, home cage monitoring results

The data is grouped by treatment. Data transformation: data (%age of time spent doing the behavior) transformed using the square root method..

##  first second 
##     10     10

We grouped the variables following the Berlin argument to get 19 behavior categories. We used the folowing time windows and got 18 x 1 = 18 variables :

  time_reference windowstart windowend windowname
7 Bintodark -120 864 full recording

Note that the last window might be truncated if not all dataset is achieving 900 min after light on.

We then run a random forest to get the variables in order of importance to distinguish the groups. We then take the best 20 and run the random forest again (such that the Gini scores obtained will not depend on the initial number of variables). We plot here the table of variables ordered by weight:

Let’s take a teshold of importance (Gini > 0.95) and get all variables satisfying the filter, or at least 8 variables:

digforage1, Drink1, Eat1, RemainLow1, ComeDown1, walk1, Sniffing1 and Chew1

#Plotting First, lets plot the 2 most discriminative variables following the random forest:

Here, we plot the first two or threecomponents obtained after a ICA performed on the reduced data:

#PCA strategy

The PCA strategy shows that the behavior profile of the two groups of animal are not identical.

We performed a PCA on the data and tested whether the groups show a difference in their first component score using a Mann-Whitney or a Kruskal-Wallis rank sum test (if more than 2 groups exists). We plot here the first component in a boxplot:

NB: This strategy is pretty good against type I errors. On the other hand, it may well oversee existing differences.


#metadata used for the analysis

project metadata


all metadata (downloadable)


Machine learning attempt, SVM

We perform a SVM on the total data. We used a 2-out validation strategy. We calculated the accuracy of the output (kappa score) for the real data grouping and permutated groups. This accuracy (0.2) was tested for significance, using a permutation strategy. We performed 200 permutations.

We use a Binomial confidence interval to calculate a p value. )

The SVM procedure could not tell the two groups apart.


Details: [1] “17 variables: Accuracy of the prediction with radial kernel (Kappa index: 0 denotes chance level, maximum is 1):0.2”

distribution of the accuracy scores with permuted labels, with adding a vertical line at the Score obtained using the real groups.

P value calculation:

                                 # Exports `binconf`
k <- sum(abs(Acc_sampled) >= abs(Accuracyreal))   # Two-tailed test
R=binconf(k, length(Acc_sampled), method='exact')
print(zapsmall(R)) # 95% CI by default
##  PointEst     Lower     Upper
##     0.285 0.2235544 0.3529427
 save.image(file= "results.rdata")