As mentioned in previous posts, FastTrack on Windows is slow compared to the Linux and macOS versions. Since version 6.2.5, the tracking speed dramatically improved on Windows.
One user reported a bug involving a memory leak on Windows for a specific video format. We investigated this bug and were able to find that it came from the OpenCL library. OpenCL was unable to share a buffer leading to multiple deep copies of images and ultimately a RAM overload. This bug was restricted to the tracking class and reproducible only with a specific video. A hotfix was deployed by deactivating OpenCL.
As usual, we run the performance benchmark and no changes were seen... except for Windows (see graph). Surprisingly, deactivating OpenCL increases tracking performance by 52% on Windows.
"OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms" CPUs, GPUs, DSPs, and FPGAs. OpenCV uses OpenCL by the mean of the transparent API that adds hardware acceleration with a minimal change in the code (use UMat instead of Mat to store images). Using hardware acceleration can increase performance when expensive operations are applied to the image, otherwise, the overhead time to moving the data to the GPU dominate.
There is numerous posts (1,2) on the internet that talk about performance issues with OpenCL. Only one thing is certain, deactivating OpenCL in FastTrack leads to consistent performance across platforms.
In this post, we will use Hyperfine to compare the performance of several versions of FastTrack to see how FastTrack performance has improved or degraded over time.
The results of the benchmark are displayed in the graph below with horizontally the version of FastTrack (left is the more recent), and vertically the mean time to perform 50 tracking analyses of the test dataset (less time is better). We can see two interesting breakpoints of performance.
The fastest version is by far the 6.2.4 (latest at the time of writing). This is due to the optimized rewriting of a core function of the tracking. This function computes the object's direction and is used ~nObject*nImage times. A slight gain can greatly impact the overall performance.
We see a degradation of performance between versions 6.0.0 and 6.1.2. This degradation was introduced when FastTrack started to use the SQLite database as a backend. In version 6.0.0 and prior, tracking data were directly saved as a plain text file. This was fine for the tracking but loading the data for reviewing was consuming a lot of RAM and was very slow. Version 6.1.0 and later introduced an SQLite database to store the tracking data but still keep the plain text file for compatibility. This development choice increased performance for the tracking review but slightly degraded the tracking time. Inserting data in the database is faster but generating and writing the text file needed to keep the compatibility introduces a small time overhead degrading the tracking performance. Overall, tracking plus reviewing was faster.
Less significantly, we see a slight increase in performance between versions 6.1.2 and 6.2.3 caused by small optimizations in the code. We see also that migrating from Qt5 (FastTrack 5.3.5 and prior) to Qt6 (FastTrack 5.3.5 and later) doesn't change the performance.
A tracking analysis is the repetition of a few functions on a lot of images. Marginal gains on these functions can cumulate to a large increase in tracking speed. We work to increase the overall performance with each release of FastTrack and there is still gain to be found.
Since version 6.2.0, FastTrack has been compilated using MinGW_w64 instead of MSVC2019.
MinGW_w64 is a fork of the MinGW project that provides the GCC compiler for Windows. With a "better-conforming and faster math support compared to VisualStudio's" and a pthreads library, this compiler yields better performance for the OpenCV library and thus for FastTrack.
Compiling FastTrack using MinGW_w64 provides several improvements. First, it provides the getopt.h header necessary to the FastTrack-Cli. From version 6.2.0, the command line interface of FastTrack is available natively on Windows. Secondly, OpenCV compiled using MinGW_w64 is more performant than with MSVC and Qt seems more responsive. Finally, the bundle (executable plus DLLs) is lighter than its MSVC counterpart (42,7 MB vs 62.8 MB).
Compiling FastTrack using MinGW_w64 comes with some challenges. The main dependency of FastTrack is OpenCV and it does not provide pre-built binaries for MinGW_w64, therefore, we need to compile OpenCV from sources. This compilation is done one time in this GitHub repository and files are downloaded at compile time to save processing energy.
Conveniently, Qt provides pre-built binaries and the whole MinGW_w64 toolchains in its archives. Installing Qt and MinWG_w64 can be done very easily without external sources. The windeployqt Qt tool takes care of the DLLs (Qt and MinGW_x64) needed at runtime and the resulting bundle is very light.
MinGW_w64 version of Qt does not provide the QtWebEngine, thus, the in software documentation is not available anymore.
To conclude, MinGW_w64 version of FastTrack has better performance, a lighter footprint with only one drawback: recompile OpenCV when newer versions will be available. For developers, the environment is easier to set up with only three commands necessary.
Copyright (C) FastTrack. Permission is granted to copy, distribute and/or modify this document. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
FastTrack performance comparison between Linux and Windows
FastTrack is a multi-platform application available for Linux, macOS, and Windows. In this post, we will compare the performance of the Linux and Windows versions. We will see that the performance depends a lot on how OpenCV was built and how to build it for performance.
The benchmark will be performed using FastTrack version 6.1.1 on a computer with an Intel(R) Core(TM) i7-8565U and 16Go of RAM.
The Linux version was compiled using the GCC compiler with the default release flag of Qt. We used two Windows compilers: MSVC 2019 used for the FastTrack stable release, and MinGW_64 (GCC for Windows. MinGW_64 is not used for binary releases because it lacks the QtWebEngine package but a lighter version of FastTrack can be compiled using the NO_WEB compilation flag).
We use OpenCV 4.5.5 and Qt 6.2.2 to perform the benchmark. We chose the test dataset of FastTrack ZFJ_001 with the parameters included with it and the PAR_001 from the two-dimensional dataset.
The results of the benchmark are displayed in Figure.1 for ZFJ_001 and Figure.2 for PAR_001. We see that the tracking is significantly slower on Windows than on Linux and that the MinGW_64 compiler yield better performance than MSVC2019.
Figure 1. Benchmark for ZFJ_001.
Figure 2. Benchmark for PAR_001.
These results can be explained by several factors. First, compiler optimizations are not the same and it seems that out-of-the-box Qt and OpenCV are generally faster with GCC. Another point is that FastTrack writes heavily on the disk using both the SQLite database and plain text files. I/O performance varies widely depending on operating system and hardware and is generally better on Linux.
In our case, we can pinpoint a large part of the performance difference to the core operations of the tracking (object detection and ellipse computation) powered by OpenCV that are significantly slower on Windows.
Performance can be improved by tweaking compiler optimization flags and compiling OpenCV using system-specific optimizations if available.
Figure.3 presents the performance for the pre-built OpenCV library and the optimized version compiled with MSVC2019. Optimized OpenCV was compiled with TBB, OpenMP, and IPP enabled and is 1.7 times faster than the pre-built version but still 1.6 times slower than the Linux version.
Figure 3. Pre-built vs omptimized OpenCV library (PAR_001).
On Linux, OpenCV is compilated as packaged by the Linux distribution. For example, ArchLinux and Ubuntu are not packaged with the same flags enabled and there is still room for performance improvement. In Figure.4, we compare the performance of the AppImage packaged on Ubuntu with the ArchLinux version available on AUR. We see that the native package is slightly faster than the AppImage but still performing very well.
Figure 4. AppImage vs Arch Linux package from AUR (PAR_001).
Pre-built binaries of FastTrack will most likely perform better on Linux than on Windows for equivalent hardware. Most Linux distributions will provide a pre-built OpenCV library well optimized whereas FastTrack for Windows is built against the pre-built OpenCV library for MSVC.
A custom compilation of OpenCV and FastTrack with platform-specific optimizations will provide maximum performance in any case.
Ultimately, switching to MinGW_64 will be the only way to start to fill the performance gap on Windows. In the next post, we will see how to compile OpenCV and (light) FastTrack with MinGW_64 and if it is possible to have performance as best as the standard Linux version.
Copyright (C) FastTrack. Permission is granted to copy, distribute and/or modify this document.This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
import fastanalysis as fa import seaborn as sns import numpy as np import matplotlib as mpl plt.style.use("fivethirtyeight") import warnings warnings.filterwarnings('ignore')
# Distribution of vertical positions for all the objects p0 = sns.histplot(data=data.getDataframe(), x="yBody", kde=True); p0.set_xlabel("Vertical position");
# Distribution of vertical positions for each individual p1 = sns.displot(data=data.getDataframe(), x="yBody", hue="id", kind="kde"); p1.ax.set_xlabel("Vertical position");
# For each individual computes a preference index pi =[] for i inrange(data.getObjectNumber()): dat = data.getObjects(i) dat.loc[:,"diffTime"]= dat.imageNumber.diff().values up = dat[dat.yBody >100] down = dat[dat.yBody <=100] pi.append((up.diffTime.sum()- down.diffTime.sum())/(up.diffTime.sum()+ down.diffTime.sum()))
p3 = sns.boxplot(y=pi) p3.set_ylim(-1,1) p3.set_ylabel("Preference index"); p3.set_title("Objects preference to the upper side");
p4 = sns.kdeplot(data=data.getDataframe(), x=np.random.normal(size=data.getDataframe().values.shape[0]), y="yBody", fill=True); p4.set_xlabel("Horizontal position randomized"); p4.set_ylabel("Vertical position"); p4.set_title("Distribution of presence");
pi =[] for i inrange(data.getObjectNumber()): dat = data.getObjects(i) dat.loc[:,"diffTime"]= dat.imageNumber.diff().values pref =[] for l, __ inenumerate(dat.yBody.values): up = dat[0:l][dat[0:l].yBody.values >100] down = dat[0:l][dat[0:l].yBody.values <=100] pref.append((up.diffTime.sum()- down.diffTime.sum())/(up.diffTime.sum()+ down.diffTime.sum())) pi.append(pref)
for i, j inenumerate(pi): p5= sns.lineplot(x=np.arange(len(j)), y=j, label=str(i)) p5.set_xlabel("Time (images)"); p5.set_ylabel("Preference index"); p5.set_title("Preference index function of time"); p5.legend(title="id", bbox_to_anchor=(1.05,1));
Copyright (C) FastTrack. Permission is granted to copy, distribute and/or modify this document.This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Get started with FastAnalysis
FastAnalysis is a python library that simplifies the importation of a tracking analysis performed with FastTrack. Easily select data for a given object or a given timepoint.
Copyright (C) FastTrack. Permission is granted to copy, distribute and/or modify this document.This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
The Julia documentation and installation guide can be found at https://julialang.org/. We provide here a simple example that details how to import the tracking.txt file from FastTrack, and how to extract basic information like the number of objects, the number of images etc...
using DataFrames using CSV using PyPlot using Plots using StatsPlots
We are going to make basic plots using Plots, StatsPlots and the PyPlot (that require a valid matplotlib installation) modules. For more information about plotting see https://docs.juliaplots.org/latest/tutorial/.
objectsByImage = zeros(numImages) for i in 1:numImages objectsByImage[i] = length(Set(data.id[data.imageNumber .== i-1])) end Plots.plot(1:numImages, objectsByImage; title="Number of detected objects by frame", xlabel="Frames", ylabel="Objects", label=false)
import pandas as pd import numpy as np import matplotlib.pyplot as plt
# Import the data data = pd.read_csv("tracking.txt", sep='\t') data
xHead
yHead
tHead
xTail
yTail
tTail
xBody
yBody
tBody
Float64
Float64
Float64
Float64
Float64
Float64
Float64
Float64
Float64
2,475 rows × 23 columns (omitted printing of 14 columns)
1
514.327
333.12
5.81619
499.96
327.727
6.10226
508.345
330.876
5.94395
2
463.603
327.051
0.301279
449.585
330.323
0.245547
458.058
328.346
0.238877
3
23.9978
287.715
3.70646
34.9722
278.836
3.99819
29.2056
283.505
3.84844
4
372.536
230.143
0.194641
354.226
231.604
6.08737
364.822
230.759
0.0515087
5
480.58
213.482
1.28236
478.125
228.52
1.53303
479.428
220.543
1.42567
6
171.682
143.55
6.09077
155.507
140.116
6.1146
164.913
142.113
6.08216
7
498.151
121.32
6.00177
483.712
119.285
0.0223247
492.683
120.55
6.15298
8
329.56
123.418
6.08726
312.526
119.042
5.9098
322.531
121.614
6.01722
9
465.256
115.045
4.44359
470.057
99.911
4.40559
467.106
109.205
4.40862
10
423.663
66.3789
0.0888056
409.105
67.2971
6.12053
417.615
66.7623
0.0292602
11
424.487
40.4232
5.48198
411.594
30.3912
5.88869
418.96
36.1192
5.64923
12
370.591
35.2147
5.99688
354.672
29.5633
5.89121
364.007
32.8767
5.94008
13
498.502
20.2527
5.66339
487.254
9.19499
5.39497
493.758
15.5781
5.5026
14
367.791
5.03034
6.05933
352.076
6.75603
0.653641
361.12
5.75904
0.152688
15
512.965
332.575
5.86617
499.435
327.759
6.052
507.626
330.673
5.95102
16
463.385
324.659
0.707
451.431
332.193
0.246265
458.959
327.443
0.542368
17
19.4579
293.022
4.28861
25.5579
281.206
4.18379
21.8962
288.302
4.23379
18
379.037
230.527
6.10571
361.728
229.616
0.199343
371.74
230.144
6.25939
19
478.884
206.712
1.27832
475.454
221.757
1.40929
477.197
214.108
1.35472
20
173.923
143.042
0.00732468
157.261
142.182
6.00453
167.066
142.689
6.20403
21
498.561
122.687
5.83253
486.357
118.196
6.13893
493.718
120.906
5.95151
22
328.812
124.134
6.05932
312.848
119.605
5.98617
322.331
122.294
6.00901
23
461.738
116.731
4.47649
466.371
101.736
4.40285
463.615
110.656
4.41641
24
428.631
69.2715
5.87139
415.665
64.6444
6.13862
423.218
67.3364
5.96558
25
425.821
44.9942
5.59983
414.84
33.2028
5.37159
421.248
40.0897
5.461
26
368.362
35.6219
5.97427
353.22
30.4625
5.88261
362.109
33.4891
5.94605
27
503.484
22.7293
5.76026
489.632
16.6315
5.92136
497.924
20.2857
5.86668
28
369.184
5.84074
6.15994
352.622
4.25328
6.24787
362.144
5.16766
6.19236
29
510.519
331.417
5.88883
495.784
327.366
6.12889
504.484
329.758
6.02088
30
464.242
323.533
0.290639
451.756
328.194
0.532686
459.432
325.326
0.37736
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
# Count the number of detected object objectNumber =len(set(data["id"].values)) objectNumber
2
# Count the number of image imageNumber = np.max(data["imageNumber"])+1 imageNumber
2000
# Plot the number of objects detected by frame objectByFrame = np.zeros(imageNumber) for i inrange(imageNumber): objectByFrame[i]= data[data["imageNumber"]== i].shape[0] plt.scatter(range(imageNumber), objectByFrame)
<matplotlib.collections.PathCollection at 0x7fa41831c6d8>
# Plot the trajectory of the first object and its orientation dataObject0 = data[data["id"]==0] distance = np.sqrt(np.diff(dataObject0["xBody"].values)**2+ np.diff(dataObject0["yBody"].values)**2) framerate =50 time = np.diff(dataObject0["imageNumber"].values)/framerate velocity = distance/time fig, ax = plt.subplots(1,2) fig.subplots_adjust(right =2) ax[0]= plt.subplot(121) plot = ax[0].scatter(dataObject0["xBody"][0:-1], dataObject0["yBody"][0:-1], c = velocity, s =1) ax[0].set_xlabel("x-position") ax[0].set_ylabel("y-position") bar = fig.colorbar(plot) bar.set_label("Velocity") ax[1]= plt.subplot(122, projection='polar') ax[1].scatter(range(dataObject0["tBody"].shape[0]), dataObject0["tBody"], s =0.4) ax[1].set_title("Object direction")
Fast Track is now integration feature-based registration alongside the other registration methods.
Feature-based registration consists of finding stable points in an image, these points are called key points and their descriptors. Then the same key points are found in the second image. From these key points, a homography is computed between the two images and the transformation is applied to all the pixels of the image to register.
Fast Track used an automatic algorithm to find the key points and the descriptors (~500) in the two images. This algorithm is called ORB feature detector and was brought up by Ethan Rublee, Vincent Rabaud, Kurt Konolige and Gary R. Bradski in 2011.
The key points are matched pairwise between the two images using the Hamming distance. The Hamming distance measures the minimum number of errors that could transform one feature descriptor in another one.
The homography is computed between the matching key points. It is possible that more than 30% of the features matched are incorrect. To reduce errors when finding the homography, Fast Track used Random Sample Consensus RANSAC estimation technic brought up by Fischler and Bolles in 1981.
This new registration method will be available in the 4.8 Fast Track release. It can be tested in the nightly build and on the dev branch on GitLab.
Fast Track is now integrating a new method of registration: the so-called ECC registration. This method was developed by Georgios D. Evangelidis and Emmanouil Z. Psarakis. It consists of maximizing the Enhanced Correlation Coefficient function to find the parameters that described the best transformation between the two images. This is done by solving iteratively a sequence of nonlinear optimization problems.
This method has several advantages:
Invariant with respect to photometric distortion, ie, in contrast, and brightness changes.
The optimization problem solution is linear, ie, the computing time is acceptable.
This method performs well in noisy conditions.
For the moment, only one mode of registration is integrated into Fast Track. The euclidian mode can correct the translation and rotation of the image. For example, this mode can correct small camera vibrations.
Currently in testing, this registration mode will be available in the 4.8.0 version. You can test this feature in the nightly release or in the dev branch on the GitLab.