This repo contains baseline code for
- Multi-Object Tracking (MOT): Detecting (Yolov5) and tracking (DeepSORT, ByteTrack) objects in video streams.
 - Determining object attributes: (like color, type in vehicles, or speed estimation if camera calibration is performed).
 - Multi-target multi-camera tracking (MTMC): Match tracks across cameras after running MOT in a multi-camera system.
 - Evaluation: Calculate MOT/MTMC metrics (MOTA, IDF1) automatically if ground truth annotations are provided.
 - Express run: Run everything from above in one fly.
 
- Nvidia drivers have to be installed (check with 
nvidia-smi), preferably supporting CUDA >11.0. - Tested on python3.7 to 3.10.
 - The 
requirements.txtcontains a working configuration with torch1.13.0, but installing the packages manually with different versions can work too (older python versions may require older torch versions) - A working c++ compiler toolchain, 
python-develheaders andwheelare required for installing torch, numpy, and scipy withpip. Otherwise the installation will only work with conda. 
Creating a virtual environment is highly recommended, except if working in a disposable environment (Kaggle, Colab, etc).
Clone the repo including the submodules:
git clone --recurse-submodules git@github.com:regob/vehicle_mtmc.git
Before installing requirements.txt cython needs to be installed:
pip install cython "numpy>=1.18.5,<1.23.0"then install the rest:
pip install -r requirements.txtSome pretrained models can be downloaded from Google drive. Create a models subdirectory, and unzip the models there. It contains:
- A resnet50-ibn re-id model trained on VeRi-Wild, CityFlow, VRIC, and some private data.
 - SVM classifiers for vehicle color/type running on the re-id embeddings.
 
Running single-cam tracking requires at a minimum a video, a re-id model and a configuration file. A fairly minimal configuration file for the highway.mp4 example video and pretrained re-id model is below:
OUTPUT_DIR: "output/mot_highway"
MOT:
  VIDEO: "datasets/highway.mp4"
  REID_MODEL_OPTS: "models/resnet50_mixstyle/opts.yaml"
  REID_MODEL_CKPT: "models/resnet50_mixstyle/net_19.pth"
  DETECTOR: "yolov5x6"
  TRACKER: "bytetrack_iou"
  SHOW: false
  VIDEO_OUTPUT: trueAny car traffic video should be fine for testing, the video from the screenshots can be downloaded as:
$ yt-dlp -f mp4 -o datasets/highway.mp4 https://www.youtube.com/watch?v=nt3D26lrkhoInstall
yt-dlporyoutube-dlfor downloading youtube videos (the former bypasses rate limits).
The example configuration is at config/examples/mot_highway.yaml. Tracking can be run from the repo root with (PYTHONPATH needs to be set to the root folder):
$ export PYTHONPATH=$(pwd)
$ python3 mot/run_tracker.py --config examples/mot_highway.yamlThe required parameters for MOT are (paths can be relative to the repo root, or absolute):
OUTPUT_DIR: Directory, where the outputs will be saved.MOT.VIDEO: Path to the video input.MOT.REID_MODEL_OPTS: path to theopts.yamlof the reid model.MOT.REID_MODEL_CKPT: path to the checkpoint of the reid model.
Other important parameters:
MOT.DETECTOR: yolov5 versions are supported.MOT.TRACKER: Choose between ByteTrack ("bytetrack_iou") or DeepSORT ("deepsort").MOT.SHOW: Show tracking online in a window (cv2 needs to connect to display for this, or it crashes).MOT.VIDEO_OUTPUT: Save tracked video in the output folder.MOT.STATIC_ATTRIBUTES: Configure attribute extraction models.MOT.CALIBRATION: Camera calibration file (to be described below).
Determining static attributes (e.g. type, color) can be configured as:
MOT:
  STATIC_ATTRIBUTES:
    - color: "models/color_svm.pkl"
    - type: "models/type_svm.pkl"Models can be the following:
- pytorch CNN, that gets the image in the bounding box as input
 - pytorch fully-connected NN that predicts the attribute from the re-id embedding.
 - sklearn/xgboost/etc models that are pickled, and have a 
predict(x)method that predicts from the re-id embedding as a numpy array. 
When adding a new attribute besides color and type, its possible values have to be configured in mot/attributes.py.
Camera calibration has to be performed with the Cal_PNP package to get a homography matrix, then the path to the homography matrix has to be configured in MOT.CALIBRATION. An example homography matrix file is provided for highway.mp4 at config/examples/highway_calibration.txt.
Express Multi-camera tracking runs MOT on all cameras and then hierarchical clustering on single-camera tracks. Temporal constraints are also considered, and have to be pre-configured in the MTMC.CAMERA_LAYOUT parameter. An example config for CityFlow S02 (4 cameras at a crossroad) is at config/cityflow/express_s02.yaml. Its part describing the MTMC config is:
MTMC:
  CAMERA_LAYOUT: 'config/cityflow/s02_camera_layout.txt'
  LINKAGE: 'average'
  MIN_SIM: 0.5
EXPRESS:
  FINAL_VIDEO_OUTPUT: true
  CAMERAS:
    - "video": "datasets/cityflow_track3/validation/S02/c006/vdo.avi"
      "detection_mask": "assets/cityflow/c006_mask.jpg"
      "calibration": "datasets/cityflow_track3/validation/S02/c006/calibration.txt"
    - "video": "datasets/cityflow_track3/validation/S02/c007/vdo.avi"
      "detection_mask": "assets/cityflow/c007_mask.jpg"
      "calibration": "datasets/cityflow_track3/validation/S02/c007/calibration.txt"
    - "video": "datasets/cityflow_track3/validation/S02/c008/vdo.avi"
      "detection_mask": "assets/cityflow/c008_mask.jpg"
      "calibration": "datasets/cityflow_track3/validation/S02/c008/calibration.txt"
    - "video": "datasets/cityflow_track3/validation/S02/c009/vdo.avi"
      "detection_mask": "assets/cityflow/c009_mask.jpg"
      "calibration": "datasets/cityflow_track3/validation/S02/c009/calibration.txt"The MOT config is the same for all cameras, but for each camera, at least the "video" key has to be given in EXPRESS.CAMERAS, the meaning of the keys is the same as in the MOT config.
In the MTMC config there are only a few paramteres:
MTMC.LINKAGEchooses the linkage for hierarchical clustering from ['single', 'complete', 'average'].MTMC.MIN_SIMis the minimal similarity between multi-cam tracks above which they can be merged.MTMC.CAMERA_LAYOUTstores the mandatory camera constraints file. The camera layout file for CityFlow S02 is at config/cityflow/s02_camera_layout.txt. On Cityflow S02 express MTMC can be run as:
$ export PYTHONPATH=$(pwd)
$ python3 mtmc/run_express_mtmc.py --config cityflow/express_s02.yamlFor running the example config, the S02 scenario of the Cityflow dataset is needed to be unzipped to
datasets/cityflow_track3/validation.
Models trained by my reid/vehicle_reid repo are supported out-of-the-box in the configuration. Other torch models could be integrated by modifying the model loading in mot/run_tracker.py, which currently looks like this:
# initialize reid model
reid_model = load_model_from_opts(cfg.MOT.REID_MODEL_OPTS,
                                  ckpt=cfg.MOT.REID_MODEL_CKPT,
                                  remove_classifier=True)If you reuse this work, please consider citing our paper:
Szűcs, G., Borsodi, R., Papp, D. (2023). Multi-Camera Trajectory Matching based on Hierarchical Clustering and Constraints. Multimedia Tools and Applications, https://doi.org/10.1007/s11042-023-17397-0
Some parts are adapted from other repositories:
- nwojke/deep_sort: Original
DeepSORT code.
- theAIGuysCode/yolov4-deepsort:
Enhanced version of DeepSORT.
- ifzhang/ByteTrack: Original
ByteTrack tracker code.
The yolov5 and vehicle_reid repos are used as submodules.
