Author Image

Hi, I am Claire

Claire Labit-Bonis

A.I. Engineer at ACTIA

I am a software architect specialized in artificial intelligence applied to computer vision. I got my PhD in 2021 within LAAS-CNRS and ACTIA teams, working on multi-object detection and tracking for public transportation use-cases.

Involvement
Goodwill
Curiosity

Publications

Intercorrélation basée apprentissage profond pour le suivi multi-cibles
Pierre Marigo Jérôme Thomas Claire Labit-Bonis Frédéric Lerasle

Tracking multiple moving targets in a video stream requires localizing and re-identifying them in varied and sometimes cluttered scenes. We present a tracking-by-detection method with multi-hypothesis visual pose estimation based on deep learning cross-correlation. In addition to a single all-in-one neural network for target detection and tracking, we also put forward a strategy for managing trajectories by characterizing their states. We demonstrate the quality of our approach and the gains made by comparing ourselves to the literature on the MOT17 benchmark through a quantitative and qualitative analysis.

Visual and automatic bus passenger counting based on a deep tracking-by-detection system

In this paper, we address the industrial constraints of automatic passenger counting in city buses through a deep architecture able to deal with images taken from low cost 2D cameras placed above the doorstep, from a zenithal point of view. The challenge is then to handle highly variable scenes due to passengers appearance (hair color, hats, height), bus population density at rush hour and changes in scene illumination. The scientific breakthrough related to deep learning applied to computer vision as well as the system embedding requirements for this task motivate us to integrate in this context a lightweight convolutional multiobject tracker which was especially designed for embedded applications and performed well on the MOT Challenge. We here evaluate it in an industrial context on our large scale in-situ dataset, labelled for detection, multi-target tracking and counting, and present a complete and embedded counting system meeting the requirements of our application.

Compact and siamese multi-object tracking-by-detection applied to on-board passenger counting in city buses
PhD thesis supervised by Frédéric Lerasle and Jérôme Thomas, defended at the University of Toulouse 3 Jun, 21 - 2021

In this thesis, we answer the industrial constraints of passenger counting in city buses through a deep architecture able to deal with images taken from low cost 2D sensors placed above the doorstep, from a zenithal point of view. The challenge is then to handle highly variable scenes due to passenger appearance (hair color, hats, height), bus population density at rush hour and changes in scene illumination. The scientific breakthrough related to deep learning applied to computer vision as well as the system embedding requirements motivate us to propose a unified and lightweight convolutional architecture in the context of multi-object tracking-by-detection for trajectory reconstruction. We evaluate our approach compared to the literature on a public reference dataset, the MOT Challenge, but also in an industrial context on our large \textit{in-situ} database, labeled for detection, multi-target tracking and counting. We show state-of-the-art results on the public database both in tracking accuracy and speed execution, but also present a complete and embedded counting system respecting the industrial constraints defined by the specifications.

Compact and discriminative multi-object tracking with siamese CNNs

Following the tracking-by-detection paradigm, multiple object tracking deals with challenging scenarios, occlusions or even missing detections; the priority is often given to quality measures instead of speed, and a good trade-off between the two is hard to achieve. Based on recent work, we propose a fast, lightweight tracker able to predict targets position and reidentify them at once, when it is usually done with two sequential steps. To do so, we combine a bounding box regressor with a target-oriented appearance learner in a newly designed and unified architecture. This way, our tracker can infer the targets’ image pose but also provide us with a confidence level about target identity. Most of the time, it is also common to filter out the detector outputs with a preprocessing step, throwing away precious information about what has been seen in the image. We propose a tracks management strategy able to balance efficiently between detection and tracking outputs and their associated likelihoods. Simply put, we spotlight a full siamese based single object tracker able to predict both position and appearance features at once with a lightweight and all-in-one architecture, within a balanced overall multi-target management strategy. We demonstrate the efficiency and speed of our system w.r.t the literature on the well-known MOT17 challenge benchmark, and bring to the fore qualitative evaluations as well as state-of-the-art quantitative results.

Fast tracking-by-detection of bus passengers with Siamese CNNs

Knowing the exact number of passengers among the citybus fleets allows public transport operators to optimally distribute their vehicles into the traffic. However, interpreting overcrowded scenarios, at rush hour, with day/night illumination changes can be tricky. Based on the visual tracking-by-detection paradigm, we benefit from video stream information provided by cameras placed above doors to infer people trajectories and deduce the number of enterings/leavings at every bus stop. In this way a person detector estimates the location of the passengers in each image, a tracker matches detections between successive frames based on different cues such as appearance or motion, and infers trajectories over time. This paper proposes a fast and embeddable framework that performs person detection using relevant state-of-the-art CNN detectors, and couple the best one (in our applicative context) with a newly designed Siamese network for real-time tracking/data association purposes. Evaluations on our own large scale in-situ dataset are very promising in terms of performances and real-time constraint expected for on-board processing.

In this paper we present a comparative study of tracking-by-detection approaches applied to passenger counting in city buses. A detector targets passengers at each frame, a tracker then matches detections together through time to produce trajectories. We compare three deep learning detectors still under-explored in our context, and couple them with a real time tracker for global evaluation on our large scale in situ dataset. The results we present are very encouraging in terms of detection, tracking rate and speed expected for our embedded perspectives.

WithU - un robot low-cost de téléprésence
Yorian Delmas Claire Labit-Bonis Jérémy Ouanély Yannick Traoré Aurélien Veillard Michel Taïx Philippe Truillet

Being able to interact with a remote environment, to “act as if you were there” without having to move, is made possible by techniques that have emerged and evolved since the 1960s. From simple teleconferencing allowing to participate in meetings on the other side of the world, to the telemanipulation of medical robots in meticulous surgical operations, telepresence allows a user to be virtually present in a remote location. We show here the design and development of a working prototype with low-cost tools.