Persistent target tracking using likelihood fusion in wide-area and full motion video sequences
Vehicle tracking using airborne wide-area motion imagery (WAMI) for monitoring urban environments is very challenging for current state-of-the-art tracking algorithms, compared to object tracking in full motion video (FMV). Characteristics that constrain performance in WAMI to relatively short tracks range from the limitations of the camera sensor array including low frame rate and georegistration inaccuracies, to small target support size, presence of numerous shadows and occlusions from buildings, continuously changing vantage point of the platform, presence of distractors and clutter among other confounding factors. We describe our Likelihood of Features Tracking (LoFT) system that is based on fusing multiple sources of information about the target and its environment akin to a track-before-detect approach. LoFT uses image-based feature likelihood maps derived from a template-based target model, object and motion saliency, track prediction and management, combined with a novel adaptive appearance target update model. Quantitative measures of performance are presented using a set of manually marked objects in both WAMI, namely Columbus Large Image Format (CLIF), and several standard FMV sequences. Comparison with a number of single object tracking systems shows that LoFT outperforms other visual trackers, including state-of-the-art sparse representation and learning based methods, by a significant amount on the CLIF sequences and is competitive on FMV sequences.