A video surveillance sequence generally contains a lot of scattered information regarding several objects in cluttered scenes. Especially in case of use of digital hand-held cameras, the overall quality is very low due to the unstable motion and the low resolution, even if multiple shots of the desired target are available. To overcome these limitations, we propose a novel Bayesian framework based on image super-resolution, that integrates all the informative bits of a target and condenses the redundancy. We call this process distillation. In the traditional formulation of the image super-resolution problem, the observed target is (1) always the same, (2) acquired using a camera making small movements, and (3) the number of available images is sufficient for recovering high-frequency information. These hypotheses obviously do not hold in the concrete situations described above. In this paper, we extend and generalize the image superresolution task, embedding it in a structured framework that accurately distills the necessary information. In short, our approach is composed by two phases. First, a transformation- invariant video clustering coarsely groups and registers the frames, also defining a similarity concept among them. Second, a novel Bayesian super-resolution method uses this concept in order to combine selectively all the pixels of similar frames, whose result consists in a highly informative super-resolved image of the desired target. Our approach is first tested on synthetic data, obtaining encouraging comparative results with respect to known super-resolution techniques and a definite robustness against noise. Second, real data coming from videos taken by a hand-held camera are considered, trying to solve the major details of a person in motion, a typical setting of video surveillance applications.
File in questo prodotto:
Non ci sono file associati a questo prodotto.