Object detection traditionally requires sliding-window classifier in modern deep learning based ap- proaches. However, both of these approaches requires tedious configurations in bounding boxes. Gen- erally speaking, single-class object detection is to tell where the object is, and how big it is. Traditional methods combine the ”where”and ”how”subproblems into a single one through the overall judgement of various scales of bounding boxes. In view of this, we are interesting in whether the ”where”and ”how”subproblems can be separated into two independent subtasks to ease the problem definition and the dif- ficulty of training. Accordingly, we provide a new perspective where detecting objects is approached as a high-level semantic feature detection task. Like edges, corners, blobs and other feature detectors, the pro- posed detector scans for feature points all over the image, for which the convolution is naturally suited. However, unlike these traditional low-level features, the proposed detector goes for a higher-level ab- straction, that is, we are looking for central points where there are objects, and modern deep models are already capable of such a high-level semantic abstraction. Like blob detection, we also predict the scales of the central points, which is also a straightforward convolution. Therefore, in this paper, pedestrian and face detection is simplified as a straightforward center and scale prediction task through convolu- tions. This way, the proposed method enjoys an anchor-free setting, considerably reducing the difficulty in training configuration and hyper-parameter optimization. Though structurally simple, it presents com- petitive accuracy on several challenging benchmarks, including pedestrian detection and face detection. Furthermore, a cross-dataset evaluation is performed, demonstrating a superior generalization ability of the proposed method.
Center and Scale Prediction: Anchor-free Approach for Pedestrian and Face Detection
Hasan Irtiza
;
2023-01-01
Abstract
Object detection traditionally requires sliding-window classifier in modern deep learning based ap- proaches. However, both of these approaches requires tedious configurations in bounding boxes. Gen- erally speaking, single-class object detection is to tell where the object is, and how big it is. Traditional methods combine the ”where”and ”how”subproblems into a single one through the overall judgement of various scales of bounding boxes. In view of this, we are interesting in whether the ”where”and ”how”subproblems can be separated into two independent subtasks to ease the problem definition and the dif- ficulty of training. Accordingly, we provide a new perspective where detecting objects is approached as a high-level semantic feature detection task. Like edges, corners, blobs and other feature detectors, the pro- posed detector scans for feature points all over the image, for which the convolution is naturally suited. However, unlike these traditional low-level features, the proposed detector goes for a higher-level ab- straction, that is, we are looking for central points where there are objects, and modern deep models are already capable of such a high-level semantic abstraction. Like blob detection, we also predict the scales of the central points, which is also a straightforward convolution. Therefore, in this paper, pedestrian and face detection is simplified as a straightforward center and scale prediction task through convolu- tions. This way, the proposed method enjoys an anchor-free setting, considerably reducing the difficulty in training configuration and hyper-parameter optimization. Though structurally simple, it presents com- petitive accuracy on several challenging benchmarks, including pedestrian detection and face detection. Furthermore, a cross-dataset evaluation is performed, demonstrating a superior generalization ability of the proposed method.File | Dimensione | Formato | |
---|---|---|---|
1-s2.0-S0031320322005519-main.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
3.08 MB
Formato
Adobe PDF
|
3.08 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.