CATALOGO DEI PRODOTTI DELLA RICERCA

Object detection traditionally requires sliding-window classifier in modern deep learning based ap- proaches. However, both of these approaches requires tedious configurations in bounding boxes. Gen- erally speaking, single-class object detection is to tell where the object is, and how big it is. Traditional methods combine the ”where”and ”how”subproblems into a single one through the overall judgement of various scales of bounding boxes. In view of this, we are interesting in whether the ”where”and ”how”subproblems can be separated into two independent subtasks to ease the problem definition and the dif- ficulty of training. Accordingly, we provide a new perspective where detecting objects is approached as a high-level semantic feature detection task. Like edges, corners, blobs and other feature detectors, the pro- posed detector scans for feature points all over the image, for which the convolution is naturally suited. However, unlike these traditional low-level features, the proposed detector goes for a higher-level ab- straction, that is, we are looking for central points where there are objects, and modern deep models are already capable of such a high-level semantic abstraction. Like blob detection, we also predict the scales of the central points, which is also a straightforward convolution. Therefore, in this paper, pedestrian and face detection is simplified as a straightforward center and scale prediction task through convolu- tions. This way, the proposed method enjoys an anchor-free setting, considerably reducing the difficulty in training configuration and hyper-parameter optimization. Though structurally simple, it presents com- petitive accuracy on several challenging benchmarks, including pedestrian detection and face detection. Furthermore, a cross-dataset evaluation is performed, demonstrating a superior generalization ability of the proposed method.

Center and Scale Prediction: Anchor-free Approach for Pedestrian and Face Detection

Hasan Irtiza;Liu Wei;Liao Shengcai

2023-01-01

Abstract

Object detection traditionally requires sliding-window classifier in modern deep learning based ap- proaches. However, both of these approaches requires tedious configurations in bounding boxes. Gen- erally speaking, single-class object detection is to tell where the object is, and how big it is. Traditional methods combine the ”where”and ”how”subproblems into a single one through the overall judgement of various scales of bounding boxes. In view of this, we are interesting in whether the ”where”and ”how”subproblems can be separated into two independent subtasks to ease the problem definition and the dif- ficulty of training. Accordingly, we provide a new perspective where detecting objects is approached as a high-level semantic feature detection task. Like edges, corners, blobs and other feature detectors, the pro- posed detector scans for feature points all over the image, for which the convolution is naturally suited. However, unlike these traditional low-level features, the proposed detector goes for a higher-level ab- straction, that is, we are looking for central points where there are objects, and modern deep models are already capable of such a high-level semantic abstraction. Like blob detection, we also predict the scales of the central points, which is also a straightforward convolution. Therefore, in this paper, pedestrian and face detection is simplified as a straightforward center and scale prediction task through convolu- tions. This way, the proposed method enjoys an anchor-free setting, considerably reducing the difficulty in training configuration and hyper-parameter optimization. Though structurally simple, it presents com- petitive accuracy on several challenging benchmarks, including pedestrian detection and face detection. Furthermore, a cross-dataset evaluation is performed, demonstrating a superior generalization ability of the proposed method.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Parole chiave
	
				Object Detection, Convolutional Neural Networks, Feature Detection, anchor-free
			
	Appare nelle tipologie:
	
				01.01 Articolo in Rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0031320322005519-main.pdf accesso aperto Licenza: Creative commons Dimensione 3.08 MB Formato Adobe PDF Visualizza/Apri	3.08 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1155347

Citazioni

ND

37

28

social impact