A common issue in medical deep learning research is the creation of dataset for training the neural networks. Medical data collection is also tied-up by privacy laws and even if a lot of medical data are available, often their elaboration can be time demanding. This problem can be avoided using neural networks architectures that can achieve a good predicting precision with few images (e.g. U-Net). In the case of semantic segmentation, the dataset generation is even more cumbersome since it requires the creation of segmentation masks manually. Some automatic ground-truth creation techniques may be employed like filtering, thresholding and Self Organized Maps1 (SOM). These automatic methods can be very powerful and useful, but they always have a bottle-neck phase: data validation. Due to algorithm reliability (that sometimes can fail), data needs to be validated manually before they can be included in a dataset for training. In this work, we propose a method to automatize this phase by moving manual intervention to an easier task: instead of creating masks and then validate them manually, we train a convolutional neural network to classify segmentation quality. Therefore, the validation is performed automatically. An initial manual phase is still required, but the classification task requires a smaller number of elements in the dataset that will feed a network employed for classification. After this phase, similar dataset creations will require less effort. This procedure is based on the fact that to obtain a high classification precision, fewer data are required than the data that are needed to obtain high precision in semantic segmentation. High classification score, can automatize validation procedure in dataset creation, being able to discard failure case in dataset creation. Being able to produce bigger dataset in less time can led to higher precision in semantic segmentation.
ADAGSS: Automatic Dataset Generation for Semantic Segmentation
L. Palladino;B. Maris;P. Fiorini
2020-01-01
Abstract
A common issue in medical deep learning research is the creation of dataset for training the neural networks. Medical data collection is also tied-up by privacy laws and even if a lot of medical data are available, often their elaboration can be time demanding. This problem can be avoided using neural networks architectures that can achieve a good predicting precision with few images (e.g. U-Net). In the case of semantic segmentation, the dataset generation is even more cumbersome since it requires the creation of segmentation masks manually. Some automatic ground-truth creation techniques may be employed like filtering, thresholding and Self Organized Maps1 (SOM). These automatic methods can be very powerful and useful, but they always have a bottle-neck phase: data validation. Due to algorithm reliability (that sometimes can fail), data needs to be validated manually before they can be included in a dataset for training. In this work, we propose a method to automatize this phase by moving manual intervention to an easier task: instead of creating masks and then validate them manually, we train a convolutional neural network to classify segmentation quality. Therefore, the validation is performed automatically. An initial manual phase is still required, but the classification task requires a smaller number of elements in the dataset that will feed a network employed for classification. After this phase, similar dataset creations will require less effort. This procedure is based on the fact that to obtain a high classification precision, fewer data are required than the data that are needed to obtain high precision in semantic segmentation. High classification score, can automatize validation procedure in dataset creation, being able to discard failure case in dataset creation. Being able to produce bigger dataset in less time can led to higher precision in semantic segmentation.File | Dimensione | Formato | |
---|---|---|---|
Automatic Dataset Generation for Semantic Segmentation (ADAGSS).pdf
accesso aperto
Tipologia:
Documento in Pre-print
Licenza:
Dominio pubblico
Dimensione
384.9 kB
Formato
Adobe PDF
|
384.9 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.