Extracting a statistically significant result from video of natural phenomenon can be difficult for two reasons: (i) there can be considerable natural variation in the observed behaviour and (ii) computer vision algorithms applied to natural phenomena may not perform correctly on a significant number of samples. This study presents one approach to clean a large noisy visual tracking dataset to allow extracting statistically sound results from the image data. In particular, analyses of 3.6 million underwater trajectories of a fish with the water temperature at the time of acquisition are presented. Although there are many false detections and incorrect trajectory assignments, by a combination of data binning and robust estimation methods, reliable evidence for an increase in fish speed as water temperature increases are demonstrated. Then, a method for data cleaning which removes outliers arising from false detections and incorrect trajectory assignments using a deep learning-based clustering algorithm is proposed. The corresponding results show a rise in fish speed as temperature goes up. Several statistical tests applied to both cleaned and not-cleaned data confirm that both results are statistically significant and show an increasing trend. However, the latter approach also generates a cleaner dataset suitable for other analysis.

Extracting statistically significant behaviour from fish tracking data with and without large dataset cleaning

Beyan, C.;
2018-01-01

Abstract

Extracting a statistically significant result from video of natural phenomenon can be difficult for two reasons: (i) there can be considerable natural variation in the observed behaviour and (ii) computer vision algorithms applied to natural phenomena may not perform correctly on a significant number of samples. This study presents one approach to clean a large noisy visual tracking dataset to allow extracting statistically sound results from the image data. In particular, analyses of 3.6 million underwater trajectories of a fish with the water temperature at the time of acquisition are presented. Although there are many false detections and incorrect trajectory assignments, by a combination of data binning and robust estimation methods, reliable evidence for an increase in fish speed as water temperature increases are demonstrated. Then, a method for data cleaning which removes outliers arising from false detections and incorrect trajectory assignments using a deep learning-based clustering algorithm is proposed. The corresponding results show a rise in fish speed as temperature goes up. Several statistical tests applied to both cleaned and not-cleaned data confirm that both results are statistically significant and show an increasing trend. However, the latter approach also generates a cleaner dataset suitable for other analysis.
2018
fish, big data analysis, restricted Boltzmann machine
File in questo prodotto:
File Dimensione Formato  
IJ10_Extracting Statistically Significant Behaviour from Fish Tracking Data With and Without Large Dataset Cleaning.pdf

solo utenti autorizzati

Tipologia: Versione dell'editore
Licenza: Copyright dell'editore
Dimensione 2.16 MB
Formato Adobe PDF
2.16 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1121845
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 21
  • ???jsp.display-item.citation.isi??? ND
social impact