CATALOGO DEI PRODOTTI DELLA RICERCA

Recent advances in DNA microarray technology have made it possible to measure the expression level of several thousand of genes simultaneously. The gene expression profiles obtained from microarray techniques have provided the opportunity of early diagnosis of cancer with the use of supervised learning algorithms. As a simple, effective and nonparametric classification method, k-Nearest Neighbor (k-NN) algorithm has recently been applied for the problem of cancer diagnosis and categorization. An obvious problem of traditional k-NN algorithm is that, when the density of training data is uneven, the precision of classification may reduce due to the consideration of first k nearest neighbors but not the differences of distances. A recent solution for this problem is adopting the theory of fuzzy sets and constructing a new membership function based on the similarities. This study has been conducted to demonstrate in what degree the fuzzification of k-NN algorithm can improve the prediction accuracy of cancer classification based on gene expression data. According to the results of the experiments over a six distinct benchmarking dataset spanning 27 diagnostic categories, it reveals that the fuzzy k-NN algorithm promotes the accuracy of cancer classification to a certain degree. Results also encourage the use of this fuzzification technique on similar problems in computational biology.

A fuzzy k-NN approach for cancer diagnosis with microarray gene expression data

C. Beyan;H. Ogul

2008-01-01

Abstract

Recent advances in DNA microarray technology have made it possible to measure the expression level of several thousand of genes simultaneously. The gene expression profiles obtained from microarray techniques have provided the opportunity of early diagnosis of cancer with the use of supervised learning algorithms. As a simple, effective and nonparametric classification method, k-Nearest Neighbor (k-NN) algorithm has recently been applied for the problem of cancer diagnosis and categorization. An obvious problem of traditional k-NN algorithm is that, when the density of training data is uneven, the precision of classification may reduce due to the consideration of first k nearest neighbors but not the differences of distances. A recent solution for this problem is adopting the theory of fuzzy sets and constructing a new membership function based on the similarities. This study has been conducted to demonstrate in what degree the fuzzification of k-NN algorithm can improve the prediction accuracy of cancer classification based on gene expression data. According to the results of the experiments over a six distinct benchmarking dataset spanning 27 diagnostic categories, it reveals that the fuzzy k-NN algorithm promotes the accuracy of cancer classification to a certain degree. Results also encourage the use of this fuzzification technique on similar problems in computational biology.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2008
			
	Parole Chiave
	
				fuzzy k-nearest neighbor, machine learning, microarray data anlaysis
			
	Appare nelle tipologie:
	
				04.01 Contributo in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
IC01_A Fuzzy K-NN Approach For Cancer Diagnosis with Microarray Gene Expression Data.pdf solo utenti autorizzati Tipologia: Versione dell'editore Licenza: Copyright dell'editore Dimensione 135.32 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	135.32 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1121927

Citazioni

ND

ND

ND

social impact