Functional dependencies (FDs) allow us to represent database constraints, corresponding to requirements as “patients having the same symptoms undergo the same medical tests.” Some research eforts have focused on extending such dependencies to consider also temporal constraints such as “patients having the same symptoms undergo in the next period the same medical tests.” Temporal functional dependencies are able to represent such kind of temporal constraints in relational databases. Another extension for FDs allows one to represent approximate functional dependencies (AFDs), as “patients with the same symptoms generally undergo the same medical tests.” It enables data to deviate from the defned constraints according to a user-defned percentage. Approximate temporal functional dependencies (ATFDs) merge the concepts of temporal functional dependency and of approximate functional dependency. Among the diferent kinds of ATFD, the Approximate Pure Temporally Evolving Functional Dependencies (APE-FDs for short) allow one to detect patterns on the evolution of data in the database and to discover dependencies as “For most patients with the same initial diagnosis, the same medical test is prescribed after the occurrence of same symptom.” Mining ATFDs from large databases may be computationally expensive. In this paper, we focus on APE-FDs and prove that, unfortunately, verifying a single APE-FD over a given database instance is in general NP-complete. In order to cope with this problem, we propose a framework for mining complex APE-FDs in real-world data collections. In the framework, we designed and applied sound and advanced model-checking techniques. To prove the feasibility of our proposal, we used real-world databases from two medical domains (namely, psychiatry and pharmacovigilance) and tested the running prototype we developed on such databases.

Discovering Evolving Temporal Information: Theory and Application to Clinical Databases

Sala, Pietro;Combi, Carlo;Mantovani, Matteo;Rizzi, Romeo
2020-01-01

Abstract

Functional dependencies (FDs) allow us to represent database constraints, corresponding to requirements as “patients having the same symptoms undergo the same medical tests.” Some research eforts have focused on extending such dependencies to consider also temporal constraints such as “patients having the same symptoms undergo in the next period the same medical tests.” Temporal functional dependencies are able to represent such kind of temporal constraints in relational databases. Another extension for FDs allows one to represent approximate functional dependencies (AFDs), as “patients with the same symptoms generally undergo the same medical tests.” It enables data to deviate from the defned constraints according to a user-defned percentage. Approximate temporal functional dependencies (ATFDs) merge the concepts of temporal functional dependency and of approximate functional dependency. Among the diferent kinds of ATFD, the Approximate Pure Temporally Evolving Functional Dependencies (APE-FDs for short) allow one to detect patterns on the evolution of data in the database and to discover dependencies as “For most patients with the same initial diagnosis, the same medical test is prescribed after the occurrence of same symptom.” Mining ATFDs from large databases may be computationally expensive. In this paper, we focus on APE-FDs and prove that, unfortunately, verifying a single APE-FD over a given database instance is in general NP-complete. In order to cope with this problem, we propose a framework for mining complex APE-FDs in real-world data collections. In the framework, we designed and applied sound and advanced model-checking techniques. To prove the feasibility of our proposal, we used real-world databases from two medical domains (namely, psychiatry and pharmacovigilance) and tested the running prototype we developed on such databases.
2020
Temporal data mining, Temporal functional dependencies, Temporal databases, Distributed algorithms, Complexity, Pharmacovigilance, Psychiatric case register
File in questo prodotto:
File Dimensione Formato  
SNComputer2020.pdf

accesso aperto

Tipologia: Versione dell'editore
Licenza: Dominio pubblico
Dimensione 8.99 MB
Formato Adobe PDF
8.99 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1023542
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact