Fuzzy Techniques for XML Data Smushing

Damiani, E.; Oliboni, Barbara; Tanca, L.

The recently proposed notion of a Semantic Web requires XML/RDF processing techniques able to locate, extract and organise heterogeneous information contained in XML documents coming from different sites, dealing flexibly with differences in structure and tag vocabulary. Such techniques should operate even when tagging is done in accordance with non-informative schemata, and even when no schema is available at all. In this paper, we review the main problems related to the processing and restructuring of large amounts of XML-based data, and propose some solutions in the framework of a flexible query and processing model for well-formed XML documents.