Title: V-DIF: Virtual Data Integration Framework
Year of Publication: 2018
Publisher: International Journal of Computer Systems (IJCS)
ISSN: 2394-1065
Series: Volume 05, Number 5, May 2018
Authors: Ali Zidane El Qutaany, Ali Hamid El Bastawissy, Osman Hegazi


Data Integration is the process of combining data residing at homogeneous, autonomous, and heterogeneous data sources, and providing users with a unified global schema GS. Users pose their queries in terms of the GS, and they expect accurate, complete and unambiguous answers. Data integration system processes users’ queries transparently, by translating each query to a set of sub-queries over the participating local sources LSs through the mappings defined between the GS and LSs. Even if none of the participating data sources have internal inconsistencies; mutual inconsistencies appear in the answers of the users’ queries due to the integration process. To ensure the unambiguity in answers, the data integration process should be followed by detecting and resolving such inconsistencies. Most of the data integration frameworks introduced in the literature concentrate mainly on data integration process and avoid or ignore the other two processes (inconsistency detection and resolution). A few frameworks consider detecting and resolving the inconsistencies but don’t consider the interfacing or linkage between the three processes. Interfacing means each process tries to serve the successive process through preparing the parameters needed for such process. We developed a Virtual – Data Integration Framework (V-DIF) and tested it over 8 heterogeneous information sources. VDIF meets most of the users’ expectations. In this article the theoretical part of the framework is introduced to ensure the interfacing between the three processes.


Data Integration, virtual integration, detectors, data fusion, duplicate and inconsistency detection, duplicate and inconsistency resolution.