Big Data Integration - State of the Art & Challenges

Presented on 23 Oct at IC3K 2014

Keynote Lecturer: Sonia Bergamaschi

Abstract: Big data is a popular term for describing the exponential growth, availability and use of information, both structured and unstructured. Much has been written on the big data trend and its potentiality for innovation and growth of enterprises. The advise of IDC (one of the premier advisory firm specialized in information technology) for organizations and IT leaders is to focus on the ever-increasing volume, variety and velocity of information that forms big data. In most cases, such huge volume of data comes from multiple sources and across heterogeneous systems, thus, data have to be to linked, matched, cleansed and transformed. Moreover, it is necessary to determine how disparate data relates to common definitions and how to systematically integrate structured and unstructured data assets to produce useful, high-quality and up-to-date information. The research area of Data Integration, active since 90s, provided good techniques for facing the above issues in a unifying framework, Relational Databases (RDB), with reference to a less complex scenario (smaller volume, variety and velocity). Moreover, simpler forms of integration among different databases can be efficiently resolved by Data Federation technologies used for DBMS today. Adopting RDB as a general framework for big data integration and solving the issues above, namely volume, variety, variability and velocity, by using more powerful RDBMs technologies enhanced with data integration techniques is a possible choice. On the other hand, new emerging technologies came into play: NOSQL systems and technologies, datawarehouse appliances platforms provided by the major software players, data governance platforms, etc. In this talk, prof. Sonia Bergamaschi will provide an overview of this exciting field that will become more and more important.