Mining Big and Complex Data
Increasingly often, data mining has to learn predictive models from big data, which may have many examples or many input/output dimensions and may be streaming at very high rates. Contemporary predictive modeling problems may also be complex in a number of other ways: they may involve (a) structured data, both as input and output of the prediction process, (b) incompletely/partially labelled data, and (c) data placed in a spatio-temporal or network context. Data mining methods that can handle such problems are being developed within the EU funded MAESTRA project (http://maestra-project.eu/).
The talk will first give an introduction to the different tasks encountered when learning from big and complex data. It will then present some methods for solving such tasks. It will focus on structured-output prediction, semi-supervised learning (from incompletely annotated data), and learning from data streams. Some illustrative applications of these methods will also be described.
About the lecturer
Sašo Džeroski is a scientific councillor at the Jozef Stefan Institute and the Centre of Excellence for Integrated Approaches in Chemistry and Biology of Proteins, both in Ljubljana, Slovenia. He is also a full professor at the Jozef Stefan International Postgraduate School and the Faculty of Computer and Information Science, University of Ljubljana. His research group investigates machine learning and data mining (including structured output prediction and automated modeling of dynamic systems) and their applications (in environmental sciences, incl. ecology, and life sciences, incl. systems biology).
He has co-authored/co-edited more than ten books, including “Inductive Logic Programming”, “Relational Data Mining”, “Learning Language in Logic”, “Computational Discovery of Scientific Knowledge” and “Inductive Databases and Constraint-Based Data Mining”. He has participated in many international research projects and coordinated two of them in the past. He currently leads the FET XTrack project MAESTRA (Learning from Massive, Incompletely annotated, and Structured Data) and is one of the principal investigators in the FET Flagship Human Brain Project.
Saso Džeroski received his Ph.D. degree in computer science from the University of Ljubljana in 1995. He was awarded a Jožef Stefan Golden Emblem Prize for his outstanding doctoral dissertation. Immediately thereafter, he received a fellowship from ERCIM, The European Research Consortium for Informatics and Mathematics, awarded to 5% of applicants. In 2008, he was awarded the title ECCAI fellow by the European Association for Artificial Intelligence (at that time called European Coordinating Committee on Artificial Intelligence) for “Pioneering Work in the field of AI and Outstanding Service for the European AI community”. In 2015, he became a foreign member of the Macedonian Academy of Sciences and Arts.