We are pleased to announce this year's joint EDBT/ICDT keynote speakers:

Dependence, Independence, and Incomplete Information

Erich Grädel
Mathematical Foundations of Computer Science Group
RWTH Aachen University, Germany


We discuss how notions such as dependence and independence can be incorporated into logical systems, and how they are connected to imperfect information in the associated evaluation games.

An important feature of logics of dependence and independence is a semantics that, unlike Tarski semantics, is not based on single assignments (mapping variables to elements of a structure) but on sets of assignments. There are interesting connections to database dependency theory.

We design model-checking games for logics with this kind of semantics in a general and systematic way and study their algorithmic properties. A number of examples are provided that show how such logics are used to reason about dependence and independence and how to express familiar combinatorial problems in a somewhat unexpected way. Based on our games, we also provide a complexity analysis of these logics.


Erich Grädel got his PhD in Mathematics at the University of Basel (Switzerland). He had research fellowship and lecturer positions in Pisa, Zürich, Berkeley, and Basel before he became Professor for Mathematical Foundations of Computer Science at RWTH Aachen University, Germany. He has had positions as visiting professor in Paris, Bordeaux, and Vienna.

His main research interests are in mathematical logic, mathematical foundations of computer science, and the theory of infinite games. He has written more than 80 research papers in these areas and co-authored four books. He is leading the European Research Networking Programme on Games for Design and Verification (GAMES).

Towards an Ecosystem for Structured Data on the Web

Alon Halevy
Structured Data Management Research Group
Google Inc., USA


The World-Wide Web contains vast quantities of structured data on a variety of domains, such as hobbies, products and reference data. Moreover, the Web provides a platform that can encourage publishing more data sets from governments and other public organizations and support new data management opportunities, such as effective crisis response, data journalism and crowd-sourcing data sets. To enable such wide-spread dissemination and use of structured data on the Web, we need to create a ecosystem that makes it easier for users to discover, manage, visualize and publish structured data on the Web.

I will describe some of the efforts we are conducting at Google towards this goal and the technical challenges they raise. In particular, I will describe Google Fusion Tables, a service that makes it easy for users to contribute data and visualizations to the Web and to perform data integration. I then describe the WebTables Project that attempts to discover high-quality tables on the Web and recover their semantics to provide effective search over the resulting collection of 200 million tables.


Alon Halevy heads the Structured Data Management Research group at Google. Prior to that, he was a professor of Computer Science at the University of Washington in Seattle, where he founded the database group. In 1999, Dr. Halevy co=96founded Nimble Technology, one of the first companies in the Enterprise Information Integration space, and in 2004, Dr. Halevy founded Transformic, a company that created search engines for the deep web, and was acquired by Google. Dr. Halevy is a Fellow of the Association for Computing Machinery, received the the Presidential Early Career Award for Scientists and Engineers (PECASE) in 2000, and was a Sloan Fellow (1999-2000). He received his Ph.D in Computer Science from Stanford University in 1993. Halevy is also a coffee culturalist and published the book "The Infinite Emotions of Coffee", bringing together stories about coffee culture from 30 countries. He is a co-author the book "Principles of Data Integration", to be published in Summer of 2012.

Inside "Big Data Management": Ogres, Onions, or Parfaits?

Michael J. Carey
Computer Science Department
University of California, Irvine, USA


The year is 2012, and everyone everywhere is buzzing about "Big Data". Virtually everyone, ranging from big Web companies to traditional enterprises to physical science researchers to social scientists, is either already experiencing or anticipating unprecedented growth in the amount of data available in their world and the new opportunities and great untapped value that successfully taming the "Big Data" beast will hold. It's almost impossible to pick up an issue of anything from the trade press, or even the popular press, without hearing something about "Big Data". It's a new era! Or is it...?

The database community has been all about "Big Data" since its inception, although the meaning of "Big" has obviously changed since ~1980 when work on parallel databases as we know them today was getting underway. That work continued until "Shared Nothing" parallel database systems were deployed commercially (by Teradata, Tandem, and others) and widely accepted by ~1990. Researchers in the database community then moved on. "Big Data" has been reborn in the 2000's, with massive, Web-driven challenges of scale driving Google, Yahoo!, Facebook, Microsoft, and others to develop new architectures for storing, accessing, and analyzing "Big Data". The database community has renewed energy, and it is quickly bringing its expertise in storage, indexing, set-oriented processing, and declarative languages to bear on the problem of "Big Data Management".

This talk (and its accompanying paper) will take look at the history of "Big Data" as well as at today's activities and architectures from the (perhaps biased) perspective of a "database guy" who has been watching over the years and is now working on "Big Data" problems. The focus will be on architectural issues, and particularly on the various components and layers that have been developed recently (in open source and elsewhere) and on how they are being used (or abused) to tackle the various challenges posed by "Big Data". Also covered will be the architectural approach being taken at UC Irvine, in the context of the ASTERIX project, where we are developing our own answers to the question of the "right" components and the "right" set of layers for taming the "Big Data" beast. We will also share our views on what some of the big open questions are and how the emerging data-intensive computing community might best go about studying and answering them.


Michael J. Carey is currently a Bren Professor of Information and Computer Sciences at UC Irvine. Prior to rejoining academia in 2008, he worked at BEA Systems as the chief architect and an engineering director for the AquaLogic Data Services Platform team. Carey also spent a number of years as a Professor at the University of Wisconsin-Madison, at IBM Almaden as a database researcher/manager, and as a Fellow at e-commerce software startup Propel Software. Carey is an ACM Fellow, a member of the National Academy of Engineering, and a recipient of the ACM SIGMOD Edgar F. Codd Innovations Award. His current research interests are centered around data-intensive computing and scalable data management.

Graph Pattern Matching Revised for Social Network Analysis

Wenfei Fan
LFCS, School of Informatics
University of Edinburgh, UK


Graph pattern matching is fundamental to social network analysis. Traditional techniques are subgraph isomorphism and graph simulation. However, these notions often impose too strong a topological constraint on graphs to find meaningful matches. Worse still, graphs in the real world are typically large, with millions of nodes and billions of edges. It is often prohibitively expensive to compute matches in such graphs. With these comes the need for revising the notions of graph pattern matching and for developing techniques of querying large graphs, to effectively and efficiently identify social communities or groups.

This talk aims to provide an overview of recent advances in the study of graph pattern matching in social networks. We introduce several revisions of the traditional notions of graph pattern matching, to find sensible matches in social networks. We also look into techniques for coping with the sheer size of social graphs, based on bounded incremental graph pattern matching, match preserving graph compression, and distributed pattern matching. We present an account of results and open problems in connection with these issues.


Wenfei Fan is Professor (Chair) of Web Data Management in the School of Informatics, University of Edinburgh, UK. He is a Fellow of the Royal Society of Edinburgh, UK, a National Professor of the Thousand-Talent Program and a Yangtze River Scholar, China. He received his PhD from the University of Pennsylvania, and his MS and BS from Peking University. He is a recipient of the Alberto O. Mendelzon Test-of-Time Award of ACM PODS 2010, the Best Paper Award for VLDB 2010, the Roger Needham Award in 2008 (UK), the Best Paper Award for ICDE 2007, the Outstanding Overseas Young Scholar Award in 2003, the Best Paper of the Year Award for Computer Networks in 2002, and the Career Award in 2001 (USA). His current research interests include database theory and systems, in particular data quality, data integration, distributed query processing, query languages, recommender systems, social networks and Web services.

Layout based on YAML