The computational complexity of query answering under updates
Abstract:
Query evaluation is one of the most fundamental tasks in databases, and a vast amount of literature is devoted to the complexity of this problem. This talk will focus on query evaluation in the “dynamic setting”, where the database may be updated by inserting or deleting tuples. In this setting, an evaluation algorithm receives a query Q and an initial database D and starts with a preprocessing phase that computes a suitable data structure to represent the result of evaluating Q on D. After every database update, the data structure is updated so that it represents the result of evaluating Q on the updated database. The data structure shall be designed in such a way that it quickly provides the query result, preferably in constant time (i.e., independent of the database size). We focus on the following flavours of query evaluation.
(1) Testing: Decide whether a given tuple t is contained in Q(D).
(2) Counting: Compute the number of tuples that belong to Q(D).
(3) Enumeration: Enumerate Q(D) with a bounded delay between the output tuples. Here, as usual, Q(D) denotes the k-ary relation obtained by evaluating a k-ary query Q on a relational database D. For Boolean queries, all three tasks boil down to
(4) Answering: Decide if Q(D) is non-empty.
Compared to the dynamic descriptive complexity framework introduced by Patnaik and Immerman (1997), which focuses on the expressive power of first-order logic on dynamic databases and has led to a rich body of literature, we are interested in the computational complexity of query evaluation. We say that a query evaluation algorithm is efficient if the update time is either constant or at most polylogarithmic in the size of the database. In this talk I want to give an overview of recent results in this area.