String Operations in Query Languages
paper (PODS 2001)
Michael Benedikt, Leonid Libkin, Thomas Schwentick and Luc Segoufin
We study relational calculi with support for string operations. Most prior
proposals were based on adding the operation of concatenation to first-order
logic. Such an extension is problematic as the relational calculus becomes
computationally complete, which in turn implies strong limits on the ability to
perform optimization and static analysis of properties such as query safety.
In contrast, we look at extensions of relational calculus that have nice
expressiveness, decidability, and safety properties, while corresponding to
sets of string operations used in SQL. We start with an extension based on the
string ordering and LIKE predicates. We then extend this basic model to include
string length comparison. While both of these share some of the attractive
properties of relational calculus (low data complexity for generic queries,
effective syntax for safe queries, correspondence with an algebra), there is a
large gap between these calculi in expressive power and complexity. The smaller
`basic model' has low data complexity for all queries, but is quite limited in
expressive power. The latter handles many natural string operations, but
includes queries with high data complexity. We thus also explore the space
between these two extremes. We consider two intermediate languages: the first
extends our base language with functions that trim/add leading characters, and
the other extends it by adding the full power of regular-expression pattern
matching. We show that both these extensions inherit many of the attractive
properties of the basic model: they both have corresponding algebras expressing
safe queries, and low complexity of query evaluation.
Back to Luc Segoufin's home page.