The QUAGGA Project
The Quagga project has been migrated to the phase 2 as the new Data Linkage project.



Real data are often dirty. Despite active research on integrity constraints enforcement and data cleaning, real data in real database applications are still dirty. To make matters worse, both diverse formats/usages of modern data and demand for large-scale data handling makes the problem even harder. The research goal of the QUAGGA project is, therefore, to contribute to the effective improvement of data quality by investigating foundational theories, developing efficient, effective and scalable algorithms, and building tools and systems. In particular, to surmount the challenges for which conventional solutions no longer work, we aim at identifying four tasks: (1) context-aware error detection; (2) scalable error detection; (3) fixing errors with tools or in systems; and (4) prototyping and evaluation.

In particular, we re-visit the following well-defined and well-studied (but greatly overlapping and related) problems from the database perspective and scalability in mind. Eventually, we aim at building an integrated solution and toolkit in SQL/RDBMS that are general enough to be used for various problems under diverse domains.

The quagga (Equus quagga) is a recently extinct mammal, closely related to horses and zebras. It was a yellowish-brown zebra with stripes only on its head, neck and forebody. The quagga was native to desert areas of the African continent until it was exterminated in the wild in the 1870s. The last captive quaggas died in Europe in the 1880s.

The project logo was made out of the figure from The Quagga Project at South African Museum. This project is an attempt by a group of dedicated people in South Africa to bring back an animal from extinction and reintroduce it into reserves in its former habitat.

The figure on the right is the only quagga to ever have been photographed alive at the London Zoo mare. Five photographs are known, taken by Frederick York and Frank Haes circa 1870.