Data Democratisation with Deep Learning: An Analysis of Text-to-SQL Systems

Presented at the 2022 Web Conference (TWC/WWW)

Abstract

In the age of the Digital Revolution, almost all human activities, from industrial and business operations to medical and academic research, are reliant on the constant integration and utilisation of ever-increasing volumes of data. However, the explosive volume and increasing complexity of data makes data querying all the more challenging even for experts. For this reason, numerous text-to-SQL systems have been developed that enable querying relational databases using natural language. The recent advances on deep neural networks along with the creation of two large datasets specifically made for training text-to-SQL systems, have paved the path for a novel and very promising research area. The purpose of this tutorial is a deep dive into this area, covering state-of-the-art techniques for natural language representation in neural networks, benchmarks that sparked research and competition, recent text-to-SQL systems using deep learning techniques, as well as open problems and research opportunities.

Outline

  1. The Text-to-SQL Problem
  2. Available Benchmarks
  3. Natural Language Representation
  4. A Taxonomy of Text-to-SQL Deep Learning Systems
  5. Key Text-to-SQL Systems
  6. Challenges and Research Opportunities

Video Teaser

Presenters

Material

Feel free to download the slides of the tutorial here.