DARE Lab | Data Democratisation with Deep Learning: The Anatomy of a Natural Language Data Interface

Abstract

In the age of the Digital Revolution, almost all human activities, from industrial and business operations to medical and academic research, are reliant on the constant integration and utilisation of ever-increasing volumes of data. However, the explosive volume and complexity of data makes querying and exploration challenging even for experts, and makes the need to democratise the access to data, even for non-technical users, all the more evident. It is time to lift all technical barriers, by empowering users to access relational databases through conversation. We consider 3 main research areas that a natural language data interface is based on: Text-to-SQL, SQL-to-Text, and Data-to-Text. The purpose of this tutorial is a deep dive into these areas, covering state-of-the-art techniques and models, and explaining how the progress in the deep learning field has led to impressive advancements. We will present benchmarks that sparked research and competition, and discuss open problems and research opportunities with one of the most important challenges being the integration of these 3 research areas into one conversational system.

Outline

Text-to-SQL
- The Text-to-SQL problem
- Benchmarks
- A Taxonomy for Deep Learning Text-to-SQL Systems
- Key Systems
- Research Challenges
SQL-to-Text
- The SQL-to-Text problem
- Challenges
- Key Systems
- Research Challenges
Data-to-Text
- What is Data-to-Text
- Subfields of Data-to-Text
- Table-to-Text
- Graph-to-Text
- Evaluation
- Research Challenges
Bringing it all together
- What do we mean?
- Why is it not trivial?
- Challenges
- Demo

Presenters

George Katsogiannis

Mike Xydas

Georgia Koutrika

Material

Feel free to download the slides of the tutorial here.