Semantic Parsing English to GraphQL
0For the last few months I've been eagerly working on building a Natural Language Processing model. I present Semantic Parsing English-to-GraphQL (SPEGQL).
In this post, I'll cover some of the highlights, and include my presentation at OpenAI.
Background
GraphQL
GraphQL is a query language for your API.
Recently GraphQL has gained a lot of popularity. Some of the main benefits I'll note are: it represents the API's schema as a graph, nested relations of any depth can be easily queried, it can aggregate data over multiple datasources and responses are predictable among other things.
Semantic Parsing
Semantic parsing is the task of converting a natural language utterance to a logical form: a machine-understandable representation of its meaning. In this case I wanted to semantically parse from English, to GraphQL.
Why this project?
I had a few reasons to work on this project:
- I wanted to understand the limits of general language models for Semantic Parsing
- This project could potentially ease the learning curve for new developers of GraphQL
- It has potential use as tooling for non technical data users, such as managers, to gain insights into their data
Objective
Given an English prompt:
“What is the name and date of the song released most recently?”
And some GraphQL Schema
type song {
artist: artist
artist_name: String
country: String
f_id: Int
file: files
genre: genre
genre_is: String
languages: String
rating: Int
releasedate: String
resolution: Int
song_name: String
}
...
Find a corresponding GraphQL Query:
query {
song(limit: 1, order_by: {releasedate: desc}) {
song_name
releasedate
}
}
This objective could be tested by passing the prompt and schema though a model (In this case T5) to output a query. The process is as follows:
Methods
The process required multiple steps
- Create an English to GraphQL dataset
- Run experiments on Encoder-Decoder Transformer models (Bart and T5)
- Collect data and results
- Implement a graphical interface to interact with the model
Results
- 46 - 50% exact set matching accuracy on GraphQL validation dataset
A couple of example videos will help show results as well
Other Notes
Here is the main Repo for creating and validating the Dataset:
https://github.com/acarrera94/sql-to-graphql
I also created an example notebook for anyone who wants to try out the model. This model is finetuned on GraphQL and SQL and can create queries for both languages:
https://colab.research.google.com/drive/1l1h8RlEl-IS0XfkDh66qikH4UsD19KF6?usp=sharing
I'm currently working on a paper that details the whole process, that's linked here.
And finally, I also gave a presentation about my project at OpenAI: