Most importantly, you can't have the system confidently answer incorrectly.
I just tested this query to evaluate joins: "What is the name of the series with the highest rating?" The answer: 5.8 :/
A typical user will be even less clear than this, asking the question as: "What's the top tv series?"
The ideal system should be able to confirm with the user that 1) they'd actually like to evaluate tv series based on rating. And 2) it should be able to guess without user confirmation that I really want the 'name' of the tv series.
It's a hard problem but massively valuable if it can be solved.
The way our system worked, broadly, was with a computational linguistics approach rather than an NLP one. We used an augmented bottom-up parsing algorithm to assemble a user's query into an AST, then translated that AST into both SQL (though that changed post-acquisition) and back into a canonical English-like query.
For ambiguous inputs, the parser would produce multiple candidate ASTs which we could offer to the user in the form of search results, for them to select the expression they preferred (solving your point 1).
The parser was constructed to use semantic information about the database schema to infer fields in the case of underspecification (handling your point 2). For example, we might understand about the `Series` table that `name` was the "display column", and should be the default field by which record-level results were displayed. Semantic information could, to various degrees, be inferred - always with the option to be manually overridden.
It seemed clear to us that a schema-agnostic approach to SQL generation would be untenable for a real product. Annotation of the schema over time (starting with basic information like field names, synonyms, and types, then adding semantic information like currency or unit labels) is key to letting a query generator produce sensible results to underspecified queries. For this reason alone, I'm skeptical of the quality of results that something like Photon can produce. Every schema differs in syntax and semantics. Perhaps a big enough corpus of schemas can allow for a generalized model. The more you can constrain the domain in building such a system, the better the results. (look at Statmuse : pretty impressive query capability over sports facts).
It was reasonably cool but I was a terrible CEO and slowly killed it: