-
Notifications
You must be signed in to change notification settings - Fork 5
DT-1701: Improve Snapshot by Query error message #1974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…ified TODO: add unit test
…able names in snapshot by query
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great!!!! thanks
@@ -16,6 +17,12 @@ public BigQueryVisitor(Map<String, DatasetModel> datasetMap) { | |||
} | |||
|
|||
public String generateAlias(String datasetName, String tableName) { | |||
if (datasetName == null || tableName == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's possible for datasetName
to be null
. It's possible for ctx.dataset_name()
to be null
but since the name is always prefixed by prefixDatasetName()
before it gets here, it will never be null
in this method. If it's possible that it is null
in practice, we need to do this check in the caller before the prefix is added.
If this check is done in the caller, then we have Table_exprContext
or Column_exprContext
which would let us include the specific part of the query in the error message, which we could use to make a more helpful error message.
assertThrows( | ||
InvalidQueryException.class, | ||
() -> parsedQuery.translateSql(bqVisitor), | ||
"All column and table names must be qualified with a dataset/table name. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I wouldn't assert the entire text, or any of it unless it includes something specific based on the method parameters.
@@ -16,6 +17,12 @@ public BigQueryVisitor(Map<String, DatasetModel> datasetMap) { | |||
} | |||
|
|||
public String generateAlias(String datasetName, String tableName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that this is public
is a bit confusing in that it has no meaning outside this class. It looks like we need it to perform string matching in tests, if so it should have @VisibleForTesting
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes the null tableName
case but, if datasetName
can be null
, that case isn't fixed. I would either come up with a test case for null
datasetName
and fix it, or remove that check for null
.
Doing the check in the caller does lead to some duplication but it would be nice to have the parser context, which includes the user's source text and could be a big help in tracking down where in the query the problem is.
String testQuery, boolean queryHasQualifiedNames) { | ||
BigQueryVisitor bqVisitor = new BigQueryVisitor(datasetMap); | ||
Query parsedQuery = Query.parse(testQuery); | ||
if (!queryHasQualifiedNames) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a small thing but you could flip the if
branches and remove the !
since there's no need to order the branches in a particular way.
@@ -16,6 +17,12 @@ public BigQueryVisitor(Map<String, DatasetModel> datasetMap) { | |||
} | |||
|
|||
public String generateAlias(String datasetName, String tableName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method doesn't need any state in BigQueryVisitor
so it can be changed to static
public String generateAlias(String datasetName, String tableName) { | |
public static String generateAlias(String datasetName, String tableName) { |
Jira ticket: https://broadworkbench.atlassian.net/browse/DT-1701
Addresses
It would be nice to give users a more helpful error message if they did not correctly format their query for snapshot by query.
After a very long debugging session, we found that a user did not "correctly" format their query for a snapshot by query error message. It turns out, if you do not properly qualify both column and table names, then the alias generated for the table doesn't match up with the alias generated for the column. This left the user with an
BQ does not recognize table with alias "alias123"
error message. Instead, we should provide users with an actionable error message with instructions for how to format their table and column names in the query.For example,
Select column1 FROM tdrDataset1.table1 where column1 = true
will not successfully get translated into a BigQuery query.The query would end up getting translated to something like the following:
Select alias1.column1 FROM tdrDataset1.table1 AS alias2 where alias1.column1 = true
Which is invalid because alias1 and alias2 do not line up.
Instead, the user should provide something like the following:
Select tdrDataset1.table1.column1 FROM tdrDataset1.table1 where tdrDataset1.table1.column1 = true
There is an example of this sort of query in the terra support docs, but we can help by providing a more direct error message.

https://support.terra.bio/hc/en-us/articles/4407243163291-How-to-create-snapshots-in-TDR-GCP
Summary of changes
Testing Strategy