-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-4417] Support BigQuery's NUMERIC type using Java #5755
Conversation
This LGTM as far as it goes. Should I hope for a test that does some end-to-end thing and get the numeric out? |
The test failures is a flake. I'll kick it. |
run java precommit |
run java postcommit |
RE end-to-end testing, that's what I'm not sure about. Is there an integration test that I can update or some way of using this code to read from a BigQuery table with a NUMERIC column to verify that it works? I'd hate to push something through that looks fine in unit tests but falls apart in practice. |
Good question. Pinging @chamikaramj @lgajowy perhaps? |
Looks like we do have some tests here: https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/941/testReport/org.apache.beam.sdk.io.gcp.bigquery/ I think these are relatively small tests that use DirectRunner. For DataflowRunner+BQ I think we currently rely on internal Google testing. |
Technically, I think you could try to write BigQueryIOIT in a fashion described here and then run it to see if everything works as expected. Note that the test itself does not require any kubernetes scripts (as described in docs) and could be run on some existing bigQuery instance. I can help with this if you choose this way. :) |
We have turned on autoformatting of the codebase, which causes small conflicts across the board. You can probably safely rebase and just keep your changes. Like this:
Please ping me if you run into any difficulty. |
I think (hope) that I fixed the conflicts. I haven't been able to figure out how to run the integration test, though. Are there specific instructions about how to run
The example indicates using the |
The instructions are nonexistent I think. Since you left the "Allow edits by maintainers" box checked I've taken the liberty of pushing an autoformat commit to get that to pass. I will now say the magic words to run the appropriate postcommit tests on this PR. |
run java postcommit |
Someone broke head for an unfortunate minute. |
retest this please |
run java postcommit |
OK it passed before & after. Since you will no longer need your local copy to be in sync with this PR, I am going to fix up the commit history then merge this. |
For reference, this is the equivalent change in google-cloud-java: https://github.com/GoogleCloudPlatform/google-cloud-java/pull/3110/files
@ElliottBrossard I think you've found an issue in the gradle code.
in the build.gradle file. For some reason, those are not added in examples' build.gradle, so this is why you are unable to use it and run the test. We'll have to fix it. Thanks for reporting this. JIRA for this: https://issues.apache.org/jira/browse/BEAM-4706 |
Thanks, Kenneth! And thanks, Łukasz, for filing an issue. |
Hi @ElliottBrossard can We Test the Numeric DataType After Building 2.6.0-SNAPSHOT from master? |
@KumarKishan It looks like you deleted your original post, but I'm guessing the issue is that this new code expects for the
The problem is that |
Yes @ElliottBrossard, I got this Exception. |
@ElliottBrossard we fixed the JIRA mentioned above (link). FWIW, we run the test successfully on Dataflow and direct runners (see the comment in the JIRA issue) after your change was merged. |
I was able to reproduce the problem using a modified BigQueryTornadoes :D For the sake of posterity, this was the command I used:
I'm working on a PR to fix the error from my comment above. |
Great! :) |
Hi all! Is this now fixed? Does the Tornadoes test verify it? |
Yes, this should be fixed now. The current version of the Tornadoes test doesn't exercise NUMERIC, but the unit tests now use the proper Avro encoding (BYTES+DECIMAL) in correspondence with NUMERIC to verify that BigQueryIO is able to read those values. |
FWIW I'll note that I've implemented a test that verifies this functionality and runs internally at Google. |
For comparison, this is the PR that added support in google-cloud-java: googleapis/google-cloud-java#3110.
I will add Python support separately, unless it's preferable to include that in this PR as well.