Skip to content

Commit f2c291c

Browse files
authored
Merge pull request #25 from mzumi/merge_upstream
Merge upstream
2 parents 61fade9 + d699d01 commit f2c291c

12 files changed

+161
-43
lines changed

CHANGELOG.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,20 @@
1+
## 0.6.8 - 2022-10-12
2+
* [enhancement] Support JSON type (thanks to @civitaspo )
3+
* [maintenance] Add an error message in order to retry (thanks to @mzumi)
4+
5+
## 0.6.7 - 2021-09-10
6+
* [enhancement] Add an expiration option of temporary table to clean up (thanks to @TKNGUE)
7+
8+
## 0.6.6 - 2021-06-10
9+
10+
* [maintenance] Fix network retry function (thanks to @case-k-git)
11+
* [enhancement] Allow to specify the billing project and the project to which the data will be loaded separately (thanks to @ck-fm0211)
12+
* [enhancement] Include original error message on json parse error (thanks to @k-yomo)
13+
14+
## 0.6.5 - 2021-06-10
15+
* [maintenance] Fix failed tests (thanks to @kyoshidajp)
16+
* [maintenance] Lock representable version for avoiding requiring Ruby 2.4 (thanks to @hiroyuki-sato)
17+
118
## 0.6.4 - 2019-11-06
219

320
* [enhancement] Add DATETIME type conveter (thanks to @kekekenta)

README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ OAuth flow for installed applications.
3333
| auth_method | string | optional | "application\_default" | See [Authentication](#authentication) |
3434
| json_keyfile | string | optional | | keyfile path or `content` |
3535
| project | string | required unless service\_account's `json_keyfile` is given. | | project\_id |
36+
| destination_project | string | optional | `project` value | A destination project to which the data will be loaded. Use this if you want to separate a billing project (the `project` value) and a destination project (the `destination_project` value). |
3637
| dataset | string | required | | dataset |
3738
| location | string | optional | nil | geographic location of dataset. See [Location](#location) |
3839
| table | string | required | | table name, or table name with a partition decorator such as `table_name$20160929`|
@@ -78,6 +79,13 @@ Options for intermediate local files
7879
| delete_from_local_when_job_end | boolean | optional | true | If set to true, delete generate local files when job is end |
7980
| compression | string | optional | "NONE" | Compression of local files (`GZIP` or `NONE`) |
8081

82+
83+
Options for intermediate tables on BigQuery
84+
85+
| name | type | required? | default | description |
86+
|:-------------------------------------|:------------|:-----------|:-------------------------|:-----------------------|
87+
| temporary_table_expiration | integer | optional | | Temporary table's expiration time in seconds |
88+
8189
`source_format` is also used to determine formatter (csv or jsonl).
8290

8391
#### Same options of bq command-line tools or BigQuery job's property

embulk-output-bigquery.gemspec

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Gem::Specification.new do |spec|
22
spec.name = "embulk-output-bigquery"
3-
spec.version = "0.7.0"
3+
spec.version = "0.7.1"
44
spec.authors = ["Satoshi Akama", "Naotoshi Seo"]
55
spec.summary = "Google BigQuery output plugin for Embulk"
66
spec.description = "Embulk plugin that insert records to Google BigQuery."
@@ -16,11 +16,16 @@ Gem::Specification.new do |spec|
1616

1717
# TODO
1818
# signet 0.12.0 and google-api-client 0.33.0 require >= Ruby 2.4.
19-
# Embulk 0.9 use JRuby 9.1.X.Y and It compatible Ruby 2.3.
20-
# So, Force install signet < 0.12 and google-api-client < 0.33.0
19+
# Embulk 0.9 use JRuby 9.1.X.Y and it's compatible with Ruby 2.3.
20+
# So, force install signet < 0.12 and google-api-client < 0.33.0
21+
# Also, representable version >= 3.1.0 requires Ruby version >= 2.4
2122
spec.add_dependency 'signet', '~> 0.7', '< 0.12.0'
2223
spec.add_dependency 'google-api-client','< 0.33.0'
2324
spec.add_dependency 'time_with_zone'
25+
spec.add_dependency "representable", ['~> 3.0.0', '< 3.1']
26+
# faraday 1.1.0 require >= Ruby 2.4.
27+
# googleauth 0.9.0 requires faraday ~> 0.12
28+
spec.add_dependency "faraday", '~> 0.12'
2429

2530
spec.add_development_dependency 'bundler', ['>= 1.10.6']
2631
spec.add_development_dependency 'rake', ['>= 10.0']
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
in:
2+
type: file
3+
path_prefix: example/example.csv
4+
parser:
5+
type: csv
6+
charset: UTF-8
7+
newline: CRLF
8+
null_string: 'NULL'
9+
skip_header_lines: 1
10+
comment_line_marker: '#'
11+
columns:
12+
- {name: date, type: string}
13+
- {name: timestamp, type: timestamp, format: "%Y-%m-%d %H:%M:%S.%N", timezone: "+09:00"}
14+
- {name: "null", type: string}
15+
- {name: long, type: long}
16+
- {name: string, type: string}
17+
- {name: double, type: double}
18+
- {name: boolean, type: boolean}
19+
out:
20+
type: bigquery
21+
mode: replace
22+
auth_method: service_account
23+
json_keyfile: example/your-project-000.json
24+
project: your_project_name
25+
destination_project: your_destination_project_name
26+
dataset: your_dataset_name
27+
table: your_table_name
28+
source_format: NEWLINE_DELIMITED_JSON
29+
compression: NONE
30+
auto_create_dataset: true
31+
auto_create_table: true
32+
schema_file: example/schema.json

lib/embulk/output/bigquery.rb

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ def self.configure(config, schema, task_count)
3636
'auth_method' => config.param('auth_method', :string, :default => 'application_default'),
3737
'json_keyfile' => config.param('json_keyfile', LocalFile, :default => nil),
3838
'project' => config.param('project', :string, :default => nil),
39+
'destination_project' => config.param('destination_project', :string, :default => nil),
3940
'dataset' => config.param('dataset', :string),
4041
'location' => config.param('location', :string, :default => nil),
4142
'table' => config.param('table', :string),
@@ -89,6 +90,8 @@ def self.configure(config, schema, task_count)
8990
'clustering' => config.param('clustering', :hash, :default => nil), # google-api-ruby-client >= v0.21.0
9091
'schema_update_options' => config.param('schema_update_options', :array, :default => nil),
9192

93+
'temporary_table_expiration' => config.param('temporary_table_expiration', :integer, :default => nil),
94+
9295
# for debug
9396
'skip_load' => config.param('skip_load', :bool, :default => false),
9497
'temp_table' => config.param('temp_table', :string, :default => nil),
@@ -135,12 +138,13 @@ def self.configure(config, schema, task_count)
135138
json_key = JSON.parse(task['json_keyfile'])
136139
task['project'] ||= json_key['project_id']
137140
rescue => e
138-
raise ConfigError.new "json_keyfile is not a JSON file"
141+
raise ConfigError.new "Parsing 'json_keyfile' failed with error: #{e.class} #{e.message}"
139142
end
140143
end
141144
if task['project'].nil?
142145
raise ConfigError.new "Required field \"project\" is not set"
143146
end
147+
task['destination_project'] ||= task['project']
144148

145149
if (task['payload_column'] or task['payload_column_index']) and task['auto_create_table']
146150
if task['schema_file'].nil? and task['template_table'].nil?
@@ -166,7 +170,7 @@ def self.configure(config, schema, task_count)
166170
begin
167171
JSON.parse(File.read(task['schema_file']))
168172
rescue => e
169-
raise ConfigError.new "schema_file #{task['schema_file']} is not a JSON file"
173+
raise ConfigError.new "Parsing 'schema_file' #{task['schema_file']} failed with error: #{e.class} #{e.message}"
170174
end
171175
end
172176

@@ -298,19 +302,23 @@ def self.auto_create(task, bigquery)
298302
end
299303
end
300304

305+
temp_table_expiration = task['temporary_table_expiration']
306+
temp_options = {'expiration_time' => temp_table_expiration}
307+
301308
case task['mode']
302309
when 'delete_in_advance'
303310
bigquery.delete_table_or_partition(task['table'])
304311
bigquery.create_table_if_not_exists(task['table'])
305312
when 'replace'
306-
bigquery.create_table_if_not_exists(task['temp_table'])
313+
bigquery.create_table_if_not_exists(task['temp_table'], options: temp_options)
307314
bigquery.create_table_if_not_exists(task['table']) # needs for when task['table'] is a partition
308315
when 'append'
309-
bigquery.create_table_if_not_exists(task['temp_table'])
316+
bigquery.create_table_if_not_exists(task['temp_table'], options: temp_options)
310317
bigquery.create_table_if_not_exists(task['table']) # needs for when task['table'] is a partition
311318
when 'replace_backup'
312-
bigquery.create_table_if_not_exists(task['temp_table'])
319+
bigquery.create_table_if_not_exists(task['temp_table'], options: temp_options)
313320
bigquery.create_table_if_not_exists(task['table'])
321+
314322
bigquery.create_table_if_not_exists(task['table_old'], dataset: task['dataset_old']) # needs for when a partition
315323
else # append_direct
316324
if task['auto_create_table']

0 commit comments

Comments
 (0)