embulk-parser-csv_guessable (runtime)guesses and parses csv which has schema in header.
Csv file sometimes has a schema in the header. embulk-parser-csv_guessable parses such a csv by using their header as column name. This plugin is useful in case of target csv schema changes frequently.
It behaves as original csv parser when embulk-parser-csv_guessable configs(schema_file
and schema_line
) is not defined.
- Plugin type: parser
- Guess supported: no
- java: 1.8+
- embulk: 0.9+
- schema_file: filename which has schema.(string, default:
null
) - schema_line: schema line in header. (integer default:
1
) - columns: Columns attributes for parse.
embulk-parser-csv_guessable
use this config only whenschema_file
is set. If"schema_file"
isn't set, this is same as the original csv parser'scolumns
. (hash, default:null
)- value_name: Name of the column in the header. rename to
name
- name: Name of the column
- type: Type of the column
- format: Format of the timestamp if type is timestamp
- date: Set date part if the format doesn't include date part
- value_name: Name of the column in the header. rename to
- any other csv configs: see www.embulk.org
test.csv (There is a schema at the first line.)
id, title, description
1, awesome-title, awesome-description
2, shoddy-title, shoddy-description
config.yml
in:
type: any file input plugin type
parser:
type: csv_guessable
schema_file: test.csv
schema_line: 1
(For explain) In case original csv parser config.yml
in:
type: any file input plugin type
parser:
type: csv
skip_header_lines: 1
columns:
- {name: id, type: string}
- {name: title, type: string}
- {name: description, type: string}
rename column name and set type Example
in:
type: any file input plugin type
parser:
type: csv_guessable
schema_file test.csv
schema_line: 1
columns:
- {value_name: 'id', name: 'number', type: long}
- {value_name: 'title', name: 'description', type: string}
- {value_name: 'status', name: 'ok?', type: string}
$ embulk gem install embulk-parser-csv_guessable
$ cd samples/sample2
$ embulk run -L ../../ config_rename.yml -l debug
$ ./gradlew gem # -t to watch change of files and rebuild continuously
$ ./gradlew test