-
-
Notifications
You must be signed in to change notification settings - Fork 571
RegExpImport
The RegExp import plugin reads data from text files using regular expression.
Files are parsed regardless of new line characters, so the regular expression can match strings being spread across several text lines.
The plugin uses Qt's QRegularExpression
class for matching, so you can read Qt documentation about that class to learn more about possible patterns.
We will import into existing table with 3 columns. Table was created earlier by the statement:
CREATE TABLE test_table (col1, col2, col3);
Input text file contents:
xxxxxx3yyyyyy54zzzzzz5aaaaaa23bbbb4cc54565xxxxx
yyy333zzzzzzzzzz154aaaaaa35bbb87
Regular expression configured for the RegExp import plugin:
\D*?(\d+)\D+?(\d+)\D+?(\d+)\D*?
The expression says, that we want to match 3 numbers separated by anything that is not a number. All 3 numbers are enclosed in capture groups, so they will be used as the data to import.
In result table test_table
will have 3 new rows:
3 | 54 | 5 |
23 | 4 | 54565 |
333 | 154 | 35 |
The last number will be ignored, as our expression expected all 3 numbers to be present, so one number doesn't fulfill the expression.
Use the same data input file, but change the expression to:
\D*?(\d+)\D+?(\d+)(?
\D+?)(\d+)\D*?
This expression uses named group, which captures word between 3rd and 4th groups. Then select option "Import only following groups" and type following groups:
3, word, 1
We use "word" as a group, cause we named it earlier. Using name that is not defined in the regular expression pattern will result in validation error and you will be notified about it.
Unnamed groups are referred with index number, starting with 1. The 0 index refers to the entire matched string. Note that in this example we switched group 3 and 1 with each other, so we will import 3rd group into 1st column and 1st group into 3rd column. It's just for showing the possibilities.
You should have the import dialog configured like this:
As a result we will import followin 3 rows:
5 | zzzzzz | 3 |
54565 | cc | 23 |
35 | aaaaaa | 333 |
Description of options that can be defined for the plugin, when executing import()
:
Option | Valid values | Description |
---|---|---|
RegExpImport.Pattern | An expression | Regular Expression pattern, as described previously on this page. |
RegExpImport.GroupsMode |
all / custom
|
If all , then any RegExp groups declared in the pattern will be used for subsequent columns in the output table. If custom is used, then only groups listedn in RegExpImport.CustomGroupList will be used for importing into subsequent table columns. |
RegExpImport.CustomGroupList | comma-separated list of group indexes or names | List of groups from the pattern to be imported. It's a comma separated list and may contain extra whitespaces, which are ignored. |