-
Notifications
You must be signed in to change notification settings - Fork 22
Annotations
CSVeed currently has the following annotations:
-
@CsvFile
; generic instructions for parsing the CSV file and converting to Rows and Beans -
@CsvCell
; custom instructions for properties, allowing mappings to column index or names and whether the value is required -
@CsvIgnore
; orders CSVeed to ignore a property -
@CsvDate
; allows a custom date format to be applied to a property -
@CsvConverter
; set a custom PropertyEditor to be applied to a property
For the annotations to work, the Bean class must be passed to CsvReader
:
CsvReader<Bean> csvReader = new CsvReaderImpl<Bean>(reader, Bean.class);
This annotation is set on the Bean class. It contains the generic instructions for parsing the CSV file and converting it to Rows and Beans. The following settings are supported by CsvFile
:
- parse instructions; escape, quote, separator, end-of-line, and comment — this determines what your CSV file looks like
-
use header; whether the CSV file contains a header and must be read as such. Using the header is essential for employing the
ColumnNameMapping
strategy. - start row; the line from where to start reading the CSV file, zero-based
- skip lines; both empty and comment lines and whether they must be ignored or parse must be attempted
-
mapping strategy; by default, this will be
ColumnIndexMapping
, which maps to Bean properties on the basis of the column index. Alternatively, this could beColumnNameMapping
, which maps to Bean properties on the basis of the name of the column (ie, the header name).
Parse instructions help CsvReader to read and interpret the CSV file. Assume the following CSV:
first name, surname, street, city, trademark
% First a line on mr Hawking
'Stephen', 'Hawking', '110th Avenue', 'New York', 'History of the \\'Universe\\''
% Then on mr Einstein
'Albert', 'Einstein', 'Leipzigerstrasse', 'Berlin', '\\'E=mc2\\''
The Bean header can be annotated as follows:
@CsvFile(comment = '%', quote='\\'', escape='\\\\', separator=',')
public class Bean {
The following parse instructions are available:
-
separator; the character used to separate two cells. This is usually a
;
(northern Europe, also the default),,
(USA), tab symbol or a pipe|
. Default is;
. -
quote; the character used to signal the start and the end of a cell. Within a cell thus delimited, it is possible to have newlines and use the quote symbol, if escaped. Default is
"
. -
escape; the character used to escape a quote symbol within a quoted field. This one is contentious since RFC 4180 states that the escape symbol is the same as the quote symbol, so you use them twice to have one. Sometimes, it is desirable to have a custom escape character, which you can set here. Default is
"
. -
end of line; a number of characters indicating when the end of a line has been reached. Default is
\r
and\n
-
comment; if a line starts with the comment character, it is assumed to be a comment line. Only used if skip comments are true (default). The default is
#
.
Suppose your CSV file does not have a header:
"line 1";1
"line 2";2
"line 3";3
You need to disable useHeader in @CsvFile
:
@CsvFile(useHeader = false)
public class Bean {
Note: it is now impossible to use ColumnNameMapping
since there is no header to supply the column names.
CSV files exist the contain a lot of non-essential information before the actual content starts, while not being marked as comment lines:
Roses are red,
Violets are blue,
And some more of that
"Here";"We";"Go"
If you are in the lucky position that you can identify the exact start row, you could pass that information on in @CsvFile
:
@CsvFile(startRow = 3)
public class Bean {
There are two skip instructions:
- skip empty lines; it can be useful to convert empty lines into single-column rows. By default, empty lines will be skipped.
- skip comment lines; it can be useful to disable the skipping of comment lines when the comment symbol can be a legitimate symbol in your CSV file. By default, comment lines will be skipped.
Example of a file where you may want to include empty lines:
Alpha
Beta
Gamma
Example of a file where you may want to ignore comments:
issue number; description
#12;Some error somewhere
#31;NPE
In these cases, make sure to instruct @CsvFile
properly:
@CsvFile(skipCommentLines = false, skipEmptyLines = false)
public class Bean {
For converting Rows to Beans, this is the most important setting of @CsvFile. There are two mapping strategies currently supported:
- ColumnIndexMapper; maps cells based on their column index to Bean properties
- ColumnNameMapper; maps cells based on their column name (ie, header name) to Bean properties
The default strategy to employ if none is passed. Cells will be mapped to Bean properties by their column index. When no instructions are passed to a property (using @CsvCell#columnIndex
), CSVeed will take the declared order of the property and use that order to self-assemble the index.
The following Bean properties (assuming they have public getters and setters):
private String name;
private Date birthdate;
private Integer creditRating;
Will lead to the following index:
0 -> name
1 -> birthdate
2 -> creditRating
Cells will be mapped to Bean properties by their column name (ie, header name). When no instructions are passed to a property (using @CsvCell#columnName
), CSVeed will take the property name and use that to self-assemble the index.
The following Bean properties (assuming public getters/setters):
private String name;
private Date birthdate;
private Integer creditRating;
Will lead to the following index:
name -> name
birthdate -> birthdate
creditrating -> creditRating
Note that the key creditrating
is all lower-case. Property-names are all lower-cased before storing them in the index. Lookups will also be done with lookup keys that are first lower-cased. Therefore ColumnNameMapper
is case-insensitive.
This annotation is set on a Bean property. @CsvCell
offers three tools:
- columnIndex; maps a column index to this Bean property
- columnName; maps a column name to this Bean property
- required; when this value is true, the content of the cell must be not empty
When not all columns in a CSV file are needed, columnIndex
may be of great help. We have a CSV file here with multiple columns and no headers:
L1C1;L1C2;L1C3;L1C4;valuable info 1;l1C6
L2C1;L2C2;L2C3;L2C4;valuable info 2;l2C6
The Bean property can now be annotated as follows:
@CsvCell(columnIndex = 4)
private String valuableInfo;
Note that the columnIndex works zero-based. Also, note that Bean properties following valuableInfo
will use the set columnIndex as their starting point. In other words, the next property will automatically have index column 5 mapped to itself.
It is possible to set up your own mapping for ColumnNameMapper
, which is especially useful if the CSV header tends to be verbose, contains lots of special characters or has a name which you do not want to reuse, ie have names that translate badly to property names:
the first column; my my, how verbose; isn't it?
@CsvCell(columnName = "the first column")
private String first;
@CsvCell(columnName = "my my, how verbose")
private String second;
@CsvCell(columnName = "isn't it?")
private String third;
Although validation is not the providence of CSVeed, this annotation does a little bit to help you along. When Bean properties are marked as required and they are found to be null
or ""
, an exception will be thrown.
first name, surname, street, city, trademark
'Stephen', 'Hawking', '110th Avenue', 'New York', 'History of the \\'Universe\\''
'Albert', 'Einstein', '', 'Berlin', '\\'E=mc2\\''
Note how Einstein's street cell is empty.
@CsvCell(required = true)
private String street;
This will result in the following error:
Exception in thread "main" nl.tweeenveertig.csveed.report.CsvException: Bean property "street" is required and may not be empty or null
2: 'Albert', 'Einstein', '', 'Berlin', '\\'E=mc2\\''[EOF]
2: ^
Marking Bean properties to be ignored, means they will not be automatically picked up for indexing, neither for ColumnIndexMapper
nor for ColumnNameMapper
. The Bean property will be completely ignored.
private String name;
@CsvIgnore
private Date birthdate;
private Integer creditRating;
Will lead to the following index:
0 -> name
1 -> creditRating
Converting to java.util.Date from String brings it owns challenges. This annotation lets you determine the date format to employ. The default format that will be used is "yyyy-MM-dd" (for example 2013-02-28
), the date format that also sorts very well.
name;date
Jane;21-03-2011
Jill;03-11-2013
So the date format is day-month-year, or "dd-MM-yyyy".
@CsvDate(format = "dd-MM-yyyy")
private String date;
Be sure to check the docs on Java SDK's SimpleDateFormat for a better understanding of the syntax involved.
It is conceivable that you bring your own String-to-property conversion wishes into the game, which is why this annotation exists. First make sure that you create or supply your converter, based on the PropertyEditor class of the Java SDK.
If you want to create your own PropertyEditor
, you are well-advised to make use of PropertyEditorSupport, which leaves you only the essentials to implement:
public class BeanSimplePropertyEditor extends PropertyEditorSupport {
public String getAsText() {
return ((BeanSimple)getValue()).getName();
}
public void setAsText(String text) {
BeanSimple bean = new BeanSimple();
bean.setName(text);
setValue(bean);
}
}
As you can see, it is basically a matter of supplying a way to go from String to a Class and vice versa. Nothing much to it, really.