Have you ever read CSV files in order to create java objects of some domain type proper for your app? Boring and cumbersome task. And if data is incomplete, e.g. some CSV files have more or less fields than those you care of, it becomes more boring, more cumbersome. It should not be! Java offers the means to do this easily. All you have to do is create a java domain type (class) with the fields of interest and match java fields to CSV header fields. Leave the rest to csv4j!
Brief Summary
Csv4j is a standalone application written in java 8. Optionally annotations are used to match a java field into one or more CSV header fields. In the absence of annotations, csv4j matches java fields to CSV fields of the same name. Then, csv4j makes extensive use of reflection in order to set the proper values to proper fields.
Basic assumptions follow:
- The first line of a CSV input file has to be the header line that defines the fields.
- The rest lines contain data and each line corresponds to an instance of a domain type.
- A java domain type has to be given as input, with:
- a public, no-argument constructor;
- non-final fields to be set with values from CSV data (additional fields are allowed and may or may not be final, it's irrelevant to csv4j);
- a standard setter method per field that needs to be set;
- each field's type has to define a static factory "valueOf" method. Primitive types are allowed as they are wrapped by csv4j, well by Guava :)
License and Source Code
Csv4j is an open source project licensed under Apache License, Version 2.0.
The code is hosted by github under csv4j repository.
Comments, pull requests or any other sort of contribution is more than welcome.
Maven is used for building and dependency management.
Dependencies
The only csv4j's compile dependency is Guava. It is used for wrapping primitive types into wrapper types (see com.google.common.primitives.Primitives) and also for checking method preconditions.
For testing, csv4j uses TestNG.
or gradle:
Release
Csv4j is released in the central maven repository. Add it as a dependency in your project with maven:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<dependency> | |
<groupId>org.ytheohar</groupId> | |
<artifactId>csv4j</artifactId> | |
<version>1.0</version> | |
</dependency> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
'org.ytheohar:csv4j:1.0' |
Use Cases
Use case 1
Let's say you have the 3 following CSV files.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
field0 | field1 | field2 | |
---|---|---|---|
0 | csv | 3.14 | |
1 | 4 | 2.71 | |
2 | j | 1.61 | |
3 | is awesome | 1.41 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
field0 | field1 | field4 | field2 | |
---|---|---|---|---|
0 | csv | 5 | 3.14 | |
1 | 4 | 76 | 2.71 | |
2 | j | 23 | 1.61 | |
3 | is awesome | -1 | 1.41 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
field0 | field2 | |
---|---|---|
0 | 3.14 | |
1 | 2.71 | |
2 | 1.61 | |
3 | 1.41 |
You would model this info in a domain type, let's call it SimpleDomainType, as follows:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class SimpleDomainType { | |
private int field0; | |
private String field1; | |
private double field2; | |
public SimpleDomainType() { | |
} | |
public void setField0(int field0) { | |
this.field0 = field0; | |
} | |
public void setField1(String field1) { | |
this.field1 = field1; | |
} | |
public void setField2(double field2) { | |
this.field2 = field2; | |
} | |
//any other methods of your wish here | |
... | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Path p = // path to the csv file | |
Hydrator<SimpleDomainType> hydrator = Hydrator.of(SimpleDomainType.class); | |
List<SimpleDomainType> objects = hydrator.fromCSV(p); |
Use case 2
Java fields are not restricted to instances of String and primitive types, as shown in use case 1, but can be of any type T as long as it defines a static factory method valueOf: String => T. This is needed to map the string read from the CSV file into an instance of type T. Fields field0 and field2 in use case 1 were auto-boxed into Integer, Double respectively, which both define a proper valueOf method. To make this clear you could write another domain type, let's call it ComplexDomainType as follows:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class ComplexDomainType { | |
private MyInt field0; | |
private String field1; | |
private double field2; | |
public ComplexDomainType() { | |
} | |
public void setField0(MyInt field0) { | |
this.field0 = field0; | |
} | |
public void setField1(String field1) { | |
this.field1 = field1; | |
} | |
public void setField2(double field2) { | |
this.field2 = field2; | |
} | |
//any other methods of your wish here | |
... | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class MyInt { | |
private final int n; | |
public MyInt(int n) { | |
this.n = n; | |
} | |
public static MyInt valueOf(String s) { | |
return new MyInt(Integer.valueOf(s)); | |
} | |
} |
The following code maps a CSV file into a list of ComplexDomainType objects:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Path p = //path to the csv file | |
Hydrator<ComplexDomainType> hydrator = Hydrator.of(ComplexDomainType.class); | |
List<ComplexDomainType> objects = hydrator.fromCSV(p); |
Use case 3
Consider now that you have the following 2 CSV files:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
field0 | field1 | field2 | |
---|---|---|---|
0 | csv | 3.14 | |
1 | 4 | 2.71 | |
2 | j | 1.61 | |
3 | is awesome | 1.41 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
field0 | field3 | field2 | |
---|---|---|---|
0 | csv | 3.14 | |
1 | 4 | 2.71 | |
2 | j | 1.61 | |
3 | is awesome | 1.41 |
And let's say you have a domain type AnnotatedDomainType with fields:
- int field0;
- String att1;
- double att2;
Obviously java field field0 will be matched with CSV field field0. You want to match java field att2 with CSV field2. Finally, you know that both field1 in data.csv and field3 in data2.csv refer to the same entity, so you want java field att1 to match both field1 and field3. You can achieve this by defining AnnotatedDomainType as follows:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
public class AnnotatedDomainType { | |
// no annotation, so it will be matched with "field0" csv field | |
private int field0; | |
// match att1 with both "field1" and "field3" csv fields | |
@CsvFields({ "field1", "field3" }) | |
private String att1; | |
// match att2 with "field2" | |
@CsvFields({ "field2" }) | |
private double att2; | |
public AnnotatedDomainType() { | |
} | |
public void setField0(int field0) { | |
this.field0 = field0; | |
} | |
public void setAtt1(String att1) { | |
this.att1 = att1; | |
} | |
public void setAtt2(double att2) { | |
this.att2 = att2; | |
} | |
// any other method of your wish here | |
... | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Path p = // path to csv file | |
Hydrator<AnnotatedDomainType> hydrator = Hydrator.of(AnnotatedDomainType.class); | |
List<AnnotatedDomainType> objects = hydrator.fromCSV(p); |
The current assumption is that each CSV file contains either field1 or field3. If a file contains both, setAtt1 method will be called twice and att1 will finally have the value of the last CSV field in the order they appear in the CSV header line.
Find the complete code for this example at AnnotatedDomainType.java and HydratorTest.java (annotatedDomainType test).
Outro
That is it, I wish you enjoy csv4j as much as I do. Until the next post have fun and take a look at Sane.java, but that deserves a separate post. Stay tuned.
No comments:
Post a Comment