Bart Simons

Bart Simons


Thoughts, stories and ideas.

Bart Simons
Author

Share


Tags


Twitter


Parsing CSV files in C# with CsvParser

Bart SimonsBart Simons

CSV, also known as comma separated values, is a widely used (open) file format that I often use for exporting data from Microsoft Excel. Since data points are separated by a separation character - commas and semicolons are most often used for this - which makes it easy to read CSV files programatically. But what is the best approach? As the title suggests, I am going to demonstrate a state of the art implementation of a CSV reader in C#.

The simple approach (and why it's so bad)

The most straight forward and direct method is to read each line of the CSV file iteratively, and separating data points by splitting each line as string type by the separation character, resulting in an array. There are lots of situations where this could go wrong. One example at which things could go wrong is data validation: the separation character itself can occur as an escaped value in the CSV file, which causes corruption during the string split process.

Do not reinvent the wheel - a better implementation is already available!

CsvHelper is a C# library to handle CSV parsing for you. It is created by Josh Close, and I'm a big fan! You can easily add it as a dependency to your projects through NuGet:

Install-Package CsvHelper

To begin with an example of a CsvHelper implementation, we first need to have some CSV information available. I have made some sample data for this demo available as a gist on GitHub.

According to the headers of the file, we can now create our class implementation parallel to the headers of the CSV file:

class Traffic
{
    public String Datum        { get; set; }
    public String Jaar         { get; set; }
    public String Mnd          { get; set; }
    public String Dag          { get; set; }
    public String Ticvanri     { get; set; }
    public String Ticvan       { get; set; }
    public String Richt        { get; set; }
    public String Hm           { get; set; }
    public String Oorz         { get; set; }
    public String Begt         { get; set; }
    public String StUur        { get; set; }
    public String StMin        { get; set; }
    public String Eindt        { get; set; }
    public String EindUur      { get; set; }
    public String EindMin      { get; set; }
    public String Zwaarte      { get; set; }
    public String GemLeng      { get; set; }
    public String Duur         { get; set; }
    public String Dagnr        { get; set; }
    public String Weeknr       { get; set; }
    public String Dagsoort     { get; set; }
    public String G_L          { get; set; }
    public String Provinci     { get; set; }
    public String Routelet     { get; set; }
    public String Routenum     { get; set; }
    public String Routeoms     { get; set; }
    public String Naam_Van     { get; set; }
    public String Naam_Naa     { get; set; }
    public String Hm_Van       { get; set; }
    public String Hm_Naar      { get; set; }
    public String Traj_Van     { get; set; }
    public String Traj_Naa     { get; set; }
    public String Flricht      { get; set; }
    public String FilesAgvWerk { get; set; }
    public String IdWerk       { get; set; }
}

We can now implement a list to store all of our Traffic instances:

List<Traffic> ListTraffic = new List<Traffic>();

And this is how we iterate through all objects inside the CSV file:

using (TextReader reader = File.OpenText(@"/Users/bart/Downloads/Work.csv"))
{
    CsvReader csv = new CsvReader(reader);
    csv.Configuration.Delimiter = ";";
    csv.Configuration.MissingFieldFound = null;
    while (csv.Read())
    {
        Traffic Record = csv.GetRecord<Traffic>();
        ListTraffic.Add(Record);
    }
}

The ListTraffic list object is now filled with traffic information, aggregated from the CSV file. Don't forget to show you support to Josh Close and his awesome CsvHelper. It has helped me a lot!

Bart Simons
Author

Bart Simons

Comments