Bart Simons

Bart Simons


Thoughts, stories and ideas.

Bart Simons
Author

Share


Tags


.net .net core Apache C# CentOS LAMP NET Framework Pretty URLs Windows Server WireGuard WireGuard.io access log add analysis android api at the same time authentication authorization automate automation azure azurerm backup bash basics batch bootstrap build capture cheat sheet chromium chroot class cli click to close code snippet command line commands compile compiling compression containers control controller controlling convert cpu usage create credentials csv csvparser curl data dd deployment desktop detect devices disable diskpart dism distributed diy docker dom changes dotnet core drivers ease of access encryption example export file transfer files fix folders generalize getting started ghost ghost.org gui guide gunicorn gzip html html tables icewarp igd imagex import inotify install installation interactive ios iphone itunes java javascript jquery json kiosk kotlin linux live load data loading screen lock screen loopback audio lxc lxd lxml macos manage manually message messages minio mirrored mod_rewrite monitor monitoring mutationobserver mysql nexmo nginx no oobe node node.js nodejs not installing notification notifications object storage on desktop one command openssl owncloud parallels parallels tools parse perfect philips hue play port forwarding portainer.io powershell processing ps-spotify python quick raspberry pi record rip ripping rsync rtmp save save data sbapplication scraping script scripting scriptingbridge scripts security send server service sharedpreferences sms songs sonos spotify spotify api spotlight ssh stack streaming streamlink studio sudo swarm swift sync sysprep system audio systemd tables terminal tracking tutorial twilio ubiquiti ubuntu ubuntu 18.04 ui code unifi unlock unsplash source upnp uptime usb tethering wallpapers wasapi website websites webview windows windows 10 without itunes without oobe workaround xaml

Parsing CSV files in C# with CsvParser

CSV, also known as comma separated values, is a widely used (open) file format that I often use for exporting data from Microsoft Excel. Since data points are separated by a separation character - commas and semicolons are most often used for this - which makes it easy to read CSV files programatically. But what is the best approach? As the title suggests, I am going to demonstrate a state of the art implementation of a CSV reader in C#.

The simple approach (and why it's so bad)

The most straight forward and direct method is to read each line of the CSV file iteratively, and separating data points by splitting each line as string type by the separation character, resulting in an array. There are lots of situations where this could go wrong. One example at which things could go wrong is data validation: the separation character itself can occur as an escaped value in the CSV file, which causes corruption during the string split process.

Do not reinvent the wheel - a better implementation is already available!

CsvHelper is a C# library to handle CSV parsing for you. It is created by Josh Close, and I'm a big fan! You can easily add it as a dependency to your projects through NuGet:

Install-Package CsvHelper

To begin with an example of a CsvHelper implementation, we first need to have some CSV information available. I have made some sample data for this demo available as a gist on GitHub.

According to the headers of the file, we can now create our class implementation parallel to the headers of the CSV file:

class Traffic
{
    public String Datum        { get; set; }
    public String Jaar         { get; set; }
    public String Mnd          { get; set; }
    public String Dag          { get; set; }
    public String Ticvanri     { get; set; }
    public String Ticvan       { get; set; }
    public String Richt        { get; set; }
    public String Hm           { get; set; }
    public String Oorz         { get; set; }
    public String Begt         { get; set; }
    public String StUur        { get; set; }
    public String StMin        { get; set; }
    public String Eindt        { get; set; }
    public String EindUur      { get; set; }
    public String EindMin      { get; set; }
    public String Zwaarte      { get; set; }
    public String GemLeng      { get; set; }
    public String Duur         { get; set; }
    public String Dagnr        { get; set; }
    public String Weeknr       { get; set; }
    public String Dagsoort     { get; set; }
    public String G_L          { get; set; }
    public String Provinci     { get; set; }
    public String Routelet     { get; set; }
    public String Routenum     { get; set; }
    public String Routeoms     { get; set; }
    public String Naam_Van     { get; set; }
    public String Naam_Naa     { get; set; }
    public String Hm_Van       { get; set; }
    public String Hm_Naar      { get; set; }
    public String Traj_Van     { get; set; }
    public String Traj_Naa     { get; set; }
    public String Flricht      { get; set; }
    public String FilesAgvWerk { get; set; }
    public String IdWerk       { get; set; }
}

We can now implement a list to store all of our Traffic instances:

List<Traffic> ListTraffic = new List<Traffic>();

And this is how we iterate through all objects inside the CSV file:

using (TextReader reader = File.OpenText(@"/Users/bart/Downloads/Work.csv"))
{
    CsvReader csv = new CsvReader(reader);
    csv.Configuration.Delimiter = ";";
    csv.Configuration.MissingFieldFound = null;
    while (csv.Read())
    {
        Traffic Record = csv.GetRecord<Traffic>();
        ListTraffic.Add(Record);
    }
}

The ListTraffic list object is now filled with traffic information, aggregated from the CSV file. Don't forget to show you support to Josh Close and his awesome CsvHelper. It has helped me a lot!

Bart Simons
Author

Bart Simons

View Comments