Working with CSV
Learn how to read and write Comma-Separated Values (CSV) files using asynchronous file operations and string manipulation, and understand the limitations of manual parsing.
We'll cover the following...
CSV (Comma-Separated Values) is a universally accepted text format for storing and exchanging tabular data. Spreadsheets, databases, and reporting tools heavily rely on CSV files for data exports.
The structure of a CSV file is highly predictable. The file consists of multiple lines. The first line is typically a header row defining the column names. Every subsequent line represents a single data record. Within each line, individual data fields are separated by a comma.
Id,Name,Price1,Laptop,999.992,Wireless Mouse,25.503,Mechanical Keyboard,85.00
How to read a CSV file
To process a CSV file, we must read the text from the disk and parse it into strongly typed C# objects. Instead of reading the entire file as one massive string using File.ReadAllTextAsync, we use File.ReadAllLinesAsync(). This method reads the file and returns an array of strings, where each element represents a single line.
We iterate through this array, skipping the header row. For each data row, we use the string.Split(',') method to divide the text into an array of individual fields.
Line 7: We asynchronously retrieve an array of strings representing every line in the file.
Line 10: We begin our
forloop at index1. Index0contains the “Id,Name,Price” header string, which cannot be parsed into numerical product data.Line 13: We separate the raw string by its commas, generating a new
string[]containing the three distinct values.Lines 16–18: We parse the string values into their appropriate C# data types.
How to write a CSV file
Writing data to a CSV file requires the reverse process. We must take a collection of C# objects, format their properties into comma-separated strings, and write those strings to the disk.
We construct a List<string>, manually append the header row first, and then iterate through our objects using string interpolation to format each data row. Finally, we use File.WriteAllLinesAsync() to safely write the entire collection to a file.
Line 14: We explicitly add the column headers as the first entry in our string collection.
Lines 17–20: We iterate over the
Productlist. We use string interpolation to seamlessly inject the commas between the property values.Line 23: We pass our fully constructed
List<string>directly intoFile.WriteAllLinesAsync. This method safely opens the unmanaged file stream, writes each string on a new line, and disposes of the lock.
Limitations of manual parsing
Using string.Split(',') works perfectly for simple datasets. However, CSV data in the real world is rarely perfectly clean. The most common edge case occurs when the data itself contains a comma.
Consider a product with the following name: Monitor, 27-inch.
If we format this into a CSV string, it looks like this: 1,Monitor, 27-inch,199.99
The string.Split method separates the string at every comma without evaluating the surrounding context. If we call Split(',') on this row, it will produce four fields instead of three. The price parsing logic will immediately throw an exception because it attempts to parse the string “ 27-inch” into a double.
The official CSV specification dictates that fields containing commas must be wrapped in double-quotes (e.g., 1,"Monitor, 27-inch",199.99). Manually writing C# logic to detect quotes and ignore internal commas is complex and error-prone. To handle these edge cases reliably, developers utilize robust third-party NuGet packages like CsvHelper. These libraries manage the complicated parsing logic automatically, allowing developers to safely map complex CSV files directly to C# models.