Tuesday, August 23, 2016

ADO.NET: Reading CSV Files with ADO.NET fails when a Filename contains Special Characters

Most of my blog postings are about success meaning I demonstrate how do perform a certain task. This is a blog posting about failure, namely it is impossible to use ADO.NET to read CSV file into a DataTable if the file name has special characters. This posting is to demonstrate one of pitfalls associated with a previously discussed top ADO.NET: Reading CSV Files.

To understand the limitation of ADO.NET parting a CSV file using the Microsfot Jet OLEDB provider  (Microsoft.Jet.OLEDB.4.0) consider the following CSV file:
    AdventureWorks.Person.Person.csv

Do not mistake the file for a table in AdventureWorks using a name qualified by database, schema and database object name  (Database.Schema.DBObject). AdventureWorks.Person.Person.csv is a file residing on disk.

The code that should be able to generate a DataTable from the AdventureWorks.Person.Person.csv CSV file is as follows:

public static DataTable GetDataTableFromCSV(
                          string folderPath, string csvFileName)
{
    DataTable table = new DataTable();
    string connectionText = String.Format(
        "Provider=Microsoft.Jet.OLEDB.4.0;Data Source='{0}'" +
             ";Extended Properties='text;HDR=Yes';",
        folderPath);
    string commandText = String.Format(
                       "SELECT * FROM [{0}]", csvFileName);

    using (OleDbConnection connection = 
                  new OleDbConnection(connectionText))
    {
        connection.Open();
        using (OleDbDataAdapter command = new 
                OleDbDataAdapter(commandText, connection))
        {
            command.Fill(table);
            command.Dispose();
        }

        connection.Close();
    }

    return table;
}

private const string _csvFilename = 
         "AdventureWorks.Person.Person.csv";

static void Main(string[] args)
{
    DataTable table;
    string executableFilename = Environment.GetCommandLineArgs()[0];
    string executablePath = 
              Path.GetDirectoryName(executableFilename);

    try
    {
        string fullyQUalifiedFilename = 
              Path.Combine(executablePath, _csvFilename);

        if (File.Exists(fullyQUalifiedFilename))
        {
            Console.WriteLine("File found on disk: " + 
                              fullyQUalifiedFilename);
        }

        table = GetDataTableFromCSV(executablePath, _csvFilename);
    }

    catch (Exception ex)
    {
        Console.Error.WriteLine("Error: " + ex.Message);
    }
}

The previous code tests using File.Exists that the file exists on disk:
    if (File.Exists(fullyQUalifiedFilename))
    {
        Console.WriteLine("File found on disk: " + 
                          fullyQUalifiedFilename);
    }

The File.Exists method in the previous code returns try as the file exists on disk.

The command passed to the OleDbDataAdapter object is as follows:
    string commandText = String.Format(
                       "SELECT * FROM [{0}]", csvFileName);

Notice the that the filename is surrounded by square brackets. The square brackets should allow the query to  handling filenames that contain special characters such as periods or spaces. The previous command text generates an exception at the line of code indicated below (command.Fill(table);):
    string commandText = String.Format(
                     "SELECT * FROM [{0}]", csvFileName);

    using (OleDbConnection connection = 
              new OleDbConnection(connectionText))
    {
        connection.Open();
        using (OleDbDataAdapter command = 
              new OleDbDataAdapter(commandText, connection))
            {
                command.Fill(table); // Exception generated here
                command.Dispose();
            }

The irony is that the follow T-SQL executes successfully when a table name contains period and a space:
    SELECT * FROM [NameWithPeriod.AndA Space]

To further test ADO.NET's accessing CSV files with the jet provider the following files will be tested:
  • FileWithA Space.csv: succeeds when surrounded by square brackets
  • FileWithA-Dash.csv: succeeds when surrounded by square brackets
  • FileWithLone.Period.csv: fails even when surrounded by square brackets
A second test will be run on the files but this time using the following command text where the CSV file name is not surrounded by square brackets:
    string commandText = String.Format(
                       "SELECT * FROM {0}", csvFileName);

When the application is running without the square brackets the following behavior is exhibited for the same CSV files:
  • FileWithA Space.csv: fails
  • FileWithA-Dash.csv: fails
  • FileWithLone.Period.csv: fails

The error generated by attempting to access the previous CSV files without using square brackets is as follows:
    Syntax error in FROM clause.

Conclusion

Clearly using square brackets around the CSV file name in the command text should be part of any project using ADO.NET to parse CSV files. The square brackets allow spaces in the filename and dashes in the filename.

Recall in an early entry the 2004 MSDN article was referenced as introducing accessing CSV files using ADO.NET and the Jet OLEDB Provider: Much ADO About Text Files. Be aware that previously referenced article does not use square brackets. As was mentioned in a previous post, the article does not surround the folder path in the connection string with single quotes. The single quotes around the folder path in the connection string and the square brackets around the CSV filename in the command text should be standard operation procedure when coding using this technology.

Accessing Source Code

The source code for this project can be found at GitHub at https://github.com/softwarepronto/Blog. The solution containing the code is ADODotNetReadingCSVFiles. The projects of note are:
  • CSVToDataTableUsingADODotNetBadFilename
  • CSVToDataTableUsingADODotNetAFewBadFilenames

No comments :

Post a Comment