Tuesday, August 23, 2016

ADO.NET: Reading CSV Files with ADO.NET fails when a Filename contains Special Characters

Most of my blog postings are about success meaning I demonstrate how do perform a certain task. This is a blog posting about failure, namely it is impossible to use ADO.NET to read CSV file into a DataTable if the file name has special characters. This posting is to demonstrate one of pitfalls associated with a previously discussed top ADO.NET: Reading CSV Files.

To understand the limitation of ADO.NET parting a CSV file using the Microsfot Jet OLEDB provider  (Microsoft.Jet.OLEDB.4.0) consider the following CSV file:
    AdventureWorks.Person.Person.csv

Do not mistake the file for a table in AdventureWorks using a name qualified by database, schema and database object name  (Database.Schema.DBObject). AdventureWorks.Person.Person.csv is a file residing on disk.

The code that should be able to generate a DataTable from the AdventureWorks.Person.Person.csv CSV file is as follows:

public static DataTable GetDataTableFromCSV(
                          string folderPath, string csvFileName)
{
    DataTable table = new DataTable();
    string connectionText = String.Format(
        "Provider=Microsoft.Jet.OLEDB.4.0;Data Source='{0}'" +
             ";Extended Properties='text;HDR=Yes';",
        folderPath);
    string commandText = String.Format(
                       "SELECT * FROM [{0}]", csvFileName);

    using (OleDbConnection connection = 
                  new OleDbConnection(connectionText))
    {
        connection.Open();
        using (OleDbDataAdapter command = new 
                OleDbDataAdapter(commandText, connection))
        {
            command.Fill(table);
            command.Dispose();
        }

        connection.Close();
    }

    return table;
}

private const string _csvFilename = 
         "AdventureWorks.Person.Person.csv";

static void Main(string[] args)
{
    DataTable table;
    string executableFilename = Environment.GetCommandLineArgs()[0];
    string executablePath = 
              Path.GetDirectoryName(executableFilename);

    try
    {
        string fullyQUalifiedFilename = 
              Path.Combine(executablePath, _csvFilename);

        if (File.Exists(fullyQUalifiedFilename))
        {
            Console.WriteLine("File found on disk: " + 
                              fullyQUalifiedFilename);
        }

        table = GetDataTableFromCSV(executablePath, _csvFilename);
    }

    catch (Exception ex)
    {
        Console.Error.WriteLine("Error: " + ex.Message);
    }
}

The previous code tests using File.Exists that the file exists on disk:
    if (File.Exists(fullyQUalifiedFilename))
    {
        Console.WriteLine("File found on disk: " + 
                          fullyQUalifiedFilename);
    }

The File.Exists method in the previous code returns try as the file exists on disk.

The command passed to the OleDbDataAdapter object is as follows:
    string commandText = String.Format(
                       "SELECT * FROM [{0}]", csvFileName);

Notice the that the filename is surrounded by square brackets. The square brackets should allow the query to  handling filenames that contain special characters such as periods or spaces. The previous command text generates an exception at the line of code indicated below (command.Fill(table);):
    string commandText = String.Format(
                     "SELECT * FROM [{0}]", csvFileName);

    using (OleDbConnection connection = 
              new OleDbConnection(connectionText))
    {
        connection.Open();
        using (OleDbDataAdapter command = 
              new OleDbDataAdapter(commandText, connection))
            {
                command.Fill(table); // Exception generated here
                command.Dispose();
            }

The irony is that the follow T-SQL executes successfully when a table name contains period and a space:
    SELECT * FROM [NameWithPeriod.AndA Space]

To further test ADO.NET's accessing CSV files with the jet provider the following files will be tested:
  • FileWithA Space.csv: succeeds when surrounded by square brackets
  • FileWithA-Dash.csv: succeeds when surrounded by square brackets
  • FileWithLone.Period.csv: fails even when surrounded by square brackets
A second test will be run on the files but this time using the following command text where the CSV file name is not surrounded by square brackets:
    string commandText = String.Format(
                       "SELECT * FROM {0}", csvFileName);

When the application is running without the square brackets the following behavior is exhibited for the same CSV files:
  • FileWithA Space.csv: fails
  • FileWithA-Dash.csv: fails
  • FileWithLone.Period.csv: fails

The error generated by attempting to access the previous CSV files without using square brackets is as follows:
    Syntax error in FROM clause.

Conclusion

Clearly using square brackets around the CSV file name in the command text should be part of any project using ADO.NET to parse CSV files. The square brackets allow spaces in the filename and dashes in the filename.

Recall in an early entry the 2004 MSDN article was referenced as introducing accessing CSV files using ADO.NET and the Jet OLEDB Provider: Much ADO About Text Files. Be aware that previously referenced article does not use square brackets. As was mentioned in a previous post, the article does not surround the folder path in the connection string with single quotes. The single quotes around the folder path in the connection string and the square brackets around the CSV filename in the command text should be standard operation procedure when coding using this technology.

Accessing Source Code

The source code for this project can be found at GitHub at https://github.com/softwarepronto/Blog. The solution containing the code is ADODotNetReadingCSVFiles. The projects of note are:
  • CSVToDataTableUsingADODotNetBadFilename
  • CSVToDataTableUsingADODotNetAFewBadFilenames

Monday, August 22, 2016

Web API 2: C# Upload File using HTML, CURL and .NET Clients

This posting presents the code used on the server-side to receive a file from a web client (Web API 2, C#). Also shown will be  three clients: HTML, C#/.NET and CURL. As I've said previous, CURL is my preferred HTTP client as it is platform agnostic. An IOS developer should be able to reverse engineer any web service example written in CURL.

HTML Client

Before presenting the server code used to upload a file, consider the following web page, detault.html, which allows a file to be uploaded:



The HTML code for the page is as follows:

<!DOCTYPE html>
<html>
<head>
    <title>Test file upload</title>
    <meta charset="utf-8" />
</head>
<body>
    <form action="http://localhost:4323/API/Files"
          enctype="multipart/form-data" method="post">
        <p>
            Select a file to upload:<br>
            <input type="file" name="datafile" size="50">
        </p>
        <div>
            <input type="Submit" value="Send">
        </div>
    </form>
</body>
</html>

The page contains a form which will perform an HTTP POST to the development URL:
    http://localhost:4323/API/Files

The page contains an input control of type file (named "datafile") which is clearly visible in the screenshot. The server side code needs to receive an HTTP POST and process the file accordingly.

Web Service

A controller named Files was was implemented in the web service which means the URL used by the HTML client makes sense as the web service URL contains API\Files:
    http://localhost:4323/API/Files

The code is as follows where the method implemented is Post() corresponding POST protocol used by the HTML client:

public class FilesController : ApiController
{
    private const string _fileUploadName = "datafile";

    public int Post()
    {
        string serverFolder = HostingEnvironment.MapPath(
                                      "~/Uploads/");
        HttpFileCollection files =       
             HttpContext.Current.Request.Files;
        HttpPostedFile postedFile = files[_fileUploadName];

        if (postedFile == null)
        {
            return 0;
        }

        string destinationFilename = Path.Combine(
                    serverFolder, 
                    Path.GetFileName(postedFile.FileName));

        if (File.Exists(destinationFilename))
        {
            return 0;
        }

        postedFile.SaveAs(destinationFilename);

        return 1;
    }
}


The following code creates path to the Uploads folder such that is resides under the web services folder:
    string folder = HostingEnvironment.MapPath("~/Uploads/");

The request, System.Web.HttpContext.Current.Request, HttpRequest contains a Files collection corresponding to the file or files uploaded:
    HttpFileCollection files = 
        System.Web.HttpContext.Current.Request.Files;

The file is stored in the Files collection with key "datafile":
    private const string _fileUploadName = "datafile";

    HttpPostedFile postedFile = files[_fileUploadName];

The HttpPostedFile class contains a SaveAs method that allows the files to be saved to physical media on the server:
    postedFile.SaveAs(destinationFilename);

Curl Client

The CURL code to upload a file is as follows:
curl -X POST -Fdatafile=@FileToUpload.txt http://localhost:4323/API/Files

The -X POST command-line parameters means CURL will execute a POST message. The -F is the form data where the input is named "datafile" and is associated with a file, @FileToUpload.Txt. The @ prefix indicates a file. The file resides in the same folders as the CURL script. The FileToUpload.txt is ultimately what gets uploaded.

C# Client

The C# client that uploads a file is follows:

private const string _fileToUpload = "FileToUpload.txt";

private const string _fileUploadName = "datafile";

private const string _uploadEndPoint = @"http://localhost:4323/API/Files";

static void Main(string[] args)
{
    using (HttpClient httpClient = new HttpClient())
    {
        using (MultipartFormDataContent content = 
                         new MultipartFormDataContent())
        {
            using (FileStream stream = File.Open(
                  _fileToUpload, FileMode.Open, FileAccess.Read))
            {
                using (StreamContent streamConent = 
                                    new StreamContent(stream))
                {
                    Task taskUpload;

                    content.Add(
                      streamConent, _fileUploadName, _fileToUpload);
                    taskUpload = httpClient.PostAsync(
                          _uploadEndPoint, content);
                    taskUpload.Wait();
                    if (taskUpload.Status == 
                                TaskStatus.RanToCompletion)
                    {
                        Console.WriteLine("File uploaded");
                    }

                    else
                    {
                        Console.WriteLine("File uploaded");
                    }
                }
            }
        }
    }
}

The HTTP client class (the class that performs the actualy post) is called (appropriately) HttpClient:
    HttpClient httpClient

The variable corresponding to the web forms's controls is MultipartFormDataContent (the form content) which is declared as follows:
    using (MultipartFormDataContent content = 
                         new MultipartFormDataContent())

The physical file is opened as a FileStream and that FileStream is associated with a StreamContent instance used to associated the file as part of the Form content submitted:

    using (FileStream stream = File.Open(
                  _fileToUpload, FileMode.Open, FileAccess.Read))
    {
        using (StreamContent streamConent = 
                                    new StreamContent(stream))

When the stream content is associated with the form the Add method is invoked. The second parameter of the Add method is passed the value _fileUploadName. The data associatd with _fileUploadName is filedata which was the name of the file input control in the original HTML client demonstrated:
    content.Add(streamConent, _fileUploadName, _fileToUpload);

The HTTP POST supported by an asynchronous method within HttpClient:
    taskUpload = httpClient.PostAsync(_uploadEndPoint, content);
    taskUpload.Wait();

Source Code

The source code for this project can be found in GitHub at https://github.com/softwarepronto/Blog:
  • C# Web Service: DemoWebService01
  • HTML Upload Gage: DemoWebClient01\default.html
  • C# file upload client: Client project
  • Curl file upload script: Tests\Local\JSON\PostAFile.bat

Friday, August 19, 2016

ADO.NET: Reading CSV Files with ADO.NET fails for .NET 4.0 using Solution Platforms x64 or Any CPU

My development machine runs 64-bit Windows 10. The current application I was adding features to is C#, .NET 4.0 configured to build with Solution Platform, Any CPU. This means that on a 64-bit machine (like my development machine) the code will run as a 64-bit executable. On a 32-bit machine the code will run as a 32-bit executable. The feature I added was the ability to read a CSV file from disk and manipulate the data using a DataTable. This is the same functionality presented in "ADO.NET: Reading CSV Files."

The code (show below) generates an exception at the line connection.Open(). The message associated with the exception is:
The 'Microsoft.Jet.OLEDB.4.0' provider is not registered on the local machine.


The code which demonstrates this error but of more importance is the version of .NET the application is built with (.NET 4.0) and the Solution Platform (Any CPU). It should be noted that a Solution Platform of x64 would also generate the same exception:


There is no real solution to this issue is there is no 64-bit version of the driver for .NET 4.0 Obviously changing the platform solution to x86 will permit the code to run. That is not a solution. It is a work around. Also upgrading the code to run with 4.5 or later would fix the issue but the only reason this is an issue is that there a requirement to develop a 64-bit application using .NET 4.0.

Changing the version of .NET for the project  can be addressed using Project Properties, the Applications tab:



Changing the platform target for the project to x86 can be addressed using Project Properties, the Build tab:


The code for this application (which obviously generates an exception) can be found on Github at https://github.com/softwarepronto/Blog, The solution is named ADODotNetReadingCSVFiles and the project is named CVSToDataTableUsingADODotNet4.0.




Thursday, August 18, 2016

ADO.NET: Reading CSV Files

It is 2016 and generating CSV files is a mature technology (can it really be called a technology?). For example ADO.NET has been able to consume a CSV file and create a corresponding DataTable for at least a dozen years (see the MSDN article from March 5, 2004, Much ADO About Text Files).  This post will cover the correct way to use ADO.NET to access CSV files and transform said files into an ADO.NET DataTable.

How not to do it

The impetus for this blog posting is a project I was assigned which accessed CSV files all the wrong ways. I was taken aback recently with a 2016  era project when I saw it include code to read each line in a text file (System.IO.File.ReadAllLines) and use Array.Split to process the CSV and at the same time to do it in a flawed manner. An example of this travesty against Alan Turing and all things good is as follows:

private static DataTable ReadCSVFile()
{
    string[] rows = File.ReadAllLines(GetCSVFilename());
    string[] columnNames = null;
    DataTable table = new DataTable();

    if (rows.Length == 0)
    {
        return table; // error empty CSV file
    }

    columnNames = rows[0].Split(',');
    foreach (string columnName in columnNames)
    {
        table.Columns.Add(columnName);
    }

    object[] rowValues = new object[columnNames.Length];

    // start from 1 b/c rowindex=0 is the header row of column names
    for (int rowIndex = 1; rowIndex < rows.Length; rowIndex++)
    {
        string rowLine = rows[rowIndex];
        // flaw in code -- any string containing a column will 
        // generate an extra column
        string[] columnValues = rowLine.Split(',');

        if (columnNames.Length != columnValues.Length)
        {
          throw new Exception(
            "Split() failed because there is a comma in the data.");

        }

        for (int columnIndex = 0; 
             columnIndex < columnValues.Length; 
             columnIndex++)
        {
            rowValues[columnIndex] = columnValues[columnIndex];
        }

        table.Rows.Add(rowValues);
    }

    return table; 
}

Pay close attention to the code marked in boldface which is a core flaw in how the CSV values are parsed. This bug is the reason it is better to use established technologies. The  bug show up at line 291 due this text data that contains a column:
  For urgent issues, do not send e-mail

How to access a CSV file using ADO.NET's OLE DB Provider

This article will demonstrated a cleaner way to convert a CSV file into a DataTable. The previously referenced MSDN article that demonstrates CSV file to DataTable code is actually flawed with respect to how connections to CSV files are handled and how the command text related to accessing values is written.  The example presented here will address these issues and said example is as follows:

public static DataTable GetDataTableFromCSV(string folderPath, 
                                            string csvFileName)
{
    DataTable table = new DataTable();
    string connectionText = String.Format(
        "Provider=Microsoft.Jet.OLEDB.4.0;Data Source='{0}';" 
          "Extended Properties='text;HDR=Yes';",
        folderPath);
    string commandText = String.Format(
        "SELECT * FROM [{0}]", csvFileName);

    using (OleDbConnection connection = 
                  new OleDbConnection(connectionText))
    {
        connection.Open();
        using (OleDbDataAdapter command = 
             new OleDbDataAdapter(commandText, connection))
        {
            command.Fill(table);
            command.Dispose();
        }

        connection.Close();
    }
}

The connection text in the previous code includes the folder in which the CSV file resides. The folder path is assigned to the Data Source property of the connection string as follows:
  "Provider=Microsoft.Jet.OLEDB.4.0;Data Source='{0}';" 
    "Extended Properties='text;HDR=Yes';",

Notice that the {0} String.Format parameters is bracketed by single quotes: '{0}'. This single quotes protect the code from any special characters in the folder path that would cause an error in the connection string.

The command text is a SQL SELECT where the CSV filename is the value specified for the FROM clause:
  "SELECT * FROM [{0}]"

Notice that the {0} String.Format parameters is bracketed by square brackets. In a perfect would this would protect the command execution from special characters in the filename.

Accessing the Source Code

Each project used in this application can be found in Github for user SoftwarePronto (https://github.com/softwarepronto/Blog). The solution associated with blog is:

  • ADODotNetReadingCSVFiles

The pertinent projects are:

  • BarbaricCSVAccess
  • CSVToDataTableUsingADODotNet 


Each project includes a sample CSV file that is part of the project and is copied to the build folder. The settings associated with supporting thsi were discussed in a previous posting, "Visual Studio Installer Projects: Installing PDFs and other Content with an Install Project".

Thursday, August 11, 2016

Windows Forms: Determining the size required to display a string (Graphics MeasureString)

Before the age of computers, typewriters could write pica (10 letters per-inch) and elite (12 letters per-inch). With fixed width fonts it was clear how much text took how much space.  That was decades ago so consider the case of a Windows Forms a TextBox where the fonts are not fixed width and there is support for different font sizes. There needs to be a way to insurance the size of the TextBox is large enough (the Size property) to accommodate the length of the text assigned to the TextBox's Text property.

The Windows forms application below is not very creative. A numeric up/down control specifies the number of characters to display in a TextBox. Below 1 character (a zero) is displayed in the TextBox:


Below 2 characters (a zero and a one) are displayed in the TextBox:




Below 6 characters (a zero through five) are assigned to the TextBox but the number five is not displayed as the TextBox is not off sufficient width::


The code associated with the flawed application is follows:

public partial class ShowWhatIsNotSeen : Form
{
  public ShowWhatIsNotSeen()
  {
    InitializeComponent();
  }

  private void HandleTextToShow()
  {
    textBoxDisplayData.Text = String.Empty;
    for (int i = 0; i < numericUpDownDataToShow.Value; i++)
    {
      textBoxDisplayData.Text += i.ToString();
    }
  }

  private void numericUpDownRowsToShow_ValueChanged(object sender, 
                                                    EventArgs e)
  {
    HandleTextToShow();
  }

  private void ShowWhatIsNotSeen_Load(object sender, EventArgs e)
  {
    HandleTextToShow();
  }
}

The following code demonostrates the Graphics class's MeasureString is used to insure the TextBox is large enough to hold the string assigned to its Text property:

    private void HandleTextToShow()
    {
      textBoxDisplayData.Text = String.Empty;
      for (int i = 0; i < numericUpDownDataToShow.Value; i++)
      {
        textBoxDisplayData.Text += i.ToString();
      }

      using (Graphics graphics = CreateGraphics())
      {
        SizeF size = 
           graphics.MeasureString(textBoxDisplayData.Text, 
                                  textBoxDisplayData.Font);
        int width = (int)Math.Ceiling(size.Width) + 
                    textBoxDisplayData.Margin.Left + 
                    textBoxDisplayData.Margin.Right;
        int height = (int)Math.Ceiling(size.Height) + 
                     textBoxDisplayData.Margin.Top + 
                     textBoxDisplayData.Margin.Bottom;

        textBoxDisplayData.Size = new Size(width, height);
      }
    }

Wednesday, August 10, 2016

Windows Forms:TableLayoutPanel's Hidden Methods GetColumnWidths/GetRowHeights

The TableLayoutPanel Windows Forms control allows a grid of controls to be displayed. This is a handy control for organizing uniform layouts but when using TableLayoutPanel it is not clear how to determine the widths of each column and the height of each row. The TableLayoutPanel  control contains the GetColumnWidths and GetRowHeights that provide, respectively array of column widths and row heights for the table. These methods are as follows:
  [Browsable(false)]
  [EditorBrowsable(EditorBrowsableState.Never)]
  public int[] GetRowHeights();

  [Browsable(false)]
  [EditorBrowsable(EditorBrowsableState.Never)]
  public int[] GetColumnWidths();

As was demonstrated in "C#: EditorBrowsabilityAttribute and hiding Methods, Properties and Field from IntelliSense," the EditorBrowsability attribute hides a method from IntelliSense when the attribute is constructed by passing in a value of EditorBrowsableState.Never. This means that GetColumnWidths and GetRowHeights are hidden from developers which is odd because the documentation does not say these methods are deprecated (what reason does Microsoft have for hiding these rather useful methods?). Here we see that the GetRowHeights method is excluded from the methods displayed by IntelliSense:


 Even though the GetRowHeights method is not visible it completely valid to use the method in the code. The previous code snippet is part of the code behind of the following Windows Form:



The previous screenshot shows a Windows Forms TableLayoutPanel control containing five rows and three columns: The left column contains a Label in each row of the column and the middle column contains a TextBox in each row of the column. The right column contains no controls.

By clicking on the NumericUpDown control labeled, "Number of Rows to Show", the number of visible rows associated with the TableLayoutPanel control can be changed by summing the heights of the visible rows and assigning the height of the TableLayoutPanel to be equal to the height of all visible rows. The code for handling this as follows, specifically the SetVisibleRows method:


The previous code only works because each row within the TableLayoutPanel  control is set to an absolute pixel width. If the rows were set to be 20% each then changing the table height would not hide rows within the table.

An example of the application displaying two visible rows is as follows:


GetRowHeights was shown to be extremely handy but for some unknown reason Microsoft has choosen to hide this method GetRowHeights and its counterpart, GetColumnsWidths, from developers.