Sunday, October 12, 2014

C#'s yield Keyword

Introduction

The yield keyword in C# represents the only C# interview question I failed to answer -- "What does the yield keyword do in C#?" All I remembered is it had something to do with enumerators.

This post demonstrates the use of the C# keyword, yield. This also covers how to implement an IEnumerable and IEnumerator interface in support of using C#'s foreach keyword. Demonstrating IEnumerable will show the true power of the yield keyword. It takes significantly less code to implement an enumerator with yield than it does with IEnumerable/IEnumerator.

Example

The following code is just a loop that displays all the factorial values for the number 5 (where 5 is the value of the parameter passed to the Factorial method):

foreach (var factor in FactorialManager.Factorial(5))
{
  Console.WriteLine(factor);
}

The output is as follows:


Under the covers the Factorial method implements the IEnumerable or IEnumerable<int> interface and the compiler as part of the foreach keyword invokes IEnumerable.GetEnumerator which returns an IEnumerator or IEnumerator<int>  instance. The IEnumerable and IEnumerator interfaces can be implemented manually or can be implemented by taking advantage of the yield keyword.

Implementing IEnumerable/IEnumerator

The foreach keyword expects the FactorialManager.Factorial method to return a type that implements IEnumerable<int>. The implementation of the FactorialManager class is as follows:

class FactorialManager : IEnumerable<int>
{
  private int _factorialToCompute;

  public FactorialManager(int factorialToCompute)
  {
    _factorialToCompute = factorialToCompute;
  }

  public static IEnumerable<int> Factorial(int factorialToCompute)
  {
    return new FactorialManager(factorialToCompute);
  }

  public IEnumerator<int> GetEnumerator()
  {
    return new FactorialSupport(_factorialToCompute);
  }

  System.Collections.IEnumerator 
    System.Collections.IEnumerable.GetEnumerator()
  {
    return GetEnumerator();
  }
}

The previous code is not complex but it is verbose. The  Factorial method (a class factory) returns an instance of FactorialManager which implements IEnumerable<int>:

  public static IEnumerable<int> Factorial(int factorialToCompute)
  {
    return new FactorialManager(factorialToCompute);
  }


The FactorialManager class instance implemented IEnumerable's GetEnumerator method:

  public IEnumerator<int> GetEnumerator()
  {
    return new FactorialSupport(_factorialToCompute);
  }

GetEnumerator method returns an instance of FactorialSupport. The FactorialSupport class implements the IEnumerator<T> interface therefore it is an appropriate return value for the GetEnumerator method. The IEnumerator<T> interface is defined follows:

public interface IEnumerator<out T> : IDisposable, IEnumerator


The IDisposable requires IEnumerator  to implement:

void Dispose()

The Dispose is used to free unmanaged resources. A factorial implementation has no unmanaged resources so Dispose should just return and do nothing.

The IEnumerator requires IEnumerator<T>  to implement the following property:

T Current { get; }

The Current property will be values such as 1, 2, 6, 24 and 120 depending on the current value returned by the factorial computation. 

The IEnumerator requires IEnumerator<T>  to implement the following methods:

void Reset()
bool MoveNext()

The Reset method initializes the computation. The MoveNext moves to the next iteration. So if the value of Current was 6 (1 * 2 * 3) then MoveNext will set Current to 24 (1 * 2 * 3 * 4).

The FactorialSupport class is implemented as follows:

public class FactorialSupport : IEnumerator<int>
{
  private int _index;

  private int _factorialToCompute;

  private int _factorial;

  public FactorialSupport(int factorialToCompute)
  {
    if (factorialToCompute < 0)
    {
      throw new ArgumentException("Cannot compute factorial for " + 
                                  factorialToCompute.ToString());
    }

    _factorialToCompute = factorialToCompute;
    Reset();
  }

  public int Current
  {
    get { return _factorial; }
  }

  public void Dispose()
  {
    return; // nothing to do here. 
  }

  object System.Collections.IEnumerator.Current
  {
    get { return this.Current; }
  }

  public bool MoveNext()
  {
    if (_index > _factorialToCompute)
    {
      return false;
    }

    _factorial *= _index;
    _index++;

    return true;
  }

  public void Reset()
  {
    _index = 1;
    _factorial = 1;
  }
}

The previous code just implements an IEnumerator that returns factorial values using Current, MoveNext and Reset.

Implementing support for foreach using FactorialManager and FactorialSupport requires approximately seventy lines of code. This is a lot of code for something as trivial as a factorial computation. 

Implementing an Enumerator with yield

A method, property or operator that returns IEnumerable<int> or IEnumerable may contain the yield keyword. The yield keyword is placed before the break or return keywords and is used as follows.

yield return <expression>;
yield break;

An example where the yield keyword is marked in boldface is as follows:

public static IEnumerable<int> Factorial(int n)
{
  if (n == 0)
  {
    n = 1;
  }

  int value = 1;

  for (int i = 1; i <= n; i++)
  {
    value *= i;
    yield return value;
  }
}

static void Main(string[] args)
{
  foreach (var factor in Factorial(5))
  {
    Console.WriteLine(factor);
  }
}

The Factorial method above (approximately ten lines of code) is all this is required to implement the IEnumerable and IEnumerator interfaces required by foreach. The beauty of the yield keyword is that it instructs the compiler to implement:
  • IEnumerable.GetEnumerator
  • IEnumerator.Dispose
  • IEnumerator.Current
  • IEnumerator.MoveNext
  • IEnumerator.Reset

Conclusion

The yield keyword is not a .NET feature. It is a feature built into the C# compiler designed to making support foreach simpler for developers.

Saturday, October 11, 2014

SQL Server: Correlated Subqueries

The Question

I had an interview via web-meeting recently. I was given access to SQL Manager and I had to solve a set of TSQL programming problems. The first programming task they throw at me is a question, "What is a correlated subquery?" I said, "I do not know" and moved on. It turns out I knew what a correlates subquery was but I just did not know such queries were called that.

This post presents an overview of correlates subqueries in TSQL.

Subqueries

A subquery is just a portion of a query that is evaluated before main part of a query. To understand this, consider the following (from the same job interview):

SELECT *
FROM [TableA] AS A
  INNER JOIN [TableB] AS B
    ON A.[AID] = B.[AID]
  INNER JOIN [TableC] AS C
    ON A.[AID] = C.[AID]
  INNER JOIN [TableD] AS D
    ON A.[AID] = D.[AID]
  INNER JOIN [TableE] AS E
    ON A.[AID] = E.[AID]
WHERE [AName] = 'Smith'

The above query has no sub-query. The interviewer explained, "This query is suffering from poor performance. What would you do to optimize it?" The answer is obvious -- use a subquery to get only the required values from TableA before joining with TableB, TableC and TableD. The subquery that should get evaluated first is:

SELECT *
FROM [TableA] AS A
WHERE [AName] = 'Smith'

The optimized query, including the subquery in its entirety, is below where the subquery is shown in boldface:

SELECT *
FROM (
  SELECT *
  FROM [TableA] AS A
  WHERE [AName] = 'Smith') AS A
  INNER JOIN [TableB] AS B
    ON A.[AID] = B.[AID]
  INNER JOIN [TableC] AS C
    ON A.[AID] = C.[AID]
  INNER JOIN [TableE] AS E
    ON A.[AID] = E.[AID]
  INNER JOIN [TableE] AS E
    ON A.[AID] = E.[AID]

The subquery is executed by SQL Server's optimizer first. This will reduce the number of data values and will potentially improve performance when the outer query is executed. The previous query is a simple subquery or a non-correlated subquery.

Correlated Subqueries

A correlated subquery is a subquery that depends on values from the outer query. Technet covers correlated subqueries here Correlated Subqueries. To demonstrate correlated queries, consider a database with tables:
  • Player: this table holds a list of soccer players (football players for you non-Americans)
  • Goal: this table holds a list of goals scored by players
To demonstrate correlated subqueries, a query will be developed that shows all players who have not scored any goals:

SELECT P.[PlayerID], P.[PlayerName]
FROM [Player] AS P
WHERE NOT EXISTS (
  SELECT * 
  FROM [Goal] AS G 
  WHERE P.[PlayerID] = G.[PlayerID])

The outer query contains table Player aliased as P. The subquery contains table Goal aliased as G. The subquery references the table Player from the outer query so the subquery is a correlated subquery. The subquery in this case did not aid performance by reducing the results returned to the outer query. Performance may suffer when using a correlated subquery because the query has to be evaluated for ever row of the outer query.

So yes, I knew how to write a correlated subquery. I just did not know the definition. 


Implementing the query without using a Correlated Subquery

A LEFT JOIN in a subquery  can be used as follows to implement the same query -- all players who have not scored a single goal:

SELECT [PlayerID], [PlayerName]
FROM (
  SELECT P.[PlayerID], P.[PlayerName], G.[GoalID]
  FROM [Player] AS P
    LEFT JOIN [Goal] AS G 
      ON P.[PlayerID] = G.[PlayerID]) AS P
WHERE P.[GoalID] IS NULL

Conclusion

Yes, I moved onto the second interview.

Friday, October 3, 2014

Visual Studio Online (cloud hosted TFS): Accessing Visual Studio Online Bugs/Tasks from Excel

This post is a "how to" covering the steps required to access Visual Studio Online tasks using Excel.

Introduction

Visual Studio Online is a fairly young cloud-based application (released in 2013). With regards to tracking tasks and bugs it does support a variety of configurable reports. Excel by contrast is a more mature reporting generating and data formatting environment (release in 1987). So to take advantage of the Excel, an extremely mature product, Visual Studio Online supports Excel integration. Excel can be used to view tasks/bugs and can even handle the modification of tasks and the creation of new tasks. Excel can also handle bulk import so if you move from Jira (for example) export the bugs to CSV format and then import into Visual Studio Online using Excel.

Excel 2007 and later versions can interface with Team Foundation Service (TFS) and since Visual Studio Online is a cloud-hosted variant of TFS, your minimum Excel version is 2007.

When Excel connects to Visual Studio Online or TFS it uses queries in order to access work items. The general workflow for the interaction between Visual Studio Online and Excel is as follows:

1) Create a Query within the Visual Studio Online instance (web site) or using Team Explorer or using Visual Studio
2) Connect to Visual Studio Online (or TFS) via Excel
3) Retrieve the query results using Excel
4) Add/Modify/Delete work items using Excel
5) Publish changes from Excel to Visual Studio Online (or TFS)

The following shows the steps required in integrating Excel with Visual Studio Online:


The previous diagram is taken MSDN's walk through of how to integrate Excel and Visual Studio Online (Work in Microsoft Excel and Microsoft Project connected to Team Foundation Server). Microsoft's write up is good but it is not as detailed as this posting. Even a person who does not use Visual Studio Online daily (a manager, a product manager, ad hoc QA, et al.) will find every step required to access Visual Studio Online from Excel by following what is documented here.

Signing up for a Visual Studio Online instance was covered by a previous post: Signing up for Visual Studio Online (cloud-based Team Foundation Server).

Prerequisite: Create an Query in Visual Studio Online

As stated previously Excel accesses Visual Studio Online work items through Visual Studio Online queries. Queries are associated with a Visual Studio Online project instance. To view all the queries associated with a Visual Studio Online project perform the following steps:

1) In Internet Explorer navigate to the instance of Visual Studio Online using the instance's URL (the instance in this case is https://AdmiralGraceHopper.visualstudio.com/):


2) If the credentials for the appropriate Microsoft account associated with the Visual Studio Online instance have not been entered, a user will be prompted to enter them:


3) Once the Microsoft Account credentials have been entered for the Visual Studio Online instance, the browser will be redirected to the main screen for the instance:


4) Under "Recent projects & teams" a list of projects is displayed. For this example the project MarkI is selected. If the desired project is not visible under the most recent list, click on the Browse button to select the appropriate Visual Studio Online project:


6) The MarkI project is shown above. To view the queries associated with the instance click on the WORK menu (not the Work "How to" help entry):


7) When the WORK menu item is selected the Backlogs and Queries tabs are visible. By default Queries are visible as follows:


The previous screenshot shows the default queries of the Mark1 project. These are the queries will ultimately be visible to an Excel workbook when accessing the Visual Studio Online instance https://AdmiralGraceHopper.visualstudio.com and the Mark1 project. 

Additional queries can be created for use in Visual Studio Online or to be accessed via Excel by clicking on the New dropdown and selecting New query:


For this example (using Excel to access Visual Studio Online) the default queries for the MarkI project will be used.

Connecting Excel to Visual Studio Online

The following steps are required to support Visual Studio Online's integration with Excel:

1) Install Team Explorer (see the previous post Visual Studio Online (cloud-hosted TFS) for non-Visual Studio Developers). If Visual Studio is installed this step can be skipped as Visual Studio includes the TFS Excel add-in.

2) Launch Excel 2007 or any later version and create a new "Blank worksheet" which appears as follows in Excel 2013:


3) Select the TEAM ribbon which is as follows:


4) If the Team ribbon is not visible, see the section at the end of this posted "Re-enabling the TEAM Ribbon"

5) To connect to Visual Studio Online or TFS click on the TEAM | New List button which causes the Connect to Team Foundation Server dialog to be displayed:


6) From the Connect to Team Foundation Server dialog click on the Servers button thus displaying the Add/Remove Team Foundation Server dialog:


7) From the Add/Remove Team Foundation Server dialog click on the Add button thus displaying the Add Team Foundation Server dialog:


8) Under the textbox labeled, "Name or URL of Team Foundation Server:" enter the URL associated with the desired Visual Studio Online instance (for this example the Visual Studio Online URL is https://AdmiralGraceHopper.visualstudio.com/):



9) Once the Visual Studio Online instance URL is specified, click on OK thus displaying a "Sign in" dialog to sign into said Visual Studio Online instance where the appropriate Microsoft account ID and password can be entered:


10) Enter the Microsoft account name and password:



11) Click on the Sign In button:



12) When a valid Sign In credentials are entered then the Add/Remove Team Foundation Server is displayed again with the freshly entered Visual Studio Online URL highlighted. Simply click on Close:



13) From the Connect to Team Foundation Server dialog, select the appropriate Team Project (e.g. MarkI) and click on Connect.

At this stage Excel is connected to a Visual Studio Online project but has yet to retrieve work items using a Visual Studio Online query.

Select a Query to Import into Excel

Once a connection is established using the Connect to Team Foundation Server dialog (initially displayed by clicking on the New List button on the TEAM ribbon), the New List dialog is displayed:



Clicking on the ... button next to Query List drop down reveals the Select Query dialog:



Expanding the folders in the Select Query dialog reveals all the queries from the MarkI project:


Select the Product Backlog query and click the Select Query dialog's OK button. With the Select Query dialog closed the New List dialog is active again:


Clicking on the OK button of the New List dialog will cause Sheet 1 of the workbook to retrieve the work items associated with the Mark1/Shared Queries/Product Backlog query:


Re-enabling the TEAM Ribbon

1) In Excel click on the File menu (it is not a ribbon as no ribbon is displayed when File is clicked on):


2) Click on Options thus displaying the Excel Options dialog:



3) From the categories along the left side of the dialog, select Add-Ins:


4) Notice under "Active Application Add-Ins" that there is no add-in named "Team Foundation Add-in". To enable the Team Foundation Add-in select "Com Add-Ins" from the Manage drop down and click on Go:



5) This displays the COM Add-Ins dialog:



6) Click on the checkbox next to the Team Foundation Add-in entry and click on OK:


7) Clicking on OK returns to the workbook view of Excel where the TEAM menu entry should be visible (see above). This means the TEAM capabilities are enabled since Team Foundation Add-In is enabled.

Subsequently selecting File | Options | Add-In displays the Excel Options | Add-Ins where the Team Foundation Add-In is listed under Active Application Add-ins: