Better the devil you know - Part III.

Roger Torres - December 18th, 2008
7 Comments

In Part II of this series I described the architecture and main components of the data access layer (DAL). While most of the design is based on ADO.NET, there are a couple of important differences:

  1. All the SQL code is defined inside a Visual Studio project as “code-behind” associated to the command classes. This is important to keep the SQL code close to the .NET data adapters (not only in the VS project but in the same source control system).
  2. Developers don’t need to deal with stored procedures directly since the framework will take care of all the deployment issues automatically. This is the desired behavior during the development phase when most security restrictions are normally removed. Developers are not forced to deal with the proliferation of stored procedures, naming guidelines, source code synchronization, and other headaches.

By default, all commands are executed in “Text” mode, which means the internal ADO.NET commands are created with the CommandType property set to CommandType.Text. Once the system is deployed to a production environment, and stored procedures are established as the only way to access the data, the execution mode of the connection object should be changed as follows:

cn.ExecutionMode = System.Data.CommandType.StoredProcedure;

At this juncture, all commands should be deployed as stored procedures to the underlying database. The connection class implements a deployment mechanism to assemble the stored procedures, taking care of the standard naming guidelines defined by the connection provider. Developers can choose to deploy (a) one command at a time, or (b) all commands defined inside an assembly dedicated to the DAL. Let’s see how to deploy the AdventureWorks data access layer using the second option:

cn.Deploy(typeof(AdventureWorks.DAL.AdventureWorksDbProvider).Assembly);


In general, the entire DAL should be deployed to a production or testing environment once the application is ready. Note that the current principal must have enough rights to create stored procedures.

EXECUTION METHODS

When developing a DAL using plain ADO.NET, developers must deal with repetitive patterns that are tedious and error prone. A regular pattern when executing a command would perform the following steps:

STEPS

  1. Declare input and/or output parameters.
  2. Set input parameter values.
  3. Execute command (using one of several execution methods implemented by ADO.NET - NonQuery, Scalar, DataReader, etc)
  4. Process results if the command returns a resultset.
  5. Get output parameter values.

Developers must also take care of other issues like handling exceptions, null values, transactions, and multiple commands executed in a single batch.

The main goal of the DAL execution methods is to encapsulate all this boilerplate code, but provide a flexible interface to cover the most common data operations. In general, all commands are executed by a connection instance following the same STEPS enumerated above. We can group the execution methods in four functional categories:

  1. Basic ADO.NET execution methods: These are the most basic command execution methods implemented by the framework. The only difference with ADO.NET is the generic interface to return strong typed results.
  2. Type readers: Methods used to read and materialize data into .NET objects or collections.
  3. Command binders: Methods used to bind .NET objects to commands before and after execution.
  4. Batch execution methods: Methods used to invoke multiple commands in a single database call, taking care of input/output parameter passing automatically.

The following table summarizes all the command execution methods, and how the framework invokes ADO.NET internally:

Category Execution Method ADO.NET Steps
Basic Execute ExecuteNonQuery 1, 2, 3, 5
Basic ExecuteScalar<T> ExecuteScalar 1, 2, 3, 5
The framework uses generics to enforce a strong typed result.
Type Reader ExecuteScalar<T>(ReaderCallback) ExecuteReader 1, 2, 3, 4
Uses callback delegate to materialize the first returned record into an instance of T.
Type Reader ExecuteReader<T>(ReaderCallback) ExecuteReader 1, 2, 3, 4
Uses callback delegate to materialize all returned records into a list of instances of T, or a second option to materialize all returned records inside a target container object.
Type Reader ExecuteReader<K, T> ExecuteReader 1, 2, 3, 4
Uses callback delegate to materialize all returned records into a dictionary of instances of T.
Command Binder ExecuteBinder<T> ExecuteNonQuery 1, 2, 3, 5
Uses binder to dematerialize a single instance of T into the command’s properties before step 1, and update the instance back with the values of output parameters after step 5.
Batch Execute(IEnumerable<Command>) ExecuteReader 1, 2, 3, 5 for each command in script
The framework internally creates a SQL batch script that executes each command in the list as a stored procedure, taking care of input and output parameters automatically.
Batch and Command Binder ExecuteBinder<T>(IEnumerable<T>) ExecuteReader Same as above but using binders to bind the list of objects to a single command.

In Part II we described how to execute a basic command to insert employee records. Most of the data operations can be implemented by the basic methods, but we can exploit some common patterns and find a few shortcuts that can improve our coding experience while enforcing consistency and a more robust product.


In the next post I will explore the generic reader pattern.

Better the devil you know - Part II.

Roger Torres - December 12th, 2008
7 Comments

In Part I of this series I presented the case for building simple in-house solutions when the development team didn’t have the time or desire to try more complex alternatives, listing a desired set of common requirements for a .NET data access layer.

In this post I’m going to describe the data access layer that was developed to meet those requirements, but:

  1. Did NOT impose a big learning curve.
  2. Simplified the life of developers and architects by “enforcing” a system of coding best practices.

THE DESIGN

As expected, the solution is based on the well known ADO.NET ConnectionCommandParameter abstractions, although most of the regular ADO.NET boilerplate code is transparent to the developers. Figure 1 depicts a layered view of the framework and its place within the architecture.

Figure 1

Figure 1


The entire framework is packaged in a single .NET assembly that can be downloaded here – [download#2].

I’m going to start with a very simple scenario to show the basic components of the framework and their role in the system. In future posts I will introduce more complex scenarios, covering all the supported execution modes and other advanced features.

THE CONNECTION CLASS

Just like in ADO.NET, the connection class is the entry point to the framework. It represents a connection to a single database instance, encapsulating all the execution methods and modes as specified by the requirements. Connections delegate most of the work to a specific database provider implementing a database agnostic interface.

The framework includes a fully functional SQL Server provider, although module designers can customize the routines in charge of generating stored procedures and applying naming conventions to meet their coding standards.

Listing 1 shows how to get a connection string from a configuration file, and instantiate a connection object with the default SQL Server provider:

1
2
3
4
string cStr = ConfigurationManager.ConnectionStrings["cs"].ConnectionString;
CoNatural.Data.IDbProvider dbProvider = new CoNatural.Data.SqlClient.SqlDbProvider();
CoNatural.Data.Connection cn = new Connection(cStr, dbProvider);
cn.TimeOut = 10000;
Listing 1

THE COMMAND CLASS

Commands are “very basic” .NET wrappers encapsulating SQL scripts. Their main role is to provide a strong typed interface to the script’s input/output parameters through its properties.

The association between commands and their corresponding script is similar to the code-behind pattern found in Windows/Web Forms; each command class implements the code-behind of a SQL script. The framework automatically performs the association by matching the “full” type name of the command class with the “full” file name of the script.

Figure 2 shows a Visual Studio project where multiple commands are defined. It’s required that both files share the same .NET assembly and namespace. Developers can add new commands to any type of project, although it’s a good practice to dedicate specific assemblies to the data access layer.

Figure 2

Figure 2

I’m currently working on a Visual Studio Wizard to streamline the creation of new commands following a predefined template as shown in Figure 3. Solution designers can modify the default template or create others according to their needs. Note that the wizard takes care of configuration details like setting the SQL script “Build Action” to “Embedded Resource” and creating a file dependency in the project to nest the code-behind file within the SQL script file.

Figure 3

Figure 3

The following code snippets show the files generated by the default template when adding a new command named “Command1.

Command1.sql

1
2
3
4
5
6
7
8
9
-------------------------------------------------------
-- Command1
--
-- Created by Roger
-- Created on 11/13/2008 7:51:43 PM
-------------------------------------------------------
 
-- TODO Write your command's sql statement here
-- e.g.  SELECT a, b, c FROM tableName WHERE id = @id

Command1.cs

1
2
3
4
5
6
7
8
9
10
11
12
13
//-------------------------------------------------------
//-- Command1
//--
//-- Created by Roger
//-- Created on 11/13/2008 7:51:43 PM
//-------------------------------------------------------
using System;
 
namespace AdventureWorks.DAL.HumanResources {
  public class Command1 : CoNatural.Data.Command {
    // TODO Write your parameters and other helper methods here
  }
}


Using the Visual Studio Wizard is not required. Developers can create new commands by adding both files manually, making sure they are under the same namespace, the file extensions are correct, and the build action of the file holding the SQL script is set to “Embedded Resource”.

THE PARAMETER CLASS

Parameters are the vehicles responsible for holding and applying the value mapping rules between command properties and SQL input/output parameters. Developers must code commands carefully, making sure that each property name and type coincides with the corresponding parameter in the SQL script.

All properties are mapped as input parameters by default, but the user can override this behavior by annotating properties with the special ParameterAttribute to explicitly set the direction, size, precision, and scale of the underlying ADO.NET parameters. The framework will use this information to generate the stored procedures at deployment time.

The runtime is guided by the following rules when preparing the collection of parameter mappings associated with a command:

  1. Creates a Parameter object for each property defined by the specialized command class, overriding defaults when the property is annotated with ParameterAttribute.
  2. If the property is Nullable, the parameter will allow and handle null values.
  3. The type of the underlying ADO.NET parameter is determined as follows:
  4. Int16 → DbType.Int16
    Int32 → DbType.Int32
    Int64 → DbType.Int64
    Boolean → DbType.Boolean
    Byte → DbType.Byte
    DateTime → DbType.DateTime
    Decimal → DbType.Decimal  (Uses Precision and Scale attributes)
    Double → DbType.Double
    Guid → DbType.Guid
    Sbyte → DbType.SByte
    Single → DbType.Single
    String → DbType.String  (Requires Size attribute between 1 - 8000)
    Uint16 → DbType.UInt16
    Uint32 → DbType.UInt32
    Uint64 → DbType.UInt64

Before commands are invoked, the runtime automatically assigns property values to the corresponding input parameters. Likewise, the command’s properties are updated with the corresponding output parameter values after the results are back.

To illustrate this concept, let’s explore a simple scenario to insert a new “Employee” record into a fictitious “Employees” table:

  1. Create the Employee table
  2. 1
    2
    3
    4
    5
    6
    7
    8
    
    CREATE TABLE [dbo].[Employee](
       [EmployeeID] [INT] IDENTITY(1,1) NOT NULL,
       [FirstName] [NVARCHAR](50) NOT NULL,
       [LastName] [NVARCHAR](50) NOT NULL,
       [HireDate] [DATETIME] NOT NULL,
       CONSTRAINT [PK_Employee_EmployeeID] PRIMARY KEY CLUSTERED
       ([EmployeeID] ASC)
    ) ON [PRIMARY]
  3. Create a new file named InsertEmployee.sql with the following SQL script, and set the build action to “Embedded Resource”:
  4. 1
    2
    3
    
    INSERT INTO Employee(FirstName, LastName, HireDate)
    VALUES(@FirstName, @LastName, @HireDate)
    SET @Id = SCOPE_IDENTITY()
  5. Create a .NET class named InsertEmployee.cs under the same folder with the following C# code:
  6. 1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    
    public class InsertEmployee : CoNatural.Data.Command {
       [CoNatural.Data.Parameter(System.Data. ParameterDirection.Output)]
       public int Id{ get; set; }
     
       [CoNatural.Data.Parameter(50)]
       public string FirstName { get; set; }
     
       [CoNatural.Data.Parameter(50)]
       public string LastName { get; set; }
     
       public DateTime HireDate { get; set; }
    }
  7. Invoke the command somewhere in the application. Let’s insert a new employee named “John Doe” hired on January 1, 2008.
  8. 1
    2
    3
    4
    5
    6
    7
    8
    
    InsertEmployee cmd = new InsertEmployee();
    cmd.FirstName = "John";
    cmd.LastName = "Doe";
    cmd.HireDate = new DateTime(2008, 1, 1);
    cn.Execute(cmd);
     
    // get new employee id
    int newId = cmd.Id;

Note how all the properties are automatically mapped “by name” to the SQL parameters defined in the script. The “Id” property is annotated to identify the output parameter, and both “Name” properties are annotated to enforce the size of the nvarchar columns in the underlying employee table.

This example shows how easy is to create and execute commands with this framework, but what’s most important, how the data layer is constrained behind a strong typed interface, systematically keeping all the parts together inside a portable and source-controlled location while moving most of the deployment issues to the end of the development process.

In the next couple of posts I will describe the different “command execution modes” supported by this framework, and how they cover the most common CRUD requirements with the help of Type Readers and Command Binders. Finally, I’m going to introduce the command batch execution methods with linked parameters, and how to automatically compile and deploy commands as stored procedures.


Stay tuned for part III.

[download#2]

Better the devil you know - Part I.

Roger Torres - December 1st, 2008
5 Comments

INTRODUCTION

A common situation found in software development projects is “technical” managers trying to introduce, and sometimes enforce, “the latest” technologies without considering if developers have the time or willingness to learn new stuff… and yet expect they will quickly adapt to finish the job within schedule and budget.

I have seen entire projects go down the drain because the proposed architecture imposed a steep learning curve; furthermore it was almost impossible to find new developers with enough experience to fill the gaps. Today, when presented with such scenarios, I basically know what to expect: moving deadlines, frustration, and poorly written code that’s hard to follow and maintain.

Don’t get me wrong, I believe it’s very important to learn and apply new and “proven” technologies in order to succeed and sometimes survive in our careers; we just need to be very careful with the timing, particularly when others are involved and the completion of a project is on the line.

Software architects who understand their team’s abilities and motivations will eventually recognize that in this business sometimes it’s “better the devil you know than the devil you don’t”, and while pushing for the adoption of best practices and proven design principles, we should also strive for simple and pragmatic solutions.

THE CASE FOR IN-HOUSE SOLUTIONS

Mastering a sophisticated technology like LINQ for Entities or NHibernate requires the time and dedication that when extrapolated to an entire development team represents an investment that many companies are not ready to accept, particularly when there are simpler tools like ADO.NET that most developers understand and “can do the job”.

Some would argue that “the more sophisticated the technology you use, the more productive you are”… and as convincing as it sounds in theory, this shouldn’t be the only reason to embark ourselves into the tricky waters of third party solutions.

Let’s consider the following questions for a moment:

  1. Are we really going to use all the features the solution is offering?
  2. Does the solution support all the features we need?
  3. How much time is it going to take for the “entire” development team to learn and adopt this technology without compromising the project?
  4. How easy is to find developers with enough experience using this solution?

There will be other questions to ponder for sure, but when we don’t know or like the answers, it’s time to explore other options.

Many designers will love the idea of developing in-house, and while this is fine when we know exactly what’s needed, the most important precondition is having a team with enough “commitment and proficiency” to make it happen. Ultimately, the right decision will be influenced by each specific situation, and the one who makes it will definitely find a fair share of supporters and detractors.

GATHERING REQUIREMENTS

I’m going to present a .NET data access layer that was built with a few conflicting goals in mind:

  1. Serve the most common data scenarios found in modern .NET applications following software best practices.
  2. Simple enough so that it could be adopted with just a few hours of training.
  3. Powerful enough so that users would feel compelled to adopt it.

The motivation to assemble something like this was not the shortage of solutions, because there are many excellent frameworks out there that can perform miracles. What I’ve found in my consulting career is that the majority of .NET developers are still using plain ADO.NET, and they are not really interested in learning other ways to handle their data. It’s not hard to understand their reasons, especially when you see the managers breathing behind their necks all the time, but I’m not going to get there. The problem is that when ADO.NET is used at will, it can lead to messy and sometimes unmaintainable code that will eventually call for a complete rewrite. I though that the best way to get them back on track was to develop a “simple layer” on top of ADO.NET that would basically “enforce” a systematic methodology without adding “unwanted noise” to the process.

I started by presenting the following list of functional and non-functional requirements to satisfy:

  1. Use simple abstractions and technologies that are familiar to most .NET developers. Because ADO.NET was already in use, the design should follow ADO.NET as much as possible.
  2. Follow best practices. Encapsulate using the open close principle, design in layers, and build a testable system.
  3. Use a provider pattern to support multiple database management systems. Leverage the providers found in ADO.NET.
  4. All the source code (including SQL scripts) should be under the same source control system.
  5. SQL scripts should be organized in namespaces.
  6. Enforce strong typing and systematic handling of NULL values.
  7. Avoid ADO.NET DataSets. Encourage the use of “plain” object models.
  8. Limit the proliferation of stored procedures during the development phase.
  9. Provide a mechanism to automatically deploy and invoke stored procedures in a production environment where direct access to tables and views is disabled. This is a commonly used security pattern. Any syntax or dependency errors in the SQL scripts should be found at deployment time.
  10. The system (not developers) should enforce a consistent naming convention for stored procedures.
  11. Module designers should have the ability to wrap the SQL scripts with custom code to streamline security and business rule validations.
  12. Keep the business objects unaware of the framework. All CRUD operations should be invoked by a higher layer in the system (Controller Layer).
  13. Rationalize the coding required to materialize data records into business objects. Likewise, develop a simple mechanism to bind object property values to SQL scripts to support CRUD operations.
  14. Provide the ability to execute multiple commands in batch (with a single call to the database).
  15. Favor the use of transaction scopes. Share ADO.NET connections when possible to avoid the promotion of lightweight transactions to distributed transactions.
  16. Develop with Visual Studio 2008, SQL Server 2005, and the .NET framework 3.5.
  17. Deliver a prototype based on the AdventureWorks database shipped with SQL Server 2005 within 2 months.


Stay tuned for part II of this article where I’m going to start describing the solution and how it’s applied to real world scenarios.

The Consulting Alternative

Roger Torres - September 1st, 2008
Comments

We live in a technologically challenging world, where individual and corporate fortunes are directly affected by the technical abilities of the people around them. Companies are having a hard time coping with the short supply of highly trained software engineers, and in general they settle by hiring “whatever” is available to cover their needs … hoping for the best.

They must face a tough reality:

  1. “Good” software engineers require many years of preparation, tons of dedication and hard work.
  2. There is no near term solution to the supply and demand issue.

Let’s explore a few options for a moment:

  1. Increasing the number of H-1B visas: This doesn’t seem to work for many reasons. A few big corporations with access to the best foreign universities will benefit the most, but the rest will be exposed to the bigger problem of importing less skilled professionals and company hoppers. The public opinion is also against importing labor and exporting jobs. You can find more information about the H-1B program here.
  2. Offering more incentives: This is a great solution, but again, not all companies can afford it. In fact, the best software engineers in this country are working for companies offering stock options, bonuses, long paid vacations, time for personal projects, a full stack of insurance plans, and most importantly a very fat salary. If you know a good engineer that’s not in this group, he is probably an entrepreneur, he is already rich, or he’s out of luck.
  3. Hiring professional software consultants: This is probably the only viable short term alternative for the majority of companies. A good consultant will not only find the shortest path to solve problems, but he will provide a consistent approach to produce and maintain systems, serving as a mentor to the staff and transferring many tricks he must have learned over the years of exposure to the fire. The good news is that consulting services are temporary, and don’t require the extra compensation packages mentioned before.

Technology giants like IBM and HP have realized that “consulting services” make a lot of sense for everyone, that’s why they have been playing the game for many years. We offer an alternative to small and medium size companies who cannot afford these 800 pound gorillas, focusing our practice on personalized and affordable services, experience, and dedication. If you have software needs in our area, please let us know how we can help.

Roger Torres
President and Consultant