This morning on Coding Horror, Jeff Atwood talked about a problem he had with a Java Script sort which ran too slow thus causing pain for the end user. He rooted the problem not with the sort, but with the testing that was carried out by the developers, QA, and even the customer for which it was made before they signed off on the project. They did what all developers do and tested the application using too small a set of data. Before I even finished the second paragraph I was immediately reminded of Data Generation Plans.
The days of testing applications with “test123” and “Foo” are over! Well not really, but they could be if you wanted them to. Database projects in Visual Studio 2008 (and VS2005 for Database Professionals) have what are called Data Generation Plans. Data Generation Plans give you the ability to generate meaningful data for testing. You can generate random data or generate data from existing data sources, and control many aspects of the data generation.
Let’s take a basic example of a Patients table for a dentist’s office. You might have a PatientID, Name, Age, Address, City, etc. The Data Generation Plan will read your table schema and identify the types of data it should generate for your columns. So you’ll notice that Name receives a generated string of Unicode characters (because the data type was nvarchar) and Age an integer. Here is the Patients table column properties window.

Data Generation Plans will use random values by default, which isn’t exactly what I need to test things properly. For instance the State column will have 1 to 2 random non unique characters. I really need to test 50 unique state abbreviations with a min of 2 and a max of 2.

This was accomplished by setting the State column property to require that the data generated be a minimum of 2 characters and a maximum of 2 characters. (The max was already established by the size allowed on that particular column.) I also set it to require unique values meaning that it will generate AA, AB, AC, etc for each row, never using the same value twice as you see in the image above.
For the Age column the Data Generation Plan used an Integer value to generate random numbers. Great, except that I don’t need to test 98638565672 as an acceptable age. I can fix this by setting a max of 85 and a min of 9. This column did not require unique values so I should now get a good mix of ages. See image below.

You may also notice that you can set a percentage of data to be null. If you have a field that allows null values this can be very handy to test the scenarios of when your data is null...but not always null.
While this wasn’t an in depth discussion about the Data Generation Plan I hope you get the idea. If you would like to learn more about Data Generation, try out some of the links below. Happy Testing!
Data Generation Plans
Generating Data with Data Generators
Creating Custom Generators