Mayank Nagar's Blog: 2008

Sunday, August 10, 2008

New DataSet Features in ADO.NET 2.0

Summary: Learn about the new ADO.NET 2.0 features in the DataSet .NET Framework class and the classes that are closely related to it. These changes include both functional and performance enhancements to the DataSet, DataTable, and DataView classes.

Download the DataSetSamples.exe sample code associated with the article.

Introduction
Raw Performance
The DataTable – More Independent Than Before
Stream to Cache, Cache to Stream
Conclusion

Introduction

In the upcoming release of ADO.NET, ADO.NET 2.0, there are many new and improved features that affect many different .NET Framework classes and application development scenarios. This article discusses on the changes and enhancement to the core disconnected mode ADO.NET Framework classes—the DataSet and associated classes such as DataSet, DataTable, and DataView.

This article is actually the first of two articles on the DataSet and associated classes in ADO.NET 2.0. Here we will focus on the classes in the .NET Framework. In the subsequent article, we will focus on developing with these and related classes from within the Visual Studio 2005 development environment. Visual Studio 2005 offers several designers and tools that offer tremendous flexibility and productivity for developing the data-centric aspects of your application. As a result, each article will have a different "feel". This article is mainly an overview of new functionality, accompanied by explanations and code samples. In the next article, the focus is more on the development process, as we see how to develop a working application.

As I mentioned above, this article only covers a small slice of the new features of ADO.NET 2.0. An overview of some of the other features can be found in ADO.NET 2.0 Feature Matrix. More in depth information on some of the topics mentioned there can be found these articles:

Unless noted otherwise, the contents of this article are based on the Beta 1 release of Visual Studio 2005. The code samples use the Northwind database that comes as a sample database with SQL Server 2000.

Raw Performance

Software developers are always concerned with performance. Sometimes they get over-concerned and make their code jump through hoops to just trim a little execution time, in places where it ultimately isn't significant—but that is a subject for another article. When it comes to ADO.NET 1.x DataSets, particularly those containing a large amount of data, the performance concerns expressed by developers are indeed justified. Large DataSets are slow—in two different contexts. The first time the sluggish performance is felt is when loading a DataSet (actually, a DataTable) with a large number of rows. As the number of rows in a DataTable increases, the time to load a new row increases almost proportionally to the number of rows in the DataTable. The other time the performance hit is felt is when serializing and remoting a large DataSet. A key feature of the DataSet is the fact that it automatically knows how to serialize itself, especially when we want to pass it between application tiers. However, a close look reveals that this serialization is quite verbose, consuming much memory and network bandwidth. Both of these performance bottlenecks are addressed in ADO.NET 2.0.

New Indexing Engine

The indexing engine for the DataTable has been completely rewritten in ADO.NET 2.0 and scales much better for large datasets. This results in faster basic inserts, updates, and deletes, and therefore faster Fill and Merge operations. While benchmarks and quantifying performance gains is always an application-specific and often risky affair, these improvements clearly provide more than an order of magnitude improvement in loading a DataTable with a million rows. But don't take my word for it, check it out yourself, with the following simple example. Add the following code as the click event handler for a button on a Windows form:

    Private Sub LoadButton_Click(ByVal sender As System.Object,

ByVal e As System.EventArgs) Handles LoadButton.Click

        Dim ds As New DataSet

        Dim time1 As New Date

        Dim i As Integer

        Dim dr As DataRow

        ds.Tables.Add("BigTable")

        ds.Tables(0).Columns.Add("ID", Type.GetType("System.Int32"))

        ds.Tables(0).Columns("ID").Unique = True

        ds.Tables(0).Columns.Add("Value", Type.GetType("System.Int32"))

        ' Show status label

        WaitLabel.Visible = True

        Me.Cursor = Cursors.WaitCursor

        Me.Refresh()

        ' catch start time

        time1 = DateTime.Now()

        ' Yes, we are loading a million rows to a DataTable!

        ' If you compile/run this with ADO.NET 1.1, you have time

        ' to make and enjoy a fresh pot of coffee...

        Dim rand As New Random

        Dim value As Integer

        For i = 1 To 1000000

Try

                value = rand.Next

                dr = ds.Tables(0).NewRow()

                dr("ID") = value

                dr("Value") = value

                ds.Tables(0).Rows.Add(dr)

            Catch ex As Exception

                ' if there are any duplicate values, an exception

                ' will be thrown since the ID column was specified

                ' to be unique

            End Try

        Next

        ' reset cursor and label

        WaitLabel.Visible = False

        Me.Cursor = Me.DefaultCursor

        ' Show elapsed time, in seconds

        MessageBox.Show("Elapsed Time: " & _

DateDiff(DateInterval.Second, time1, DateTime.Now))

        ' verify number of rows in the table

        ' This number will probably be less that the number

        ' of loop iterations, since if the same random number

        ' comes up, it will/can not be added to the table

        MessageBox.Show("count = " & ds.Tables(0).Rows.Count)

    End Sub

When I ran this code in my environment with ADO.NET 1.1 and Visual Studio 2003, the execution time was about 30 minutes. With ADO.NET 2.0 and Visual Studio 2005, I had an execution time of approximately 40-50 seconds! When I lowered the number of rows to only half a million, the 1.1 version took about 45 seconds and the 2.0 version took about 20 seconds. Your numbers will vary, but I think the point is clear.

In fact, this example is a very simple one, since it contains only one index, for the unique column. However, as the number of indices on the specified DataTable increases, such as by adding additional DataViews, UniqueKeys and ForeignKeys, the performance difference will be that much greater.

Note The reason the ID value in the sample code is being generated by a random number generator rather than just using the loop counter as the ID, is in order to better represent the real-world scenario. In real applications, accessing the elements of a DataTable for Inserts, Updates, and Deletes is rarely done sequentially. For each operation, the row specified by the unique key must first be located. When inserting and deleting rows, the table's indices must be updated. If we were to just load a million rows with sequentially key values into an empty table, the results would be extremely fast, but misleading.

Binary Serialization Option

The major performance improvement in loading a DataTable with a lot of data did not require us to make any change at all to our existing ADO.NET 1.x code. In order to benefit from improved performance when serializing the DataSet, we need to work a bit harder—we need to add a single line of code to set the new RemotingFormat property.

In ADO.NET 1.x, the DataSet serializes as XML, even when using the binary formatter. In ADO.NET 2.0, in addition to this behavior, we can also specify true binary serialization, by setting the RemotingFormat property to SerializationFormat.Binary rather than (the default) SerializationFormat.XML. Let us take a look at the different outputs resulting from these two different options.

In order to maintain backwards compatibility (about which the ADO.NET team was always concerned), the default value of XML serialization will give us the same behavior as in ADO.NET 1.x. The results of this serialization can be seen by running this code:

    Private Sub XMLButton_Click(ByVal sender As System.Object,

ByVal e As System.EventArgs) Handles XMLButton.Click

        Dim ds As New DataSet

        Dim da As New SqlDataAdapter("select * from [order details]", _

 GetConnectionString())

        da.Fill(ds)

        Dim bf As New BinaryFormatter

        Dim fs As New FileStream("..\xml.txt", FileMode.CreateNew)

        bf.Serialize(fs, ds)

    End Sub

Private Function GetConnectionString() As String

        ' To avoid hard-coding the connection string in your code,

        ' use the application settings

        Return MySettings.Value.NorthwindConnection

End Function

Note that this code is explicitly using the BinaryFormatter class, yet the output in file xml.txt, shown in Figure 1, is clearly XML. Also, in this case, the size of the file is 388 KB.

Let us now change the serialization format to binary by adding the line

ds.RemotingFormat = SerializationFormat.Binary

and save the data to a different file by modifying the filename in the FileStream constructor so that the code now looks like this:

    Private Sub BinaryButton_Click(ByVal sender As System.Object,

ByVal e As System.EventArgs) Handles BinaryButton.Click

        Dim ds As New DataSet

        Dim da As New SqlDataAdapter("select * from [order details]", _

GetConnectionString())

        da.Fill(ds)

        Dim bf As New BinaryFormatter

        Dim fs As New FileStream("..\binary.txt", FileMode.CreateNew)

        ds.RemotingFormat = SerializationFormat.Binary

        bf.Serialize(fs, ds)

    End Sub

The output in file binary.txt is shown in Figure 2. Here we see that it is now in fact binary data, pretty unintelligible to the human reader. Moreover, the size of this file is only 59 KB—again, an order of magnitude reduction in the amount of data that needs to be transferred and the CPU, memory, and bandwidth resources required to process it. It should be pointed out that this improvement is relevant when using remoting and not when using Web Services, since Web Services by definition must be passing XML. This means that you will only be able to take advantage of this enhancement when both sides of the communication are .NET-based and not when communicating to non-.NET platforms.

More in-depth details about DataSet serialization process can be found in Binary Serialization of DataSets.

The DataTable – More Independent Than Before

When discussing ADO.NET 1.x and its object model for disconnected data access, the central object was the DataSet. Sure it contained other objects, such the DataTable, DataRelation, DataRow, etc., but the attention generally started and revolved around the DataSet. It is true that most .NET developers were aware and leveraged the fact that the DataTable was quite useful on its own, without being encapsulated inside a DataSet. However, there were some scenarios where we couldn't do what we wanted to do with a DataTable unless we first took it and forced it into a DataSet. The most glaring and often painful example of this is to read and write (load and save) XML data in to and out of the DataTable. In ADO.NET 1.x, we must first add the DataTable to DataSet, just so we could read or write XML, since the methods to do so are only available on the DataSet!

One of the objectives of ADO.NET 2.0 was to make the standalone DataTable class far more functional and useful than it is in ADO.NET 1.x. The DataTable now supports the basic methods for XML, just as the DataSet does. This includes the following methods:

ReadXML
ReadXMLSchema
WriteXML
WriteXMLSchema

The DataTable is independently serializable and can be used in both web service and remoting scenarios. In addition to now supporting the Merge method, the stand-alone DataTable also supports new ADO.NET 2.0 features added to the DataSet:

RemotingFormat property (discussed previously)
Load method (discussed later in this article)
GetDataReader method (discussed later in this article)

Note On the topic of XML, it is worth noting that in ADO.NET 2.0 there is much enhanced XML support—what Microsoft likes to call greater "XML Fidelity". This takes the form of support for the SQL Server 2005 XML data type, extended XSD schema support, an improved XSD schema inference engine, and the elimination of two often troublesome limitations: (i) The DataSet and DataTable classes can now handle multiple in-line schemas and (ii) The DataSet now fully supports namespaces, so that a DataSet can contain multiple DataTables with the same name, but from different namespaces, i.e., tables with the same unqualified names, but with different qualified names. Also, a child table with the same name and namespace that is included in multiple relations can be nested in multiple parent tables.

Stream to Cache, Cache to Stream

Another one of the main enhancements for the DataSet and DataTable classes in ADO.NET 2.0 is the availability of mechanisms to consume a DataReader (loading data into DataTables) and to expose a DataReader over the contents of DataTables.

Sometimes we have/receive our data in the form of a DataReader, but really want to have it in the form of a cached DataTable. The new Load method allows us to take an existing DataReader and use it to fill a DataTable with its contents.

Sometimes we have/receive our data in a cached form (DataTable) and need to access it via a DataReader type interface. The new GetTableReader method allows us to take an existing DataTable and access it with a DataReader interface and semantics.

In the following sections, we'll take a look at these new methods.

The Load Method – Basic Use

The Load method is a new method that has been added to the DataSet and the DataTable in ADO.NET 2.0. It loads a DataTable with the contents of a DataReader object. It can actually load multiple tables at one time, if the DataReader contains multiple resultsets.

The basic use of the Load method is quite straightforward:

MyDataTable.Load (MyDataReader)

A more complete illustration of its use is shown in this sample code:

 Private Sub LoadButton_Click(ByVal sender As System.Object,

ByVal e As System.EventArgs) Handles LoadButton.Click

Try

            Using connection As New SqlConnection(GetConnectionString())

                Using command As New SqlCommand("SELECT * from customers", connection)

                    connection.Open()

                    Using dr As SqlDataReader = command.ExecuteReader()

                        'Fill table with data from DataReader

                        Dim dt As New DataTable

                        dt.Load(dr, LoadOption.OverwriteRow)

                        ' Display the data

                        DataGridView1.DataSource = dt

                    End Using

                End Using

            End Using

        Catch ex As SqlException

            MessageBox.Show(ex.Message)

        Catch ex As InvalidOperationException

            MessageBox.Show(ex.Message)

        Catch ex As Exception

            ' You might want to pass these errors

            ' back out to the caller.

            MessageBox.Show(ex.Message)

        End Try

    End Sub

The code above initializes connection and command objects and then executes the ExecuteReader method to fetch the data from the database. The results of the query are provided as a DataReader, which is then passed to the Load method of the DataTable to fill it with the returned data. Once the DataTable is filled with the data, it can be bound and displayed in the DataGridView. The significance of the OverwriteRow load option for the (optional) LoadOption parameter will be explained in the next section.

The Load Method – Why am I loading this data?

If all you are doing with your DataSet/DataTable and DataAdapter is filling the DataSet with data from the data source, modifying that data, and then at some later point pushing it back into the data source, that things general move pretty smoothly. A first complication occurs if you are utilizing optimistic concurrency and a concurrency violation is detected (someone else already changed one of the rows you are trying to change). In this case what you normally need to do to resolve the conflict is to resynchronize the DataSet with the data source, so that the original values for the rows match the current database values. This can be accomplished by merging a DataTable with the new values into the original table (in ADO.NET 1.x, the merge method is only available on the DataSet):

OriginalTable.Merge(NewTable, True)

By matching rows with the same primary key, records in the new table are merged with the records in the original table. Of key significance here is the second parameter, PreserveChanges. This specifies that the merge operation should only update the original values for each row, and not affect the current values for the row. This allows the developer to subsequently execute a DataAdapter.Update that will now succeed in updating the data source with the changes (current values), since the original values now match the current data source values. If PreserveChanges is left at its default value of false, the merge would override both the original and current values of the rows in the original DataTable and all of the changes that were made would be lost.

However, sometimes we want to update data in the data source, where the new values don't come from programmatically modifying the values. Perhaps we obtain updated values from another database or from an XML source. In this scenario, we want to update the current values of the rows in the DataTable, but not affect the original values for those rows. There is no easy way to do this in ADO.NET 1.x. It is for this reason that the ADO.NET 2.0 Load method accepts a parameter LoadOption that indicates how to combine the new incoming rows with the same (primary key) rows already in the DataTable.

The LoadOption allows us to explicitly specify what our intention is when loading the data (synchronization or aggregation) and how we therefore want to merge the new and existing rows. Figure 3 outlines the various scenarios:

Where:

Primary Data Source—DataTable/DataSet synchronizes/updates with only one Primary Data Source. It will track changes to allow for synchronization with primary data sources.
Secondary Data Source—DataTable/DataSet accepts incremental data feeds from one or more Secondary Data Sources. It is not responsible for tracking changes for the purpose of synchronization with secondary data sources.

The three cases shown in Figure 3 can be summarized as follows:

Case 1—Initialize DataTable(s) from Primary Data Source. The user wants to initialize an empty DataTable (original values and current values) with values from primary data source and then later, after changes have been made to this data, propagate the changes back to the primary data source.
Case 2—Preserve Changes and Re-Sync from Primary Data Source. The user wants to take the modified DataTable and re-synchronize its contents (original values only) with the primary data source while maintaining the changes made (current values)
Case 3—Aggregate incremental data feeds from one or more Secondary Data Sources. The user wants to accept changes (current values) from one or more secondary data sources and then propagate these changes back to the primary data source.

The LoadOption enumeration has three values that respectively represent these three scenarios:

OverwriteRow—Update the current and original versions of the row with the value of the incoming row.
PreserveCurrentValues (default)—Update original version of the row with the value of the incoming row.
UpdateCurrentValues—Update the current version of the row with the value of the incoming row.

Note These names will probably change post-Beta 1.

Table 1 below summarizes the load semantics. If the incoming row and existing row agree on primary key values, then the row is processed using its existing DataRowState, else use 'Not Present' section (the last row in the table).

Table 1. Summary of Load Semantics

Existing DataRow State

UpdateCurrentValues

OverwriteRow

PreserveCurrentValues (Default)

Added

Current =

Original = - --

State =

Current =

Original =

State =

Current =

Original =

State =

Modified

Current =

Original =

State =

Current =

Original =

State =

Current =

Original =

State =

Deleted

(Undo Delete) and

Current =

Original =

State = <>

(Undo Delete) and

Current =

Original =

State =

Current =

Original =

State =

Unchanged

Current =

Original =

If new value same as existing value then

State =

Else

State =

Current =

Original =

State =

Current =

Original =

State =

Not Present

Current =

Original = ---

State = <>

Current =

Original =

State =

Current =

Original =

State =

Example

In order to illustrate the behavior specified in Table 1, I offer a simple example.

Assume that both the existing DataRow and incoming row have 2 columns with matching names. The first column is the primary key and the second column contains a numeric value. The tables below show the contents of the second column in the data rows.

Table 2 represents the contents of a row in all 4 states before invoking Load. The incoming row's second column value is 3. Table 3 shows its contents after load.

Table 2. Row State Before Load

Existing Row State	Version	Added	Modified	Deleted	Unchanged
	Current	2	2	-	4
	Original	-	4	4	4

Incoming Row

Table 3. Row State After Load

UpdateCurrentValues

OverwriteRow

PreserveCurrentValues

Added

Current = <3>

Original = ---

State =

Current = <3>

Original = <3>

State =

Current = <2>

Original = <3>

State =

Modified

Current = <3>

Original = <4>

State =

Current = <3>

Original = <3>

State =

Current = <2>

Original = <3>

State =

Deleted

Current = <3>

Original = <4>

State =

Current = <3>

Original = <3>

State =

Current = <2>

Original = <3>

State =

Unchanged

Current = <3>

Original = <4>

State =

Current = <3>

Original = <3>

State =

Current = <3>

Original = <3>

State =

Not Present

Current = <3>

Original = ---

State =

Current = <3>

Original = <3>

State =

Current = <3>

Original = <3>

State =

Note You can see the beginnings of this concept already in ADO.NET 1.x. The default behavior of the DataAdapter's Fill method when loading data into a DataTable is to mark all the rows as Unchanged (This can be overridden by setting the AcceptChangesOnFill property to False). However, when using ReadXML to load data into a DataSet, the rows are marked as Added. The rationale for this (which was implemented based on customer feedback) is that this would allow loading new data from an XML source into a DataSet and then using the associated DataAdapter to update the primary data source. If the rows were marked as Unchanged when loaded from ReadXML, the DataAdapter.Update would not detect and changes and would not execute any commands against the data source.

In order to provide similar functionality, the FillLoadOptions property has been added to the DataAdapter in order to offer the same semantics and behavior as the Load method described here, while still preserving the same (by default) existing behavior of the Fill method.

Another feature (which doesn't exist) that developers always ask about in ADO.NET 1.x, is the ability to manually modify the state of DataRow. While the options offered by the Load method may address most scenarios, you may still want to have finer-grained control over the row state—you may have a need to modify the state of individual rows. To that end, ADO.NET 2.0 introduces two new methods on the DataRow class: SetAdded and SetModified. Before you ask about setting the state to Deleted, or Unchanged, let me remind you that with version 1.x we already have the Delete and AcceptChanges/RejectChanges methods to accomplish this.

The GetTableReader Method

The GetTableReader method is a new method that has been added to the DataSet and the DataTable in ADO.NET 2.0. It returns the contents of a DataTable as a DataTableReader (derived from DBDataReader) object. If it is invoked on a DataSet that contains multiple tables, the DataReader will contain multiple resultsets.

The use of the GetTableReader method is quite straight-forward:

Dim dtr As DataTableReader = ds.Tables(0).GetDataReader

The DataTableReader works pretty much like the other data readers you have worked with, such as the SqlDataReader or OleDbDataReader. The difference is, however, that rather than streaming data from a live database connection, the DataTableReader provides iteration over the rows of a disconnected DataTable.

The DataTableReader provides a smart, stable iterator. The cached data may be modified while the DataTableReader is active and the reader will automatically maintain its position appropriately even if one or more rows are deleted or inserted while iterating.

A DataTableReader that is created by calling GetDataReader on a DataTable contains one result set with the same data as the DataTable from which it was created. The result set contains only the current column values for each DataRow and rows that are marked for deletion are skipped. A DataTableReader that is created by calling GetDataReader on a DataSet that contains more than one table will contain multiple result sets. The result sets will be in the same sequence as the DataTable objects in the DataSet object's DataTableCollection.

In addition to the features outlined above, another great use of the GetDataReader method is to quickly copy data from one DataTable to another:

Dim dt2 as new DataTable

dt2.Load(ds.Tables(0).GetDataReader)

The DataView.ToTable Method

Another new method that is somewhat related to the previous ones (in that it provides a new DataTable cache of existing data) and is worth mentioning is the ToTable method of the DataView class. As a reminder, the DataView class provides a logical view of the rows in a DataTable. This view may be filtered by row, row state, and sorted. However, in ADO.NET 1.1, there is no easy way to save or pass on the rows of the view, since the DataView does not have its own copy of the rows—it simply accesses the rows of the underlying DataTable as prescribed by the filter and sort parameters. The DataView's ToTable method returns an actual DataTable object that is populated with rows of the exposed by the current view.

Overloaded versions of the ToTable method offer the option of specifying the list of columns to be included in the created table. The generated table will contain the listed columns in the specified sequence, which may differ from the original table/view. This ability to limit the number of columns in a view is a feature that is missing in ADO.NET 1.x and has frustrated many a .NET programmer. You can also specify the name of the created table and whether it should contain all or only distinct rows.

Here is some sample code that shows how to use the ToTable method:

    Private Sub ToTableButton_Click(ByVal sender As System.Object,

ByVal e As System.EventArgs) Handles ToTableButton.Click

        ' Show only 2 columns in second grid

        Dim columns As String() = {"CustomerID", "ContactName"}

        Dim dt As DataTable = _

ds.Tables("customers").DefaultView.ToTable( _

"SmallCustomers", False, columns)

        DataGridView2.DataSource = dt

    End Sub

Assuming that the contents of the "customers" table in the DataSet ds are displayed in a first grid, this routine displays the newly created DataTable that contains only those rows exposed by the DefaultView (as specified by its filter parameters). The rows in the new table contain only two of the columns of the original DataTable and DataView. An example of this can be seen in Figure 4.

Conclusion

The ADO.Net 2.0 version of the DataSet (and DataTable) introduces numerous new features and enhancements to existing features. The main features, discussed in the article, include significantly improved performance due to a new index engine and the binary serialization format option, extensive capabilities available to a stand-alone DataTable, and mechanisms for exposing cached data as a stream (DataReader) and loading stream data into a DataTable cache. ADO.NET 2.0 also offers greater control over the state of rows in a DataTable, in order to better address more real-world scenarios.

Thanks to Mayank nagar, Kawarjit S. Bedi, Pablo Castro, Alan Griver, Steve Lasker, and Paul Yuknewicz of Microsoft for their help in preparing this article.

Friday, August 8, 2008

CODE ACCESS SECURITY

Introduction

Over the past years, I've learned many things from CodeProject ... and now I'm giving back to the CodeProject. Since I didn't find any articles on Code Access Security, here's my one. Enjoy!

I'm not going to bore you with theory, but before we wet our feet, there are some concepts, keywords that you should learn. .NET has two kinds of security:

Role Based Security (not being discussed in this article)
Code Access Security

The Common Language Runtime (CLR) allows code to perform only those operations that the code has permission to perform. So CAS is the CLR's security system that enforces security policies by preventing unauthorized access to protected resources and operations. Using the Code Access Security, you can do the following:

Restrict what your code can do
Restrict which code can call your code
Identify code

We'll be discussing about these things through out this article. Before that, you should get familiar with the jargon.

Jargon

Code access security consists of the following elements:

permissions
permission sets
code groups
evidence
policy

Permissions

Permissions represent access to a protected resource or the ability to perform a protected operation. The .NET Framework provides several permission classes, like FileIOPermission (when working with files), UIPermission (permission to use a user interface), SecurityPermission (this is needed to execute the code and can be even used to bypass security) etc. I won't list all the permission classes here, they are listed below.

Permission sets

A permission set is a collection of permissions. You can put FileIOPermission and UIPermission into your own permission set and call it "My_PermissionSet". A permission set can include any number of permissions. FullTrust, LocalIntranet, Internet, Execution and Nothing are some of the built in permission sets in .NET Framework. FullTrust has all the permissions in the world, while Nothing has no permissions at all, not even the right to execute.

Code groups

Code group is a logical grouping of code that has a specified condition for membership. Code from http://www.somewebsite.com/ can belong to one code group, code containing a specific strong name can belong to another code group and code from a specific assembly can belong to another code group. There are built-in code groups like My_Computer_Zone, LocalIntranet_Zone, Internet_Zone etc. Like permission sets, we can create code groups to meet our requirements based on the evidence provided by .NET Framework. Site, Strong Name, Zone, URL are some of the types of evidence.

Policy

Security policy is the configurable set of rules that the CLR follows when determining the permissions to grant to code. There are four policy levels - Enterprise, Machine, User and Application Domain, each operating independently from each other. Each level has its own code groups and permission sets. They have the hierarchy given below.

Figure 1

Okay, enough with the theory, it's time to put the theory into practice.

Quick Example

Let's create a new Windows application. Add two buttons to the existing form. We are going to work with the file system, so add the System.IO namespace.

using System.IO;

Figure 2

Write the following code:

private void btnWrite_click(object sender, System.EventArgs e)

    StreamWriter myFile = new StreamWriter("c:\\Security.txt");

    myFile.WriteLine("Trust No One");

    myFile.Close();

private void btnRead_click(object sender, System.EventArgs e)

    StreamReader myFile = new StreamReader("c:\\Security.txt");

    MessageBox.Show(myFile.ReadLine())

    myFile.Close()

The version number should be intact all the time, for our example to work. Make sure that you set the version number to a fixed value, otherwise it will get incremented every time you compile the code. We're going to sign this assembly with a strong name which is used as evidence to identify our code. That's why you need to set the version number to a fixed value.

[assembly: AssemblyVersion("1.0.0.0")]

That's it ... nothing fancy. This will write to a file named Security.txt in C: drive. Now run the code, it should create a file and write the line, everything should be fine ... unless of course you don't have a C: drive. Now what we are going to do is put our assembly into a code group and set some permissions. Don't delete the Security.txt file yet, we are going to need it later. Here we go.

.NET Configuration Tool

We can do this in two ways, from the .NET Configuration Tool or from the command prompt using caspol.exe. First we'll do this using the .NET Configuration Tool. Go to Control Panel --> Administrative Tools --> Microsoft .NET Framework Configuration. You can also type "mscorcfg.msc" at the .NET command prompt. You can do cool things with this tool ... but right now we are only interested in setting code access security.

Figure 3

Creating a new permission set

Expand the Runtime Security Policy node. You can see the security policy levels - Enterprise, Machine and User. We are going to change the security settings in Machine policy. First we are going to create our own custom permission set. Right click the Permission Sets node and choose New. Since I couldn't think of a catchy name, I'm going to name it MyPermissionSet.

Figure 4

In the next screen, we can add permissions to our permission set. In the left panel, we can see all the permissions supported by the .NET Framework. Now get the properties of File IO permission. Set the File Path to C:\ and check Read only, don't check others. So we didn't give write permission, we only gave read permission. Please note that there is another option saying "Grant assemblies unrestricted access to the file system." If this is selected, anything can be done without any restrictions for that particular resource, in this case the file system.

Figure 5

Now we have to add two more permissions - Security and User Interface. Just add them and remember to set the "Grant assemblies unrestricted access". I'll explain these properties soon. Without the Security permission, we don't have the right to execute our code, and without the User Interface permission, we won't be able to show a UI. If you're done adding these three permissions, you can see there is a new permission set created, named MyPermissionSet.

Creating a new code group

Now we will create a code group and set some conditions, so our assembly will be a member of that code group. Notice that in the code groups node, All_Code is the parent node. Right Click the All_Code node and choose New. You'll be presented with the Create Code Group wizard. I'm going to name it MyCodeGroup.

Figure 6

In the next screen, you have to provide a condition type for the code group. Now these are the evidence that I mentioned earlier. For this example, we are going to use the Strong Name condition type. First, sign your assembly with a strong name and build the project. Now press the Import button and select your assembly. Public Key, Name and Version will be extracted from the assembly, so we don't have to worry about them. Now move on to the next screen. We have to specify a permission set for our code group. Since we have already created one - MyPermissionSet, select it from the list box.

Figure 7

Exclusive and LevelFinal

If you haven't messed around with the default .NET configuration security settings, your assembly already belongs to another built-in code group - My_Computer_Zone. When permissions are calculated, if a particular assembly falls into more than one code group within the same policy level, the final permissions for that assembly will be the union of all the permissions in those code groups. I'll explain how to calculate permissions later, for the time being we only need to run our assembly only with our permission set and that is MyPermissionSet associated with the MyCodeGroup. So we have to set another property to do just that. Right click the newly created MyCodeGroup node and select Properties. Check the check box saying "This policy level will only have the permissions from the permission set associated with this code group." This is called the Exclusive attribute. If this is checked then the run time will never grant more permissions than the permissions associated with this code group. The other option is called LevelFinal. These two properties come into action when calculating permissions and they are explained below in detail.

Figure 8

I know we have set lots of properties, but it'll all make sense at the end (hopefully).

Okay .. it's time to run the code. What we have done so far is, we have put our code into a code group and given permissions only to read from C: drive. Run the code and try both buttons. Read should work fine, but when you press Write, an exception will be thrown because we didn't set permission to write to C: drive. Below is the error message that you get.

Figure 9

So thanks to Code Access Security, this kind of restriction to a resource is possible. There's a whole lot more that you can do with Code Access Security, which we're going to discuss in the rest of this article.

Functions of Code Access Security

According to the documentation, Code Access Security performs the following functions: (straight from the documentation)

Defines permissions and permission sets that represent the right to access various system resources.
Enables administrators to configure security policy by associating sets of permissions with groups of code (code groups).
Enables code to request the permissions it requires in order to run, as well as the permissions that would be useful to have, and specifies which permissions the code must never have.
Grants permissions to each assembly that is loaded, based on the permissions requested by the code and on the operations permitted by security policy.
Enables code to demand that its callers have specific permissions. Enables code to demand that its callers possess a digital signature, thus allowing only callers from a particular organization or site to call the protected code.
Enforces restrictions on code at run time by comparing the granted permissions of every caller on the call stack to the permissions that callers must have.

We have already done the top two, and that is the administrative part. There's a separate namespace that we haven't looked at yet - System.Security, which is dedicated to implementing security.

Security Namespace

These are the main classes in System.Security namespace:

Classes	Description
`CodeAccessPermission`	Defines the underlying structure of all code access permissions.
`PermissionSet`	Represents a collection that can contain many different types of permissions.
`SecurityException`	The exception that is thrown when a security error is detected.

These are the main classes in System.Security.Permissions namespace:

Classes	Description
`EnvironmentPermission`	Controls access to system and user environment variables.
`FileDialogPermission`	Controls the ability to access files or folders through a file dialog.
`FileIOPermission`	Controls the ability to access files and folders.
`IsolatedStorageFilePermission`	Specifies the allowed usage of a private virtual file system.
`IsolatedStoragePermission`	Represents access to generic isolated storage capabilities.
`ReflectionPermission`	Controls access to metadata through the `System.Reflection` APIs.
`RegistryPermission`	Controls the ability to access registry variables.
`SecurityPermission`	Describes a set of security permissions applied to code.
`UIPermission`	Controls the permissions related to user interfaces and the clipboard.

You can find more permission classes in other namespaces. For example, SocketPermission and WebPermission in System.Net namespace, SqlClientPermission in System.Data.SqlClient namespace, PerformanceCounterPermission in System.Diagnostics namespace etc. All these classes represent a protected resource.

Next, we'll see how we can use these classes.

Declarative vs. Imperative

You can use two different kinds of syntax when coding, declarative and imperative.

Declarative syntax

Declarative syntax uses attributes to mark the method, class or the assembly with the necessary security information. So when compiled, these are placed in the metadata section of the assembly.

[FileIOPermission(SecurityAction.Demand, Unrestricted=true)]

public calss MyClass

    public MyClass() {...}   // all these methods

    public void MyMethod_A() {...} // demands unrestricted access to

    public void MyMethod_B() {...} // the file system

Imperative syntax

Imperative syntax uses runtime method calls to create new instances of security classes.

public calss MyClass

    public MyClass() { }

    public void Method_A()

        // Do Something

        FileIOPermission myPerm =

          new FileIOPermission(PermissionState.Unrestricted);

        myPerm.Demand();

        // rest of the code won't get executed if this failed

        // Do Something

    // No demands

    public void Method_B()

        // Do Something

The main difference between these two is, declarative calls are evaluated at compile time while imperative calls are evaluated at runtime. Please note that compile time means during JIT compilation (IL to native).

There are several actions that can be taken against permissions.

First, let's see how we can use the declarative syntax. Take the UIPermission class. Declarative syntax means using attributes. So we are actually using the UIPermissionAttribute class. When you refer to the MSDN documentation, you can see these public properties:

Action - one of the values in SecurityAction enum (common)
Unrestricted - unrestricted access to the resource (common)
Clipboard - type of access to the clipboard, one of the values in UIPermissionClipboard enum (UIPermission specific)
Window - type of access to the window, one of the values in UIPermissionWindow enum (UIPermission specific).

Action and Unrestricted properties are common to all permission classes. Clipboard and Window properties are specific to UIPermission class. You have to provide the action that you are taking and the other properties that are specific to the permission class you are using. So in this case, you can write like the following:

[UIPermission(SecurityAction.Demand,

      Clipboard=UIPermissionClipboard.AllClipboard)]

or with both Clipboard and Window properties:

[UIPermission(SecurityAction.Demand,

      Clipboard=UIPermissionClipboard.AllClipboard,

      Window=UIPermissionWindow.AllWindows)]

If you want to declare a permission with unrestricted access, you can do it as the following:

[UIPermission(SecurityAction.Demand, Unrestricted=true)]

When using imperative syntax, you can use the constructor to pass the values and later call the appropriate action. We'll take the RegistryPermission class.

RegistryPermission myRegPerm =

   new RegistryPermission(RegistryPermissionAccess.AllAccess,

   "HKEY_LOCAL_MACHINE\\Software");

myRegPerm.Demand();

If you want unrestricted access to the resource, you can use PermissionState enum in the following way:

RegistryPermission myRegPerm = new

  RegistryPermission(PermissionState.Unrestricted);

myRegPerm.Demand();

This is all you need to know to use any permission class in the .NET Framework. Now, we'll discuss about the actions in detail.

Security Demands

Demands are used to ensure that every caller who calls your code (directly or indirectly) has been granted the demanded permission. This is accomplished by performing a stack walk. What .. a cat walk? No, that's what your girl friend does. I mean a stack walk. When demanded for a permission, the runtime's security system walks the call stack, comparing the granted permissions of each caller to the permission being demanded. If any caller in the call stack is found without the demanded permission then a SecurityException is thrown. Please look at the following figure which is taken from the MSDN documentation.

Figure 10

Different assemblies as well as different methods in the same assembly are checked by the stack walk.

Now back to demands. These are the three types of demands.

Demand
Link Demand
Inheritance Demand

Demand

Try this sample coding. We didn't use security namespaces before, but we are going to use them now.

using System.Security;

using System.Security.Permissions;

Add another button to the existing form.

private void btnFileRead_Click(object sender, System.EventArgs e)

try

        InitUI(1);

    catch (SecurityException err)

        MessageBox.Show(err.Message,"Security Error");

    catch (Exception err)

        MessageBox.Show(err.Message,"Error");

InitUI just calls the ShowUI function. Note that it has been denied permission to read the C: drive.

// Access is denied for this function to read from C: drive

// Note: Using declrative syntax

[FileIOPermission(SecurityAction.Deny,Read="C:\\")]

private void InitUI(int uino)

    // Do some initializations

    ShowUI(uino);    // call ShowUI

ShowUI function takes uino in and shows the appropriate UI.

private void ShowUI(int uino)

    switch (uino)

        case 1: // That's our FileRead UI

            ShowFileReadUI();

            break;

        case 2:

            // Show someother UI

            break;

ShowFileReadUI shows the UI related to reading files.

private void ShowFileReadUI()

    MessageBox.Show("Before calling demand");

    FileIOPermission myPerm = new

      FileIOPermission(FileIOPermissionAccess.Read, "C:\\");

    myPerm.Demand();

       // All callers must have read permission to C: drive

      // Note: Using imperative syntax

    // code to show UI

    MessageBox.Show("Showing FileRead UI");

    // This is excuted if only the Demand is successful.

I know that this is a silly example, but it's enough to do the job.

Now run the code. You should get the "Before calling demand" message, and right after that the custom error message - "Security Error". What went wrong? Look at the following figure:

Figure 11

We have denied read permission for the InitUI method. So when ShowFileReadUI demands read permission to C: drive, it causes a stack walk and finds out that not every caller is granted the demanded permission and throws an exception. Just comment out the Deny statement in InitUI method, then this should be working fine because all the callers have the demanded permission.

Note that according to the documentation, most classes in .NET Framework already have demands associated with them. For example, take the StreamReader class. StreamReader automatically demands FileIOPermission. So placing another demand just before it causes an unnecessary stack walk.

Link Demand

A link demand only checks the immediate caller (direct caller) of your code. That means it doesn't perform a stack walk. Linking occurs when your code is bound to a type reference, including function pointer references and method calls. A link demand can only be applied declaratively.

[FileIOPermission(SecurityAction.LinkDemand,Read="C:\\")]

private void MyMethod()

    // Do Something

Inheritance Demand

Inheritance demands can be applied to classes or methods. If it is applied to a class, then all the classes that derive from this class must have the specified permission.

[SecurityPermission(SecurityAction.InheritanceDemand)]

private class MyClass()

    // what ever

If it is applied to a method, then all the classes that derive from this class must have the specified permission to override that method.

private class MyClass()

    public class MyClass() {}

    [SecurityPermission(SecurityAction.InheritanceDemand)]

    public virtual void MyMethod()

        // Do something

Like link demands, inheritance demands are also applied using declarative syntax only.

Requesting Permissions

Imagine a situation like this. You have given a nice form to the user with 20+ fields to enter and at the end, all the information would be saved to a text file. The user fills all the necessary fields and when he tries to save, he'll get this nice message saying it doesn't have the necessary permission to create a text file! Of course you can try to calm him down explaining all this happened because of a thing called stack walk .. caused by a demand .. and if you are really lucky you can even get away by blaming Microsoft (believe me ... sometimes it works!).

Wouldn't it be easier if you can request the permissions prior to loading the assembly? Yes you can. There are three ways to do that in Code Access Security.

RequestMinimum
RequestOptional
RequestRefuse

Note that these can only be applied using declarative syntax in the assembly level, and not to methods or classes. The best thing in requesting permissions is that the administrator can view the requested permissions after the assembly has been deployed, using the permview.exe (Permission View Tool), so what ever the permissions needed can be granted.

RequestMinimum

You can use RequestMinimum to specify the permissions your code must have in order to run. The code will be only allowed to run if all the required permissions are granted by the security policy. In the following code fragment, a request has been made for permissions to write to a key in the registry. If this is not granted by the security policy, the assembly won't even get loaded. As mentioned above, this kind of request can only be made in the assembly level, declaratively.

using System;

using System.Windows.Forms;

using System.IO;

using System.Security;

using System.Security.Permissions;

// placed in assembly level

// using declarative syntax

[assembly:RegistryPermission(SecurityAction.RequestMinimum,

         Write="HKEY_LOCAL_MACHINE\\Software")]

namespace SecurityApp

    // Rest of the implementation

RequestOptional

Using RequestOptional, you can specify the permissions your code can use, but not required in order to run. If somehow your code has not been granted the optional permissions, then you must handle any exceptions that is thrown while code segments that need these optional permissions are being executed. There are certain things to keep in mind when working with RequestOptional.

If you use RequestOptional with RequestMinimum, no other permissions will be granted except these two, if allowed by the security policy. Even if the security policy allows additional permissions to your assembly, they won't be granted. Look at this code segment:

[assembly:FileIOPermission(SecurityAction.RequestMinimum, Read="C:\\")]

[assembly:FileIOPermission(SecurityAction.RequestOptional, Write="C:\\")]

The only permissions that this assembly will have are read and write permissions to the file system. What if it needs to show a UI? Then the assembly still gets loaded but an exception will be thrown when the line that shows the UI is executing, because even though the security policy allows UIPermission, it is not granted to this assembly.

Note that, like RequestMinimum, RequestOptional doesn't prevent the assembly from being loaded, but throws an exception at run time if the optional permission has not been granted.

RequestRefuse

You can use RequestRefuse to specify the permissions that you want to ensure will never be granted to your code, even if they are granted by the security policy. If your code only wants to read files, then refusing write permission would ensure that your code cannot be misused by a malicious attack or a bug to alter files.

[assembly:FileIOPermission(SecurityAction.RequestRefuse, Write="C:\\")]

Overriding Security

Sometimes you need to override certain security checks. You can do this by altering the behavior of a permission stack walk using these three methods. They are referred to as stack walk modifiers.

Assert
Deny
PermitOnly

Assert

You can call the Assert method to stop the stack walk from going beyond the current stack frame. So the callers above the method that has used Assert are not checked. If you can trust the upstream callers, then using Assert would do no harm. You can use the previous example to test this. Modify the code in ShowUI method, just add the two new lines shown below:

private void ShowUI(int uino)

    // using imperative syntax to create a instance of FileIOPermission

    FileIOPermission myPerm = new

      FileIOPermission(FileIOPermissionAccess.Read, "C:\\");

    myPerm.Assert();    // don't check above stack frames.

    switch (uino)

        case 1: // That's our FileRead UI

            ShowFileReadUI();

            break;

        case 2:

            // Show someother UI

            break;

    CodeAccessPermission.RevertAssert();    // cancel assert

Make sure that the Deny statement is still there in InitUI method. Now run the code. It should be working fine without giving any exceptions. Look at the following figure:

Figure 12

Even though InitUI doesn't have the demanded permission, it is never checked because the stack walk stops from ShowUI. Look at the last line. RevertAssert is a static method of CodeAccessPermission. It is used after an Assert to cancel the Assert statement. So if the code below RevertAssert is accessing some protected resources, then a normal stack walk would be performed and all callers would be checked. If there's no Assert for the current stack frame, then RevertAssert has no effect. It is a good practice to place the RevertAssert in a finally block, so it will always get called.

Note that to use Assert, the Assertion flag of the SecurityPermission should be set.

Warning from Microsoft!: If asserts are not handled carefully it may lead into luring attacks where malicious code can call our code through trusted code.

Deny

We have used this method already in the previous example. The following code sample shows how to deny permission to connect to a restricted website using imperative syntax:

WebPermission myWebPermission =

        new WebPermission(NetworkAccess.Connect,

        "http://www.somewebsite.com");

myWebPermission.Deny();

// Do some work

CodeAccessPermission.RevertDeny(); // cancel Deny

RevertDeny is used to remove a previous Deny statement from the current stack frame.

PermitOnly

You can use PermitOnly in some situations when needed to restrict permissions granted by security policy. The following code fragment shows how to use it imperatively. When PermitOnly is used, it means only the resources you specify can be accessed.

WebPermission myWebPermission =

  new WebPermission(NetworkAccess.Connect,

  "http://www.somewebsite.com");

myWebPermission.PermitOnly();

// Do some work

CodeAccessPermission.PermitOnly(); // cancel PermitOnly

You can use PermitOnly instead of Deny when it is more convenient to describe resources that can be accessed instead of resources that cannot be accessed.

Calculating Permissions

In the first example, we configured the machine policy level to set permissions for our code. Now we'll see how those permissions are calculated and granted by the runtime when your code belongs to more than one code group in the same policy level or in different policy levels.

The CLR computes the allowed permission set for an assembly in the following way:

Starting from the All_Code code group, all the child groups are searched to determine which groups the code belongs to, using identity information provided by the evidence. (If the parent group doesn't match, then that group's child groups are not checked.)
When all matches are identified for a particular policy level, the permissions associated with those groups are combined in an additive manner (union).
This is repeated for each policy level and permissions associated with each policy level are intersected with each other.

So all the permissions associated with matching code groups in one policy level are added together (union) and the result for each policy level is intersected with one another. An intersection is used to ensure that policy lower down in the hierarchy cannot add permissions that were not granted by a higher level.

Look at the following figure taken from a MSDN article, to get a better understanding:

Figure 13

Have a quick look at the All_Code code group's associated permission set in Machine policy level. Hope it makes sense by now.

Figure 14

The runtime computes the allowed permission set differently if the Exclusive or LevelFinal attribute is applied to the code group. If you are not suffering from short term memory loss, you should remember that we set the Exclusive attribute for our code group - MyCodeGroup in the earlier example.

Here's what happens if these attributes are set.

Exclusive - The permissions with the code group marked as Exclusive are taken as the only permissions for that policy level. So permissions associated with other code groups are not considered when computing permissions.
LevelFinal - Policy levels (except the application domain level) below the one containing this code group are not considered when checking code group membership and granting permissions.

Now you should have a clear understanding why we set the Exclusive attribute earlier.

Nice Features in .NET Configuration Tool

There are some nice features in .NET Configuration Tool. Just right click the Runtime Security Policy node and you'll see what I'm talking about.

Figure 15

Among other options there are two important ones.

Evaluate Assembly - This can be used to find out which code group(s) a particular assembly belongs to, or which permissions it has.
Create Deployment Package - This wizard will create a policy deployment package. Just choose the policy level and this wizard will wrap it into a Windows Installer Package (.msi file), so what ever the code groups and permissions in your development PC can be quickly transferred to any other machine without any headache.

Tools

Permissions View Tool - permview.exe

The Permissions View tool is used to view the minimal, optional, and refused permission sets requested by an assembly. Optionally, you can use permview.exe to view all declarative security used by an assembly. Please refer to the MSDN documentation for additional information.

Examples:

permview SecurityApp.exe - Displays the permissions requested by the assembly SecurityApp.exe.

Code Access Security Policy Tool - caspol.exe

The Code Access Security Policy tool enables users and administrators to modify security policy for the machine policy level, the user policy level and the enterprise policy level. Please refer to the MSDN documentation for additional information.

Examples:

Here's the output when you run "caspol -listgroups", this will list the code groups that belong to the default policy level - Machine level.

Figure 16

Note that label "1." is for All_Code node because it is the parent node. It's child nodes are labeled as "1.x", and their child nodes are labeled as "1.x.x", get the picture?

caspol -listgroups - Displays the code groups
caspol -machine -addgroup 1. -zone Internet Execution - Adds a child code group to the root of the machine policy code group hierarchy. The new code group is a member of the Internet zone and is associated with the Execution permission set.
caspol -user -chggroup 1.2. Execution - Changes the permission set in the user policy of the code group labeled 1.2. to the Execution permission set.
caspol -security on - Turns code access security on.

caspol -security off - Turns code access security off.

Sunday, August 10, 2008

New DataSet Features in ADO.NET 2.0

Contents

Introduction

Raw Performance

New Indexing Engine

Binary Serialization Option

The DataTable – More Independent Than Before

Stream to Cache, Cache to Stream

The Load Method – Basic Use

The Load Method – Why am I loading this data?

Example

The GetTableReader Method

The DataView.ToTable Method

Conclusion

Friday, August 8, 2008

CODE ACCESS SECURITY

Contents

Introduction

Jargon

Permissions

Permission sets

Code groups

Policy

Quick Example

.NET Configuration Tool

Creating a new permission set

Creating a new code group

Exclusive and LevelFinal

Functions of Code Access Security

Security Namespace

Declarative vs. Imperative

Declarative syntax

Imperative syntax

Security Demands

Demand

Link Demand

Inheritance Demand

Requesting Permissions

RequestMinimum

RequestOptional

RequestRefuse

Overriding Security

Assert

Deny

PermitOnly

Calculating Permissions

Nice Features in .NET Configuration Tool

Tools

Permissions View Tool - permview.exe

Code Access Security Policy Tool - caspol.exe