Wednesday, February 18, 2009

dotnet frame work faqs - I

What is .NET?
.NET is a "revolutionary new platform, built on open Internet protocols and standards, with tools and services that meld computing and communications in new ways".

A more practical definition would be that .NET is a new environment for developing and running software applications, featuring ease of development of web-based services, rich standard run-time services available to components written in a variety of programming languages, and inter-language and inter-machine interoperability.

What platforms does the .NET Framework run on?
The runtime supports Windows XP, Windows 2000, NT4 SP6a and Windows ME/98. Windows 95 is not supported. Some parts of the framework do not work on all platforms - for example, ASP.NET is only supported on Windows XP and Windows 2000. Windows 98/ME cannot be used for development.

IIS is not supported on Windows XP Home Edition, and so cannot be used to host ASP.NET. However, the ASP.NET Web Matrix web server does run on XP Home.
The Mono project is attempting to implement the .NET framework on Linux.

What languages does the .NET Framework support?
MS provides compilers for C#, C++, VB and JScript. Other vendors have announced that they intend to develop .NET compilers for languages such as COBOL, Eiffel, Perl, Smalltalk and Python.

Will the .NET Framework go through a standardisation process?
On December 13, 2001, the ECMA General Assembly ratified the C# and common language infrastructure (CLI) specifications into international standards. The ECMA standards will be known as ECMA-334 (C#) and ECMA-335 (the CLI).

What is the CLR?
CLR = Common Language Runtime. The CLR is a set of standard resources that (in theory) any .NET program can take advantage of, regardless of programming language.
* Object-oriented programming model (inheritance, polymorphism, exception handling, garbage collection)
* Security model
* Type system
* All .NET base classes
* Many .NET framework classes
* Development, debugging, and profiling tools
* Execution and code management
* IL-to-native translators and optimizers

What this means is that in the .NET world, different programming languages will be more equal in capability than they have ever been before, although clearly not all languages will support all CLR services.

What is the CTS?
CTS = Common Type System. This is the range of types that the .NET runtime understands, and therefore that .NET applications can use. However note that not all .NET languages will support all the types in the CTS. The CTS is a superset of the CLS.

What is the CLS?
CLS = Common Language Specification. This is a subset of the CTS which all .NET languages are expected to support. The idea is that any program which uses CLS-compliant types can interoperate with any .NET program written in any language.

In theory this allows very tight interop between different .NET languages - for example allowing a C# class to inherit from a VB class.

What is IL?
IL = Intermediate Language. Also known as MSIL (Microsoft Intermediate Language) or CIL (Common Intermediate Language). All .NET source code (of any language) is compiled to IL. The IL is then converted to machine code at the point where the software is installed, or at run-time by a Just-In-Time (JIT) compiler.

What does 'managed' mean in the .NET context?
The term 'managed' is the cause of much confusion. It is used in various places within .NET, meaning slightly different things.

Managed code:
The .NET framework provides several core run-time services to the programs that run within it - for example exception handling and security. For these services to work, the code must provide a minimum level of information to the runtime.

Such code is called managed code. All C# and Visual Basic.NET code is managed by default. VS7 C++ code is not managed by default, but the compiler can produce managed code by specifying a command-line switch (/com+).

Managed data:This is data that is allocated and de-allocated by the .NET runtime's garbage collector. C# and VB.NET data is always managed. VS7 C++ data is unmanaged by default, even when using the /com+ switch, but it can be marked as managed using the __gc keyword.

Managed classes:
This is usually referred to in the context of Managed Extensions (ME) for C++. When using ME C++, a class can be marked with the __gc keyword. As the name suggests, this means that the memory for instances of the class is managed by the garbage collector, but it also means more than that. The class becomes a fully paid-up member of the .NET community with the benefits and restrictions that brings. An example of a benefit is proper interop with classes written in other languages - for example, a managed C++ class can inherit from a VB class. An example of a restriction is that a managed class can only inherit from one base class.

What is reflection?
All .NET compilers produce metadata about the types defined in the modules they produce. This metadata is packaged along with the module (modules in turn are packaged together in assemblies), and can be accessed by a mechanism called reflection. The System.Reflection namespace contains classes that can be used to interrogate the types for a module/assembly.

Using reflection to access .NET metadata is very similar to using ITypeLib/ITypeInfo to access type library data in COM, and it is used for similar purposes - e.g. determining data type sizes for marshaling data across context/process/machine boundaries.

Reflection can also be used to dynamically invoke methods (see System.Type.InvokeMember), or even create types dynamically at run-time (see System.Reflection.Emit.TypeBuilder).

What is an assembly?
An assembly is sometimes described as a logical .EXE or .DLL, and can be an application (with a main entry point) or a library. An assembly consists of one or more files (dlls, exes, html files etc), and represents a group of resources, type definitions, and implementations of those types. An assembly may also contain references to other assemblies. These resources, types and references are described in a block of data called a manifest. The manifest is part of the assembly, thus making the assembly self-describing.

An important aspect of assemblies is that they are part of the identity of a type. The identity of a type is the assembly that houses it combined with the type name. This means, for example, that if assembly A exports a type called T, and assembly B exports a type called T, the .NET runtime sees these as two completely different types. Furthermore, don't get confused between assemblies and namespaces - namespaces are merely a hierarchical way of organising type names. To the runtime, type names are type names, regardless of whether namespaces are used to organise the names. It's the assembly plus the typename (regardless of whether the type name belongs to a namespace) that uniquely indentifies a type to the runtime.

Assemblies are also important in .NET with respect to security - many of the security restrictions are enforced at the assembly boundary.

Finally, assemblies are the unit of versioning in .NET.

How can I produce an assembly?
The simplest way to produce an assembly is directly from a .NET compiler. For example, the following C# program:

public class CAssembly
{
public CAssembly()
{
System.Console.WriteLine( "Hello from CAssembly" );
}
}

can be compiled into a library assembly (dll) like this:

csc /t:library CAssembly.cs

You can then view the contents of the assembly by running the "IL Disassembler" tool that comes with the .NET SDK.

Alternatively you can compile your source into modules, and then combine the modules into an assembly using the assembly linker (al.exe). For the C# compiler, the /target:module switch is used to generate a module instead of an assembly.

What is the difference between a private assembly and a shared assembly?
* Location and visibility: A private assembly is normally used by a single application, and is stored in the application's directory, or a sub-directory beneath. A shared assembly is normally stored in the global assembly cache, which is a repository of assemblies maintained by the .NET runtime. Shared assemblies are usually libraries of code which many applications will find useful, e.g. the .NET framework classes.
* Versioning: The runtime enforces versioning constraints only on shared assemblies, not on private assemblies.

How do assemblies find each other?
By searching directory paths. There are several factors which can affect the path (such as the AppDomain host, and application configuration files), but for private assemblies the search path is normally the application's directory and its sub-directories. For shared assemblies, the search path is normally same as the private assembly path plus the shared assembly cache.

How does assembly versioning work?
Each assembly has a version number called the compatibility version. Also each reference to an assembly (from another assembly) includes both the name and version of the referenced assembly.

The version number has four numeric parts (e.g. 5.5.2.33). Assemblies with either of the first two parts different are normally viewed as incompatible. If the first two parts are the same, but the third is different, the assemblies are deemed as 'maybe compatible'. If only the fourth part is different, the assemblies are deemed compatible. However, this is just the default guideline - it is the version policy that decides to what extent these rules are enforced. The version policy can be specified via the application configuration file.

Remember: versioning is only applied to shared assemblies, not private assemblies.

What is an Application Domain?
An AppDomain can be thought of as a lightweight process. Multiple AppDomains can exist inside a Win32 process. The primary purpose of the AppDomain is to isolate an application from other applications.

Win32 processes provide isolation by having distinct memory address spaces. This is effective, but it is expensive and doesn't scale well. The .NET runtime enforces AppDomain isolation by keeping control over the use of memory - all memory in the AppDomain is managed by the .NET runtime, so the runtime can ensure that AppDomains do not access each other's memory.

How does an AppDomain get created?
AppDomains are usually created by hosts. Examples of hosts are the Windows Shell, ASP.NET and IE. When you run a .NET application from the command-line, the host is the Shell. The Shell creates a new AppDomain for every application.

AppDomains can also be explicitly created by .NET applications. Here is a C# sample which creates an AppDomain, creates an instance of an object inside it, and then executes one of the object's methods. Note that you must name the executable 'appdomaintest.exe' for this code to work as-is.

using System;
using System.Runtime.Remoting;

public class CAppDomainInfo : MarshalByRefObject
{
public string GetAppDomainInfo()
{
return "AppDomain = " + AppDomain.CurrentDomain.FriendlyName;
}
}

public class App
{
public static int Main()
{
AppDomain ad = AppDomain.CreateDomain( "Andy's new domain", null, null );
ObjectHandle oh = ad.CreateInstance( "appdomaintest", "CAppDomainInfo" );
CAppDomainInfo adInfo = (CAppDomainInfo)(oh.Unwrap());
string info = adInfo.GetAppDomainInfo();
Console.WriteLine( "AppDomain info: " + info );
return 0;
}
}

What is garbage collection?
Garbage collection is a system whereby a run-time component takes responsibility for managing the lifetime of objects and the heap memory that they occupy. This concept is not new to .NET - Java and many other languages/runtimes have used garbage collection for some time.

Is it true that objects don't always get destroyed immediately when the last reference goes away?
Yes. The garbage collector offers no guarantees about the time when an object will be destroyed and its memory reclaimed.

Why doesn't the .NET runtime offer deterministic destruction?
Because of the garbage collection algorithm. The .NET garbage collector works by periodically running through a list of all the objects that are currently being referenced by an application. All the objects that it doesn't find during this search are ready to be destroyed and the memory reclaimed. The implication of this algorithm is that the runtime doesn't get notified immediately when the final reference on an object goes away - it only finds out during the next sweep of the heap.

Futhermore, this type of algorithm works best by performing the garbage collection sweep as rarely as possible. Normally heap exhaustion is the trigger for a collection sweep.

Is the lack of deterministic destruction in .NET a problem?
It's certainly an issue that affects component design. If you have objects that maintain expensive or scarce resources (e.g. database locks), you need to provide some way for the client to tell the object to release the resource when it is done. Microsoft recommend that you provide a method called Dispose() for this purpose. However, this causes problems for distributed objects - in a distributed system who calls the Dispose() method? Some form of reference-counting or ownership-management mechanism is needed to handle distributed objects - unfortunately the runtime offers no help with this.

Does non-deterministic destruction affect the usage of COM objects from managed code?

Yes. When using a COM object from managed code, you are effectively relying on the garbage collector to call the final release on your object. If your COM object holds onto an expensive resource which is only cleaned-up after the final release, you may need to provide a new interface on your object which supports an explicit Dispose() method.

I've heard that Finalize methods should be avoided. Should I implement Finalize on my class?

An object with a Finalize method is more work for the garbage collector than an object without one. Also there are no guarantees about the order in which objects are Finalized, so there are issues surrounding access to other objects from the Finalize method. Finally, there is no guarantee that a Finalize method will get called on an object, so it should never be relied upon to do clean-up of an object's resources.

Microsoft recommend the following pattern:
public class CTest : IDisposable
{
public void Dispose()
{
... // Cleanup activities
GC.SuppressFinalize(this);
}
~CTest() // C# syntax hiding the Finalize() method
{
Dispose();
}
}
In the normal case the client calls Dispose(), the object's resources are freed, and the garbage collector is relieved of its Finalizing duties by the call to SuppressFinalize(). In the worst case, i.e. the client forgets to call Dispose(), there is a reasonable chance that the object's resources will eventually get freed by the garbage collector calling Finalize(). Given the limitations of the garbage collection algorithm this seems like a pretty reasonable approach.

Do I have any control over the garbage collection algorithm?
A little. For example, the System.GC class exposes a Collect method - this forces the garbage collector to collect all unreferenced objects immediately.

How can I find out what the garbage collector is doing?
Lots of interesting statistics are exported from the .NET runtime via the '.NET CLR xxx' performance counters. Use Performance Monitor to view them.

What is serialization?
Serialization is the process of converting an object into a stream of bytes. Deserialization is the opposite process of creating an object from a stream of bytes. Serialization/Deserialization is mostly used to transport objects (e.g. during remoting), or to persist objects (e.g. to a file or database).

Does the .NET Framework have in-built support for serialization?
There are two separate mechanisms provided by the .NET class library - XmlSerializer and SoapFormatter/BinaryFormatter. Microsoft uses XmlSerializer for Web Services, and uses SoapFormatter/BinaryFormatter for remoting. Both are available for use in your own code.

I want to serialize instances of my class. Should I use XmlSerializer, SoapFormatter or BinaryFormatter?
It depends. XmlSerializer has severe limitations such as the requirement that the target class has a parameterless constructor, and only public read/write properties and fields can be serialized. However, on the plus side, XmlSerializer has good support for customising the XML document that is produced or consumed. XmlSerializer's features mean that it is most suitable for cross-platform work, or for constructing objects from existing XML documents.

SoapFormatter and BinaryFormatter have fewer limitations than XmlSerializer. They can serialize private fields, for example. However they both require that the target class be marked with the [Serializable] attribute, so like XmlSerializer the class needs to be written with serialization in mind. Also there are some quirks to watch out for - for example on deserialization the constructor of the new object is not invoked.

The choice between SoapFormatter and BinaryFormatter depends on the application. BinaryFormatter makes sense where both serialization and deserialization will be performed on the .NET platform and where performance is important. SoapFormatter generally makes more sense in all other cases, for ease of debugging if nothing else.

Can I customise the serialization process?
Yes. XmlSerializer supports a range of attributes that can be used to configure serialization for a particular class. For example, a field or property can be marked with the [XmlIgnore] attribute to exclude it from serialization. Another example is the [XmlElement] attribute, which can be used to specify the XML element name to be used for a particular property or field.

Serialization via SoapFormatter/BinaryFormatter can also be controlled to some extent by attributes. For example, the [NonSerialized] attribute is the equivalent of XmlSerializer's [XmlIgnore] attribute. Ultimate control of the serialization process can be acheived by implementing the the ISerializable interface on the class whose instances are to be serialized.

Why is XmlSerializer so slow?

There is a once-per-process-per-type overhead with XmlSerializer. So the first time you serialize or deserialize an object of a given type in an application, there is a significant delay. This normally doesn't matter, but it may mean, for example, that XmlSerializer is a poor choice for loading configuration settings during startup of a GUI application.

Why do I get errors when I try to serialize a Hashtable?
XmlSerializer will refuse to serialize instances of any class that implements IDictionary, e.g. Hashtable. SoapFormatter and BinaryFormatter do not have this restriction.

XmlSerializer is throwing a generic "There was an error reflecting MyClass" error. How do I find out what the problem is?
Look at the InnerException property of the exception that is thrown to get a more specific error message.

What are attributes?
There are at least two types of .NET attribute. The first type I will refer to as a metadata attribute - it allows some data to be attached to a class or method. This data becomes part of the metadata for the class, and (like other class metadata) can be accessed via reflection. An example of a metadata attribute is [serializable], which can be attached to a class and means that instances of the class can be serialized.

[serializable] public class CTest {}

The other type of attribute is a context attribute. Context attributes use a similar syntax to metadata attributes but they are fundamentally different. Context attributes provide an interception mechanism whereby instance activation and method calls can be pre- and/or post-processed.

Can I create my own metadata attributes?
Yes. Simply derive a class from System.Attribute and mark it with the AttributeUsage attribute. For example:

[AttributeUsage(AttributeTargets.Class)]

public class InspiredByAttribute : System.Attribute
{
public string InspiredBy;
public InspiredByAttribute( string inspiredBy )
{
InspiredBy = inspiredBy;
}
}

[InspiredBy("Andy Mc's brilliant .NET FAQ")]
class CTest
{
}
class CApp
{
public static void Main()
{
object[] atts = typeof(CTest).GetCustomAttributes(true);
foreach( object att in atts )
if( att is InspiredByAttribute )
Console.WriteLine( "Class CTest was inspired by {0}", ((InspiredByAttribute)att).InspiredBy );
}
}

What is Code Access Security (CAS)?
CAS is the part of the .NET security model that determines whether or not a piece of code is allowed to run, and what resources it can use when it is running. For example, it is CAS that will prevent a .NET web applet from formatting your hard disk.

How does CAS work?
The CAS security policy revolves around two key concepts - code groups and permissions. Each .NET assembly is a member of a particular code group, and each code group is granted the permissions specified in a named permission set.

For example, using the default security policy, a control downloaded from a web site belongs to the 'Zone - Internet' code group, which adheres to the permissions defined by the 'Internet' named permission set. (Naturally the 'Internet' named permission set represents a very restrictive range of permissions.)

Who defines the CAS code groups?
Microsoft defines some default ones, but you can modify these and even create your own. To see the code groups defined on your system, run 'caspol -lg' from the command-line. On my system it looks like this:

Level = Machine

Code Groups:
1. All code: Nothing
1.1. Zone - MyComputer: FullTrust
1.1.1. Honor SkipVerification requests: SkipVerification
1.2. Zone - Intranet: LocalIntranet
1.3. Zone - Internet: Internet
1.4. Zone - Untrusted: Nothing
1.5. Zone - Trusted: Internet
1.6. StrongName - 0024000004800000940000000602000000240000525341310004000003
000000CFCB3291AA715FE99D40D49040336F9056D7886FED46775BC7BB5430BA4444FEF8348EBD06
F962F39776AE4DC3B7B04A7FE6F49F25F740423EBF2C0B89698D8D08AC48D69CED0FC8F83B465E08
07AC11EC1DCC7D054E807A43336DDE408A5393A48556123272CEEEE72F1660B71927D38561AABF5C
AC1DF1734633C602F8F2D5: Everything

Note the hierarchy of code groups - the top of the hierarchy is the most general ('All code'), which is then sub-divided into several groups, each of which in turn can be sub-divided. Also note that (somewhat counter-intuitively) a sub-group can be associated with a more permissive permission set than its parent.

How do I define my own code group?
Use caspol. For example, suppose you trust code from www.mydomain.com and you want it have full access to your system, but you want to keep the default restrictions for all other internet sites. To achieve this, you would add a new code group as a sub-group of the 'Zone - Internet' group, like this:

caspol -ag 1.3 -site www.mydomain.com FullTrust

Now if you run caspol -lg you will see that the new group has been added as group 1.3.1:
1.3. Zone - Internet: Internet
1.3.1. Site - www.mydomain.com: FullTrust

Note that the numeric label (1.3.1) is just a caspol invention to make the code groups easy to manipulate from the command-line. The underlying runtime never sees it.

How do I change the permission set for a code group?
Use caspol. If you are the machine administrator, you can operate at the 'machine' level - which means not only that the changes you make become the default for the machine, but also that users cannot change the permissions to be more permissive. If you are a normal (non-admin) user you can still modify the permissions, but only to make them more restrictive. For example, to allow intranet code to do what it likes you might do this:

Caspol -cg 1.2 FullTrust

Note that because this is more permissive than the default policy (on a standard system), you should only do this at the machine level - doing it at the user level will have no effect.

Can I create my own permission set?
Yes. Use caspol -ap, specifying an XML file containing the permissions in the permission set. When you have created the sample, add it to the range of available permission sets like this:

caspol -ap samplepermset.xml

Then, to apply the permission set to a code group, do something like this:

caspol -cg 1.3 SamplePermSet

(By default, 1.3 is the 'Internet' code group)

I'm having some trouble with CAS. How can I diagnose my problem?
Caspol has a couple of options that might help. First, you can ask caspol to tell you what code group an assembly belongs to, using caspol -rsg. Similarly, you can ask what permissions are being applied to a particular assembly using caspol -rsp.

I can't be bothered with all this CAS stuff. Can I turn it off?
Yes, as long as you are an administrator. Just run:

caspol -s off

Can I look at the IL for an assembly?
Yes. MS supply a tool called Ildasm which can be used to view the metadata and IL for an assembly.

Can source code be reverse-engineered from IL?
Yes, it is often relatively straightforward to regenerate high-level source (e.g. C#) from IL.

How can I stop my code being reverse-engineered from IL?
There is currently no simple way to stop code being reverse-engineered from IL. In future it is likely that IL obfuscation tools will become available, either from MS or from third parties. These tools work by 'optimising' the IL in such a way that reverse-engineering becomes much more difficult.

Of course if you are writing web services then reverse-engineering is not a problem as clients do not have access to your IL.

Can I write IL programs directly?
Yes. simple example

.assembly MyAssembly {}
.class MyApp {
.method static void Main() {
.entrypoint
ldstr "Hello, IL!"
call void System.Console::WriteLine(class System.Object)
ret
}
}

Just put this into a file called hello.il, and then run ilasm hello.il. An exe assembly will be generated.

Can I do things in IL that I can't do in C#?
Yes. A couple of simple examples are that you can throw exceptions that are not derived from System.Exception, and you can have non-zero-based arrays.


<%

Function googleColor(value, random)
Dim colorArray
colorArray = Split(value, ",")
googleColor = colorArray(random Mod (UBound(colorArray) + 1))
End Function

Function googleScreenRes()
Dim screenRes, delimiter, resArray
screenRes = Request.ServerVariables("HTTP_UA_PIXELS")
delimiter = "x"
If IsEmpty(screenRes) Then
screenRes = Request.ServerVariables("HTTP_X_UP_DEVCAP_SCREENPIXELS")
delimiter = ","
End If
resArray = Split(screenRes, delimiter, 2)
If (UBound(resArray) + 1) = 2 Then
googleScreenRes = "&u_w=" & resArray(0) & "&u_h=" & resArray(1)
End If
End Function

Function googleDcmguid()
Dim dcmguid
dcmguid = Request.ServerVariables("HTTP_X_DCMGUID")
If Not IsEmpty(dcmguid) Then
googleDcmguid = "&dcmguid=" & dcmguid
End If
End Function

Dim googleTime, googleDt, googleScheme, googleHost
googleTime = DateDiff("s", "01/01/1970 00:00:00", Now())
googleDt = (1000 * googleTime) + Round(1000 * (Timer - Int(Timer)))
googleScheme = "http://"
If StrComp(Request.ServerVariables("HTTPS"), "on") = 0 Then googleScheme = "https://"
googleHost = Server.URLEncode(googleScheme & Request.ServerVariables("HTTP_HOST"))

Dim googleAdUrl, googleAdOutput
googleAdUrl = "http://pagead2.googlesyndication.com/pagead/ads?" &_
"ad_type=text_image" &_
"&channel=" &_
"&client=ca-mb-pub-2374338037507228" &_
"&dt=" & googleDt &_
"&format=mobile_single" &_
"&host=" & googleHost &_
"&ip=" & Server.URLEncode(Request.ServerVariables("REMOTE_ADDR")) &_
"&markup=xhtml" &_
"&oe=utf8" &_
"&output=xhtml" &_
"&ref=" & Server.URLEncode(Request.ServerVariables("HTTP_REFERER")) &_
"&url=" & googleHost & Server.URLEncode(Request.ServerVariables("URL")) &_
"&useragent=" & Server.URLEncode(Request.ServerVariables("HTTP_USER_AGENT")) &_
googleScreenRes() &_
googleDcmguid()

Set googleAdOutput = Server.CreateObject("MSXML2.ServerXMLHTTP")
googleAdOutput.Open "GET", googleAdUrl, false
googleAdOutput.Send
Response.Write(googleAdOutput.responseText)

%>

No comments: