C# 5.0 in a Nutshell (2012)
Chapter 18. Assemblies
An assembly is the basic unit of deployment in .NET and is also the container for all types. An assembly contains compiled types with their IL code, runtime resources, and information to assist with versioning, security, and referencing other assemblies. An assembly also defines a boundary for type resolution and security permissioning. In general, an assembly comprises a single Windows Portable Executable (PE) file—with an .exe extension in the case of an application, or a .dll extension in the case of a reusable library. A WinRT library has a .winmd extension and is similar to a .dll, except that it contains only metadata and no IL code.
Most of the types in this chapter come from the following namespaces:
System.Reflection
System.Resources
System.Globalization
What’s in an Assembly
An assembly contains four kinds of things:
An assembly manifest
Provides information to the .NET runtime, such as the assembly’s name, version, requested permissions, and other assemblies that it references
An application manifest
Provides information to the operating system, such as how the assembly should be deployed and whether administrative elevation is required
Compiled types
The compiled IL code and metadata of the types defined within the assembly
Resources
Other data embedded within the assembly, such as images and localizable text
Of these, only the assembly manifest is mandatory, although an assembly nearly always contains compiled types (unless it’s a WinRT reference assembly).
Assemblies are structured similarly whether they’re executables or libraries. The main difference with an executable is that it defines an entry point.
The Assembly Manifest
The assembly manifest serves two purposes:
§ It describes the assembly to the managed hosting environment.
§ It acts as a directory to the modules, types, and resources in the assembly.
Assemblies are hence self-describing. A consumer can discover all of an assembly’s data, types, and functions—without needing additional files.
NOTE
An assembly manifest is not something you add explicitly to an assembly—it’s automatically embedded into an assembly as part of compilation.
Here’s a summary of the functionally significant data stored in the manifest:
§ The simple name of the assembly
§ A version number (AssemblyVersion)
§ A public key and signed hash of the assembly, if strongly named
§ A list of referenced assemblies, including their version and public key
§ A list of modules that comprise the assembly
§ A list of types defined in the assembly and the module containing each type
§ An optional set of security permissions requested or refused by the assembly (SecurityPermission)
§ The culture it targets, if a satellite assembly (AssemblyCulture)
The manifest can also store the following informational data:
§ A full title and description (AssemblyTitle and AssemblyDescription)
§ Company and copyright information (AssemblyCompany and AssemblyCopyright)
§ A display version (AssemblyInformationalVersion)
§ Additional attributes for custom data
Some of this data is derived from arguments given to the compiler, such as the list of referenced assemblies or the public key with which to sign the assembly. The rest comes from assembly attributes, indicated in parentheses.
NOTE
You can view the contents of an assembly’s manifest with the .NET tool ildasm.exe. In Chapter 19, we describe how to use reflection to do the same programmatically.
Specifying assembly attributes
You can control much of the manifest’s content with assembly attributes. For example:
[assembly: AssemblyCopyright ("\x00a9 Corp Ltd. All rights reserved.")]
[assembly: AssemblyVersion ("2.3.2.1")]
These declarations are usually all defined in one file in your project. Visual Studio automatically creates a file called AssemblyInfo.cs in the Properties folder with every new C# project for this purpose, prepopulated with a default set of assembly attributes that provide a starting point for further customization.
The Application Manifest
An application manifest is an XML file that communicates information about the assembly to the operating system. An application manifest, if present, is read and processed before the .NET-managed hosting environment loads the assembly—and can influence how the operating system launches an application’s process.
A .NET application manifest has a root element called assembly in the XML namespace urn:schemas-microsoft-com:asm.v1:
<?xml version="1.0" encoding="utf-8"?>
<assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1">
<!-- contents of manifest -->
</assembly>
The following manifest instructs the OS to request administrative elevation:
<?xml version="1.0" encoding="utf-8"?>
<assembly manifestVersion="1.0" xmlns="urn:schemas-microsoft-com:asm.v1">
<trustInfo xmlns="urn:schemas-microsoft-com:asm.v2">
<security>
<requestedPrivileges>
<requestedExecutionLevel level="requireAdministrator" />
</requestedPrivileges>
</security>
</trustInfo>
</assembly>
We describe the consequences of requesting administrative elevation in Chapter 21.
Metro applications have a far more elaborate manifest, described in the Package.appxmanifest file. This includes a declaration of the program’s capabilities, which determine permissions granted by the operating system. The easiest way to edit this file is with Visual Studio, which presents a UI when you double-click the manifest file.
Deploying an .NET application manifest
You can deploy a .NET application manifest in two ways:
§ As a specially named file located in the same folder as the assembly
§ Embedded within the assembly itself
As a separate file, its name must match that of the assembly’s, plus .manifest. So, if an assembly was named MyApp.exe, its manifest would be named MyApp.exe.manifest.
To embed an application manifest file into an assembly, first build the assembly and then call the .NET mt tool as follows:
mt -manifest MyApp.exe.manifest -outputresource:MyApp.exe;#1
NOTE
The .NET tool ildasm.exe is blind to the presence of an embedded application manifest. Visual Studio, however, indicates whether an embedded application manifest is present if you double-click the assembly in Solution Explorer.
Modules
The contents of an assembly are actually packaged within one or more intermediate containers, called modules. A module corresponds to a file containing the contents of an assembly. The reason for this extra layer of containership is to allow an assembly to span multiple files—a feature that’s useful when building an assembly containing code compiled in a mixture of programming languages.
Figure 18-1 shows the normal case of an assembly with a single module. Figure 18-2 shows a multifile assembly. In a multifile assembly, the “main” module always contains the manifest; additional modules can contain IL and/or resources. The manifest describes the relative location of all the other modules that make up the assembly.
Figure 18-1. Single-file assembly
Multifile assemblies have to be compiled from the command line: there’s no support in Visual Studio. To do this, you invoke the csc compiler with the /t switch to create each module, and then link them with the assembly linker tool, al.exe.
Although the need for multifile assemblies is rare, at times you need to be aware of the extra level of containership that modules impose—even when dealing just with single-module assemblies. The main scenario is with reflection (see Reflecting Assemblies and Emitting Assemblies and Types in Chapter 19).
The Assembly Class
The Assembly class in System.Reflection is a gateway to accessing assembly metadata at runtime. There are a number of ways to obtain an assembly object: the simplest is via a Type’s Assembly property:
Assembly a = typeof (Program).Assembly;
or, in Metro applications:
Assembly a = typeof (Program).GetTypeInfo().Assembly;
Figure 18-2. Multifile assembly
In non-Metro apps, you can also obtain an Assembly object by calling one of Assembly’s static methods:
GetExecutingAssembly
Returns the assembly of the type that defines the currently executing function
GetCallingAssembly
Does the same as GetExecutingAssembly, but for the function that called the currently executing function
GetEntryAssembly
Returns the assembly defining the application’s original entry method
Once you have an Assembly object, you can use its properties and methods to query the assembly’s metadata and reflect upon its types. Table 18-1 shows a summary of these functions.
Table 18-1. Assembly members
Functions |
Purpose |
See the section... |
FullName, GetName |
Returns the fully qualified name or an AssemblyName object |
Assembly Names |
CodeBase, Location |
Location of the assembly file |
Resolving and Loading Assemblies |
Load, LoadFrom, LoadFile |
Manually loads an assembly into the current application domain |
Resolving and Loading Assemblies |
GlobalAssemblyCache |
Indicates whether the assembly is in the GAC |
The Global Assembly Cache |
GetSatelliteAssembly |
Locates the satellite assembly of a given culture |
Resources and Satellite Assemblies |
GetType, GetTypes |
Returns a type, or all types, defined in the assembly |
Reflecting and Activating Types in Chapter 19 |
EntryPoint |
Returns the application’s entry method, as a MethodInfo |
Reflecting and Invoking Members in Chapter 19 |
GetModules, ManifestModule |
Returns all modules, or the main module, of an assembly |
Reflecting Assemblies in Chapter 19 |
GetCustomAttributes |
Returns the assembly’s attributes |
Working with Attributes in Chapter 19 |
Strong Names and Assembly Signing
A strongly named assembly has a unique and untamperable identity. It works by adding two bits of metadata to the manifest:
§ A unique number that belongs to the authors of the assembly
§ A signed hash of the assembly, proving that the unique number holder produced the assembly
This requires a public/private key pair. The public key provides the unique identifying number, and the private key facilitates signing.
NOTE
Strong-name-signing is not the same as Authenticode-signing. We cover Authenticode later in this chapter.
The public key is valuable in guaranteeing the uniqueness of assembly references: a strongly named assembly incorporates the public key into its identity. The signature is valuable for security—it prevents a malicious party from tampering with your assembly. Without your private key, no one can release a modified version of the assembly without the signature breaking (causing an error when loaded). Of course, someone could re-sign the assembly with a different key pair—but this would give the assembly a different identity. Any application referencing the original assembly would shun the imposter because public key tokens are written into references.
WARNING
Adding a strong name to a previously “weak” named assembly changes its identity. For this reason, it pays to give production assemblies strong names from the outset.
A strongly named assembly can also be registered in the GAC.
How to Strongly Name an Assembly
To give an assembly a strong name, first generate a public/private key pair with the sn.exe utility:
sn.exe -k MyKeyPair.snk
This manufactures a new key pair and stores it to a file called MyApp.snk. If you subsequently lose this file, you will permanently lose the ability to recompile your assembly with the same identity.
You then compile with the /keyfile switch:
csc.exe /keyfile:MyKeyPair.snk Program.cs
Visual Studio assists you with both steps in the Project Properties window.
WARNING
A strongly named assembly cannot reference a weakly named assembly. This is another compelling reason to strongly name all your production assemblies.
The same key pair can sign multiple assemblies—they’ll still have distinct identities if their simple names differ. The choice as to how many key pair files to use within an organization depends on a number of factors. Having a separate key pair for every assembly is advantageous should you later transfer ownership of a particular application (along with its referenced assemblies), in terms of minimum disclosure. But it makes it harder for you to create a security policy that recognizes all of your assemblies. It also makes it harder to validate dynamically loaded assemblies.
NOTE
Prior to C# 2.0, the compiler did not support the /keyfile switch and you would specify a key file with the AssemblyKeyFile attribute instead. This presented a security risk, because the path to the key file would remain embedded in the assembly’s metadata. For instance, with ildasm, you can see quite easily that the path to the key file used to sign mscorlib in CLR 1.1 was as follows:
F:\qfe\Tools\devdiv\EcmaPublicKey.snk
Obviously, you need access to that folder on Microsoft’s .NET Framework build machine to take advantage of that information!
Delay Signing
In an organization with hundreds of developers, you might want to restrict access to the key pairs used for signing assemblies, for a couple of reasons:
§ If a key pair gets leaked, your assemblies are no longer untamperable.
§ A test assembly, if signed and leaked, could be maliciously propagated as the real assembly.
Withholding key pairs from developers, though, means they cannot compile and test assemblies with their correct identity. Delay signing is a system for working around this problem.
A delay-signed assembly is flagged with the correct public key, but not signed with the private key. A delay-signed assembly is equivalent to a tampered assembly and would normally be rejected by the CLR. The developer, however, instructs the CLR to bypass validation for the delay-sign assemblies on that computer, allowing the unsigned assemblies to run. When it comes time for final deployment, the private key holder re-signs the assembly with the real key pair.
To delay-sign, you need a file containing just the public key. You can extract this from a key pair by calling sn with the -p switch:
sn -k KeyPair.snk
sn -p KeyPair.snk PublicKeyOnly.pk
KeyPair.snk is kept secure and PublicKeyOnly.pk is freely distributed.
NOTE
You can also obtain PublicKeyOnly.pk from an existing signed assembly with the -e switch:
sn -e YourLibrary.dll PublicKeyOnly.pk
You then delay-sign with PublicKeyOnly.pk by calling csc with the /delaysign+ switch:
csc /delaysign+ /keyfile: PublicKeyOnly.pk /target:library YourLibrary.cs
Visual Studio does the same if you tick the “Delay sign” checkbox.
The next step is to instruct the .NET runtime to skip assembly identity verification on the development computers running the delay-signed assemblies. This can be done on either a per-assembly or a per-public key basis, by calling the sn tool with the Vr switch:
sn -Vr YourLibrary.dll
WARNING
Visual Studio does not perform this step automatically. You must disable assembly verification manually from the command line. Otherwise, your assembly will not execute.
The final step is to fully sign the assembly prior to deployment. This is when you replace the null signature with a real signature that can be generated only with access to the private key. To do this, you call sn with the R switch:
sn -R YourLibrary.dll KeyPair.snk
You can then reinstate assembly verification on development machines as follows:
sn -Vu YourLibrary.dll
You won’t need to recompile any applications that reference the delay-signed assembly, because you’ve changed only the assembly’s signature, not its identity.
Assembly Names
An assembly’s “identity” comprises four pieces of metadata from its manifest:
§ Its simple name
§ Its version (“0.0.0.0” if not present)
§ Its culture (“neutral” if not a satellite)
§ Its public key token (“null” if not strongly named)
The simple name comes not from any attribute, but from the name of the file to which it was originally compiled (less any extension). So, the simple name of the System.Xml.dll assembly is “System.Xml.” Renaming a file doesn’t change the assembly’s simple name.
The version number comes from the AssemblyVersion attribute. It’s a string divided into four parts as follows:
major.minor.build.revision
You can specify a version number as follows:
[assembly: AssemblyVersion ("2.5.6.7")]
The culture comes from the AssemblyCulture attribute and applies to satellite assemblies, described later in the section Resources and Satellite Assemblies.
The public key token comes from a key pair supplied at compile time via the /keyfile switch, as we saw earlier, in the section How to Strongly Name an Assembly.
Fully Qualified Names
A fully qualified assembly name is a string that includes all four identifying components, in this format:
simple-name, Version=version, Culture=culture, PublicKeyToken=public-key
For example, the fully qualified name of System.Xml.dll is:
"System.Xml, Version=2.0.0.0, Culture=neutral,
PublicKeyToken=b77a5c561934e089"
If the assembly has no AssemblyVersion attribute, the version appears as “0.0.0.0”. If it is unsigned, its public key token appears as “null”.
An Assembly object’s FullName property returns its fully qualified name. The compiler always uses fully qualified names when recording assembly references in the manifest.
NOTE
A fully qualified assembly name does not include a directory path to assist in locating it on disk. Locating an assembly residing in another directory is an entirely separate matter that we pick up in Resolving and Loading Assemblies.
The AssemblyName Class
AssemblyName is a class with a typed property for each of the four components of a fully qualified assembly name. AssemblyName has two purposes:
§ It parses or builds a fully qualified assembly name.
§ It stores some extra data to assist in resolving (finding) the assembly.
You can obtain an AssemblyName object in any of the following ways:
§ Instantiate an AssemblyName, providing a fully qualified name.
§ Call GetName on an existing Assembly.
§ Call AssemblyName.GetAssemblyName, providing the path to an assembly file on disk (non-Metro apps only).
You can also instantiate an AssemblyName object without any arguments, and then set each of its properties to build a fully qualified name. An AssemblyName is mutable when constructed in this manner.
Here are its essential properties and methods:
string FullName { get; } // Fully qualified name
string Name { get; set; } // Simple name
Version Version { get; set; } // Assembly version
CultureInfo CultureInfo { get; set; } // For satellite assemblies
string CodeBase { get; set; } // Location
byte[] GetPublicKey(); // 160 bytes
void SetPublicKey (byte[] key);
byte[] GetPublicKeyToken(); // 8-byte version
void SetPublicKeyToken (byte[] publicKeyToken);
Version is itself a strongly typed representation, with properties for Major, Minor, Build, and Revision numbers. GetPublicKey returns the full cryptographic public key; GetPublicKeyToken returns the last eight bytes used in establishing identity.
To use AssemblyName to obtain the simple name of an assembly:
Console.WriteLine (typeof (string).Assembly.GetName().Name); // mscorlib
To get an assembly version:
string v = myAssembly.GetName().Version.ToString();
We’ll examine the CodeBase property in the later section Resolving and Loading Assemblies.
Assembly Informational and File Versions
Because an integral part of an assembly name is its version, changing the AssemblyVersion attribute changes the assembly’s identity. This affects compatibility with referencing assemblies, which can be undesirable when making non-breaking updates. To address this, there are two other independent assembly-level attributes for expressing version-related information, both of which are ignored by the CLR:
AssemblyInformationalVersion
The version as displayed to the end user. This is visible in the Windows File Properties dialog box as “Product Version”. Any string can go here, such as “5.1 Beta 2”. Typically, all the assemblies in an application would be assigned the same informational version number.
AssemblyFileVersion
This is intended to refer to the build number for that assembly. This is visible in the Windows File Properties dialog box as “File Version”. As with AssemblyVersion, it must contain a string consisting of up to four numbers separated by periods.
Authenticode Signing
Authenticode is a code-signing system whose purpose is to prove the identity of the publisher. Authenticode and strong-name signing are independent: you can sign an assembly with either or both systems.
While strong-name signing can prove that assemblies A, B, and C came from the same party (assuming the private key hasn’t been leaked), it can’t tell you who that party was. In order to know that the party was Joe Albahari—or Microsoft Corporation—you need Authenticode.
Authenticode is useful when downloading programs from the Internet, because it provides assurance that a program came from whoever was named by the Certificate Authority and was not modified in transit. It also prevents the “Unknown Publisher” warning shown in Figure 18-3, when running a downloaded application for the first time. Authenticode signing is also a requirement for Metro apps when submitting to the Windows Store, and for assemblies in general as part of the Windows Logo program.
Figure 18-3. Unsigned file warning
Authenticode works with not only .NET assemblies, but also unmanaged executables and binaries such as ActiveX controls or .msi deployment files. Of course, Authenticode doesn’t guarantee that a program is free from malware—although it does make it less likely. A person or entity has been willing to put its name (backed by a passport or company document) behind the executable or library.
NOTE
The CLR does not treat an Authenticode signature as part of an assembly’s identity. However, it can read and validate Authenticode signatures on demand, as we’ll see soon.
Signing with Authenticode requires that you contact a Certificate Authority (CA) with evidence of your personal identity or company’s identity (articles of incorporation, etc.). Once the CA has checked your documents, it will issue an X.509 code-signing certificate that is typically valid for one to five years. This enables you to sign assemblies with the signtool utility. You can also make a certificate yourself with the makecert utility, however it will be recognized only on computers on which the certificate is explicitly installed.
The fact that (non-self-signed) certificates can work on any computer relies on public key infrastructure. Essentially, your certificate is signed with another certificate belonging to a CA. The CA is trusted because all CAs are loaded into the operating system (to see them, go to the Windows Control Panel and choose Internet Options→Content tab→Certificates button→Trusted Root Certification Authorities tab). A CA can revoke a publisher’s certificate if leaked, so verifying an Authenticode signature requires periodically asking the CA for an up-to-date list of certification revocations.
Because Authenticode uses cryptographic signing, an Authenticode signature is invalid if someone subsequently tampers with the file. We discuss cryptography, hashing, and signing in Chapter 21.
How to Sign with Authenticode
Obtaining and installing a certificate
The first step is to obtain a code-signing certificate from a CA (see sidebar). You can then either work with the certificate as a password-protected file, or load the certificate into the computer’s certificate store. The benefit of doing the latter is that you can sign without needing to specify a password. This is advantageous because it avoids having a password visible in automated build scripts or batch files.
WHERE TO GET A CODE-SIGNING CERTIFICATE
Just a handful of code-signing CAs are preloaded into Windows as root certification authorities. These include (with prices for one-year code-signing certificates at the time of publication): Comodo ($180), Go Daddy ($200), GlobalSign ($229), thawte ($299), and VeriSign ($499).
There is also a reseller called Ksoftware (http://www.ksoftware.net), which currently offers Comodo code-signing certificates for $95 per year.
The Authenticode certificates issued by Ksoftware, Comodo, Go Daddy, and GlobalSign are advertised as less restrictive in that they will also sign non-Microsoft programs. Aside from this, the products from all vendors are functionally equivalent.
Note that a certificate for SSL cannot generally be used for Authenticode signing (despite using the same X.509 infrastructure). This is, in part, because a certificate for SSL is about proving ownership of a domain; Authenticode is about proving who you are.
To load a certificate into the computer’s certificate store, go to the Windows Control Panel and select Internet Options→Content tab→Certificates button→Import. Once the import is complete, click the View button on the certificate, go to the Details tab, and copy the certificate’sthumbprint. This is the SHA-1 hash that you’ll subsequently need to identity the certificate when signing.
NOTE
If you also want to strong-name-sign your assembly (which is highly recommended), you must do so before Authenticode signing. This is because the CLR knows about Authenticode signing, but not vice versa. So if you strong-name-sign an assembly after Authenticode-signing it, the latter will see the addition of the CLR’s strong name as an unauthorized modification, and consider the assembly tampered.
Signing with signtool.exe
You can Authenticode-sign your programs with the signtool utility that comes with Visual Studio. It displays a UI if you call it with the signwizard flag; otherwise, you can use it in command-line style as follows:
signtool sign /sha1 (thumbprint) filename
The thumbprint is that of the certificate as shown in the computer’s certificate store. (If the certificate is in a file instead, specify the filename with /f, and the password with /p.)
For example:
signtool sign /sha1 ff813c473dc93aaca4bac681df472b037fa220b3 LINQPad.exe
You can also specify a description and product URL with /d and /du:
... /d LINQPad /du http://www.linqpad.net
In most cases, you will also want to specify a time-stamping server.
Time stamping
After your certificate expires, you’ll no longer be able to sign programs. However, programs that you signed before its expiry will still be valid—if you specified a time-stamping server with the /t switch when signing. The CA will provide you with a URI for this purpose: the following is for Comodo (or Ksoftware):
... /t http://timestamp.comodoca.com/authenticode
Verifying that a program has been signed
The easiest way to view an Authenticode signature on a file is to view the file’s properties in Windows Explorer (look in the Digital Signatures tab). The signtool utility also provides an option for this.
Authenticode Validation
Both the operating system and the CLR may validate Authenticode signatures.
Windows validates Authenticode signatures before running programs marked as “blocked”—in practice, this means programs run for the first time after having been downloaded from the Internet. The status—or absence—of Authenticode information is then shown in the dialog box we saw in Figure 18-3.
The CLR reads and validates Authenticode signatures when you ask for assembly evidence. Here’s how to do that:
Publisher p = someAssembly.Evidence.GetHostEvidence<Publisher>();
The Publisher class (in System.Security.Policy) exposes a Certificate property. If this returns a non-null value, it has been Authenticode-signed. You can then query this object for the details of the certificate.
WARNING
Prior to Framework 4.0, the CLR would read and validate Authenticode signatures when an assembly was loaded—rather than waiting until you called GetHostEvidence. This had potentially disastrous performance consequences, because Authenticode validation may round-trip to the CA to update the certificate revocation list—which can take up to 30 seconds (to fail) if there are Internet connectivity problems. For this reason, it’s best to avoid Authenticode-signing .NET 3.5 or earlier assemblies if possible. (Signing .msi setup files, though, is fine.)
Regardless of the Framework version, if a program has a bad or unverifiable Authenticode signature, the CLR will merely make that information available via GetHostEvidence: it will never display a warning to the user or prevent the assembly from running.
As we said previously, an Authenticode signature has no effect on an assembly’s identity or name.
The Global Assembly Cache
As part of the .NET Framework installation, a central repository is created on the computer for storing .NET assemblies, called the Global Assembly Cache, or GAC. The GAC contains a centralized copy of the .NET Framework itself, and it can also be used to centralize your own assemblies.
The main factor in choosing whether to load your assemblies into the GAC relates to versioning. For assemblies in the GAC, versioning is centralized at the machine level and controlled by the computer’s administrator. For assemblies outside the GAC, versioning is handled on an application basis, so each application looks after its own dependency and update issues (typically by maintaining its own copy of each assembly that it references).
The GAC is useful in the minority of cases where machine-centralized versioning is genuinely advantageous. For example, consider a suite of interdependent plug-ins, each referencing some shared assemblies. We’ll assume each plug-in is in its own directory, and for this reason, there’s a possibility of there being multiple copies of a shared assembly (maybe some later than others). Further, we’ll assume the hosting application will want to load each shared assembly just once for the sake of efficiency and type compatibility. The task of assembly resolution is now difficult for the hosting application, requiring careful planning and an understanding of the subtleties of assembly loading contexts. The simple solution here is to put the shared assemblies into the GAC. This ensures that the CLR always makes straightforward and consistent assembly resolution choices.
In more typical scenarios, however, the GAC is best avoided because it adds the following complications:
§ XCOPY or ClickOnce deployment is no longer possible; an administrative setup is required to install your application.
§ Updating assemblies in the GAC also requires administrative privileges.
§ Use of the GAC can complicate development and testing, because fusion, the CLR’s assembly resolution mechanism, always favors GAC assemblies over local copies.
§ Versioning and side-by-side execution require some planning, and a mistake may break other applications.
On the positive side, the GAC can improve startup time for very large assemblies, because the CLR verifies the signatures of assemblies in the GAC only once upon installation, rather than every time the assembly loads. In percentage terms, this is relevant if you’ve generated native images for your assemblies with the ngen.exe tool, choosing nonoverlapping base addresses. A good article describing these issues is available online at the MSDN site, titled “The Performance Benefits of NGen.”
NOTE
Assemblies in the GAC are always fully trusted—even when called from an assembly running in a limited-permissions sandbox. We discuss this further in Chapter 21.
How to Install Assemblies to the GAC
To install assemblies to the GAC, the first step is to give your assembly a strong name. Then you can install it using the .NET command-line tool, gacutil:
gacutil /i MyAssembly.dll
If the assembly already exists in the GAC with the same public key and version, it’s updated. You don’t have to uninstall the old one first.
To uninstall an assembly (note the lack of a file extension):
gacutil /u MyAssembly
You can also specify that assemblies be installed to the GAC as part of a setup project in Visual Studio.
Calling gacutil with the /l switch lists all assemblies in the GAC. You can do the same with the mscorcfg MMC snap-in (from Window→Administrative Tools→Framework Configuration).
Once an assembly is loaded into the GAC, applications can reference it without needing a local copy of that assembly.
WARNING
If a local copy is present, it’s ignored in favor of the GAC image. This means there’s no way to reference or test a recompiled version of your library—until you update the GAC. This holds true as long as you preserve the assembly’s version and identity.
GAC and Versioning
Changing an assembly’s AssemblyVersion gives it a brand-new identity. To illustrate, let’s say you write a utils assembly, version it “1.0.0.0”, strongly name it, and then install it in the GAC. Then suppose later you add some new features, change the version to “1.0.0.1”, recompile it, and reinstall it into the GAC. Instead of overwriting the original assembly, the GAC now holds both versions. This means:
§ You can choose which version to reference when compiling another application that uses utils.
§ Any application previously compiled to reference utils 1.0.0.0 will continue to do so.
This is called side-by-side execution. Side-by-side execution prevents the “DLL hell” that can otherwise occur when a shared assembly is unilaterally updated: applications designed for the older version might unexpectedly break.
A complication arises, though, when you want to apply bug fixes or minor updates to existing assemblies. You have two options:
§ Reinstall the fixed assembly to the GAC with the same version number.
§ Compile the fixed assembly with a new version number and install that to the GAC.
The difficulty with the first option is that there’s no way to apply the update selectively to certain applications. It’s all or nothing. The difficulty with the second option is that applications will not normally use the newer assembly version without being recompiled. There is a workaround—you can create a publisher policy allowing assembly version redirection—at the cost of increasing deployment complexity.
Side-by-side execution is good for mitigating some of the problems of shared assemblies. If you avoid the GAC altogether—instead allowing each application to maintain its own private copy of utils—you eliminate all of the problems of shared assemblies!
Resources and Satellite Assemblies
An application typically contains not only executable code, but also content such as text, images, or XML files. Such content can be represented in an assembly through a resource. There are two overlapping use cases for resources:
§ Incorporating data that cannot go into source code, such as images
§ Storing data that might need translation in a multilingual application
An assembly resource is ultimately a byte stream with a name. You can think of an assembly as containing a dictionary of byte arrays keyed by string. This can be seen in ildasm if we disassemble an assembly that contains a resource called banner.jpg and a resource called data.xml:
.mresource public banner.jpg
{
// Offset: 0x00000F58 Length: 0x000004F6
}
.mresource public data.xml
{
// Offset: 0x00001458 Length: 0x0000027E
}
In this case, banner.jpg and data.xml were included directly in the assembly—each as its own embedded resource. This is the simplest way to work.
The Framework also lets you add content through intermediate .resources containers. They are designed for holding content that may require translation into different languages. Localized .resources can be packaged as individual satellite assemblies that are automatically picked up at runtime, based on the user’s operating system language.
Figure 18-4 illustrates an assembly that contains two directly embedded resources, plus a .resources container called welcome.resources, for which we’ve created two localized satellites.
Figure 18-4. Resources
Directly Embedding Resources
NOTE
Embedding resources into assemblies is not supported in Metro apps. Instead, add any extra files to your deployment package, and access them by reading from your application StorageFolder (Package.Current.InstalledLocation).
To directly embed a resource at the command line, use the /resource switch when compiling:
csc /resource:banner.jpg /resource:data.xml MyApp.cs
You can optionally specify that the resource be given a different name in the assembly as follows:
csc /resource:<file-name>,<resource-name>
To directly embed a resource using Visual Studio:
§ Add the file to your project.
§ Set its build action to “Embedded Resource.”
Visual Studio always prefixes resource names with the project’s default namespace, plus the names of any subfolders in which the file is contained. So, if your project’s default namespace was Westwind.Reports and your file was called banner.jpg in the folder pictures, the resource name would be Westwind.Reports.pictures.banner.jpg.
WARNING
Resource names are case-sensitive. This makes project subfolder names in Visual Studio that contain resources effectively case-sensitive.
To retrieve a resource, you call GetManifestResourceStream on the assembly containing the resource. This returns a stream, which you can then read as any other:
Assembly a = Assembly.GetEntryAssembly();
using (Stream s = a.GetManifestResourceStream ("TestProject.data.xml"))
using (XmlReader r = XmlReader.Create (s))
...
System.Drawing.Image image;
using (Stream s = a.GetManifestResourceStream ("TestProject.banner.jpg"))
image = System.Drawing.Image.FromStream (s);
The stream returned is seekable, so you can also do this:
byte[] data;
using (Stream s = a.GetManifestResourceStream ("TestProject.banner.jpg"))
data = new BinaryReader (s).ReadBytes ((int) s.Length);
If you’ve used Visual Studio to embed the resource, you must remember to include the namespace-based prefix. To help avoid error, you can specify the prefix in a separate argument, using a type. The type’s namespace is used as the prefix:
using (Stream s = a.GetManifestResourceStream (typeof (X), "XmlData.xml"))
X can be any type with the desired namespace of your resource (typically, a type in the same project folder).
WARNING
Setting a project item’s build action in Visual Studio to “Resource” within a WPF application is not the same as setting its build action to “Embedded Resource”. The former actually adds the item to a .resources file called <AssemblyName>.g.resources, whose content you access through WPF’s Application class, using a URI as a key.
To add to the confusion, WPF further overloads the term “resource.” Static resources and dynamic resources are both unrelated to assembly resources!
GetManifestResourceNames returns the names of all resources in the assembly.
.resources Files
.resources files are containers for potentially localizable content. A .resources file ends up as an embedded resource within an assembly—just like any other kind of file. The difference is that you must:
§ Package your content into the .resources file to begin with.
§ Access its content through a ResourceManager or pack URI, rather than a GetManifestResourceStream.
.resources files are structured in binary and so are not human-editable; therefore, you must rely on tools provided by the Framework and Visual Studio to work with them. The standard approach with strings or simple data types is to use the .resx format, which can be converted to a .resourcesfile either by Visual Studio or the resgen tool. The .resx format is also suitable for images intended for a Windows Forms or ASP.NET application.
In a WPF application, you must use Visual Studio’s “Resource” build action for images or similar content needing to be referenced by URI. This applies whether localization is needed or not.
We describe how to do each of these in the following sections.
.resx Files
A .resx file is a design-time format for producing .resources files. A .resx file uses XML and is structured with name/value pairs as follows:
<root>
<data name="Greeting">
<value>hello</value>
</data>
<data name="DefaultFontSize" type="System.Int32, mscorlib">
<value>10</value>
</data>
</root>
To create a .resx file in Visual Studio, add a project item of type “Resources File”. The rest of the work is done automatically:
§ The correct header is created.
§ A designer is provided for adding strings, images, files, and other kinds of data.
§ The .resx file is automatically converted to the .resources format and embedded into the assembly upon compilation.
§ A class is written to help you access the data later on.
NOTE
The resource designer adds images as typed Image objects (System.Drawing.dll), rather than as byte arrays, making them unsuitable for WPF applications.
Creating a .resx file at the command line
If you’re working at the command line, you must start with a .resx file that has a valid header. The easiest way to accomplish this is to create a simple .resx file programmatically. The System.Resources.ResXResourceWriter class (which, peculiarly, resides in theSystem.Windows.Forms.dll assembly) does exactly this job:
using (ResXResourceWriter w = new ResXResourceWriter ("welcome.resx")) { }
From here, you can either continue to use the ResXResourceWriter to add resources (by calling AddResource) or manually edit the .resx file that it wrote.
The easiest way to deal with images is to treat the files as binary data and convert them to an image upon retrieval. This is also more versatile than encoding them as a typed Image object. You can include binary data within a .resx file in base 64 format as follows:
<data name="flag.png" type="System.Byte[], mscorlib">
<value>Qk32BAAAAAAAAHYAAAAoAAAAMAMDAwACAgIAAAAD/AA....</value>
</data>
or as a reference to another file that is then read by resgen:
<data name="flag.png"
type="System.Resources.ResXFileRef, System.Windows.Forms">
<value>flag.png;System.Byte[], mscorlib</value>
</data>
When you’re done, you must convert the .resx file by calling resgen. The following converts welcome.resx into welcome.resources:
resgen welcome.resx
The final step is to include the .resources file when compiling, as follows:
csc /resources:welcome.resources MyApp.cs
Reading .resources files
NOTE
If you create a .resx file in Visual Studio, a class of the same name is generated automatically with properties to retrieve each of its items.
The ResourceManager class reads .resources files embedded within an assembly:
ResourceManager r = new ResourceManager ("welcome",
Assembly.GetExecutingAssembly());
(The first argument must be namespace-prefixed if the resource was compiled in Visual Studio.)
You can then access what’s inside by calling GetStringor GetObject with a cast:
string greeting = r.GetString ("Greeting");
int fontSize = (int) r.GetObject ("DefaultFontSize");
Image image = (Image) r.GetObject ("flag.png"); // (Visual Studio)
byte[] imgData = (byte[]) r.GetObject ("flag.png"); // (Command line)
To enumerate the contents of a .resources file:
ResourceManager r = new ResourceManager (...);
ResourceSet set = r.GetResourceSet (CultureInfo.CurrentUICulture,
true, true);
foreach (System.Collections.DictionaryEntry entry in set)
Console.WriteLine (entry.Key);
Creating a pack URI resource in Visual Studio
In a WPF application, XAML files need to be able to access resources by URI. For instance:
<Button>
<Image Height="50" Source="flag.png"/>
</Button>
Or, if the resource is in another assembly:
<Button>
<Image Height="50" Source="UtilsAssembly;Component/flag.png"/>
</Button>
(Component is a literal keyword.)
To create resources that can be loaded in this manner, you cannot use .resx files. Instead, you must add the files to your project and set their build action to “Resource” (not “Embedded Resource”). Visual Studio then compiles them into a .resources file called <AssemblyName>.g.resources—also the home of compiled XAML (.baml) files.
To load a URI-keyed resource programmatically, call Application.GetResource-Stream:
Uri u = new Uri ("flag.png", UriKind.Relative);
using (Stream s = Application.GetResourceStream (u).Stream)
Notice we used a relative URI. You can also use an absolute URI in exactly the following format (the three commas are not a typo):
Uri u = new Uri ("pack://application:,,,/flag.png");
If you’d rather specify an Assembly object, you can retrieve content instead with a ResourceManager:
Assembly a = Assembly.GetExecutingAssembly();
ResourceManager r = new ResourceManager (a.GetName().Name + ".g", a);
using (Stream s = r.GetStream ("flag.png"))
...
A ResourceManager also lets you enumerate the content of a .g.resources container within a given assembly.
Satellite Assemblies
Data embedded in .resources is localizable.
Resource localization is relevant when your application runs on a version of Windows built to display everything in a different language. For consistency, your application should use that same language too.
A typical setup is as follows:
§ The main assembly contains .resources for the default or fallback language.
§ Separate satellite assemblies contain localized .resources translated to different languages.
When your application runs, the Framework examines the language of the current operating system (from CultureInfo.CurrentUICulture). Whenever you request a resource using ResourceManager, the Framework looks for a localized satellite assembly. If one’s available—and it contains the resource key you requested—it’s used in place of the main assembly’s version.
This means you can enhance language support simply by adding new satellites—without changing the main assembly.
NOTE
A satellite assembly cannot contain executable code, only resources.
Satellite assemblies are deployed in subdirectories of the assembly’s folder as follows:
programBaseFolder\MyProgram.exe
\MyLibrary.exe
\XX\MyProgram.resources.dll
\XX\MyLibrary.resources.dll
XX refers to the two-letter language code (such as “de” for German) or a language and region code (such as “en-GB” for English in Great Britain). This naming system allows the CLR to find and load the correct satellite assembly automatically.
Building satellite assemblies
Recall our previous .resx example, which included the following:
<root>
...
<data name="Greeting"
<value>hello</value>
</data>
</root>
We then retrieved the greeting at runtime as follows:
ResourceManager r = new ResourceManager ("welcome",
Assembly.GetExecutingAssembly());
Console.Write (r.GetString ("Greeting"));
Suppose we want this to instead write “Hallo” if running on the German version of Windows. The first step is to add another .resx file named welcome.de.resx that substitutes hello for hallo:
<root>
<data name="Greeting">
<value>hallo<value>
</data>
</root>
In Visual Studio, this is all you need to do—when you rebuild, a satellite assembly called MyApp.resources.dll is automatically created in a subdirectory called de.
If you’re using the command line, you call resgen to turn the .resx file into a .resources file:
resgen MyApp.de.resx
and then call al to build the satellite assembly:
al /culture:de /out:MyApp.resources.dll /embed:MyApp.de.resources /t:lib
You can specify /template:MyApp.exe to import the main assembly’s strong name.
Testing satellite assemblies
To simulate running on an operating system with a different language, you must change the CurrentUICulture using the Thread class:
System.Threading.Thread.CurrentThread.CurrentUICulture
= new System.Globalization.CultureInfo ("de");
CultureInfo.CurrentUICulture is a read-only version of the same property.
NOTE
A useful testing strategy is to ℓѺ¢αℓïʐɘ into words that can still be read as English, but do not use the standard Roman Unicode characters.
Visual Studio designer support
The designers in Visual Studio provide extended support for localizing components and visual elements. The WPF designer has its own workflow for localization; other Component-based designers use a design-time-only property to make it appear that a component or Windows Forms control has a Language property. To customize for another language, simply change the Language property and then start modifying the component. All properties of controls that are attributed as Localizable will be persisted to a .resx file for that language. You can switch between languages at any time just by changing the Language property.
Cultures and Subcultures
Cultures are split into cultures and subcultures. A culture represents a particular language; a subculture represents a regional variation of that language. The Framework follows the RFC1766 standard, which represents cultures and subcultures with two-letter codes. Here are the codes for English and German cultures:
en
de
Here are the codes for the Australian English and Austrian German subcultures:
en-AU
de-AT
A culture is represented in .NET with the System.Globalization.CultureInfo class. You can examine the current culture of your application as follows:
Console.WriteLine (System.Threading.Thread.CurrentThread.CurrentCulture);
Console.WriteLine (System.Threading.Thread.CurrentThread.CurrentUICulture);
Running this on a computer localized for Australia illustrates the difference between the two:
EN-AU
EN-US
CurrentCulture reflects the regional settings of the Windows control panel, whereas CurrentUICulture reflects the language of the operating system.
Regional settings include such things as time zone and the formatting of currency and dates. CurrentCulture determines the default behavior of such functions as DateTime.Parse. Regional settings can be customized to the point where they no longer resemble any particular culture.
CurrentUICulture determines the language in which the computer communicates with the user. Australia doesn’t need a separate version of English for this purpose, so it just uses the U.S. one. If I spent a couple of months working in Austria, I would go to the control panel and change my CurrentCulture to Austrian-German. However, since I can’t speak German, my CurrentUICulture would remain U.S. English.
ResourceManager, by default, uses the current thread’s CurrentUICulture property to determine the correct satellite assembly to load. ResourceManager uses a fallback mechanism when loading resources. If a subculture assembly is defined, that one is used; otherwise, it falls back to the generic culture. If the generic culture is not present, it falls back to the default culture in the main assembly.
Resolving and Loading Assemblies
A typical application comprises a main executable assembly plus a set of referenced library assemblies. For example:
AdventureGame.exe
Terrain.dll
UIEngine.dll
Assembly resolution refers to the process of locating referenced assemblies. Assembly resolution happens both at compile time and at runtime. The compile-time system is simple: the compiler knows where to find referenced assemblies because it’s told where to look. You (or Visual Studio) provide the full path to referenced assemblies that are not in the current directory.
Runtime resolution is more complicated. The compiler writes the strong names of referenced assemblies to the manifest—but not any hints as to where to find them. In the simple case where you put all referenced assemblies in the same folder as the main executable, there’s no issue because that’s (close to) the first place the CLR looks. The complexities arise:
§ When you deploy referenced assemblies in other places
§ When you dynamically load assemblies
WARNING
Metro apps are very limited in what you can do in the way of customizing assembly loading and resolution. In particular, loading an assembly from an arbitrary file location isn’t supported, and there’s no AssemblyResolve event.
Assembly and Type Resolution Rules
All types are scoped to an assembly. An assembly is like an address for a type. To give an analogy, we can refer to a person as “Joe” (type name without namespace), or “Joe Bloggs” (full type name), or “Joe Bloggs of 100 Barker Ave, WA” (assembly-qualified type name).
During compilation, we don’t need to go further than a full type name for uniqueness, because you can’t reference two assemblies that define the same full type name (at least not without special tricks). At runtime, though, it’s possible to have many identically named types in memory. This happens within the Visual Studio designer, for instance, whenever you rebuild the components you’re designing. The only way to distinguish such types is by their assembly; therefore, an assembly forms an essential part of a type’s runtime identity. An assembly is also a type’s handle to its code and metadata.
The CLR loads assemblies at the point in execution when they’re first needed. This happens when you refer to one of the assembly’s types. For example, suppose that AdventureGame.exe instantiates a type called TerrainModel.Map. Assuming no additional configuration files, the CLR answers the following questions:
§ What’s the fully qualified name of the assembly that contained TerrainModel.Map when AdventureGame.exe was compiled?
§ Have I already loaded into memory an assembly with this fully qualified name, in the same (resolution) context?
If the answer to the second question is yes, it uses the existing copy in memory; otherwise, it goes looking for the assembly. The CLR first checks the GAC, then the probing paths (generally the application base directory), and as a final resort, fires the AppDomain.AssemblyResolveevent. If none returns a match, the CLR throws an exception.
AssemblyResolve
The AssemblyResolve event allows you to intervene and manually load an assembly that the CLR can’t find. If you handle this event, you can scatter referenced assemblies in a variety of locations and still have them load.
Within the AssemblyResolve event handler, you locate the assembly and load it by calling one of three static methods in the Assembly class: Load, LoadFrom, or LoadFile. These methods return a reference to the newly loaded assembly, which you then return to the caller:
static void Main()
{
AppDomain.CurrentDomain.AssemblyResolve += FindAssembly;
...
}
static Assembly FindAssembly (object sender, ResolveEventArgs args)
{
string fullyQualifiedName = args.Name;
Assembly a = Assembly.LoadFrom (...);
return a;
}
The ResolveEventArgs event is unusual in that it has a return type. If there are multiple handlers, the first one to return a nonnull Assembly wins.
Loading Assemblies
The Load methods in Assembly are useful both inside and outside an AssemblyResolve handler. Outside the event handler, they can load and execute assemblies not referenced at compilation. An example of when you might do this is to execute a plug-in.
WARNING
Think carefully before calling Load, LoadFrom, or LoadFile: these methods permanently load an assembly into the current application domain—even if you do nothing with the resultant Assembly object. Loading an assembly has side effects: it locks the assembly files as well as affecting subsequent type resolution.
The only way to unload an assembly is to unload the whole application domain. (There’s also a technique to avoid locking assemblies called shadow copying for assemblies in the probing path—go to http://albahari.com/shadowcopy for the MSDN article.)
If you just want to examine an assembly without executing any of its code, you can instead use the reflection-only context (see Chapter 19).
To load an assembly from a fully qualified name (without a location) call Assembly.Load. This instructs the CLR to find the assembly using its normal automatic resolution system. The CLR itself uses Load to find referenced assemblies.
To load an assembly from a filename, call LoadFrom or LoadFile.
To load an assembly from a URI, call LoadFrom.
To load an assembly from a byte array, call Load.
NOTE
You can see what assemblies are currently loaded in memory by calling AppDomain’s GetAssemblies method:
foreach (Assembly a in
AppDomain.CurrentDomain.GetAssemblies())
{
Console.WriteLine (a.Location); // File path
Console.WriteLine (a.CodeBase); // URI
Console.WriteLine (a.GetName().Name); // Simple name
}
Loading from a filename
LoadFrom and LoadFile can both load an assembly from a filename. They differ in two ways. First, if an assembly with the same identity has already been loaded into memory from another location, LoadFrom gives you the previous copy:
Assembly a1 = Assembly.LoadFrom (@"c:\temp1\lib.dll");
Assembly a2 = Assembly.LoadFrom (@"c:\temp2\lib.dll");
Console.WriteLine (a1 == a2); // true
LoadFile gives you a fresh copy:
Assembly a1 = Assembly.LoadFile (@"c:\temp1\lib.dll");
Assembly a2 = Assembly.LoadFile (@"c:\temp2\lib.dll");
Console.WriteLine (a1 == a2); // false
If you load twice from an identical location, however, both methods give you the previously cached copy. (In contrast, loading an assembly twice from an identical byte array gives you two distinct Assembly objects.)
WARNING
Types from two identical assemblies in memory are incompatible. This is the primary reason to avoid loading duplicate assemblies, and hence a reason to favor LoadFrom over LoadFile.
The second difference between LoadFrom and LoadFile is that LoadFrom hints the CLR as to the location of onward references, whereas LoadFile does not. To illustrate, suppose your application in \folder1 loads an assembly in \folder2 called TestLib.dll, which references\folder2\Another.dll:
\folder1\MyApplication.exe
\folder2\TestLib.dll
\folder2\Another.dll
If you load TestLib with LoadFrom, the CLR will find and load Another.dll.
If you load TestLib with LoadFile, the CLR will be unable to find Another.dll and will throw an exception—unless you also handle the AssemblyResolve event.
In following sections, we demonstrate these methods in the context of some practical applications.
Statically referenced types and LoadFrom/LoadFile
When you refer to a type directly in your code, you’re statically referencing that type. The compiler bakes a reference to that type into the assembly being compiled, as well as the name of the assembly containing the type in question (but not any information on where to find it at runtime).
For instance, suppose there’s a type called Foo in an assembly called foo.dll and your application bar.exe includes the following code:
var foo = new Foo();
The bar.exe application statically references the Foo type in the foo assembly. We could instead dynamically load foo as follows:
Type t = Assembly.LoadFrom (@"d:\temp\foo.dll").GetType ("Foo");
var foo = Activator.CreateInstance (t);
If you mix the two approaches, you will usually end up with two copies of the assembly in memory, because the CLR considers each to be a different “resolution context.”
We said previously that when resolving static references, the CLR looks first in the GAC, then in the probing path (normally the application base directory), and then fires the AssemblyResolve event as a last resort. Before any of this, though, it checks whether the assembly has already been loaded. However, it considers only assemblies that have either:
§ Been loaded from a path that it would otherwise have found on its own (probing path)
§ Been loaded in response to the AssemblyResolve event
Hence, if you’ve already loaded it from an unprobed path via LoadFrom or LoadFile, you’ll end up with two copies of the assembly in memory (with incompatible types). To avoid this, you must be careful, when calling LoadFrom/LoadFile, to first check whether the assembly exists in the application base directory (unless you want to load multiple versions of an assembly).
Loading in response to the AssemblyResolve event is immune to this problem (whether you use LoadFrom, LoadFile—or load from a byte array as we’ll see later), because the event fires only for assemblies outside the probing path.
NOTE
Whether you use LoadFrom or LoadFile, the CLR always looks first for the requested assembly in the GAC. You can bypass the GAC with ReflectionOnlyLoadFrom (which loads the assembly into a reflection-only context). Even loading from a byte array doesn’t bypass the GAC, although it gets around the problem of locking assembly files:
byte[] image = File.ReadAllBytes (assemblyPath);
Assembly a = Assembly.Load (image);
If you do this, you must handle the AppDomain’s AssemblyResolve event in order to resolve any assemblies that the loaded assembly itself references, and keep track of all loaded assemblies (see Packing a Single-File Executable).
Location versus CodeBase
An Assembly’s Location property usually returns its physical location in the file system (if it has one). The CodeBase property mirrors this in URI form except in special cases, such as if loaded from the Internet, where CodeBase is the Internet URI and Location is the temporary path to which it was downloaded. Another special case is with shadow copied assemblies, where Location is blank and CodeBase is its unshadowed location. ASP.NET and the popular NUnit testing framework employ shadow copying to allow assemblies to be updated while the website or unit tests are running (for the MSDN reference, go to http://albahari.com/shadowcopy). LINQPad does something similar when you reference custom assemblies.
Hence relying solely on Location is dangerous if you’re looking for an assembly’s location on disk. The better approach is to check both properties. The following method returns an assembly’s containing folder (or null if it cannot be determined):
public static string GetAssemblyFolder (Assembly a)
{
try
{
if (!string.IsNullOrEmpty (a.Location))
return Path.GetDirectoryName (a.Location);
if (string.IsNullOrEmpty (a.CodeBase)) return null;
var uri = new Uri (a.CodeBase);
if (!uri.IsFile) return null;
return Path.GetDirectoryName (uri.LocalPath);
}
catch (NotSupportedException)
{
return null; // Dynamic assembly generated with Reflection.Emit
}
}
Note that because CodeBase returns a URI, we use the Uri class to obtain its local file path.
Deploying Assemblies Outside the Base Folder
Sometimes you might choose to deploy assemblies to locations other than the application base directory, for instance:
..\MyProgram\Main.exe
..\MyProgram\Libs\V1.23\GameLogic.dll
..\MyProgram\Libs\V1.23\3DEngine.dll
..\MyProgram\Terrain\Map.dll
..\Common\TimingController.dll
To make this work, you must assist the CLR in finding the assemblies outside the base folder. The easiest solution is to handle the AssemblyResolve event.
In the following example, we assume all additional assemblies are located in c:\ExtraAssemblies:
using System;
using System.IO;
using System.Reflection;
class Loader
{
static void Main()
{
AppDomain.CurrentDomain.AssemblyResolve += FindAssembly;
// We must switch to another class before attempting to use
// any of the types in c:\ExtraAssemblies:
Program.Go();
}
static Assembly FindAssembly (object sender, ResolveEventArgs args)
{
string simpleName = new AssemblyName (args.Name).Name;
string path = @"c:\ExtraAssemblies\" + simpleName + ".dll";
if (!File.Exists (path)) return null; // Sanity check
return Assembly.LoadFrom (path); // Load it up!
}
}
class Program
{
internal static void Go()
{
// Now we can reference types defined in c:\ExtraAssemblies
}
}
WARNING
It’s vitally important in this example not to reference types in c:\ExtraAssemblies directly from the Loader class (e.g., as fields), because the CLR would then attempt to resolve the type before hitting Main().
In this example, we could use either LoadFrom or LoadFile. In either case, the CLR verifies that the assembly that we hand it has the exact identity it requested. This maintains the integrity of strongly named references.
In Chapter 24, we describe another approach that can be used when creating new application domains. This involves setting the application domain’s PrivateBinPath to include the directories containing the additional assemblies—extending the standard assembly probing locations. A limitation of this is that the additional directories must all be below the application base directory.
Packing a Single-File Executable
Suppose you’ve written an application comprising 10 assemblies: 1 main executable file, plus 9 DLLs. Although such granularity can be great for design and debugging, it’s also good to be able to pack the whole thing into a single “click and run” executable—without demanding the user perform some setup or file extraction ritual. You can accomplish this by including the compiled assembly DLLs in the main executable project as embedded resources, and then writing an AssemblyResolve event handler to load their binary images on demand. Here’s how it’s done:
using System;
using System.IO;
using System.Reflection;
using System.Collections.Generic;
public class Loader
{
static Dictionary <string, Assembly> _libs
= new Dictionary <string, Assembly>();
static void Main()
{
AppDomain.CurrentDomain.AssemblyResolve += FindAssembly;
Program.Go();
}
static Assembly FindAssembly (object sender, ResolveEventArgs args)
{
string shortName = new AssemblyName (args.Name).Name;
if (_libs.ContainsKey (shortName)) return _libs [shortName];
using (Stream s = Assembly.GetExecutingAssembly().
GetManifestResourceStream ("Libs." + shortName + ".dll"))
{
byte[] data = new BinaryReader (s).ReadBytes ((int) s.Length);
Assembly a = Assembly.Load (data);
_libs [shortName] = a;
return a;
}
}
}
public class Program
{
public static void Go()
{
// Run main program...
}
}
Because the Loader class is defined in the main executable, the call to Assembly.GetExecutingAssembly will always return the main executable assembly, where we’ve included the compiled DLLs as embedded resources. In this example, we prefix the name of each embedded resource assembly with "Libs.". If the Visual Studio IDE was used, you would change "Libs." to the project’s default namespace (go to Project Properties→Application). You would also need to ensure that the “Build Action” IDE property on each of the DLL files included in the main project was set to “Embedded Resource”.
The reason for caching requested assemblies in a dictionary is to ensure that if the CLR requests the same assembly again, we return exactly the same object. Otherwise, an assembly’s types will be incompatible with those loaded previously (despite their binary images being identical).
A variation of this would be to compress the referenced assemblies at compilation, then decompress them in FindAssembly using a DeflateStream.
Selective Patching
Suppose in this example that we want the executable to be able to autonomously update itself—perhaps from a network server or website. Directly patching the executable not only would be awkward and dangerous, but also the required file I/O permissions may not be forthcoming (if installed in Program Files, for instance). An excellent workaround is to download any updated libraries to isolated storage (each as a separate DLL) and then modify the FindAssembly method such that it first checks for the presence of a library in its isolated storage area before loading it from a resource in the executable. This leaves the original executable untouched and avoids leaving any unpleasant residue on the user’s computer. Security is not compromised if your assemblies are strongly named (assuming they were referenced in compilation), and if something goes wrong, the application can always revert to its original state—simply by deleting all files in its isolated storage.
Working with Unreferenced Assemblies
Sometimes it’s useful to explicitly load .NET assemblies that may not have been referenced in compilation.
If the assembly in question is an executable and you simply want to run it, calling ExecuteAssembly on the current application domain does the job. ExecuteAssembly loads the executable using LoadFrom semantics, and then calls its entry method with optional command-line arguments. For instance:
string dir = AppDomain.CurrentDomain.BaseDirectory;
AppDomain.CurrentDomain.ExecuteAssembly (Path.Combine (dir, "test.exe"));
ExecuteAssembly works synchronously, meaning the calling method is blocked until the called assembly exits. To work asynchronously, you must call ExecuteAssembly on another thread or task (see Chapter 14).
In most cases, though, the assembly you’ll want to load is a library. The approach then is to call LoadFrom, and then use reflection to work with the assembly’s types. For example:
string ourDir = AppDomain.CurrentDomain.BaseDirectory;
string plugInDir = Path.Combine (ourDir, "plugins");
Assembly a = Assembly.LoadFrom (Path.Combine (plugInDir, "widget.dll"));
Type t = a.GetType ("Namespace.TypeName");
object widget = Activator.CreateInstance (t); // (See Chapter 19)
...
We used LoadFrom rather than LoadFile to ensure that any private assemblies widget.dll referenced in the same folder were also loaded. We then retrieved a type from the assembly by name and instantiated it.
The next step could be to use reflection to dynamically call methods and properties on widget; we describe how to do this in the following chapter. An easier—and faster—approach is to cast the object to a type that both assemblies understand. This is often an interface defined in a common assembly:
public interface IPluggable
{
void ShowAboutBox();
...
}
This allows us to do this:
Type t = a.GetType ("Namespace.TypeName");
IPluggable widget = (IPluggable) Activator.CreateInstance (t);
widget.ShowAboutBox();
You can use a similar system for dynamically publishing services in a WCF or Remoting Server. The following assumes the libraries we want to expose end in “server”:
using System.IO;
using System.Reflection;
...
string dir = AppDomain.CurrentDomain.BaseDirectory;
foreach (string assFile in Directory.GetFiles (dir, "*Server.dll"))
{
Assembly a = Assembly.LoadFrom (assFile);
foreach (Type t in a.GetTypes())
if (typeof (MyBaseServerType).IsAssignableFrom (t))
{
// Expose type t
}
}
This does make it very easy, though, for someone to add rogue assemblies, maybe even accidentally! Assuming no compile-time references, the CLR has nothing against which to check an assembly’s identity. If everything that you load is signed with a known public key, the solution is to check that key explicitly. In the following example, we assume that all libraries are signed with the same key pair as the executing assembly:
byte[] ourPK = Assembly.GetExecutingAssembly().GetName().GetPublicKey();
foreach (string assFile in Directory.GetFiles (dir, "*Server.dll"))
{
byte[] targetPK = AssemblyName.GetAssemblyName (assFile).GetPublicKey();
if (Enumerable.SequenceEqual (ourPK, targetPK))
{
Assembly a = Assembly.LoadFrom (assFile);
...
Notice how AssemblyName allows you to check the public key before loading the assembly. To compare the byte arrays, we used LINQ’s SequenceEqual method (System.Linq).