A first look at C++/CLI
A brief look at the new C++/CLI syntax and how it improves over the old MC++ syntax
Introduction
When Microsoft brought out the Managed Extensions to C++ with VS.NET 7, C++
programmers accepted it with mixed reactions. While most people were happy that
they could continue using C++, nearly everyone was unhappy with the ugly and
twisted syntax offered by Managed C++. Microsoft obviously took the feedback it
got very seriously and they decided that the MC++ syntax wasn't going to be much
of a success.
On October 6th 2003, the ECMA announced the creation of a new task group to oversee
development of a standard set of language extensions to create a binding between
the ISO standard C++ programming language and Common Language Infrastructure
(CLI). It was also made known that this new set of language extensions will be
known as the C++/CLI standard, which will be supported by the VC++ compiler
starting with the Whidbey release (VS.NET 2005).
Problems with the old syntax
- Ugly and twisted syntax and grammar - All those double underscores weren't
exactly pleasing to the eye.
- Second class CLI support - Compared to C# and VB.NET, MC++ used contorted
workarounds to provide CLI support, for e.g. it didn't have a for-each construct
to enumerate .NET collections.
- Poor integration of C++ and .NET - You couldn’t use C++ features like
templates on CLI types and you couldn’t use CLI features like garbage collection
on C++ types.
- Confusing pointer usage - Both unmanaged C++ pointers and managed reference
pointers used the same
* based syntax which was quite confusing
because __gc pointers were totally different in nature and behavior
from unmanaged pointers.
- The MC++ compiler could not produce verifiable code
What C++/CLI gives us?
- Elegant syntax and grammar -This gave a natural feel for C++ developers
writing managed code and allowed a smooth transition from unmanaged coding to
managed coding. All those ugly double underscores are gone now.
- First class CLI support - CLI features like properties, garbage collection
and generics are supported directly. And what's more, C++/CLI allows jus to use
these features on native unmanaged classes too.
- First class C++ support - C++ features like templates and deterministic
destructors work on both managed and unmanaged classes. In fact C++/CLI is the
only .NET language where you can *seemingly* declare a .NET type on the stack or
on the native C++ heap.
- Bridges the gap between .NET and C++ - C++ programmers won't feel like a
fish out of water when they attack the BCL
- The executable generated by the C++/CLI compiler is now fully verifiable.
Hello World
using namespace System;
void _tmain()
{
Console::WriteLine("Hello World");
}
Well, that doesn't look a lot different from old syntax, except that now you
don't need to add a reference to mscorlib.dll because the Whidbey
compiler implicitly references it whenever you compile with /clr (which
now defaults to /clr:newSyntax).
Handles
One major confusion in the old syntax was that we used the * punctuator with
unmanaged pointers and with managed references. In C++/CLI Microsoft introduces
the concept of handles.
void _tmain()
{
//The ^ punctuator represents a handle
String^ str = "Hello World";
Console::WriteLine(str);
}
The ^ punctuator (pronounced as cap) represents a handle to a managed object.
According to the CLI specification a handle is a managed object reference.
Handles are the new-syntax equivalent of __gc pointers in the MC++
syntax. Handles are not to be confused with pointers and are totally different
in nature from pointers.
How handles differ from pointers?
- Pointers are denoted using the
* punctuator while handles are
denoted using the ^ punctuator.
- Handles are managed references to objects on the managed heap, pointers just
point to a memory address.
- Pointers are stable and GC cycles do not affect them, handles might keep
pointing to different memory locations based on GC and memory compactions.
- For pointers, the programmer must
delete explicitly or else
suffer a leak. For handles delete is optional.
- Handles are type-safe while pointers are most definitely not. You cannot
cast a handle to a
void^.
- Just as a
new returns a pointer, a gcnew returns a
handle.
Instantiating CLR objects
void _tmain()
{
String^ str = gcnew String("Hello World");
Object^ o1 = gcnew Object();
Console::WriteLine(str);
}
The gcnew keyword is used to instantiate CLR objects and it
returns a handle to the object on the CLR heap. The good thing about
gcnew is that it allows us to easily differentiate between managed
and unmanaged instantiations.
Basically, the gcnew keyword and the ^ operator
offer just about everything you need to access the BCL. But obviously you'd need
to create and declare your own managed classes and interfaces.
Declaring types
CLR types are prefixed with an adjective that describes what sort of type it
is. The following are examples of type declarations in C++/CLI :-
- CLR types
- Reference types
ref class RefClass{...};
ref struct RefClass{...};
- Value types
value class ValClass{...};
value struct ValClass{...};
- Interfaces
interface class IType{...};
interface struct IType{...};
- Enumerations
enum class Color{...};
enum struct Color{...};
- Native types
class Native{...};
struct Native{...};
using namespace System;
interface class IDog
{
void Bark();
};
ref class Dog : IDog
{
public:
void Bark()
{
Console::WriteLine("Bow wow wow");
}
};
void _tmain()
{
Dog^ d = gcnew Dog();
d->Bark();
}
There, the syntax is now so much more neater to look at than the old-syntax
where the above code would have been strewn with double-underscored keywords
like __gc and __interface.
Boxing/Unboxing
Boxing is implicit (yaay!) and type-safe. A bit-wise copy is performed and an
Object is created on the CLR heap. Unboxing is explicit - just do a
reinterpret_cast and then dereference.
void _tmain()
{
int z = 44;
Object^ o = z; //implicit boxing
int y = *reinterpret_cast<int^>(o); //unboxing
Console::WriteLine("{0} {1} {2}",o,z,y);
z = 66;
Console::WriteLine("{0} {1} {2}",o,z,y);
}
// Output
// 44 44 44
// 44 66 44
The Object o is a boxed copy and does not actually
refer the int value-type which is obvious from the output of the
second Console::WriteLine.
When you box a value-type, the returned object remembers the original value
type.
void _tmain()
{
int z = 44;
float f = 33.567;
Object^ o1 = z;
Object^ o2 = f;
Console::WriteLine(o1->GetType());
Console::WriteLine(o2->GetType());
}
// Output
// System.Int32
// System.Single
Thus you cannot try and unbox to a different type.
void _tmain()
{
int z = 44;
float f = 33.567;
Object^ o1 = z;
Object^ o2 = f;
int y = *reinterpret_cast<int^>(o2);//System.InvalidCastException
float g = *reinterpret_cast<float^>(o1);//System.InvalidCastException
}
If you do attempt to do so, you'll get a
System.InvalidCastException. Talk about perfect type-safety! If you
look at the IL generated, you'll see the MSIL box instruction in action. For
example :-
void Box2()
{
float y=45;
Object^ o1 = y;
}
gets compiled to :-
.maxstack 1
.locals (float32 V_0, object V_1)
ldnull
stloc.1
ldc.r4 45.
stloc.0
ldloc.0
box [mscorlib]System.Single
stloc.1
ret
According to the MSIL docs, "The box instruction converts the ‘raw’
valueType (an unboxed value type) into an instance of type Object (of type O).
This is accomplished by creating a new object and copying the data from
valueType into the newly allocated object."
Further reading
Conclusion
Alright, so why would anyone want to use C++/CLI when they can use C#, J# and
that VB thingie for writing .NET code? Here are the four reasons I gave during
my talk at DevCon 2003 in Trivandrum (Dec 2003).
- Compile existing C++ code to IL (/clr magic)
- Deterministic destruction
- Native interop support that outmatches anything other CLI languages can
offer
- All those underscores in MC++ are gone ;-)