I have seen many people who want to start writing COM code, but don't know where to start.  Not that COM is under-documented, the problem is exactly the other way: there are just enough resources available for one to study to get them lost along their way, wondering how to create a simple COM object and do something with it.  In this article I try to explain how you can write client COM code, and I'll stay away from all the internal details as much as possible.  If you need more details in a specific topic, you can both consult the MSDN documentation or send me an email.

First, I try to give you some general knowledge of COM, just enough details to get you going:

What is COM?

COM is the abbreviation for Component Object Model.  COM is a binary standard issued by Microsoft, and as long as an object sticks to the COM specification for its binary model in the memory, it can be called from any language which supports COM.

Suppose you have a piece of C++ code, and you want to use it in JavaScript.  You only have one way to do this: rewrite the whole code.  Note that I don't say there's no other easy way than COM, there simply is no way at all to do this except using COM.  The reason is that an object in C++ is not the same thing as an object in JavaScript, so JavaScript has no knowledge about how to create your C++ object, and use it.  Sadly even rewriting the code is not usually an option, because of the limitations of the JavaScript language which don't give you enough power to translate a C++ object into JavaScript.

Using COM, you write a C++ object that only has to follow some basic rules, and everything else is on the compiler's shoulders to create a binary module (DLL or EXE) compatible with the COM specification.  Then you can easily use that COM object from JavaScript.  But I'm not going to discuss how to write a COM object in this article, so I'll stop this topic now.

What is client COM?

Client COM is simply code which uses a pre-written (and possibly 3rd party) COM object.  Don't get cheated by the “client” term here.  It has nothing to do with the usual notion you might have of client/server applications which communicate over a certain kind of network.  Any code which uses a COM object is called client COM code, and likewise any code which actually implements a COM object is so-called COM server code.  In this article, we'll only cover client COM, so you will be able to use a COM object after finishing this article, but not to write one.

What is a COM object?

A COM object is a piece of binary data which follows the COM specification.  Logically, but not technically, a COM object is just like a C++ object: it has methods and properties used to manipulate the object.  But note that a COM object is not the same thing as a C++ object; they're different beasts. The client of the COM object does not care at all about how the object is implemented.  In fact, they can't even see inside of an object.  All they know of the object is what the object itself claims to be, which is the COM interface of the object.

What is a COM interface?

A COM interface is the layout of a COM object, which shows the methods and properties the object supports.  I want to make it clear here that methods and properties in COM are different to what we already know of them in C++.  A method is a function related to a COM interface.  A property is a special kind of method which can be called with a simpler syntax from some languages (such as Visual Basic).  A property usually consists of a “get” and a “put” method (unless it's read-only, in which case it does not have the “put” method; or it's write-only, in which case it does not have the “get” method).  When you want to query a property's value, you call the “get” function, and when you want to change the value of the property, you call the “put” function.  More on this soon.

OK, back to the interfaces.  An interface shows what functions an object has.  A COM interface cannot have a data member (like C++ classes, for example).  Data members are implemented using properties in COM, which I will explain later.  In C++, a COM interface is equivalent to an abstract class (a class with no data members, and with only pure virtual functions).  As you know, abstract classes show nothing about the implementation, and that is what I said above.  In COM, you don't care about how an object is implemented (or even which language it's written in); you just use it.  You don't know anything about a COM object, instead of a set of interfaces which it implements.  Interfaces are so called the “contracts” that a COM object accepts and based upon which it works with its clients.

COM interfaces can themselves be derived from other COM interfaces, but there is no multiple inheritance in COM interfaces.  A COM class can support multiple interfaces.  Don't worry if it all seems weird, you'll learn all these points soon.

The mother interface, IUnknown

All the COM interfaces have to be derived from a pre-defined interface, called IUnknown.  IUnknown is the root for all the interface hierarchies.  Many interfaces derive from IUnknown directly, and many others derive from other interfaces which in turn derive from IUnknown.  IUnknown has only three functions, and every COM interface has to support these functions.  Here is how IUnknown is defined in C++ (its actual definition is in unknwn.h, but the below is the meat of it):

#define interface struct  
  
interface IUnknown  
{  
    virtual HRESULT __stdcall QueryInterface( IID & riid,  
      void ** ppv ) = 0;  
    virtual ULONG __stdcall AddRef( void ) = 0;  
    virtual ULONG __stdcall Release( void ) = 0;  
};

I will describe the QueryInterface( ) function here, and will discuss AddRef( ) and Release( ) later.  Suppose somehow (I'll show you how later) you obtain an IUnknown pointer to a COM object.  Suppose this object supports the IAnotherIFace interface, and you want to call a method on this object's IAnotherIFace interface.  Obviously you should first obtain a pointer to the IAnotherIFace interface.  You do this using QueryInterface( ).  QueryInterface( ) takes to parameters, an IID, and a void **.  The IID parameter is the identity of the COM interface.  The void ** parameter is a pointer to the interface pointer variable you have already allocated.  If QueryInterface( ) succeeds, it returns the interface pointer in *ppv.  Now let's see what the identity is.

COM uses a method to identify everything, called GUIDs.  A GUID (Globally Unique Identifier, sometimes referred to as UUID, Universally Unique Identifier) is a 128-bit number which is guaranteed to be unique (i.e. no one else uses the same GUID as you decide to use).  Its uniqueness is guaranteed because of the very minor chance to generate two identical GUIDs.  I won't discuss what factors are used to make repetitive GUIDs as rare as possible, but you can have my word that any GUID the system creates for you will be unique (by the way, to create a GUID, you can call the CoCreateGuid function).  A GUID is shown like this in C++ (see the guiddef.h header file):

typedef struct _GUID {  
    unsigned long  Data1;  
    unsigned short Data2;  
    unsigned short Data3;  
    unsigned char  Data4[ 8 ];  
} GUID;  
typedef GUID UUID;
typedef GUID IID;
typedef GUID CLSID;

But you rarely have to work with GUIDs in that form.  Most of the time, they are shown in a human readable form by the pattern: {xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}, for example: {E464E05E-C102-499c-9B35-C1D569B065AB}.  An IID (Interface Identifier) and a CLSID (Class Identifier) which we'll cover later are GUIDs as well.  Also COM type libraries are identified by a LIBID, but we won't discuss them in this article.  So, in short, each interface has an associated IID which uniquely identifies the interface.  So, to obtain an interface pointer from another interface pointer, you pass the IID of the desired interface to the QueryInterface( ) function.  Note that a usual naming convention dictates the IIDs to be named in the form of IID__InterfaceName_, so in our example, the IID of the IAnotherIFace will be IID_IAnotherIFace.

As you see, the QueryInterface( ) returns an HRESULT type.  In fact, all the methods and properties in COM are required to return HRESULTs (that is, all the methods and properties except IUnknown::AddRef( ) and IUnknown::Release( )).  HRESULT is a typedef to LONG, and is a rich facility to transfer error/success status from the COM object back to the COM client.  The Platform SDK documentation describes the elements of an HRESULT, so I won't discuss them here.  Only get familiar with the SUCCEEDED() and FAILED() macros for now.  Suppose you have a variable of type HRESULT names hr, then SUCCEEDED(hr) will evaluate to TRUE if hr is a success status code, and FAILED(hr) evaluates to TRUE if the hr is a failure status code.  You can use the Error Lookup tool which comes with Visual C++ to see what each HRESULT code means.  The most common HRESULT codes are S_OK (which shows a general success, defined as 0x00000000) and E_FAIL (which shows a general failure, defined as 0x80004005).

Using what I described, now we can write out first piece of COM code which is a call to QueryInterface( ):

// suppose we have somehow obtained pUnk  
void foo(IUnknown * pUnk)  
{  
    IAnotherIFace * pAnother = NULL;  
    HRESULT hr = pUnk->QueryInterface( IID_IAnotherIFace,  
        reinterpret_cast< void ** > (&pAnother) );  
    if (SUCCEEDED(hr))  
    {  
        // succeeded, pAnother is non-NULL  
    }  
    else  
    {  
        // failed, pAnother is NULL  
    }  
}

Now the above code should make sense to you: it takes an IUnknown pointer and obtains an IAnotherIFace interface from the pointer using the QueryInterface( ) function, then checks the result of the QueryInterface( ) call to see if the operation has succeeded or not (the operation would fail when the object does not support the IAnotherIFace interface, in which case it returns the E_NOINTERFACE error code).  The above code is a common piece of code when writing client COM code, so you should keep it in mind.

A small point, as you've noticed, is that all the IUnknown methods have the __stdcall calling convention.  This is true for all the interface methods in COM, so in the rest of this article I assume that __stdcall is the default calling convention, and do not state it explicitly.

COM object creation and destruction

In C++, you create an object using operator new( ), and destroy it using operator delete( ).  But things are different in COM.  In COM, you create an object using the CoCreateInstance function, but never destroy an object.  Before you say that would cause memory leaks, read on to see how things work in COM.

The CoCreateInstance( ) function is prototyped like this in objbase.h:

HRESULT __stdcall CoCreateInstance( CLSID & rclsid,  
                                   IUnknown * pUnkOuter,  
                                   DWORD dwClsContext,  
                                   IID & riid,  
                                   void ** ppv );

The first parameter is the CLSID of the object to be created.  A CLSID is a GUID like an IID.  Like IID identifies an interface, a CLSID uniquely identifies a COM object (which is referred to as a coclass, which is COM Class in abbreviated form, and in essence is a COM object).  Again like IIDs, CLSIDs usually follow a naming convention like this: CLSID__coclassName_.  CoCreateInstance needs the CLSID to know which object to create.  The second parameter is used for aggregation which is beyond the scope of this article.  You can always pass NULL for this parameter.  The third parameter identifies the context in which the object is to be created.  What this exactly means is beyond the scope of this article, but you should know that you usually pass CLSCTX_INPROC_SERVER for COM objects which are in DLLs, and CLSCTX_LOCAL_SERVER for COM objects that reside in EXE modules, or alternatively CLSCTX_ALL when you don't care where the COM object resides.  The fourth and fifth parameters are just like the parameters of the IUnknown::QueryInterface( ) function.  In fact, CoCreateInstance internally performs a QueryInterface( ) using these parameters.  When you know what interface the object supports, you can pass the related IID as well as a pointer to an interface pointer of that time for the void ** parameter.  When you are not sure which interface(s) an object supports, you can pass IID_IUnknown and a pointer to an IUnknwon *, and later call QueryInterface( ) on the pointer to test the interface for the interfaces you're interested in.

So, let's look at how we can create a coclass of type AnotherIFace using its CLSID (CLSID_AnotherIFace).

void bar(void)  
{  
    IUnknown * pUnk = NULL;  
    HRESULT hr = ::CoCreateInstance( CLSID_AnotherIFace,  
        NULL,  
        CLSCTX_INPROC_SERVER,  
        IID_IUnknown,  
        reinterpret_cast< void ** > (&pUnk) );  
    if (SUCCEEDED(hr))  
    {  
        // call our foo method to obtain a IAnotherIFace pointer  
        foo( pUnk );  
    }  
    else  
    {  
        // error  
    }  
}

This function creates a AnotherIFace coclass, gets its IUnknown pointer, and uses the foo( ) function showed earlier to obtain a IAnotherIFace pointer.  Note that you could request the IAnotherIFace pointer directly, so that calling foo( ) wouldn't be necessary:

void bar2(void)  
{  
    IAnotherIFace * pAnother = NULL;  
    HRESULT hr = ::CoCreateInstance( CLSID_AnotherIFace,  
        NULL,  
        CLSCTX_INPROC_SERVER,  
        IID_IAnotherIFace,  
        reinterpret_cast< void ** > (&pAnother) );  
    if (SUCCEEDED(hr))  
    {  
        // now we have an IAnotherIFace pointer  
    }  
    else  
    {  
        // error  
    }  
}

That's how you construct a COM object.  Now let's see how COM objects are destroyed.  As I said above, you can't destroy a COM object directly.  In other words, if you call operator delete( ) on an interface pointer, the result would be disastrous.  But COM objects should have some way to be destroyed, otherwise they'll stay in memory forever.  Fortunately they have such a way, which is implemented by a technique called reference counting.  When an object is being reference counted, it maintains an internal variable keeping track of the clients who are using it.  When a client obtains an interface pointer, the reference count is incremented by one.  When they no longer need the object, the reference count is decremented by one.  Because multiple clients can use an object simultaneously, this is a nice way to ensure that the object won't be destroyed too soon, and also won't stay in memory forever.  An object will only be destroyed after its reference count reaches zero.  There are two points to consider when implementing reference counting.  The first one is that the COM object's code is responsible to perform this reference counting, since it doesn't happen automatically, but this won't worry us since we don't care about the object's implementation.  The second point is that the client is responsible to tell the object when it's using the object, and when it's done with the object.  This imposes a burden on us as the COM client programmers.  Let's see how we should respect the rules here.

As you remember, when I was talking about IUnknown, I didn't say anything about its AddRef( ) and Release( ) methods.  These two methods are responsible for performing the reference counting.  The AddRef( ) method increments the internal reference count by one, and the Release( ) function decrements the reference count by one.  If the reference count reaches zero after decrementing in Release( ), the object performs a self-destruction.  The AddRef( ) function returns the new reference count, and the Release( ) function returns the reference count after decrementing (so if Release( ) returns 0, the object is most likely destroyed before Release( ) returns.  But you should never store the return values of AddRef( ) and Release( ) and should never rely on them, because these values might not be the true reference count of the object.

A very important thing that most beginners in COM client programming forget is the fact that QueryInterface( ) performs an AddRef( ) on the interface pointer before returning it (if the function succeeds, of course), and expects the caller to call Release( ) when it no longer needs the object.  This implies that the CoCreateInstance function also performs an AddRef( ) on the interface pointer before returning it, because like I said, CoCreateInstance internally calls QueryInterface( ).

As you see, you should completely know the disciplines for reference counting to avoid problems in your code.  Through the article, I show a list of all the rules that you must follow to write safe COM client code.  Here are the first three:

  1. For each time you call AddRef( ) on an interface pointer, you must call Release( ) once, so that for each AddRef( ) you can easily find a corresponding Release( ).
  2. For each time you call QueryInterface( ) and obtain a new interface pointer, you must call Release( ) once on the resulting interface pointer.
  3. For each time you call the CoCreateInstance( ) function and obtain a new interface pointer, you must call Release( ) once on the resulting interface pointer.

Because of the rule number 1, you can't assign interface pointers like normal variables.  See the following code, and try to determine the problem caused:

void DoSomethingAndRelease(IAnotherIFace * pPtr)  
{  
    // Do something on pUnk  
    pPtr->Release();  
}  
  
void bar3(void)  
{  
    IAnotherIFace * pAnother = NULL;  
    HRESULT hr = ::CoCreateInstance( CLSID_AnotherIFace,  
        NULL,  
        CLSCTX_INPROC_SERVER,  
        IID_IAnotherIFace,  
        reinterpret_cast< void ** > (&pAnother) );  
    if (SUCCEEDED(hr))  
    {  
        IAnotherIFace * pSecondPtr = pAnother;  
        DoSomethingAndRelease( pAnother );  
        DoSomethingAndRelease( pSecondPtr );  
    }  
    else  
    {  
        // error  
    }  
}

Do you see the problem in bar3( )?  The pAnother variable is being assigned to pSecondPtr, but this doesn't increment the reference count on the AnotherIFace object, so the result of executing this code is that AddRef( ) is called only once (inside the CoCreateInstance function) and Release( ) is called twice (in the two DoSomethingAndRelease calls).  Fixing this code is really simple:

void DoSomethingAndRelease(IAnotherIFace * pPtr)  
{  
    // Do something on pUnk  
    pPtr->Release();  
}  
  
void bar3_fix(void)  
{  
    IAnotherIFace * pAnother = NULL;  
    HRESULT hr = ::CoCreateInstance( CLSID_AnotherIFace,  
        NULL,  
        CLSCTX_INPROC_SERVER,  
        IID_IAnotherIFace,  
        reinterpret_cast< void ** > (&pAnother) );  
    if (SUCCEEDED(hr))  
    {  
        IAnotherIFace * pSecondPtr = pAnother;  
        pSecondPtr->AddRef();  
        DoSomethingAndRelease( pAnother );  
        DoSomethingAndRelease( pSecondPtr );  
    }  
    else  
    {  
        // error  
    }  
}

So, a correct assignment in COM is done using a built-in C++ assignment as well as an AddRef( ) call.  This simplifies the things a lot: you can call Release( ) on any pointer you no longer need.  This leads us to the rules number 4 and 5:

  1. Never assign an interface pointer without calling AddRef( ) on the new pointer.
  2. Call Release( ) on the interface pointers when you no longer need them.

Also look at the rule number 6 which is really important:

  1. Never use an interface pointer after you call Release( ) on it.

If you forget about this rule, you may use an object which has already been destructed, and I'm pretty sure the results will be anything but what you expect to be.  In order to guarantee that you won't make this mistake, I suggest you to assign the interface pointer to NULL so that if you accidentally use it later, you'd get an access violation which would remind you something fishy is going on.  You can write a function like below to release an interface pointer:

inline void Release( IUnknown *& pUnk )  
{  
    pUnk->Release();  
    pUnk = NULL;  
}  
  
// always call the ::Release( ) function instead of  
// calling IUnknown::Release( ) directly:  
  
void bar4(void)  
{  
    IAnotherIFace * pAnother = NULL;  
    HRESULT hr = ::CoCreateInstance( CLSID_AnotherIFace,  
        NULL,  
        CLSCTX_INPROC_SERVER,  
        IID_IAnotherIFace,  
        reinterpret_cast< void ** > (&pAnother) );  
    if (SUCCEEDED(hr))  
    {  
        // do something with it  
        ::Release( pAnother ); // automatically sets pAnother to NULL  
    }  
    else  
    {  
        // error  
    }  
}

Using this code, you should only remember to call the global Release( ) function instead of the IUnknown::Release( ) function.

Actually if you stick to the above three rules (rules 4, 5, and 6), you have all you need to know about writing safe COM client code with regard to reference counting.  If you follow rules 4, 5, and 6; then the rules 1, 2, and 3 will also be implicitly in effect.

Object/interface relationships

A very important aspect of COM is the relationship between a COM object and its interface(s).  In COM, objects are pieces of code which are abstracted by the use of interfaces.  What it means to you, as the COM client programmer is that you don't know anything about an object's internal behavior and how it works.  The only thing you know is which interfaces a particular object exposes.  You acquire interface pointers through calls to the QueryInterface( ) function of the IUnknown interface, and call the desired methods on the interfaces which results in some code of the object being executed.

Most objects expose multiple interfaces.  As you know, all objects have to support IUnknown, so in fact it doesn't make sense to have an object with only one interface being exposed, because that one interface would be IUnknown, and the object actually provides no way for the client to ask it to do some work.  Each interface the object exposes provides a logically distinct properties and methods which together form the appearance of the object from a specific point of view.  To make myself clear, suppose you have an object called Person implementing a human being.  This object would expose several interfaces, such as IPerson, ILivingThing, ISocialBehavior, IPersonalBehavior, which can be defined like below:

interface IPerson : public IUnknown  
{  
    HRESULT GetName( BSTR * pName ) = 0;  
    HRESULT GetAge( long * pAge ) = 0;  
};  
  
interface ILivingThing : public IUnknown  
{  
    HRESULT IsLiving( BOOL * bLeaving ) = 0;  
    HRESULT Breathe( void ) = 0;  
};  
  
interface ISocialBehavior : public IUnknown  
{  
    HRESULT TalkToSomeone( IPerson * pPerson ) = 0;  
};  
  
interface IPersonalBehavior : public IUnknown  
{  
    HRESULT EatFood( void ) = 0;  
};

With such a structure, you can call CoCreateInstance to create a CLSID_Person object requesting the IID_IPerson interface.  You can get the name of the person using the IPerson::GetName( ) method (which is a good candidate to be converted into a property, as I'll show next).  Then to get the person to talk to somebody else (perhaps another Person object created by a CoCreateInstance call) you can QueryInterface( ) the IPerson interface pointer for the IID_ISocialBehavior interface.  Note that there is no limitation to the number of interfaces an object can expose.

One other point to keep in mind is that nothing prevents different object from exposing the same interface.  In fact, this is natural, since all the objects expose the IUnknown interface.  So, for example if we have a COM object named Cat representing a cat, it can expose the ILivingThing interface (which provides the client with the information about whether the cat is alive or dead, and asking it to take a breath).  Likewise if we have a COM object called Martian representing a Martian man, it can expose the ISocialBehavior interface (provided that Martians can have social behavior with us humans and talk to us!).

Parameter types

In COM, we can have some knowledge on what each parameter is supposed to do.  Most significantly, we have three types of parameters: in, out, and retval.  An ‘in’ parameter, as its name suggests, is the input to a function.  For example, in the ISocialBehavior interface above, the pPerson parameter of the TalkToSomeone function is an ‘in’ parameter, because it's an input for the function so that it knows which person to talk to.  An ‘out’ parameter is the output of a function.  Unlike C++ functions, COM methods and properties have to return HRESULT, so any other output a function might have should be in the form of an ‘out’ parameter.  Note that ‘out’ parameters have to be in form of a pointer, so that the called function can set the value pointed by them.  As an example, in the IPerson interface above, the pAge parameter of GetAge is an out parameter, because it allows the function to return the age of a person.  A ‘retval’ parameter specifies the return value of a function.  In some higher level languages, like Visual Basic, the programmer cannot see the HRESULT return values, so the function can specify a pseudo return value which just seems like a return value, but in essence is an out parameter.  In C++ we don't have such a thing, so you should treat a ‘retval’ parameter just the same as an ‘out’ parameter.  Only you should know that usually the ‘retval’ comes with ‘out’, and is shown like ‘out, retval’.

A nice combination of parameter types is the ‘in, out’ parameters.  An ‘in, out’ parameter is both an input to a function, and an output as well.  When a function takes an ‘in, out’ parameter, it internally frees the original value of the parameter, and then sets the value to the new value.  But we as the COM client programmers don't care about it.  The only thing you must keep in mind is that an ‘in, out’ parameter, the value of the variable can be changed after the function called.  Obviously the ‘in, out’ parameters should be pointers.

There is no way to show these types in C++.  The Interface Definition Language (IDL) which I won't cover in this article has vuilt-in support for these types, but C++ lacks such a support.  To come around this, two techniques are usual: usage of preprocessor macros, and comments:

#define IN  
#define OUT  
#define RETVAL  
  
interface ISomeIFace : IUnknown  
{  
    HRESULT Method1( IN IUnknown *, OUT ISomeOtherIFace ** ) = 0;  
    HRESULT Method2( IN OUT long, OUT RETVAL IUnknown ** ) = 0;  
};

or

interface ISomeIFace : IUnknown  
{  
    HRESULT Method1( /* in */ IUnknown *, /* out */ ISomeOtherIFace ** ) = 0;  
    HRESULT Method2( /* in, out */ long, /* out, retval */ IUnknown ** ) = 0;  
};

I prefer the latter method.  If we want to specify the parameter types for the Person interfaces, we would have some code like this:

interface IPerson : public IUnknown  
{  
    HRESULT GetName( /* out, retval */ BSTR * pName ) = 0;  
    HRESULT GetAge( /* out, retval */ long * pAge ) = 0;  
};  
  
interface ILivingThing : public IUnknown  
{  
    HRESULT IsLiving( /* out, retval */ BOOL * bLeaving ) = 0;  
    HRESULT Breathe( void ) = 0;  
};  
  
interface ISocialBehavior : public IUnknown  
{  
    HRESULT TalkToSomeone( /* in */ IPerson * pPerson ) = 0;  
};  
  
interface IPersonalBehavior : public IUnknown  
{  
    HRESULT EatFood( void ) = 0;  
};

Note that the ‘out, retval’ parameters could be ‘out’ parameters without the C++ programmer seeing the difference, but if they're ‘out, retval’, the other fellow programmers would have an easier time.

This discussion leads us to the next rules of COM client programming:

  1. The interface pointers which are passed as ‘in’ and ‘in, out’ parameters should follow the same rules stated above, so that Release( ) should be called on them after being passed as the parameter when you no longer need the interface pointer.
  2. The interface pointers obtained via ‘out’ and ‘out, retval’ parameters should be Release( )-ed once they are no longer needed.  If the interface pointer is not NULL before being passed as those parameters, Release( ) should be called on it before being passed.

Casting in COM

Here I'm going to discuss one of the biggest mistakes people make when writing client COM code: casting.  I have seen people who freely cast COM interfaces to their heart's content, and think it's safe because it would be safe in the normal C++ world.  But the moment you begin to cast a COM interface into another one is the moment you invite malicious bugs and memory corruptions and access violations into your code.  There are only two situations where casting won't harm: when you cast a pointer to an interface pointer to void**, and where the compiler allows an implicit cast.  The first case is necessary in all calls to CoCreateInstance, IUnknown::QueryInterface and similar functions, where they accept a void** parameter, and you should cast a pointer to an interface pointer to void** using either reinterpret_cast (the preferred method) or the C-style cast (the deprecated and should-be-avoided method).  The second case happens when you use an interface pointer in place of another interface pointer higher in the derivation hierarchy.  For example, you can use the IPerson interface where an IUnknown interface is expected, and the compiler does the cast implicitly.  This cast is completely safe in COM.  Of course in these situations, you can still write the case explicitly (either using static_cast of C-style cast), but I suggest you allow the compiler to do the cast implicitly, so that the compiler catches you where you're doing an illegal cast.  So, the ninth rule of COM client programming is as follows:

  1. Never cast COM interface pointers using any of the static_cast, dynamic_cast, const_cast, reinterpret_cast, or C-style cast, except the two situations pointed out above.

The prohibition of casting in COM does not mean that COM disables casting; it just shows that C/C++ style casts are not appropriate with COM.  COM provides its own method of casting, which is (as you might guess) the IUnknown::QueryInterface function.  You pass the correct IID, and the QueryInterface function returns the desired interface pointer as its second parameter, or alternatively returns E_NOINTERFACE in case of an illegal/unsupported cast.

  1. Stay with the IUnknown::QueryInterface function to do all the castings between interface pointers as you need.

Methods and properties

In COM, we have two distinct types of functions supported by an interface: a method, and a property.  A method is pretty similar to a C++ member function.  A method usually does some work on the client's behalf, and returns the result in the form of a HRESULT status code (which can be either a standard code, like S_OK, or a custom code).  A method can take any number of ‘in’ and/or ‘out’, or ‘retval’ parameters.  A property consists of either one or two special ‘get’ and/or ‘put’ methods.  Their names will be in the form of get__Property_, and put__Property_.  The ‘get’ method retrieves the property's value, and the ‘put’ method changes it.  A certain property can only have the ‘get method (a read-only property) or only have the ‘put’ method (a write-only property).  If a property is of type TYPE, the get/put functions should be declared like this:

 HRESULT get_Property( /* out, retval */ TYPE * ) = 0;  
    HRESULT put_Property( /* in */ TYPE ) = 0;

In some higher level languages, like Visual Basic, the syntax for getting or setting a property is simplified like a C++ object's data member, but in C++ properties are actually the get/put methods, and an interface cannot have any data members like normal C++ classes.

To provide an example, here I change the interfaces of the Person object to use properties:

interface IPerson : public IUnknown  
{  
    HRESULT get_Name( /* out, retval */ BSTR * pName ) = 0;  
    HRESULT put_Name( /* in */ BSTR name ) = 0;  
    HRESULT get_Age( /* out, retval */ long * pAge ) = 0;  
    HRESULT put_Age( /* in */ long age ) = 0;  
};  
  
interface ILivingThing : public IUnknown  
{  
    HRESULT get_Living( /* out, retval */ BOOL * bLeaving ) = 0;  
    HRESULT Breathe( void ) = 0;  
};  
  
interface ISocialBehavior : public IUnknown  
{  
    HRESULT TalkToSomeone( /* in */ IPerson * pPerson ) = 0;  
};  
  
interface IPersonalBehavior : public IUnknown  
{  
    HRESULT EatFood( void ) = 0;  
};

In the above example, the IPerson interface has two properties, Name and Age, which can be both read and changed, and the ILivingThing interface has one read-only property, Living.

COM runtime initialization

Like many libraries available, the COM library must also be initialized before being used.  You should use the CoInitialize function to initialize the COM library, and the CoUninitialize function to allow the COM library to clean up.  CoInitialize takes only one parameter which should always be set to NULL, and CoUninitialize does not take any parameters.

Because this is such a common task, I have written a class called CComInit as follows:

class CComInit  
{  
public:  
    CComInit(void)  
    {  
        ::CoInitialize( NULL );  
    }  
  
    ~CComInit(void)  
    {  
        ::CoUninitialize();  
    }  
};

To use this class, it's enough to create a global object of this class in one of the .cpp files of your project.  This way you can be sure that usage of COM from your application will always be safe.  Note that if your application is multi-threaded, you need to create an object of this class inside each thread of your application which will use the COM API or COM objects.  You can just create an object of this class in your thread entry point function's stack so that it would be created when the thread starts, and would be destroyed when the thread terminates.  This ensures that the COM library is safely available to your code during the whole life time of your thread.

Strings and COM

Strings in COM are the source of much confusion for almost every beginner COM client programmer.  In C, a string is usually represented as a char*.  In C++, we have a more advanced string, which is the std::string class.  In COM1, a string is something different.  We represent a string using a BSTR in COM.  A BSTR is a typedef for an array of wchar_t elements, and wchar_t is (according to the C++ standard) a built-in type to represent wide characters (but it's a typedef to unsigned short in compilers like VC++ 6, but fortunately it's been fixed in VC++ .NET).  In other words, a BSTR looks like a Unicode string.  If you have never worked with Unicode strings, check out my Unicode article.  But a BSTR has some additional properties which is beyond the scope of this topic, but the most important one is that it can contain NULL characters, unlike C/C++ strings, where a NULL character means the end of the string.

You create a BSTR using one of the SysAllocString, SysAllocStringByteLen, or SysAllocStringLen functions.  Of the three, the former and the latter are of most use.  They both take a constant Unicode string as their first parameter, and return a BSTR which is a copy of the Unicode string passed to them.  SysAllocStringLen also has a second parameter which specifies the length of the string.  SysAllocString assumes the Unicode string is NULL terminated, and calculates its length using the lstrlenW function.  To free the BSTR when you're done with it, you call SysFreeString.  You can change the size of an existing BSTR using one of the SysReAllocString or SysReAllocStringLen functions.  To get the length of the BSTR, you call the SysStringLen or SysStringByteLen functions.  Please see the Platform SDK Documentation for more information on these functions.

If you keep in mind that can contain embedded NULL characters, you're safe to use the wcs_xxx_ functions to manipulate the BSTRs.  There are some classes like _bstr_t (declared in comutil.h), and CComBSTR (declared in atlbase.h) which can simplify working with BSTRs.  You can also use the ATL Conversion Macros (check out the Unicode article) to convert from BSTRs to C strings, and vice versa.  To work with strings in COM, keep in mind that you should follow rules 7 and 8 for BSTRs as well.  Of course you call SysFreeString on a BSTR instead of Release( ).

Enough theory, let's see it in action

OK, now you need almost everything you know to get started in writing COM client code for C++.  Of course, there are dozens of details which are left unspoken in this article, but covering all of them requires a book or two.  You know enough to start surfing in the MSDN and reading up books to get a firm grasp on COM.  But no theory has the effect of a piece of sample code.

I wrote a simple Win32 SDK application to demonstrate the topics discussed in the article.  The sample application which is called Wallpaper (and you can download it at the bottom of the article) does a very simple task: displaying the location of the wallpaper picture selected by the user, and changing it.  To perform this task, the sample utilizes the IActiveDesktop interface offered by the Active Desktop component.  Here is the relevant pieces of the source code:

bool Initialize()  
{  
    if (!pActiveDesktop)  
    {  
        HRESULT hr = ::CoCreateInstance( CLSID_ActiveDesktop, NULL,  
            CLSCTX_INPROC_SERVER, IID_IActiveDesktop,  
            reinterpret_cast< void ** > (&pActiveDesktop) );  
        return SUCCEEDED(hr);  
    }  
    else  
        return true;  
}  
  
void Destroy()  
{  
    if (pActiveDesktop)  
    {  
        pActiveDesktop->Release();  
        pActiveDesktop = NULL;  
    }  
}  
  
bool GetCurrentWallpaper(LPTSTR pszBuffer, DWORD cch)  
{  
    if (!pActiveDesktop && !Initialize())  
        return false;  
  
    LPWSTR pwsz = reinterpret_cast< LPWSTR > (::_alloca( cch * sizeof(WCHAR) ));  
    HRESULT hr = pActiveDesktop->GetWallpaper( pwsz, cch, 0 );  
    if (SUCCEEDED(hr))  
    {  
        USES_CONVERSION;  
        ::lstrcpyn( pszBuffer, W2CT( pwsz ), cch );  
        return true;  
    }  
    else  
        return false;  
}  
  
bool SetCurrentWallpaper(LPCTSTR pszBuffer)  
{  
    if (!pActiveDesktop && !Initialize())  
        return false;  
  
    USES_CONVERSION;  
    HRESULT hr = pActiveDesktop->SetWallpaper( T2CW( pszBuffer ), 0 );  
    pActiveDesktop->ApplyChanges( AD_APPLY_ALL );  
    return SUCCEEDED(hr);  
}

As you see, the code seems extremely easy.  And in fact, when you respect the rules in COM client coding, the whole process is easy!  So, welcome to the world of COM!

Glossary

Here is a list of the terminology used in this article which might be new to you, together with a short definition for them.

COM
A binary standard which allows objects communicate with each other at runtime without respect to the language they are developed in. [back]

COM object, CoClass
An object which adheres to the rules of COM.  Not to be confused with a C++ object.  There is a subtle difference between COM objects and CoClasses which is like the difference between C++ objects and classes.  Objects live in memory, CoClasses are specifications of those objects (i.e. what interfaces they support). [back]

Client COM
Code which utilizes a COM object. [back]

Server COM
Code which implements a COM object. [back]

COM interface
A set of contracts (expressed in terms of functions) which a COM server promises to fulfill and a COM client can depend upon. [back]

IUnknown
The mother of COM interfaces: the interface which all other COM interfaces should derive from either directly or indirectly, and the only possible interface which does not derive from an interface.  It provides a basic set of functionality (reference counting and runtime polymorphism) which all COM objects must support. [back]

GUID, UUID
A 128-bit unique identifier used to identify things in COM world. [back]

CLSID
A GUID for identifying a CoClass. [back]

IID
A GUID for identifying an interface. [back]

LIBID
A GUID for identifying a library. [back]

HRESULT
The standard return value for all COM functions and methods stating a success/error status. [back]

Acknowledgements

Max provided some nice comments on the first version of this article which lead to some improvements in the structure and understandability of this work.  Thanks a lot, Max!

 Download source code for the article


Footnotes:

1 - Strictly speaking, COM itself has no definition of a string like BSTRs.  BSTR is the standard string type in Automation, one of the technologies based upon COM.  Because of the reasons that fall outside the scope of this article, it's preferred to use BSTRs to represent strings in COM as well, but nothing prevents the developer of a COM server from using C style strings in designing her interfaces.

This article originally appeared on BeginThread.com. It's been republished here, and may contain modifications from the original article.