Steps to Understanding Boxing and Unboxing

I could start by telling you that wrapping a value type in an Object is boxing. There, we’re done right? I mean, you now know what that means, yeah?! So, you could probably then deduce that unboxing would be taking that variable back out of the object?! SIMPLE! Cool…so then…why do we care? Good question. Let’s see if we can go ahead and answer that!In order to understand why boxing and unboxing isn’t the greatest of ideas, we need to understand what’s going on behind the scenes. Just know, like with anything else in life, the devil’s in the details! Let’s get into it then.

It all boils down to memory management. In .NET (and many other languages, including Java), there are multiple areas of memory available to your programs. The first we need to discuss is call the Stack.

The Stack

The stack is managed by the .NET runtime and it is where memory is allocated and immediately cleaned up as soon as the variables are out of scope. The important thing to know here is that only VALUE TYPE variables are stored in the stack. What are value type variables? These are the basic native types in .NET such as numerical types, booleans, structs and enums. A more complete list can be found here: http://msdn.microsoft.com/en-us/library/s1ax56ch.aspx. When variables of these types are created in your methods, these values are stored in memory in a LIFO stack. .NET manages this memory for you so you can just create your variables, assign them some values and forget about them. Once your method runs to completion, the .NET CLR will clean that memory and make it available for the next method call. The great thing about this is these variables are short-lived and they are very small in size – the size of the memory is simply enough to store the actual data as well as a block for describing the type of data in that memory block. Therefore, the stack is lean and efficient.

The Heap

So if the stack is where value types are stored, then the Heap is where REFERENCE types are stored. What’s a reference type?! Basically anything that’s not a value type! But, to be a touch clearer, pretty much anything that’s an Object – so any classes you instantiate, strings, interfaces, delegates, etc. For a full list of those you can check this out from MSDN: http://msdn.microsoft.com/en-us/library/490f96s2.aspx. If you’re doing good OO (Object Oriented) design, chances are your code will be full of reference types.

What’s so different from the Stack and the Heap?! Well, where the stack is lean and efficient, the Heap is not. For every reference type you put in the Heap, there’s a variable that also exists in the stack to point to it. There’s also the dirty little secret that your variables you placed on the heap don’t go away when you do. Let’s say you create 100 instances of a class within the scope of a method. One would think that once you leave the scope of the method (return out of it), the slate is wiped clean. Well, that’s not so with the heap. Items on the heap hang around until the garbage collector comes along and removes any stray chunks lying around. Your stack variables disappeared as soon as you returned out of your method, but your heap variables will still be there until the CLR decides it’s time to crawl the heap and do some house keeping.

Well, I guess I need to figure out how to make everything a value type…

Calm down now! Let’s not get out of hand here! As far as .NET is concerned, if you didn’t have reference type objects piling on the heap, there’d be no such thing as object oriented programming! Think about it – any class instantiation goes on the heap! We’re definitely not proposing writing insane code that would circumvent the entire reason OO even exists. Rather, we want to make you aware of little things that can help improve your app and your system’s overall performance. And thus, leading into the topic of Boxing/Unboxing.

Boxing

Now that you have a general idea of what roles variables play in the stack and the heap, let’s get to the heart of the matter. Boxing – this is simply taking a value type, wrapping it (typically in a System.Object) which throws it on the heap. Here’s a simple example of a forced box:

var myValue = (object)5;

1	var myValue = (object)5;

As you can see above, 5 was cast into an object, and thus it’s boxed. The value 5 (what was an integer) is now on the heap with the variable myValue having a pointer to that location on the heap.

Why would you do this?! Well, the simple truth is that you probably don’t even know you’re doing it! Let’s take a simple example:

String.Format("{0} has {1} pets!", "Joe", 8);

1	String.Format("{0} has {1} pets!", "Joe", 8);

That looks innocent enough right?! Well, the thing that you wouldn’t even think about is the fact that once you start passing in arguments of different types to the String.Format() method call, the method signature that’s used is Object[] for the parameter list! What does that mean? That means that your values for “Joe” and 8 get wrapped into Objects and thrown on the heap just to output your (what you thought to be) simple string!

What Insidiousness!

Indeed. Like I said, you probably didn’t even know you were doing that! If you want to know a nice little trick to get around this, check out the post by Joe on how to avoid boxing and unboxing in console writeline and string methods.

Another place that is often overlooked is in the use of Hashtables and Arraylists – ALL VALUE TYPES that goes into these type of collections get boxed. This was the way of life prior to .NET 2.0. Obviously, that’s a few versions ago, so why in the world are we even talking about that?! Well, old habits die hard. People who started out in the early days of .NET may not know to use List<T> or Dictionary collections. In .NET 2.0 the introduction of Generics changed everything. Now you can actually create a collection that does NO BOXING! Woo hoo!

Unboxing

So boxing is the task of taking a value type, wrapping it and throwing it on the heap, then unboxing is basically the inverse, although, as mentioned in Podcast Episode 2, it’s not so completely cut and dry. There appear to be differing definitions of what exactly unboxing is, but it really is just a question of how far you go to be considered unboxed. In one definition, just the task of accessing the memory location of the value type on the heap is considered unboxing, however, MSDN describes unboxing as the physical copying of the value back to a value type variable. You can see that writeup on MSDN’s description of boxing/unboxing.

For me, it seems to make sense that unboxing would mean actually copying it back and casting it to a value type. Either way you look at it, it’s a task that comes at a cost. Rather than having direct access to the variable on the stack like you would with a true value type, now you have to go look up that value in this huge heap, cast it back to the original value type, and potentially (if you’re doing a copy) allocate more space for the new variable on the stack. Obviously this takes quite a bit more computing power and memory than just having a simple variable lying around.

Wrapping Up (Pun)

Hopefully I’ve put together a decent outline of what all this boxing/unboxing hubbub is about. It’s one of those things that while it seems very minor can actually have large negative impact on your overall application’s and system’s performance. Just knowing about boxing and unboxing and some of the simple little steps you can take to avoid doing this (mostly) unnecessary operation will give you a leg up on writing great performing applications. Now you can save the heap space for what it was truly intended for!

Share the joy

Navigation