Understanding and Using the Disassembly Window in Visual Studio

In this post I discuss how to use and read the disassembly window, and determine if i++ is more efficient than i = i + 1.

Advertisements

In this post I want to share one of my favorite features in Visual Studio, the disassembly window. For those of you who aren’t familiar with the disassembly window, or who maybe don’t even understand disassembly in general, allow me to break it down very quickly. In .Net, languages like C#, F#, etc. target the common language runtime (CLR), this is an intermediate platform for converting source code to native assembly, the CLR uses something called intermediate language, which is a high level assembly language that any language can target. The benefit of this is you can create a language that targets the CLR, and because the CLR compiles your code to intermediate language, your language will automatically be able to run on any chipset supported by the CLR. I’ve kinda butchered the exact technical details as to how this process works, but basically, language targets CLR -> CLR spits out intermediate language -> Intermediate language compiles to a specific chipset’s assembly.

So what’s that got to do with anything? Well, because of this process, source code can be mapped directly to exact assembly instructions that get generated by the CLR for your program, this mapped assembly is referred to as “disassembly”.

So why is this useful?

Great question level 2 sub heading, allow me to elaborate. There’s lots of different applications for this, some are more practical than others. For example, one use of this is to verify that the currently executing source code matches the actual assembly being run, and why does that matter? Well, rather than be a good writer and elaborate on this myself, I’m just going to link to this post. One reason I really like it for myself personally, is it’s an absolutely invaluable resource for studying how code works under the hood. Let’s discuss this more since I can better speak to it.

Using the disassembly window to better understand code

I’ve created a basic console program, and in it I want to test a theory. There’s debate over whether or not using the ++ operator is more efficient than simply adding 1 to a variable, well rather than debate, let’s just settle it once and for all (kinda…). First, I wrote the following code:

int i = 0;
i = i + 1;

Simple right? Okay, so let’s compile this, set a breakpoint, and look at the assembly generated for this portion of code

Quick side note: When you run in debug, the compiler sometimes spits out lots of extra assembly instructions that allow you to set breakpoints, most notably no-ops, but it will sometime generate different instructions as well, so to see a truly fair comparison, you need to build for release, however you can’t use the disassembly window in release mode, so our hands are a little tied here.

Here is the assembly that gets generated, notice the source code that each set of instructions corresponds to is present as well, this is due to that source mapping I mentioned earlier.

            int i = 0;
02D00478  xor         edx,edx  
02D0047A  mov         dword ptr [ebp-40h],edx  
            i = i + 1;
02D0047D  inc         dword ptr [ebp-40h] 

Now there are a few things you should know if you’re not super familiar with assembly (said the guy who’s not super familiar with assembly), first off, just because there are more instructions executing, doesn’t mean that bit of assembly is less efficient. Assembly instructions, just like methods or functions, are weighted differently based on how long they take to execute. For example if I write a method in a class that takes 20 seconds to execute, vs. another method that takes just a fraction of a millisecond to execute, both of them appear to only be one line of code when I call them, but they behave very differently under the hood. So basically more doesn’t mean better or worse.

Breaking down what’s happening

Okay, so we’ve got 2 instructions for the variable assignment, an xor and a mov, in this case, xor means check for equal values, and assign zero if the values match. Why assign zero this way? Well, xor is the absolute fastest way to assign zero to a particular register. The next instruction is mov, and in this case it simply moves the contents of a double word sized pointer into a register. When reading mov, it reads as put the right side into the left side, which is reverse of what happens in source code, which is take the stuff on the right and assign it to the stuff on the left. One more thing you may be wondering is what’s with the edx? Registers in assembly are named, and can be used fully or partially, so in this case, we’re using the data register (dx), and more specifically, we’re using the the full, or extended data register (edx), meaning this integer takes the entire register. The next and final instruction is inc, which as you may have guessed means increment by 1.

Comparing against i++

Okay, so we’ve seen the assembly generated for a simple variable + 1 statement, let’s see what gets generated from variable++:

            int i = 0;
04B90478  xor         edx,edx  
04B9047A  mov         dword ptr [ebp-40h],edx  
            i = i++;
04B9047D  mov         eax,dword ptr [ebp-40h]  
04B90480  mov         dword ptr [ebp-44h],eax  
04B90483  inc         dword ptr [ebp-40h]  
04B90486  mov         eax,dword ptr [ebp-44h]  
04B90489  mov         dword ptr [ebp-40h],eax

Really interesting results here, honestly I was a little surprised to see this myself. So right off the bat we can see that using ++, contrary to popular opinion, is actually a lot more overhead, how do I know this you may be asking? Well, the only difference between this assembly and the previous example, is ++ uses 4 extra mov instructions, and uses the exact same xor/mov for initialization, and it uses inc as well, so there you have it, at least for C# on an intel x64 chipset, it’s more efficient to use variable = variable + 1 for adding 1.

Some other stuff

Along with the disassembly window, you can also view memory addresses, and the current state of registers, two things I don’t use often but are cool to have nonetheless. To use any of these features, while debugging, under debug, simply go to windows you and you’ll get a whole list of cool features you can utilize. I won’t get into these windows in this post, I just wanted to point out that they exist.

Conclusion

So there you have it, how to use and read the disassembly window. If you liked this post, or the idea of using assembly to see whether one method of doing something is more efficient than another method, let me know and I may start a little series on it, maybe I’ll call it assembly duels or something corny like that. Anyway, thanks for reading, until next time.

  1. […] on March 26, 2018by admin submitted by /u/Trevor266 [link] [comments] No comments […]

    Like

    Reply

  2. both `i++` and `i = i + 1` compile to the same CLR

    L_0008: ldloc.0
    L_0009: ldc.i4.1
    L_000a: add
    L_000b: stloc.0

    Both of these compile to the same 8086

    inc dword ptr [ebp-40h]

    So they are identical. You are comparing `i++` with `i = i++` which is obviously not the same thing as it involves storing the value twice. If you want to compare ‘i = i++’ then the equivalent would be `j = i++`, `j = i = i +1` or the nonsensical `i = i = i + 1`.

    Like

    Reply

    1. Yeah…I’ve since realized this, I was sort of rushing posts for the week and that was the example that popped in my head so I just rolled with it, probably should have thought it through a little more. I will probably go through and fix this soon.

      Like

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Advertisements
Advertisements
%d bloggers like this: