In this post I want to share one of my favorite features in Visual Studio, the disassembly window. For those of you who aren’t familiar with the disassembly window, or who maybe don’t even understand disassembly in general, allow me to break it down very quickly. In .Net, languages like C#, F#, etc. target the common language runtime (CLR), this is an intermediate platform for converting source code to native assembly, the CLR uses something called intermediate language, which is a high level assembly language that any language can target. The benefit of this is you can create a language that targets the CLR, and because the CLR compiles your code to intermediate language, your language will automatically be able to run on any chipset supported by the CLR. I’ve kinda butchered the exact technical details as to how this process works, but basically, language targets CLR -> CLR spits out intermediate language -> Intermediate language compiles to a specific chipset’s assembly.
So what’s that got to do with anything? Well, because of this process, source code can be mapped directly to exact assembly instructions that get generated by the CLR for your program, this mapped assembly is referred to as “disassembly”.
So why is this useful?
Great question level 2 sub heading, allow me to elaborate. There’s lots of different applications for this, some are more practical than others. For example, one use of this is to verify that the currently executing source code matches the actual assembly being run, and why does that matter? Well, rather than be a good writer and elaborate on this myself, I’m just going to link to this post. One reason I really like it for myself personally, is it’s an absolutely invaluable resource for studying how code works under the hood. Let’s discuss this more since I can better speak to it.
Using the disassembly window to better understand code
I’ve created a basic console program, and in it I want to test a theory. There’s debate over whether or not using the
++ operator is more efficient than simply adding 1 to a variable, well rather than debate, let’s just settle it once and for all (kinda…). First, I wrote the following code:
int i = 0; i = i + 1;
Simple right? Okay, so let’s compile this, set a breakpoint, and look at the assembly generated for this portion of code
Quick side note: When you run in debug, the compiler sometimes spits out lots of extra assembly instructions that allow you to set breakpoints, most notably no-ops, but it will sometime generate different instructions as well, so to see a truly fair comparison, you need to build for release, however you can’t use the disassembly window in release mode, so our hands are a little tied here.
Here is the assembly that gets generated, notice the source code that each set of instructions corresponds to is present as well, this is due to that source mapping I mentioned earlier.
int i = 0; 02D00478 xor edx,edx 02D0047A mov dword ptr [ebp-40h],edx i = i + 1; 02D0047D inc dword ptr [ebp-40h]
Now there are a few things you should know if you’re not super familiar with assembly (said the guy who’s not super familiar with assembly), first off, just because there are more instructions executing, doesn’t mean that bit of assembly is less efficient. Assembly instructions, just like methods or functions, are weighted differently based on how long they take to execute. For example if I write a method in a class that takes 20 seconds to execute, vs. another method that takes just a fraction of a millisecond to execute, both of them appear to only be one line of code when I call them, but they behave very differently under the hood. So basically more doesn’t mean better or worse.
Breaking down what’s happening
Okay, so we’ve got 2 instructions for the variable assignment, an
xor and a
mov, in this case,
xor means check for equal values, and assign zero if the values match. Why assign zero this way? Well,
xor is the absolute fastest way to assign zero to a particular register. The next instruction is
mov, and in this case it simply moves the contents of a double word sized pointer into a register. When reading
mov, it reads as put the right side into the left side, which is reverse of what happens in source code, which is take the stuff on the right and assign it to the stuff on the left. One more thing you may be wondering is what’s with the edx? Registers in assembly are named, and can be used fully or partially, so in this case, we’re using the data register (dx), and more specifically, we’re using the the full, or extended data register (edx), meaning this integer takes the entire register. The next and final instruction is
inc, which as you may have guessed means increment by 1.
Comparing against i++
Okay, so we’ve seen the assembly generated for a simple variable + 1 statement, let’s see what gets generated from variable++:
int i = 0; 04B90478 xor edx,edx 04B9047A mov dword ptr [ebp-40h],edx i = i++; 04B9047D mov eax,dword ptr [ebp-40h] 04B90480 mov dword ptr [ebp-44h],eax 04B90483 inc dword ptr [ebp-40h] 04B90486 mov eax,dword ptr [ebp-44h] 04B90489 mov dword ptr [ebp-40h],eax
Really interesting results here, honestly I was a little surprised to see this myself. So right off the bat we can see that using ++, contrary to popular opinion, is actually a lot more overhead, how do I know this you may be asking? Well, the only difference between this assembly and the previous example, is ++ uses 4 extra mov instructions, and uses the exact same xor/mov for initialization, and it uses inc as well, so there you have it, at least for C# on an intel x64 chipset, it’s more efficient to use variable = variable + 1 for adding 1.
Some other stuff
Along with the disassembly window, you can also view memory addresses, and the current state of registers, two things I don’t use often but are cool to have nonetheless. To use any of these features, while debugging, under debug, simply go to windows you and you’ll get a whole list of cool features you can utilize. I won’t get into these windows in this post, I just wanted to point out that they exist.
So there you have it, how to use and read the disassembly window. If you liked this post, or the idea of using assembly to see whether one method of doing something is more efficient than another method, let me know and I may start a little series on it, maybe I’ll call it assembly duels or something corny like that. Anyway, thanks for reading, until next time.