Thursday, June 14, 2012

Windows 8 Release Preview is really different ....

As always a new version of the Windows operating system is enough to tickle our curiosity and Windows 8 Release Preview is no different.
We want to know if our products function and perform as expected so we quickly install and test them.
This had not yielded any surprises for many years, until we did the same with Windows 8 Release Preview.

Now I'm not going to discuss the new Metro interface, nor the lack of a start button.
I'm going to talk about Windows Forms in .NET.

As soon as we opened our GUI we noticed it was pretty much messed up. The toolbar was in the middle, the output settings were not visible at all. We ran our code (on Windows 8) in Visual Studio and got the same incorrect results. When we ran the exact same code on Windows 7 everything looked fine.

So what has happened?

As it turned out we had a programming error in a TableLayoutPanel. We used a duplicate cell number.
This had worked fine in all Windows operating system versions since Windows XP and all .NET versions, but apparently our friends in Redmond found time to modify the Windows Forms library as well.


Saturday, January 14, 2012

Why and how we moved to Git

Recently we've switched all our version control from Subversion to Git.
In this article we'll tell you why and how.

Let's start with a little bit of history. From a very early stage we've used Subversion as version control. It works very well and has matured a lot since then. We store our code in several repositories and share code between them with svn:externals.

Freezing the state of a release with Subversion used to be a two stage process. First we create a branch for the release. We then build, test and finally tag the release.
With the use of svn:externals links we we're forced to create branches in all linked repositories and modify the svn:external links in the main repository to point to those branches. If we had not done that, the parts of the code that are not in the main repository would not have been frozen.

So if that works well, why did we switch?
We develop features and make fixes on a hourly basis and want to test those changes before we put them in the main development line. Since all our testing in done on a build server, we need some mechanism to get the change over to that server. With branching (and especially merging) being a bit slow and costly on Subversion we found that we we're using several alternative ways to get the change over to the build server. Copying binaries, applying patches and even checking in into the trunk, we've done it all and we're not really proud of it.

We had to find a way that our version control system could be used to easily and quickly share changes and merge them back into the main development line.

That's where Git came into the picture. We read about it's merge capabilities and it's distributed nature. After some serious considerations and experiments we worked out a way how we could use Git to improve our processes.

Initially we used git-svn to import our existing repositories into Git repositories. It takes a while, but the end results are solid. We then had to find a good solution for our svn:external links as they are not automatically imported into Git. We looked at various options from git-tree, git-modules to custom solutions.
We decided to stick with git-modules.

All our svn:external links have been converted into git-submodules. We use a custom url prefix (url insteadOf in git config) to avoid having to refer to fixed server names.
With git-submodules we no longer have to do create branches in the linked repositories when we label a release or create a branch. Since each git-submodule has a reference to the commit object in the linked repository we always know the exact state of the total code base.

Now did this switch help us with the initial problem?
Yes. We're now able to very easily create and merge branches. Trivial merges are a thing of the past. The best part is that we just pull a change directly from the developers repository onto the build server for testing.
The developer can continue with the next task (on a new branch) and merge the branch back into the main development line when all tests are ok.

Despite our long lasting use of Subversion we're now very happy with Git.
Of course since we're Windows based developers, the integration into Windows Explorer provided by TortoiseGIT is essential for us.

Sunday, September 11, 2011

Silverlight and person names

Recently we found some issues in Silverlight obfuscation, so we created some additional test cases.

One of these test cases involves a local resource with a version simple Person class. This person class only had two properties: Name and PhoneNumber.
In XAML this looked something like this:

<UserControl.Resources>
<local:Person x:Key="person1" Name="Guss" PhoneNumber="123456789"/>
</UserControl.Resources>

So far so good.
After obfuscation we inspected the XAML and found this:

<UserControl.Resources>
<local:Person x:Key="person1" Name="a" PhoneNumber="123456789"/>
</UserControl.Resources>

Of course we did not want to change Guss's name.

So why did this happen?

After some careful inspection we found that the Name property has a special meaning (which is obvious for every normal control) so the compiler generated an underlying field called Guss. Since this field was not public our obfuscator renamed it to "a" with the above XAML as a result.

The lesson we learned was that in XAML you better give a person a FirstName and/or LastName. They work as expect.

Friday, August 26, 2011

Why we love ILSpy ...

Often we get questions and comments about tools like ILSpy (and before that Reflector). Are we afraid of them, don't we think they break our protection and so on.

The truth is that we love these tools and ILSpy in particular. Let me explain why.

In the early days of .NET all you have to look inside an assembly was ildasm. This very basic tools that comes with each .NET SDK shows all metadata of an assembly and shows the IL code found in method bodies. This was of course a step up from the native code we all used before .NET, but you cannot say it's a friendly way of inspecting and assembly.

Now let's first go in why you want to inspect an assembly anyway. Besides the obvious "darker" use cases typically of interest to hackers there are also very valid and very legal use cases. For example many developers will find that as their code base grows, so grows the chance that you include a class or set of classes by mistake. It also happens that use had resources in your assembly in version 1 that you no longer used in version 2. Did you remember to remove them?
Inspecting an assembly is an easy and fast way to look for mistakes like these.
Of course as builders of protection software we also like them as they show the result of our work. Many customers want to see result of obfuscation. Combined with our Inspection friendly mode, it is also a valuable tools for finding obfuscation issues if they may occur.

After ildasm, Reflector was the defacto standard for looking inside assemblies for a long time.
It's easy access to all types, members and resources combined with a decompiler made it also our preferred tool at that time.

For reasons most .NET developers know, Reflector has now been de-emphasized and other tools are taking it's place. One of them only started early this year and has quickly become our favorite: ILSpy. We like this tools because it keeps the same easy access to assemblies that Reflector did, but extends on it with a great decompiler. This meant that simpler tricks like inserting (valid) instructions that broke Reflector are no longer sufficient and people can clearly see the difference between good code obfuscation and easy code obfuscation.

In other words, having good assembly inspection tools with better and better decompilers helps all developers with their normal work and keeps us sharp to stay ahead of them with ever smarter code obfuscation.

Thursday, June 16, 2011

VSPackage and Setup project adventures

For the Visual Studio integration of our obfuscator, we use the Managed Package Framework (MPF) from Microsoft.
With this framework you can build custom project types in .NET. This is a large improvement, because the native interface for Visual Studio is a large set of COM interface.

A custom project type we developed in the DeepSea Obfuscation project type. It contains of implementation for the nodes used in the solution tree, support for building the project using MSBuild and lot's of other details.

One of the details has kept us busy for a while.

Many developers use the Setup and Deployment project that comes with Visual Studio. Although far from perfect, it's an easy way of creating an installer for your project.
We believe it is far from perfect because this project type is the only often used project type that is not MSBuild based. Therefore you cannot built it with normal MSBuild constructs, unless you use to start devenv.com to build it. This also means that the project file although readable is not easily editable.

Now when you combine our obfuscation project with a setup project, all is fine in VS2008. However on VS2010 you got an error "Unrecoverable build error". We hope you agree with us that Microsoft could have chosen a more descriptive error message.

Fortunately we've managed to resolve this issue and we want to share how.
As part of a custom project type, you have to tell what the outputs are for your project. That is done by creating the proper OutputGroup objects which implement IVsOutputGroup2. The output group then has methods to enumerate the outputs found in that group. Each output must implement IVsOutput2.

One of the methods of IVsOutput2 is get_Property. According to the specification you're supposed to return E_NOTIMPL when the requested property is not known/found. The MPF implements this behavior correct. After extensive testing we found that the Setup project asks for a property called "MERGEMOD". This property is not found, therefore E_NOTIMPL is returned. Besides returning this (correct) value, the output parameter is set to String.Empty. As it turns out, the Setup project will not except this value.

We solved this problem by making sure that the output parameter is set to null instead of String.Empty when the property is not known/found.

So dear Microsoft developers. When we return E_NOTIMPL we do mean it is "not implemented". Please stick to your own specification.

Tuesday, June 14, 2011

Benefits of a strong name

Every .NET assembly can be signed with a strong name. What are the benefits and why is it good for obfuscation.

When you sign your assembly with a strong name, a hash is calculated over the entire assembly and the result is signed using a strong name key and stored in the assembly. Because the strong name key is asymmetric, the public key is stored in the assembly, while the private key remains safely in your development system.
The public key itself is also hashed to calculate a public key token. This is an 8-byte value that you see in assembly references.

At runtime, the CLR can verify the the strong named assembly, by re-calcuting the hash and check it's results against the signature stored in the assembly (which it has to decrypt using the stored public key). The CLR now knows if the assembly has been modified after it was signed with the strong name.
The CLR does NOT know who signed it, when it was signed etc. All it knows is if modifications were made.

So why does this help me?

First of all, if you want to install and assembly into the GAC (global assembly cache), you must sign it with a strong name. If not, the GAC will not store the assembly.
Further more, if you're developing and application that has a strong name, all libraries it uses must also have a strong name. You cannot refer to an assembly without a strong name from an assembly with a strong name. This is why it is so important for component vendors to produce strong named assemblies.
In the end it's all about knowing that the assembly you're using is the assembly you intended to use, not a hacked replacement.

Can a strong name signature be hacked?

Yes, in fact that is fairly easy. All a hacker has to do re-sign the assembly (after making some modifications) with a strong name. As long as you keep your strong name key in a safe place, he has to use a different strong name key then you used originally. But at verification time, the assembly is still a valid assembly. The signature is correct.

This is where obfuscation can help. Several obfuscation features use some form of encryption. For example string encryption. It converts literal strings found in the code into encrypted versions, which are decrypted at runtime. Our obfuscator uses the strong name key of an assembly in this encryption/decryption process. The result is that if a hacker has modified the strong name of the assembly, all literal strings used in the code will be altered. This will not break a "hello world" style application, but all larger products will eventually fail when their strings are incorrect.

Monday, May 30, 2011

Control flow obfuscation and Assembly verifiability

Assembly verifiability used to be an issue for assemblies running in a medium-trust ASP.NET environment. Many commercial web hosters did not allow ASP.NET applications to run unless they were 100% verifiable.
Nowadays verifiability becomes ever more importants with Silverlight and Windows Phone 7.

But wat does this verifiability mean?

It means that the CLR can verify before your code is ever run that the assembly will not do anything outside the control of the CLR. For example running unmanaged code is (of course) a big no no when you want your assembly to be verifiable. The CLR specification lists many rules for verifiability. The specification also lists for each IL instructions the requirements wrt. verifiability. These rules are implemented by a tool like peverify.

So how does obfuscation influence verifiability?

Most transformations done by an obfuscator do not change the verifiability of our assembly. However some do, the most important of them is control flow obfuscation.

Control flow obfuscation is a transformation that takes the original code of a method body (a sequence of IL instructions) and alters this sequence in such a way that decompilers no longer recognize the original control flow although the behavior remains the same.
Here's an example:

for (int i = 0; i < 5; i++) {
  Console.WriteLine("i=" + i);
}

This C# body is compiled to this IL sequence:

IL_0000: nop
IL_0001: ldc.i4.1
IL_0002: stloc.0
IL_0003: br.s IL_0021
// loop start (head: IL_0021)
    IL_0005: nop
    IL_0006: ldstr "i="
    IL_000b: ldloc.0
    IL_000c: box [mscorlib]System.Int32
    IL_0011: call string [mscorlib]System.String::Concat(objectobject)
    IL_0016: call void [mscorlib]System.Console::WriteLine(string)
    IL_001b: nop
    IL_001c: nop
    IL_001d: ldloc.0
    IL_001e: ldc.i4.1
    IL_001f: add
    IL_0020: stloc.0

    IL_0021: ldloc.0
    IL_0022: ldc.i4.5
    IL_0023: clt
    IL_0025: stloc.1
    IL_0026: ldloc.1
    IL_0027: brtrue.s IL_0005
// end loop

This sequence will be clearly recognized as for loop.

To alter this sequence into a non-recognizable sequence, there are various approaches.
One common approach is to insert "bogus" jumps to other instructions. These jumps are typically surrounded with opaque predicates in order to avoid the jump from actually do something. An opaque predicate is an expression for which you know at built time what the outcome will be. You can use them like this:

if (opaque predicate is true) {
  execute normal code
} else {
  bogus jump to other instruction
}

If you know up front that the opaque predicate will always evaluate to true, this code will never alter the overall behavior.

Now back to code verification.

Inserting a bogus jump is a simple solution that will break most decompilers, but unless you use it very careful it will also break your code verifiability. The reason for that is that one of the verification rules states that the stack height (and signature) of each instruction must be determined in a single forward pass.
Let's demonstrate this with a few instructions:

IL_001: ldc.i4.4
IL_002: ldloc.0
IL_003: add
....
IL_025: ldstr "hello world"
IL_026: br IL_002

The first instruction pushes an int (value 4) on the stack.
The stack height at the beginning of IL_002 is now 1 and the stack contains a single int typed value.

The second group of instructions is our bogus jump.
At IL_025 a string is pushed on the stack. At IL_026 a jump is made to IL_002.
The stack height at the beginning of IL_026 may be 1, but the stack contains a string typed value. The jump to IL_002 will break the verifiability rules because the stack has a different signature then before.

In future blogs we'll describe control flow obfuscation techniques that do not break the verifiability of your code, but do turn the code into unreadable "spaghetti".

So the next time you see "This method is obfuscated" in your decompiler, do make sure to verify your assembly with peverify.