|
|
|
|
 Tuesday, March 02, 2004
|
Update 2004-03-07: Added screenshots.
Read the intro to find out why I'm writing this.
Alright, before we get into attacking .NET, let's see how it's done against common Win32 programs in x86. First, you'll need a good disassembler/debugger. I recommend OllyDbg. It's very easy to use, and does a good analysis of the code, which helps us out quite a bit. SoftICE is another alternative, but it's low-level, harder to use, and it costs $1000. People tend to use this when they want to debug something like a device driver, or make a patch for Windows.
Here's the executable I wrote for this sample: SimpleCode.exe (44 KB) and if you feel like cheating, the source code: SimpleCode.cpp.txt (1.28 KB). It's very simple. In fact, the whole purpose is to validate the user code -- there's no real content that's protected. However, it will be enough to learn from. Also note that it only runs on Windows 2000 and above. If you aren't using that OS, upgrade :), or get the code and fix it, or email me for a version you can use.
So, let's open OllyDbg and make sure the analysis options are on (Alt-O, check all of them out). Now, load SimpleCode.exe. OllyDbg loads and disassembles the code. You now have a console window open, and a bunch of x86 on your screen. Let's run through the program (F9). Enter 4 chars for your serial, and 4 for your activation code (no checking is done, so you'll screw up the program if you enter more data). A message box appears telling us the code is invalid:

That's our way in, for this example. We know that somewhere before the message box was shown, our activation code was tested. So, let's go breakpoint at the message box. Restart SimpleCode (Ctrl-F2). Right click in the main window and select Search for -> All intermodule calls. In the new window, type MessageBox. You'll see two calls to MessageBoxA. A real program would have many more. Right click one of the calls and select “Set breakpoint on every call to MessageBoxA”.

Run the program and enter fake serial/activation again. The program breaks at “00401163 |. FF15 DC804000 CALL DWORD PTR DS:[<&USER32.MessageBoxA>>; \MessageBoxA“. If we look up a bit, we can see that the arguments loaded are for the invalid serial. This is the message box we want. Go into breakpoints (Alt-B) and disable both breakpoints. Now, the opcode right after the MessageBoxA call is C3, RETN, the end of the function. Considering the code for this function is very short (21 lines), it should contain only the “bad” code -- code we don't want executing. Press F8 to step over that call. Dismiss the message box. Notice you can press “;“ to add comments to lines. It'd be good to mark this line with something like “Return from displaying bad message box.“, just in case we get lost later on. In many programs, there will be many interesting points, so good commenting is key.

If you're going to be doing real attacking, you need to learn some X86. Important things are CALL, RETN, the various jumps, and comparisons. Because most likely, somewhere inside your target program, a check is performed and then a corresponding action is taken. If we can reverse the logic, then we can make the program think correct data was entered when it wasn't (and the opposite: correct data will be considered incorrect).
Now we're about to return to the point that called this function. Press F7 to see where that takes us. Now we're on “00401274 |. 8B4C24 3C MOV ECX,DWORD PTR SS:[ESP+3C]“. The line above that is the callsite of the “bad display function“. Comment it as such. Look around. OllyDbg should display some arrows indicating jumps and targets. If it doesn't go into debugging options and check your settings.
Notice that the callsite of the bad function is a jump target from “00401259 |. /74 14 JE SHORT SimpleCo.0040126F“. If we take the jump, we end up calling the bad function. If we don't we RETN (look at the line right above the bad callsite). Sounds interesting. Set a breakpoint on that JE instruction, restart, run and enter the data.

JE means “jump if equal“. It's opposite is JNE (jump if not equal). Our program is stopped right now at a JE, and OllyDbg says the jump will be taken. Since the jump goes someplace bad, we don't want it to happen. Press space. This opens the reassembler. Change the JE to JNE and press assemble. OllyDbg patches the in-memory executable.

Let's see what happens. If we're lucky, this will call the “good“ code. If not, we just patched something else and the program at the best is going to do something strange, but most likely will crash and burn. Press F9.

What's that? Thanks for activating? Why, you're quite welcome! That jump did it. Wasn't that easy? And we didn't have to learn much X86 at all. To save your changes, we'll need to restart (OllyDbg will complain since the breakpoint code was patched and changed) and goto the breakpoint and re-patch. This time, right click and select “Copy to executable -> All modifications”. Now we've got a patched program.
This was extremely easy (it was a very simple program!), and just demonstrates one way that someone could attack your code. It's also an inflexible attack (a binary patch, versus finding the algorithm), so if a new version is released, we need to debug and patch it again. Hope you learned something!
Update 2004-3-8: Part 2 now available.
|
|
Code | Security
|
Tuesday, March 02, 2004 7:20:18 PM UTC
|
Trackback
|
|
To defend, you must have some idea of what you're defending, and who and what you're defending against, specifically, which attacks. Failure do understand and know these things means that your defense will most likely not be effective, and could in fact decrease your security. Here's an example:
Near where I live, thieves were stealing cars that people parked in the street. The neighbourhood committee decided that they'd stop this. The solution they implemented was to put gates at all entrances and exits of their area, and have guards that only allow cars with a particular sticker get through. This makes people FEEL more secure. However, for the cost (guardhouses and gates construction, guard salaries), it's not as effective as it could be. A thief can still walk in just as easily (gates only block roads), and when driving a stolen car out, the guards will see the car and sticker, recognize it, and let them leave. If they had thought about how thieves operated, then they would have realised this and done something more effective, perhaps hiring the same number of guards, but setting them on a patrol, instead of just sitting at their posts. With unlimited resources, they could do both things, and give each member a special remote key-code to unlock the gate when they are driving. However, the tradeoff in cost and convenience is too high for them.
This is how security is, in the physical and electronic worlds. We have many possibilities, each with their tradeoffs. Deciding which measures to implement requires us to understand how our opponent is going to operate, as well as the details of how exactly our defenses work.
In this series, I'm going to show you how to crack simple code. I'm going to make a series of samples to try this out on (to avoid DMCA problems with real code), so as to get a feel of what crackers do to code. It is not going to be in-depth or show how to become a master cracker. Just enough so that we could attack a simple Windows/.NET program's licensing key system, which is a common theme in software protection.
Continue to Part 1, where we'll crack some simple code...
|
|
Code | Security
|
Tuesday, March 02, 2004 5:26:40 PM UTC
|
Trackback
|
 Monday, March 01, 2004
|
Now that I've decided on which library to use, I'll describe the actual code.
We already know that HTML, esp. in Internet Explorer, provides many attack vectors. And new versions of the browser could add another tag or attribute that can execute code. So we need to use a whitelist, not a blacklist.
Next, there are many more legit users than attackers. So when dangerous content is detected, it needs to be removed -- we can't just blow up and tell the user not to hack us. The number of false positives could actually be rather high, since some people are going to use Word and end up with a lot of tags and who knows what else. And finally, users could accidentally paste something that's potentially dangerous. Yelling at them, or even telling them to fix their code isn't going to work, since they're maybe not even aware that HTML exists.
So, here's the code: SafeHtml.cs.txt (3.28 KB). It's very short and easy, thanks to the HtmlAgilityPack. The processing of style tags is pretty weak (simple replacements), but should do the trick. Enjoy!
Update 2004-Mar-04: Forgot to handle <A href=”scriptType:code...”>. Be sure to add that if you use this code in production.
|
|
Code | Security
|
Monday, March 01, 2004 7:14:24 PM UTC
|
Trackback
|
|
Following up from part 1, I reviewed three different libraries:
HTMLDocument is a commercial component ($249 per dev, inc. source code). The other two are libraries written by some cool people at Microsoft and include source code.
SgmlReader is basically an XmlReader that can handle HTML. To write, we need to use an XmlWriter, and that can mess up the HTML, and we don't want that. SgmlReader seems like it'd be ok if all we wanted to do is determine if there's unsafe content and then return false, but that's not what we need.
However, both HtmlAgilityPack and HTMLDocument read HTML and create a DOM out of it, allowing you to modify it and write the HTML back out. This is what we need. I briefly looked over both libraries to see which one I want to program against. I gave them both an equal rating to start off with, but the scales rapidly tipped in favour of one library.
HTMLDocument definately loses as far as API niceness and robustness. Some problems:
- Inconsistency when loading data into the HtmlDocument. If you have a string, it needs to go in the constructor, otherwise, use an instance method.
- Enums (both of them) are prefixed with “e”. Why?
- Lack of types. There are four types total. That's all. No HtmlAttribute. No HtmlElementCollection. Nothing like that.
- Weak-typed collections. ArrayLists and HashTables are used as the collections, instead of strongly-typed collections. So you must cast, and if you insert an unsupported object, then it will throw an exception when writing the HTML. Not very robust.
- And the silliest thing of all: No encoding support. Worse than that, FORCED ASCII. If you open a file, their code opens a stream, manually passing ASCII encoding. No BOM detection, no system default, just ASCII. Ouch.
These things made me seriously doubt how professional a library HTMLDocument is. Most of these things are ultra-simple to fix. If I was forced to use this, I'd have to buy the source code just to make it right. It seems like it's purpose is to demonstrate how not to construct a class library.
What's more is that HtmlAgilityPack doesn't have any of these flaws. In fact, it seems like it's actually a missing piece of the base class libraries. Superbly done. Writing code against it was so easy and natural. I'm extremely impressed. Even the documentation is much more complete (it comes with a 180KB HTML Help file, compared to HTMLDocument's 36KB HTML Help file).
Hands-down-winner: HtmlAgilityPack.
|
|
Code | Security
|
Monday, March 01, 2004 6:54:09 PM UTC
|
Trackback
|
|
Since I'm about to leave Guatemala after living here for over six years, I thought I'd jot down some experiences as to remember them. I'm not making this up.
At a store, my father asked for some soap. He was told that they currently did not have any, and that they wouldn't for two weeks. My father suggested that if they were selling so much soap, perhaps they'd order more. The storekeep smiled and said “Well actually, we never sell soap for the second half of the month. Our records show that we only sell soap for the first two weeks and then none for the rest of the month. So, we only buy for the first two weeks.”
I went to buy a microwave at the biggest Sony distributor in the country. In the front of the store they had a very interesting microwave with some really advanced features. I asked how much it was, and was told that I couldn't buy it, since they didn't have any. When I pointed out to the salesperson that they did, in fact, have one, and it was right there, he said “That's our display unit. If we sold that, we wouldn't be able to show it to other customers.”
|
|
Humour
|
Monday, March 01, 2004 2:29:43 PM UTC
|
Trackback
|
 Sunday, February 29, 2004
|
I was going to write a series about learning MSIL (Microsoft Intermediate Language, or simply “IL“), and then get into more advanced topics. However, I found a good tutorial (and no doubt there's more if I use Google for am minute) at CodeGuru, called MSIL Tutorial. It should be enough to get people up to some speed.
I'll be writing some articles about how people actually attack programs, starting with nice x86 assembler, and then showing how attacks against .NET programs can use many of the same vectors. I'll show how, even with some weak obfuscation (and by weak I mean pretty much every product currently available), crackers still have an easier time on .NET than on native x86/Win32. Then I'll talk about some mitigation techniques that can be used to make things somewhat harder.
|
|
Code | Security | IL
|
Sunday, February 29, 2004 5:43:53 PM UTC
|
Trackback
|
|
One thing about living in Guatemala is that McDonalds has a delivery service. I don't think they do in Canada or the states. I wouldn't usually write this, but they had some awesome service today. My nephew stayed at my house last night, and this morning we called for a Happy Meal. He is collecting the current toy line, so we asked for a specific part.
Well, when the delivery guy showed up, they had the wrong one. I figured we'd go later and change it. A few minutes later, McDonalds calls to apologize for the mistake and invites us to come by to change the toy. An hour later, they show up at my house, just to deliver the new toy and apolgize again. WOW! That'd be impressive in most countries, but it's doubly so in Guatemala, where the concept of customer service is pretty much non-existant.
|
|
Personal
|
Sunday, February 29, 2004 5:32:28 PM UTC
|
Trackback
|
|
The other day I got an interesting email. A client who we had written a payment processing system was having trouble with MQSeries (shudder), and was pinning the blame on our system. The issue was that when MQSeries dispatched a message that eventually timed out (30 seconds), the payment server blocked until the timeout was returned.
At first, we thought that MQSeries was to blame. After all, it's a most annoying piece of software (it starts up around 30 processes for some reason). There's a reason that IBM's consulting division makes so much money :). But we thought that serializing all connections was a bit bad, even for IBM.
Remoting seemed unlikely. After all, how could anything be scalable if remoting used only one thread? After some tests, we found out the cause.
Apparently there is a bug in remoting. Or perhaps it's by design. The result is that it appears as if remoting tries to keep one and only one thread per CPU active. This could be a performance benefit, assuming your thread is using the CPU. However, when your thread blocks, for instance, calling Thread.Sleep for more than a few hundred miliseconds, or calling WaitOne with an indefinite timout, remoting releases a new thread. This is actually a decent scenario for most things, since it assures your CPUs are operating at the highest efficiency.
The problem was the MQSeries was being called via a MC++ interop library. (IBM didn't have a .NET library when we wrote this, and apparently their new .NET library for MQSeries is pretty bad.) Since it's unknown what happens inside of a P/Invoke request, no thread is released.
The ideal workaround would be to let remoting know that a new thread should be released. However, I'm unsure of how to do this (or if it's even possible), and thus, the workaround is to manually multithread your server-side code where needed.
|
|
Code
|
Sunday, February 29, 2004 3:07:31 PM UTC
|
Trackback
|
|
In an application I'm currently writing, we allow users to write messages with HTML markup in them, to deliver a rich experience. The obvious problem is making this secure. We don't want UserA to write a malicious script and steal some of UserB's data. IE provides some cross-site scripting defense, but defense-in-depth (well, not even that deep in this case) would want to make us ensure that the HTML doesn't contain anything executable. I've seen some samples that claim to clean the HTML with not much code at all. They check a few tags, and they think they're done. Of course, they aren't.
The problem is that IE is extremely powerful. While this is great when developing an intranet application, it makes finding all the attack vectors nearly impossible. For instance, we might think that a style attribute is ok, right? Wrong. There are two problems that I can think of (without thinking too hard). First, someone could use styles to “overwrite” links on the page by using absolute positioning. They could then change the “My Account” link into a link that goes to their own server, and steal the user's information. Second, the style attribute can be used to load an HTML Component (.HTC). This can contain lots of script. That's bad. And this is just in one little attribute!
Needless to say, there are many, many more attack vectors. Even if we could find them all, that doesn't help users when they get a new browser with upgraded and different capabilities. So, we're going to have to resort to a “safe” HTML subset. We'll go though the MSDN reference and pick out the tags and attributes that we consider safe, and anything else will simply get deleted.
Sounds easy enough, except we've got to parse the HTML. Not fun. Fortunately, I've found two libraries that do this. The HtmlAgilityPack, written in C# by Simon Mourier from Microsoft (source included), and DevComponents.com's HTMLDocument, a commercial but inexpensive library. If anyone knows of other HTML parsing libraries, please leave a comment. In part 2, I'm going to review the APIs of the different libraries.
|
|
Code | Security
|
Sunday, February 29, 2004 4:45:27 AM UTC
|
Trackback
|
|
Well, for the last two weeks I've been using some shade of gray (205,205,205) as my Window colour (all backgrounds). And for the most part, applications have worked just fine, not like my last experience. Perhaps I need a darker shade, but I'm worried that the reduced contrast will start straining my eyes and negate the benefit of non-white background to begin with.
Of course, I could change the text colours to white, but I really doubt anything would look good then... Anyways, give it a spin! Turn down the amount of energy that your display is emitting and see how it feels.
|
|
Misc. Technology | Personal
|
Sunday, February 29, 2004 4:28:38 AM UTC
|
Trackback
|
|
In the comments for my post, “Some colour tips for Visual Studio .NET“, Michael Carter writes:
“I'm also using Lucida Sans Typewriter as my default font. I think it's much easier to read than Courier. “
Easier than Courier [New]? I had to try. Well, after playing with Lucida Sans Typewriter for about 5 minutes, I found that going back to Courier New was impossible. Thanks for the tip Michael!
|
|
Misc. Technology
|
Sunday, February 29, 2004 4:25:05 AM UTC
|
Trackback
|
 Thursday, February 12, 2004
|
My earlier post about thinking abstractly in relation to language to text works because when a user is put through something, and must deal with it for a bit, hopefully they will be more sensitive to others who might deal with a circumstance all the time.
Case-in-point: Colours. Why is it that so many developers just ASSUME I'm going to use the standard Windows colour scheme, and then decide that using system colours or transparent colour is too much work, and that they'll just set it to White, since it works?
While talking about background colours in VS.NET, I remembered that this of course applies to many applications. In fact, a while back, I tried to switch the background text color to a nice gray. I found out that my system looked like crap, since over half of the apps I use don't play nice. Some are unreadable, others hurt a LOT to read. I think it was a version of some CD burning software that decided to use Red (FF0000) for some text. Red next to the gray I used turned out to be an optical illusion of pain.
Websites have the same problem too, although most of them have the inverse problem. The designers want the background to be white, and rely on the default. This is NOT necessarily a bad thing. If I set my colour scheme for a dark background, I'd enjoy reading/writing text on a site with a dark background (all the white on my own site is starting to annoy me...).
Eventually, I ended up going back to a white background, painful as it is. But, it's been a while, and perhaps devs are smarter now? I'm going to go switch now and see how things work out.
|
|
Code | Misc. Technology
|
Thursday, February 12, 2004 3:21:20 PM UTC
|
Trackback
|
|
One thing I don't understand is why VS.NET ships with no color coding for strings. It's right there in the options. But, it's left as automatic. Considering how much strings are used in .NET coding, I'd think they'd warrant a bit more attention. I set my string color to Maroon. It's dark so it doesn't stick out too much, but just enough to let me know where character and string data are.
When writing of code (esp. when mixing string literals with code, as I am now for outputting dynamic JScript to web pages), this helps me catch a lot of errors that I'd normally find at syntax checking or compile time. When scanning through to make a change somewhere, the string data sticks out enough that I can easily find a section. I also know explicitly where I'm passing strings around (and thus can find places that might have a refactor possibility).
For those of you who haven't, go into VS.NET Tools -> Options -> Environment -> Fonts and Colors. Go change your string colour to maroon and see if you like it.
My second tip is against eye strain. By default, you have a white background. That's fine if you deal with paper all the time, and thus most text is dark on light. However, if you're like many programmers, time spent on paper during the day (reading programming books in bed doesn't count) is significantly less than time on-screen. Thus, you can benefit by changing text to be dark on light, or in my case, dark on not-as-dark.
What I've done is change my text background to gray (specifically 205, 205, 205). It's light enough that the standard text colours work, but it's dark enough that there is a significant reduction in light output from my monitor. At first it's a bit odd, but quickly you start to feel more comfortable. Naturally, there's less strain on your eyes, since there is less energy going in. This may be one of those things that takes a few years (like ergonomic keyboards) before you realise the benefit. Since eyes are harder to fix than wrists, I'd play it safe and try to reduce strain as much as possible instead of having problems later on.
Oddly enough, this is one area where most systems have gone backwards. When I used various versions of BASIC over 14 years ago, white text was the norm. Heck, even in Turbo C++ I remember not having a white background. One company that does realise this is discreet*. All their products have a “charcoal” interface, where everything is dark. They have a more urgent reason for this, since their products work with video and graphics: your colour perception gets distorted by extra light, thus by keeping the UI dark and as invisible as possible, you don't mix the UI into your colour corrections.
|
|
Code | Personal
|
Thursday, February 12, 2004 2:57:56 PM UTC
|
Trackback
|
|
Something that many programmers have to do, consciously and subconsciously, is think abstractly. Some have defined intelligence as the ability to think or reason abstractly. Abstraction occurs from specification design, all the way to the actual code construction.
I bet many of us have run into some kind of problem in a program where we realise that perhaps one set of data was incorrectly or unnecessarily related to another. Sometimes the reasons for this are related to a lack of understanding of the data that's being dealt with, sometimes it's just oversight.
Something I see happening all the time is the first problem: lack of understanding. This presents itself very often as text encoding problems: “I just want the standard 8-bit ASCII!” is heard often. The easy solution is giving someone a quick primer in Unicode and different encodings.
However, if someone grew up in English, and only uses English, their thoughts regarding the abstraction of language versus text can be quite limited. Perhaps they took a year or two of Spanish or other similar language, so they know that grammar structures can change around. But even with Western European languages, the relation of written versus spoken language is somewhat similar -- at least there is a letter-based alphabet.
I think it should be mandatory for students to learn another alphabet. It's not needed that they understand a language behind it. Simply writing English in a foreign script can be a great mental excercise. Abstracting written language from alphabets is a good thing to know of.
Also, I believe that anyone learning another script or language should do so not only on paper, but use a computer with different inputs configured. Being able to read and write isn't too useful when you're stuck on a computer and you don't know how to use the IME. I can't remember when I used a pen last (except my digitizer). And who is going to have paper pen-pals? Nowadays, it's easier and more fun to get online IM-pals or email-pals.
A simple example is my Chinese Hangman program. In Hangman, I'd be tempted to take the incoming keystroke and add that to the guess -- one letter at a time, just like the paper game. In concept, that works fine for Chinese -- a one character guess. In practise, the problem is that to get that character, many keystrokes or perhaps even characters could be written. For me, I use the Korean word 과일 (Gwa-il) and then convert to Hanja (Chinese characters). My keystrokes are: [Right Alt][r][h][k][Right Ctrl][2]. The right alt switches to Hangeul, rhk are: ㄱ ㅗ ㅏ, which combine to form 과. Right control tells the IME to list Chinese characters for words with the current syllable, and 2 is the number from the list that corresponds to fruit. The end result: 果. Note to everyone who is trying to grab control keys and stop their normal usage for some funky functionality in their own app: You're screwing with someone's input in a very annoying way.
In less two weeks, someone can learn a simple phonetic alphabet and how to use an IME. At least well enough to type a few simple things in, and get a feel for how input might be entered. However, the lessons learned are going to be there adding another automatic “what if...” case while coding or designing, and hopefully avoid some flaw.
|
|
Code | Personal
|
Thursday, February 12, 2004 2:38:37 PM UTC
|
Trackback
|
|
|