Tuesday, February 15, 2005

The IE7 announcement

Anyone who has found their way here knows that IE7 has been announced. This is to discuss the timing and announcement itself of the news.

Today is the day Firefox 1.0 was downloaded 25 million times.

MSFT people on IEBlog have consistently said that their goal is IE for Longhorn, although Bruce Morgan certainly suggested a decoupled release in this comment a while back.

I wonder how long the IE team have really been planning this!

Thursday, February 10, 2005

Automated Testing in Browsers

DISCLAIMER: I know next to nothing about software development! This whole post is written by a complete newbie. The primary purpose of writing this was to try to learn! Please help me out with your comments.

This article is inspired by a recent comment from IEBlog with suggested post titles. There is also a relevant previous article on IEBlog, which talks more about automated debugging than automated testing per se.

Here I will talk about existing automated testing efforts that I am aware of (for browsers) and ways I feel they can be improved.

Layout Testing

Mozilla and Opera (and presumably the IE team also) use automated testing on their nightlies for layout regressions (ie, when they have reintroduced a previous layout bug). They take 'known good' snapshots of the rendering of thousands of weird HTML/CSS test cases, and see if they change from day to day. There is some excellent discussion of the difficulties encountered here.

These difficulties (which have largely been overcome) include:

  • A way to detect, log and move on from crashes and hangs.
  • Ensuring all layout is done before the screenshot is taken.
  • Strange results encountered using antialiasing/transparencies etc.
  • Testing in different window sizes.
Automatically dealing with animated testcases is understandably hard. Frame-by-frame would be ideal if possible but then I guess you run into issues by artificially flushing output streams...

The W3C provide excellent test suites for HTML/XHTML, CSS, SVG and all sorts of other standards.

These could easily be incorporated into such automated layout tests (assuming of course, that they are supported to being with!). After all, the whole point of automated layout and rendering tests is to find regressions systematically instead of piecemeal.

I think there is a real possibility for an exponential growth in test cases, particularly when you consider these tests are easily separable and distributable.

Automated layout testing is just one possible part of functional testing, but I thought it was cool because of the graphical element. There are obvious other functional tests for client-side scripting languages etc.

Fuzz Testing

This came to prominence when Michal Zalewski created his 'mangleme' tool, which spits out automated 'shards' of malformed HTML, against which the rendering engines are tested. He describes this results in this BugTraq posting. In his original tests he discovered multiple hangs and crashes in Gecko, Presto and Lynx and others. Trident was far more resilient, as we later learned, in part due to fuzz testing being part of Microsoft unit tests. A subsequent Python port of the code did reveal an IFRAME vulnerability for Trident.

Fuzz testing should certainly be applied to all possible input points. The next obvious question is 'how?'.

Larry Osterman's approach seems to be taking lots of cases of valid input, and then deforming them at random, by inserting invalid characters, unusually large inputs, unusually nested inputs, etc, increasing the pathology of the input all the time - generally, an excellent idea.

The first question I have is, why not do these things more systematically than 'randomly'? If you can spit out test cases and order them by approximate degree of pathology, then they are easily split up into separate classes, and easy candidates for testing using distributed methods. Fuzz testing as a methodology is in the open, and I can certainly imagine blackhats harnessing their zombied botnets to find new vulnerabilities. Microsoft has enough money to buy clusters of clusters, and I'm sure Mozilla and Opera could leverage (ugh!) their incredible goodwill to start huge distributed.net autotesting projects. This would increase the 'depth' of any fuzz test.

But how would you increase the 'breadth' of a fuzz test? For instance, take the Python version of mangleme above. It certainly mangles the values, but not the attributes or tags themselves. Now, you may say that these are a tightly controlled, already tested, small set of cases... but the whole point of fuzz testing is to increase your ability to test all input! THIS, if anything, is where I would use 'randomness' - perhaps a genetic algorithm could mutate characters at random?

Real world online error reporting is really just glorified low-level fuzz testing. On large projects this usually requires statistical analysis before it can be used to prioritize bugfixes.

I think there could be really interesting challenges when fuzz testing across applications!

Automated Pinpointing

I couldn't find a better name for this, but this is where you have a bug caused by a very complicated scenario, and you want to narrow it down to the 'minimal' test case. Here, you tell your test program what the 'symptoms' of the bug are, and it will systematically reduce input in different (possibly random) ways, stopping when it is reasonably confident it has shrunk the problem down.

Automatic Debugging aka Code Checking

There is a plethora of tools which can be used on the codebase itself, actively finding problems before or during compilation rather than a binary reacting passively to inputs. One place where I found lots of research and documentation (designed for open source projects, but definitely more widely applicable) was the Berkeley Open Source Quality Group. Microsoft has released a tool called PreFast [PowerPoint file]. This goes through every possible execution path in every function to find possible errors.

Simple stuff to manage data types and check bounds 'locally' has been around for a while. The real kicker is analyzing as you move down code paths. At this point more than any other, I would like to stress that I am not a coder of any sort!

This is hard, so a really good first step is to simplify the problem as much as possible. One way is to try to separate out, as much as possible, sections of code which provably (or as close enough as is practical) don't touch each other at all, and then work with each section individually, ignoring all 'known good' code. However, in practice this means separating out only those bits which don't rely on global information. This is why PreFast uses function by function analysis, at the cost of not being able to work on certain dynamic problems.

Another big issue with such autodebugs is that they can flag a lot of false positives. This is why one area being researched is refining each possible failure to a 'minimal counterexample', and then trying to show by brute force that such a minimal counterexample does not exist. I suggest Bayesian filtering ;-)

Conclusion?

I'm getting far far over my head with this, so I'll leave it here. Anyway, the point is obvious - that automated testing is very important and becoming increasingly important.

This is a document in progress, please leave messages with suggestions!

Welcome to IEBlog Watch

The purpose of IEBlog Watch is to discuss matters arising from articles and comments at the IEBlog. It will also serve as an open forum to talk about Internet Explorer, other web browsers, and web browsing/development issues more generally.

Why did I set this blog up?

I asked the question

How many security vulnerabilities in Internet Explorer have been due to an ActiveX component?
on the IEBlog, as a comment to this post. In order to 'nail down' the question, I added that I was willing to discount the effect of patches instead of vulnerabilities, known vulnerabilities only, and the same vulnerability in different versions of IE. I also said that I wasn't expecting an answer, but I hoped to be proved wrong.

Perhaps ill-advisedly, I chose the screenname "IE==pwned!", which I thought was amusing given that the post was discussing recent remote security vulnerabilities in IE. In any case, regardless of the name, I thought the question itself was perfectly reasonable.

However, not only was my question left unanswered, but it was deleted. I accept all criticism of the username, but I will say that it was in a spirit of fun. However, Bruce Morgan [MSFT], who deleted my post, went on to say:
I found your screen name and your list of leading questions offensive in content and style, not in words.

You jumped in here with no previous commenting history with an offensive name and a list of "do you still beat your wife" style questions.
I don't feel my question was leading, and I gladly (and publicly) offered to let IEBlog change my screenname.

Bruce also commented on anonymity:
BTW, anonymity gets you little respect.

I accord a lot more value to comments from someone who has the courage to link back to a real blog or real email. These comments can be as anti-IE as the person wishes; as long as they are not profane or personally attacking people on the team I won't delete them.

Anonymous comment with no content other than "IE is bad!" (or three paragraphs of "you guys are morons, IE sucks, blah blah blah!") are a dime a dozen. Do that on your own blog all you want. Track back here, I won't delete it.

Otherwise, I treat it like graffiti on the wall and I get rid of it.

I fail to see what the anonymity (or source) of comments has to do with the quality or content of comments. Only accepting criticism from 'known' people seems wrong. It is also pointless, especially as anyone can easily get a blog, let alone email address. I accept that comments that say "IE is bad!" and nothing else should be deleted. I don't think my question falls into that category.

Anyway, I wanted to create a forum, with a strict comments policy to maintain quality, but less censorship than the IEBlog. I will also use my 'status' as a blog owner to post 'non-anonymously' on the IEBlog, hopefully avoiding undue censorship of future comments. Hopefully, along the way, I will build up enough 'karma' that someone can answer my original question:
How many security vulnerabilities in Internet Explorer have been due to an ActiveX component?
as well as lots of other interesting questions, of course!

Comments Policy
  • No profanity of any kind;
  • No personally insulting comments of any kind;
  • Feel free to praise/insult any product, development technique, management strategy, company or the like, but only if you back it up;
  • Keep it on-topic;
  • Make your point once, and once only, unless it is obviously intended to clarify.
I will delete any comments that violate this policy, and also obvious spammers, trolls and flamebaiters, and ban them from future posting if tools to do so are available. I reserve the right to delete other comments too. I will try to explain all deletions, though I can't guarantee this. All comments will be considered to be released under the Creative Commons Attribution Non-Commercial License 2.0 unless you specify otherwise, which you are free to do. Of course, all this is in addition to any Blogger comments policy in operation, which will always take precedence over what I have said here.

How can you help?

I recognize that the value of a blog is more often in its comments than its articles. I therefore encourage you to make lots of incisive, insightful comments! This should also be a place to learn about browser functionality, so ask questions (Google first, please!) and hopefully someone will answer. In particular, [MSFT] posters and others with 'vested interests' are extremely welcome to comment - disclosure would be nice but it is by no means mandatory!

You can also post articles here by emailing me at [name of this blog without spaces] at gmail dot com.

Hopefully we can make this an genuinely honest, genuinely open and genuinely useful forum. Let me know what you think.