4 Comments
  • Posted in:
  • C#

I recently stumbled upon a small bug which had to do with a part of C# code that cleans up an HTML string which came from a database. This string is used as output on the web and therefore needs to be w3c and tidy!

I always used Tidy.Net for it. Really liked it and decided to check for a new version of that library while I was doing some code maintenance. That library's latest release date is from June 2005! that’s over 6 years old!

So I decided to go and look for a better solution. I found the TidyManaged project from June 2010. I wasn’t directly motivated to migrate to this library so my next step was a showdown between the two. [more]

I fired up Visual Studio 2010 and started a new console application. Because ‘the numbers tell the tale’. I used the StopWatch class which is awesome! I have downloaded the HTML source code from a website and passed that to both the Libraries.


static void Main(string[] args)
{
    WebClient wc = new WebClient();
    string testInput = wc.DownloadString("http://www.jphellemons.nl");

    Stopwatch sw = new Stopwatch();
    sw.Start();
    ParseWithOldLib(testInput);           
    sw.Stop();
    Console.WriteLine("Tidy.Net lib from 2005 took: " + sw.ElapsedTicks);
    sw.Restart();
    ParseWithNewLib(testInput);
    sw.Stop();
    Console.WriteLine("TidyManaged lib from 2010 took: " + sw.ElapsedTicks);
    Console.ReadKey(); // to keep console open
}

The results:

 Tidy.NetTidyManaged
website 11080148359861
website 2644471140835
website 346709484472
website 4495851191426

So this managed code wrapper for the unmanaged tidy project’s DLL is always a lot faster! I have the tidy DLL placed in “C:\Windows\system”.That DLL is 323kb and the unmanaged DLL is 25kb. (together 348kb) The Tidy.Net (which is an older .Net port of the 323kb DLL is 188kb. So that is smaller, but an older library.

If you look at the output of both libraries, you will see that the Tidy.Net library makes smaller html files then the TidyManaged. But the TidyManaged takes inline CSS styles and combines them in the header of your document.

I will attach my sample project, so that you can test the difference yourself.

Good luck!

kick it on DotNetKicks.comShout it

Pin on pinterest Plus on Googleplus Post on LinkedIn

Comments

Comment by DotNetShoutout

Tidy your HTML with Asp.Net TidyManaged vs Tidy.net

Thank you for submitting this cool story - Trackback from DotNetShoutout

Comment by DotNetKicks.com

Tidy your HTML with Asp.Net TidyManaged vs Tidy.net

You've been kicked (a good thing) - Trackback from DotNetKicks.com

Comment by syngu.com

Pingback from syngu.com

Tidy your HTML with Asp.Net TidyManaged vs... | .NET, ASP.NET, Performance | Syngu

Comment by Quang

I've just downloaded your test project. It's wonderful. I am looking for an example about Tidy too. Thank you for sharing.

Quang