Microsoft Like Moria, Not Mordor

2 June 2009, 13:00 — Reflections, Software Development

balrog

Contrary to public opinion, Microsoft is not Mordor.

Microsoft is like Moria.

Being a developer and faced with the task of developing some particularly interesting feature for, say, the Internet Information Server (IIS), is much like Gandalf and the companions looking at crossing the mountains through the treacherous kingdom of Moria. Once a kingdom of dwarves, treasures and royal hallways, the rumors that now spread from these caverns are dark and brooding. And entering the domains of Moria will always be an adventure, whichever way you look upon it.

You may get through, of course, if you’re quiet, don’t do anything surprising, and stick to the API’s. Keep it simple, and don’t touch anything you shouldn’t touch. It will be a long, dark passage, but you may eventually get through to the other side.

On the other hand, you may get lost, stray into tunnels that you shouldn’t have gone into; and disturb the deep darkness below. Too many have been lost in these caverns and you are now about to be the next victim. You pass through chambers with ominous writing inside (“here lies Balin, king of Moria”) and stumbling, falling, irretrievably lost proceeding further and further into the darkness…

You can abort. Cut your losses, and get out. You never should have come here. We’ll make for the gap of Rohan instead! Quickly, now!

But some part of you pushes on… There must be an exit ahead… Who knows, behind the next corner you may feel a gust of fresh wind, and a dim light far ahead, the opening to the outside and you’ll be through. But instead, you cross a barrier and suddenly you’re face to face with the Balrog, the most terrifying of all enemies. You’re at the brink of insanity and plunging forward. There is no escape.


Trying to script the IIS is an excellent example. The API is dark and mysterious, and faint murmurs of a dark evil reach you as you stand by the door. And a short way into the inside, you reach a decision point: Should you use the XML metabase? WMI? ADSI? All of these have possibilities, and yet, dangerous drawbacks.

You spent some time trying to figure out what wmic is and how it works. Turning out to be an utterly incomprehensible object, you shy away from it. Next up, you work with CIM Explorer to explore the WMI and MicrosoftIISv2 namespace. It is equally incomprehensible, and every attempt you make to understand it and get ahold of its promises are futile. Like a slithery Gollum, it writhes out of your grip and disappears into the caves. “My preciousssss!” you hear a scream in the distance.

Next up: ADsUtil. A wonderful little VBScript gratitiously provided by Microsoft. By it quickly breaks in your hands – it returns strange and complicated errors when applying it to the IPSecurity object. But, even though ADsUtil crumbles to dust (for magical reasons, you assume), the ADSI approach seems worthwhile. The air in this tunnel seems fresher. You stumble on.

A while later, while pouring over .NET, DirectoryEntry, and hitting your head repeatedly on sharp, nasty stalagmites stalactites hanging from the ceiling, you force yourself into twisting little dark tunnels, which now consist of strange, magical invocations of objects, GetTypes, unexplored COM objects, and you have the vicious feeling that nobody has been here before you in a very long time.

And right there, just when you hit entry.CommitChanges() and think that you’re home safe, an exception is suddenly thrown out of the blue. A wall collapses in front of you. And there it is… the Balrog, full of flame and fire and smoke and radiating with a pure evil from aeons past.

I should have made for the Gap of Rohan. Now, it’s over.

*sigh*

Speeding Up Delphi’s TStringList.IndexOf

1 June 2009, 15:33 — Music, Software Development

Delphi’s TStringList is one of the objects I love the most. If it’s sorted (StringList.Sorted := true) then you can use it to parse huge chunks of data quickly.

For instance, looping through an enormous amount of IP addresses and keeping count of how many times they appeared, is easily done using the following code (not compiled or checked for syntax errors):

ls := TStringList.Create;
ls.Sorted := true;
for ip in ipAddresses do begin
  n := ls.IndexOf(ip);
  if n = -1 then
    ls.AddObject(ip, TObject(1))
  else
    ls.Objects[n] := TObject(Integer(ls.Objects[n]) + 1);
end;

It’s very efficient. Since TStringList.IndexOf always does a binary search, it operates in log2(n) time, and using Objects as integers allows us to keep track of count without messing with the string data.

But there are things we can do to speed it up. For instance, TStringList.IndexOf relies on TStringList.Find, which itself uses AnsiCompareStr, which is a slow Windows call, taking locale and its mother into consideration. Overriding this with our own method should be worthwhile. (The code below is adapted straight from the Classes unit.)

type
  TStringListEx = class(TStringList)
  public
    function Find(const S: string; var Index: Integer): Boolean; override;
  end;

function TStringListEx.Find(const S: string; var Index: Integer): Boolean;
var
  L, H, I, C: Integer;
begin
  Result := False;
  L := 0;
  H := Count - 1;
  while L <= H do
  begin
    I := (L + H) shr 1;
    C := CompareStr(Get(I), S);
    if C < 0 then L := I + 1 else
    begin
      H := I - 1;
      if C = 0 then
      begin
        Result := True;
        if Duplicates <> dupAccept then L := I;
      end;
    end;
  end;
  Index := L;
end;

We’ve replaced AnsiCompareStr with Delphi’s own CompareStr, which is a highly optimized FastCode routine. There are some drawbacks – things will always be sorted in byte order and no case-sensitivity is done. But we don’t care about this – it can always be done afterwards; right now, speed is the main importance.

And it turns out that using the above code, in pure examples, can slash execution time with up to about 80%. Dramatic savings, indeed. In my own example, where I analyze ftp log data, I managed to cut execution time on 122 MB of data down from 7 seconds down to 3.1 seconds.

Best of all, since TStringList.Find is declared virtual, we don’t need to change any types anywhere, just do a TStringListEx.Create instead of a TStringList.Create and you’re good to go.