Archive for May, 2008

A Problem with BizTalk Server 2006 Performance – 100% CPU Utilization

A few weeks ago, a new application was deployed at work that was taking about 45-60 seconds to run through a cycle of a particular orchestration that was set up as a singleton. As time went by, the process seemed to take longer – instead of 60 seconds, it seemed to be taking 3 minutes. This bothered me a bit, but I wasn’t sure just what to do. Around this time, I noticed that the host instance process, which used to take only a few CPU cycles on the server, began taking 25% of the 4 CPU cores (100% of one core). Although I know BizTalk uses more than one thread in its processing, I couldn’t help but be suspicious – it sure seemed like a process was out of control.

Nonetheless, the process was finishing, and I had lots of other pressing matters to worry about, so after trying a couple of simple things to speed things up (which all failed), I left the application alone. Yesterday, I noticed that many of the processes that were now running came from messages received over 24 hours prior. This caused an alarm; a meeting was held to figure out what to do. At the end of the day the problem still wasn’t solved – which interpreted means that Victor was going to work on Saturday until he figured things out. I searched on the internet and found many articles on performance. Most of them I had read already, or had been taken into account when setting up the BizTalk Server, but there were a few that were new to me. Each “new” issue I found seemed promising – I crossed my fingers as I implemented each fix, but to no avail. Nothing seemed to help.

So I called Microsoft to get help. I chatted with Sajid, one of their good support persons, on the phone and described the issue. He ran through a handful of possible causes (similar to what I had seen in other articles), to see if he could quickly isolate the problem. No luck. I might add that it was comforting that he too suspected an issue with the SQL Server database – but we found nothing. He didn’t have internet access with him at the time (after all, it was a weekend), so he called back later. He looked at the issue, and asked how many messages were in queue for the process. Unfortunately the admin console doesn’t have a count feature, but after limiting the query to show 5000 results, and having the tool report that there were still more that weren’t being shown, Sajid became suspicious. We terminated the orchestration that was running (the singleton) since I could recreate the input messages, and whala! Problem solved.

So, you may be curious as to what was the problem in the first place. That is, why were there so many messages? Well, as you may know, messages that are consumed by an orchestration still show up in the message holder of the orchestration until the orchestration finishes. A singleton that doesn’t terminate, will eventually cause such a large queue of messages (even if they are marked “consumed”) to appear, that apparently the host instance runs wild while simply trying to pick up the messages that it needs for its own processing. A more sophisticated singleton design would cause the orchestration to complete if no new messages had been received after a given time, say 15 minutes.

I hope as a follow up to this post I’ll have time enough to show some better singleton designs.  I know posts without pictures can be boring.  =)

Categories: BizTalk Server

BizTalk Siebel Adapter Error Resolution

At work we’ve been seeing the following error with the BizTalk Siebel Adapter (BizTalk Server 2006) quite frequently:

No connection could be made because the target machine actively refused it

The first time it happened, I immediately presumed that the error was accurate, and notified the Siebel system administrator that something seemed to be wrong with the Siebel server. After checking things out he told me I was mistaken, and that everything was fine.

We have a host instance dedicated to the Siebel adapter, so I simply restarted it and found that things returned to normal again… at least for a few more hours. The error would then reoccur, and so forth. This particular error worried me quite a bit because it meant that a crucial business process was halted as a result. In consequence, a great deal of work was required by the business users to manually redo the process that had failed. Ouch.

Over a course of two weeks I tried all sorts of things. I couldn’t find any articles on the internet that described my problem, and I ended up setting up a scheduled task to restart the SiebelHost every 30 minutes. This worked for the most part, but of course was very sloppy fix. While this was happening I put in a ticket to Microsoft for help.

After 2-3 of days I got a call from Microsoft’s support team (Sajid is really helpful by the way), with an idea to try and help. Here’s the substance of the email:

To fix this issue, create a DWord registry value in the registry for the key HKLM\software\Microsoft\BizTalkAdapters\New Reg Value : StartAgentSleep
Type: DWord
Value : 1000 (Decimal) measured in Milli secondsThe value is configurable and can change according to machines and to different users. 1000 is the value which Microsoft had tried with another customer and seemed to work. 1000 Ms of time = 1 sec.

Setting this causes the adapter to wait longer than the default timeout for the browsingagent.exe or runtimeagent.exe process to prepare itself to talk to the adapter.

So I gave it a whiz and guess what?  Problems solved. We’ve now been up for about five consecutive days without seeing the problem reoccur.

Categories: BizTalk Server