Wednesday, October 12, 2011

Why is my server dragging its feet?

Last week I had major issues with two business critical applications. Both applications experienced serious time-out issues and sometimes did not even load for our users.

Monitoring the server I could not see any issues. CPU was below 40% at all times with the occasional spikes, and memory was also not running out. There was a large allocation of memory to the SQL server instance, but with two large databases running that was expected.

So the investigation continued, checking the local disks for fragmentation. Fragmentation was at up to 70%, which seemed to be the issue.

Next check was SQL fragmentation, checking the indexes, most of which were up to 95% fragmented, which also could have contributed to the issue.

After performing a series of defragmentation on the local disk and SQL indexes, the server was still performing very slowly although it finally was at least usable.

Next check which I completely forgot about is to check perfmon for memory, CPU and disk queue length. Disk queue length was running at 100% constantly, which obviously was not right. From this I checked the SQL log files and discovered that these were written to the same disk so decided to change the location of the SQL logs to a separate disk.

This seemed to have resolved the issue, with the disk queue running at a much lower level.

But then I restarted an integration service between two 3rd party applications and immediately the disk queue length jumped up dramatically. So even though all of the points above were important in terms of server maintenance and best practices, it was a 3rd party integration service that seems to be causing the issues.

No comments: