A 2 node cluster running active/passive with SQL 2000 on Windows Server 2003 enterprise is encountering a problem with a fileshare not staying on line since one node had been patched.
Reluctant to patch other node due to problem with fileshare.
Events of interest are,
Event Type: Error
Event Source: Srv
Event Category: None
Event ID: 2011
Date: 11/02/2007
Time: 21:14:53
User: N/A
Computer: SERVER1
Description:
The server's configuration parameter "irpstacksize" is too small for the server to use a local device. Please increase the value of this parameter.
and,
Event Type: Error
Event Source: ClusSvc
Event Category: File Share Resource
Event ID: 1055
Date: 11/02/2007
Time: 21:14:53
User: N/A
Computer: SERVER1
Description:
Cluster File Share resource 'Share1' has failed a status check. The error code is 1130.
Used information in Microsoft article,
http://support.microsoft.com/kb/285089
(increasing irpstacksize), this did not help.
It seems that the share is not coming online on NODE 1 and is failing with error 1130. clussvc is trying to restart it three times and failing to come online successfully, it initiates the failback operation.
Initiated Failback of SQL group back to the Node 2 as share is configured to "AFFECT GROUP"
==================================================================================================================================
000006ec.000009a8::{2007/03/18 20:15:55.579} WARN [FM] Group failure for group . Create thread to take offline and move.
000006ec.000009a8::{2007/03/18 20:15:55.579} INFO [FM] FmpHandleGroupFailure, Exit: Group failure for {SQL Group, type = Group}...
=================================================================================================================
Suggestions:
+++ Shutdown both the nodes and power on Node 1 only, and check if all resources including "Thesis File Share" is coming online successfully.
+++ And if it comes online successfully, then make the following changes in registry on both the nodes.
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters\MaxFreeConnections (REG_DWORD Decimal 2000)
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters\MinFreeConnections (REG_DWORD Decimal 200)
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters\MaxWorkItems (REG_DWORD Decimal 12000
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters\IRPStackSize (REG_DWORD Decimal 30)
This fixed the issue.
THE INFORMATION IN THIS DOCUMENT IS PROVIDED ON AN AS-IS BASIS WITHOUT WARRANTY OF ANY KIND. PROVIDER SPECIFICALLY DISCLAIMS ANY OTHER WARRANTY, EXPRESS OR IMPLIED, INCLUDING ANY WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL PROVIDER BE LIABLE FOR ANY CONSEQUENTIAL, INDIRECT, SPECIAL OR INCIDENTAL DAMAGES, EVEN IF PROVIDER HAS BEEN ADVISED BY USER OF THE POSSIBILITY OF SUCH POTENTIAL LOSS OR DAMAGE. USER AGREES TO HOLD PROVIDER HARMLESS FROM AND AGAINST ANY AND ALL CLAIMS, LOSSES, LIABILITIES AND EXPENSES.