Quantcast
Channel: High Availability (Clustering) forum
Viewing all 4519 articles
Browse latest View live

Scale Out File Servers for IIS .NET Data

$
0
0

Hello,

I am having a difficult time finding a direct answer online about whether or not my specific scenario warrants the use of an SOFS cluster, or just a traditional File Server cluster. 

SCENARIO:

  • Roughly 30 Windows Server 2012 R2 Standard Web servers running IIS 8.X
  • Approximately 600 websites all running ASP.NET 4.0 applications
  • Each website requires 5 different Virtual Directories. 
  • Each Website is actually a radio station, and their content is very heavy on streaming media, pictures, and dynamic live content including streaming live radio broadcasts.
  • The Cluster is connected to a SAN via 8Gbps redundant (MPIO) Fiber Channel for the source of shared storage
  • We have 3 physical hosts in the Windows cluster that will accommodate either the SOFS or the traditional File Server

So basically here's my quandary.  I've set this up a number of times, and never got it to a point where I could put any sort of load on the platform to see a difference.  I fully grasp the concepts of both, but the problem is, I see very little documentation out there regarding others using SOFS technology for IIS Virtual Directories.  I do see that there are posts stating that it's actually part of it's core function aside from Hyper-V and SQL as well. 

So, anyone out there using this in a real world scenario?  Any recommendations on whether I should or should NOT be deploying an SOFS based on the information above?  If yes, or no, then why in either direction? 

I'm really looking to put a nail in the coffin on this question.  I actually have an advisory case opened up right now too, and even the first guy I talked to couldn't answer the question.  He said he'd have to get back to me.

Any help is GREATLY appreciated!  Have a good one!


How to take Cluster Quorum drive backup in Windows 2008

$
0
0

Dear All,

We have Windows 2008 Enterprise Failover cluster environment, I want to know do I need to take quorum drive backup and is it safe to take backup while servers are online. can anyone please tell me how can I take Quorum drive backup, can I use windows backup or Symantec backupexec utility or manual copy all folders?

Thanks

Agha

Access is denied messages in Win2012 R2 Failover Cluster validation report and CSV entering a paused state

$
0
0

Been having some issues with nodes basically dropping out of clusters config.
Error showing was

"Cluster Shared Volume 'Volume1' ('Data') has entered a paused state because of '(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished."

All nodes (Poweredge 420) connected a Dell MD3200 shared SAS storage.

Nodes point to Virtual 2012 R2 DC's

Upon running validation with just two nodes, get the same errors over and over again.

Bemused!

----------------

List Software Updates
Description: List software updates that have been applied on each node.
An error occurred while executing the test.
An error occurred while getting information about the software updates installed on the nodes.

One or more errors occurred.

Creating an instance of the COM component with CLSID {4142DD5D-3472-4370-8641-DE7856431FB0} from the IClassFactory failed due to the following error: 80070005 Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED)).


and

List Disks
Description: List all disks visible to one or more nodes. If a subset of disks is specified for validation, list only disks in the subset.
An error occurred while executing the test.
Storage cannot be validated at this time. Node 'zhyperv2.KISLNET.LOCAL' could not be initialized for validation testing. Possible causes for this are that another validation test is being run from another management client, or a previous validation test was unexpectedly terminated. If a previous validation test was unexpectedly terminated, the best corrective action is to restart the node and try again.

Access is denied

-----------

The event viewer on one of the hosts shows
-------------
Cluster node 'zhyperv2' lost communication with cluster node 'zhyperv1'.  Network communication was reestablished. This could be due to communication temporarily being blocked by a firewall or connection security policy update. If the problem persists and network communication are not reestablished, the cluster service on one or more nodes will stop.  If that happens, run the Validate a Configuration wizard to check your network configuration. Additionally, check for hardware or software errors related to the network adapters on this node, and check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster, or a failover of the witness disk.
Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

Only other warning is because the 4 nic ports in each node server are teamed on one ip address split over two switches - I am not concernd about this and could if required split then pairs, I think this is a red herring????

how to remove a node from nlb at runtime?

$
0
0

hello,

i need to temporally exclude a node from an nlb.

May happen that a server is up and working but the web application i'm balancing is out of sinch with the same application in the others nodes.

Eg. some static variables are not the same of the same static variables of other nodes, because of a timeout, a write error and so on but the server is still working.

in this case i need to stop the server from nlb because the information in the web application is not in sinch with other nodes.

I need to prevent users from being serverd from this out to date server, untill it will became updated, but i need to do this programmatically

how can i do it?


event id 6237 on a Hyper-V cluster

$
0
0

Hi. We're getting error with event it 6237 (source FailoverClustering-WMIProvider) on a 2008 R2 SP1 Hyper-V cluster.

There was an older thread http://social.technet.microsoft.com/Forums/en-US/bae84aea-5a80-4637-989b-c31bbc1aa55a/event-id-6237-failover-cluster-wmi-provider-detected-an-invalid-character?forum=windowsserver2008r2highavailability which suggest installing KB974930, but that update is not applicable on my nodes (probably part of SP1?).

How do we solve this issue?

My message is concerning different property (PreviousOfflineAction - which I failed to find using either GUI, PSH or cluster.exe):

Failover Cluster WMI Provider detected an invalid character. The private property name 'PreviousOfflineAction' had an invalid character but the provider failed to change it to a valid property name. Property names must start with A-Z or a-z, and valid characters for WMI property names are A-Z, a-z, 0-9, and '_'.

This message is repeated 6x every 15 minutes.


Removing dfs namespace gives error "dfs_cluster: The cluster file share resource for share cannot be found in group . The resource cannot be found.

$
0
0

We have a Microsoft cluster running on 2 Server Enterprise 2008R2 Servers with the FileServices Roles (including DFS roles).  We have 3 Standalone DFS namespaces (roots) which have been up and running for a couple of months.  Users were complaining about performance so we added the DfsDnsConfig registry entry to force the FQDN format in the namespace (KB244380). 

Although it doesn't say to do so in the KB Article, I found several threads that said you also need to delete and recreate the namespaces afterwards.  This is where I got into the problem.

I deleted one of my namespaces from the Failover Cluster Manager utility.  I then recreated the namespace.  While it works, the share was empty.  This is because by default it placed the namespace in DFSRoots and we have our namespaces in DFSroot (no trailing "s").

So I am trying to delete the new namespace in DFSRoots and create it in the DFSRoot folder, but I get the error listed in the title.  This is what I've tried so far to resolve this:

1. dfsutil to remove the namespace and it fails saying the command is not supported in a cluster environment.  Fair enough.

2. I tried to delete the share (using "Share and "), but it fails saying the shared folder is a DFS namespace root.

3. I tried to find an option to add the namespace to the resource group (using "Failover Cluster Manager") but I can't find the appropriate option.  Maybe it cant be done from the "Failover Cluster Manager" unless you are created a new one, I think.

4. Of course I tried to delete the namespace (using the "DFS Management" utility) and that's where I get the error in the Title of this post.

5. I deleted the registry entry for the namespace under hklm\software\Microsoft\dfs\roots\standalone.  I did this on both Fileserver nodes in the cluster.  However then all the utilities complained that they couldn't find the namespace and I couldn't do anything with the specific namespace.  So I put the registry key back in.

At this point I don't know what else to try to delete this namespace properly.  Any assistance is appreciated.



2012R2 Cluster "Protected Network" not working / no failover

$
0
0

hi,

I have two 2012R2 hosts running in a cluster. Hardware is exactly the same. I created a Virtual Network on both Hyper-V managers. Live migration is correctly working.

However i was testing one of the new features "Protected Network" where the VM would failover to the other node when the virtual network fails. When i try it and disconnect the cables on the node where the VM is running i see it going to error state but nothing happens (other then the VM being not available anymore). I've waited for hours but no failover.

Am i missing something?

Windows 2012 R2 cluster nodes hanging frequently

$
0
0

Hello all,

Expecting better performance and stability, we  cretaed a new failover cluster in windows 2012 R2 with 3 hp BL460 G8 servers and SAN storage. But  it seems that even a folder movement or delete file operation causes the server un-responsive and hangs for hours.  Even it will not neatly sign out the current user, eventually leading to soft reboot.

Cluster validation doesn't produce any errors, another issue from windows 2012 R2 after solving NIC teaming problems.

Windows server 2012 R2 is still not stable????

Thanks

Mohammed Uwaiz A


2008 R2 Multi-Site Cluster Possible without Third Party Software?

$
0
0

2008 R2 Multi-Site Cluster Possible without Third Party Software?  I know it has been asked, but I haven't seen a specific answer.  We are using Dell Compellent Storage and I created several read only replicated volumes at our DR site.  I am having issues failing over to the volumes after I use cluster cli to mount the read only volumes.

Any help would be appreciated greatly!

 

Custer dependency in Domain

$
0
0

Hi,

I have come across a confusing question

One of our customer environment is having only one Windows 2008 Domain Controller in entrire FOrest and Domain. We know that its design problem which need to be corrected on priority. But i got struct in answering some of the questions as

1) If i am running Windows 2008 Failover Cluster for SQL DB, what is the level of dependency on my Domain/Domain Controller.

2) If Single Domain Controller which is there in environment is failed due to hardware or OS problem then Failover Cluster Server will get impacted immediately or it will run some time? If it run some time and fail later then what is the stage it will get failed actually.

Pls suggest if you know any of these points, its good discussion point as well.


Regards:Mahesh

Server 2008 Cluster Random failover occuring on Fileserver Resource

$
0
0
We have a 2 node active/passive 2008 Sql Cluster that also has a fileshare on it that randomly fails over. We get events

Events from Cluster Admin

 

Event ID 1230

cluster resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)' (resource type '', DLL 'clusres.dll') either crashed or deadlocked. The Resource Hosting Subsystem (RHS) process will now attempt to terminate, and the resource will be marked to run in a separate monitor.

 

Event2

EventID 1146

 

the cluster resource host subsystem (RHS) stopped unexpectedly. An attempt will be made to restart it. This is usually due to a problem in a resource DLL. Please determine which resource DLL is causing the issue and report the problem to the resource vendor.

 

Event 3

EventID 1069

 

Cluster resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)' in clustered service or application 'SQL Server (SQLPRODA)' failed.

 

Event 4

Event ID 1205

 

The Cluster service failed to bring clustered service or application 'SQL Server (SQLPRODA)' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

 




We have updated the NIC drivers on each node, the Drivers and Bios have been updated on the HBA's. We have updated the srv.sys and the srv2.sys files thinking it might be an SMB issue. TCP offloading is disabled on the the Nics. We are running SP2 on both nodes and all the windows updates are current.  In the cluster logs we are seeing what is listed below.

HYSQL02
========
00000cc8.00001364::2010/02/17-18:23:32.352 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:24:32.353 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:25:32.356 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:26:32.414 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_ImportFiles), status 2114. Tolerating...
00000cc8.00001364::2010/02/17-18:29:32.369 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:32:32.431 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
00000cc8.00001364::2010/02/17-18:35:32.387 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:37:32.392 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:42:32.408 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:43:32.410 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:44:32.425 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:48:32.798 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:51:32.949 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:54:33.045 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-18:58:33.158 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
00000cc8.00001364::2010/02/17-19:01:33.192 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
00000cc8.00001364::2010/02/17-19:05:33.166 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-19:10:33.182 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-19:11:33.184 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
00000cc8.00001364::2010/02/17-19:13:33.190 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-19:22:33.218 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-19:26:33.229 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-19:27:33.232 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
00000cc8.00001364::2010/02/17-19:28:33.236 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 64. Tolerating...
00000cc8.00001364::2010/02/17-19:29:33.238 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_ImportFiles), status 64. Tolerating...
00000cc8.00001364::2010/02/17-19:30:33.241 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_ReportImages), status 64. Tolerating...

00000cc8.00000cd4::2010/02/17-19:30:34.000 ERR [RHS] RhsCall::DeadlockMonitor: Call ISALIVE timed out for resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)'.
00000cc8.00000cd4::2010/02/17-19:30:34.000 ERR [RHS] Resource FileServer-(MSCS3)(Cluster Disk 4- Database) handling deadlock. Cleaning current operation and terminaiting RHS process.
000009ec.0000174c::2010/02/17-19:30:34.000 INFO [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'FileServer-(MSCS3)(Cluster Disk 4- Database)', gen(0) result 4.
000009ec.0000174c::2010/02/17-19:30:34.000 INFO [RCM] rcm::RcmResource::HandleMonitorReply: Resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)' consecutive failure count 1.
000009ec.0000174c::2010/02/17-19:30:34.002 ERR [RCM] rcm::RcmMonitor::RecoverProcess: Recovering monitor process 3272 / 0xcc8
000009ec.0000174c::2010/02/17-19:30:34.004 INFO [RCM] Created monitor process 2248 / 0x8c8
000008c8.000010c8::2010/02/17-19:30:34.019 INFO [RHS] Initializing.
000009ec.0000174c::2010/02/17-19:30:34.030 INFO [RCM] rcm::RcmResource::ReattachToMonitorProcess: (FileServer-(MSCS3)(Cluster Disk 4- Database), Online)


000009ec.0000174c::2010/02/17-19:30:34.030 INFO [RCM] TransitionToState(FileServer-(MSCS3)(Cluster Disk 4- Database)) Initializing-->OpenCallIssued.
000009ec.0000174c::2010/02/17-19:30:34.030 INFO [RCM] rcm::RcmGroup::ProcessStateChange: (SQL Server (SQLPRODA), Online --> PartialOnline)
000009ec.0000174c::2010/02/17-19:30:34.055 INFO [RCM] TransitionToState(FileServer-(MSCS3)(Cluster Disk 4- Database)) Online-->ProcessingFailure.
000009ec.0000174c::2010/02/17-19:30:34.055 INFO [RCM] rcm::RcmGroup::ProcessStateChange: (SQL Server (SQLPRODA), PartialOnline --> Failed)
000009ec.0000174c::2010/02/17-19:30:34.055 ERR [RCM] rcm::RcmResource::HandleFailure: (FileServer-(MSCS3)(Cluster Disk 4- Database))
000009ec.0000174c::2010/02/17-19:30:34.055 INFO [RCM] resource FileServer-(MSCS3)(Cluster Disk 4- Database): failure count: 1, restartAction: 2.
000009ec.0000174c::2010/02/17-19:30:34.055 INFO [RCM] Will restart resource in 500 milliseconds.
000009ec.0000174c::2010/02/17-19:30:34.055 INFO [RCM] TransitionToState(FileServer-(MSCS3)(Cluster Disk 4- Database)) ProcessingFailure-->[Terminating to DelayRestartingResource].
000009ec.0000174c::2010/02/17-19:30:34.055 INFO [RCM] rcm::RcmGroup::ProcessStateChange: (SQL Server (SQLPRODA), Failed --> Pending)
000008c8.00001784::2010/02/17-19:30:34.112 INFO [RES] File Server : FileServerDoTerminate: Terminate called... !!!
000009ec.0000126c::2010/02/17-19:30:34.119 INFO [RCM] TransitionToState(FileServer-(MSCS3)(Cluster Disk 4- Database)) [Terminating to DelayRestartingResource]-->DelayRestartingResource.
000009ec.0000174c::2010/02/17-19:30:34.619 INFO [RCM] Delay-restarting FileServer-(MSCS3)(Cluster Disk 4- Database) and any waiting dependents.
000009ec.0000174c::2010/02/17-19:30:34.619 INFO [RCM] TransitionToState(FileServer-(MSCS3)(Cluster Disk 4- Database)) DelayRestartingResource-->OnlineCallIssued.
000009ec.0000126c::2010/02/17-19:30:34.620 INFO [RCM] HandleMonitorReply: ONLINERESOURCE for 'FileServer-(MSCS3)(Cluster Disk 4- Database)', gen(1) result 997.
000009ec.0000126c::2010/02/17-19:30:34.620 INFO [RCM] TransitionToState(FileServer-(MSCS3)(Cluster Disk 4- Database)) OnlineCallIssued-->OnlinePending.
000008c8.000016cc::2010/02/17-19:30:34.657 INFO [RES] File Server : Shares 'are being scoped to virtual name MSCS3



HYSQL01
=========

000015ac.00001200::2010/02/17-21:42:54.976 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-21:47:51.082 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-21:51:51.094 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-21:56:51.056 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000015ac.00001200::2010/02/17-22:06:51.139 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-22:09:51.148 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000009e0.00001b08::2010/02/17-22:17:51.431 INFO [NM] Received request from client address 10.1.0.220.
000015ac.00001200::2010/02/17-22:21:51.184 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-22:25:31.804 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-22:30:34.959 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000015ac.00001200::2010/02/17-22:31:36.518 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 2114. Tolerating...
000015ac.00001200::2010/02/17-22:34:41.036 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-22:39:48.514 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000015ac.00001200::2010/02/17-22:42:51.247 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000009e0.0000132c::2010/02/17-22:44:16.801 INFO [NM] Received request from client address 10.1.0.220.
000015ac.00001200::2010/02/17-22:47:51.209 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000015ac.00001200::2010/02/17-22:49:51.215 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
000009e0.000015f4::2010/02/17-22:51:27.511 INFO [NM] Received request from client address 10.1.0.220.
000015ac.00001200::2010/02/17-22:52:51.277 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-22:55:51.286 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-23:06:51.319 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000015ac.00001200::2010/02/17-23:12:51.284 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000015ac.00001200::2010/02/17-23:13:51.340 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 2114. Tolerating...
000015ac.00001200::2010/02/17-23:16:51.349 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...


2nd Issues
----------------

000018f0.0000137c::2010/02/16-18:03:23.988 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 2114. Tolerating...
000018f0.0000137c::2010/02/16-18:07:23.947 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:11:23.959 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:13:23.965 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:14:24.021 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 2114. Tolerating...
000018f0.0000137c::2010/02/16-18:20:23.986 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:23:23.996 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:26:24.005 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:27:24.007 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:28:24.063 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 2114. Tolerating...
000018f0.0000137c::2010/02/16-18:37:24.038 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:38:24.094 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 2114. Tolerating...
000018f0.0000137c::2010/02/16-18:41:24.102 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000018f0.0000137c::2010/02/16-18:44:24.059 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:50:24.129 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000018f0.0000137c::2010/02/16-18:54:24.089 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:55:24.091 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:56:24.095 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-18:57:24.151 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_ImportFiles), status 2114. Tolerating...
000009e0.00000d2c::2010/02/16-19:13:04.903 INFO [NM] Received request from client address 10.1.0.220.
000018f0.0000137c::2010/02/16-19:18:24.213 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
000018f0.0000137c::2010/02/16-19:22:24.172 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
000018f0.0000137c::2010/02/16-19:24:24.178 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
000018f0.000012dc::2010/02/16-19:25:25.000 ERR [RHS] RhsCall::DeadlockMonitor: Call ISALIVE timed out for resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)'.
000018f0.000012dc::2010/02/16-19:25:25.000 ERR [RHS] Resource FileServer-(MSCS3)(Cluster Disk 4- Database) handling deadlock. Cleaning current operation and terminaiting RHS process.
000009e0.00000f48::2010/02/16-19:25:25.000 INFO [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'FileServer-(MSCS3)(Cluster Disk 4- Database)', gen(1) result 4.
000009e0.00000f48::2010/02/16-19:25:25.000 INFO [RCM] rcm::RcmResource::HandleMonitorReply: Resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)' consecutive failure count 1.
000009e0.00000f48::2010/02/16-19:25:25.002 ERR [RCM] rcm::RcmMonitor::RecoverProcess: Recovering monitor process 6384 / 0x18f0
000009e0.00000f48::2010/02/16-19:25:25.003 INFO [RCM] Created monitor process 6020 / 0x1784
00001784.00001b1c::2010/02/16-19:25:25.012 INFO [RHS] Initializing.
000009e0.00000f48::2010/02/16-19:25:25.023 INFO [RCM] rcm::RcmResource::ReattachToMonitorProcess: (FileServer-(MSCS3)(Cluster Disk 4- Database), Online)
000009e0.00000f48::2010/02/16-19:25:25.023 INFO [RCM] TransitionToState(FileServer-(MSCS3)(Cluster Disk 4- Database)) Initializing-->OpenCallIssued.
000009e0.00000f48::2010/02/16-19:25:25.023 INFO [RCM] rcm::RcmGroup::ProcessStateChange: (SQL Server (SQLPRODA), Online --> PartialOnline)



3)

00000d80.00000388::2010/02/16-12:15:13.281 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
00000d80.00000388::2010/02/16-12:19:19.253 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000d80.00000388::2010/02/16-12:24:22.132 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000d80.00000388::2010/02/16-12:25:22.187 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 2114. Tolerating...
00000d80.00000388::2010/02/16-12:29:22.146 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000d80.00000388::2010/02/16-12:42:22.185 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000d80.00000388::2010/02/16-12:50:22.209 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000d80.00000388::2010/02/16-12:51:22.212 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
00000d80.00000388::2010/02/16-12:53:22.218 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 64. Tolerating...
00000d80.00000388::2010/02/16-12:54:22.274 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_ImportFiles), status 2114. Tolerating...
00000d80.00000388::2010/02/16-13:01:31.308 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
00000d80.00000388::2010/02/16-13:10:22.322 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 2114. Tolerating...
00000d80.00000388::2010/02/16-13:13:22.279 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000d80.00000388::2010/02/16-13:17:22.291 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000d80.00000388::2010/02/16-13:20:22.300 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00000d80.00000388::2010/02/16-13:22:22.305 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
00000d80.00000388::2010/02/16-13:24:22.311 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 64. Tolerating...
00000d80.00000d8c::2010/02/16-13:24:23.000 ERR [RHS] RhsCall::DeadlockMonitor: Call ISALIVE timed out for resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)'.
00000d80.00000d8c::2010/02/16-13:24:23.000 ERR [RHS] Resource FileServer-(MSCS3)(Cluster Disk 4- Database) handling deadlock. Cleaning current operation and terminaiting RHS process.
000009e0.000015dc::2010/02/16-13:24:23.000 INFO [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'FileServer-(MSCS3)(Cluster Disk 4- Database)', gen(0) result 4.
000009e0.000015dc::2010/02/16-13:24:23.000 INFO [RCM] rcm::RcmResource::HandleMonitorReply: Resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)' consecutive failure count 1.
000009e0.000015dc::2010/02/16-13:24:23.002 ERR [RCM] rcm::RcmMonitor::RecoverProcess: Recovering monitor process 3456 / 0xd80



4)

00001770.00001594::2010/02/09-16:01:06.362 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_ReportImages), status 2114. Tolerating...
00000aa4.0000183c::2010/02/09-16:01:15.630 INFO [RES] Physical Disk: HardDiskpGetDiskInfo: Disk is of type MBR, signature 0x3ba33338
00000aa4.0000183c::2010/02/09-16:01:19.036 INFO [RES] Physical Disk: HardDiskpGetDiskInfo: Disk is of type MBR, signature 0x3ba3333f
00000aa4.0000183c::2010/02/09-16:01:19.040 INFO [RES] Physical Disk: HardDiskpGetDiskInfo: Disk is of type MBR, signature 0x3ba3333a
00000aa4.0000183c::2010/02/09-16:01:19.044 INFO [RES] Physical Disk: HardDiskpGetDiskInfo: Disk is of type MBR, signature 0x3ba33339
00001770.00001910::2010/02/09-16:05:06.311 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, HelpSystem), status 64. Tolerating...
00001770.00001910::2010/02/09-16:06:06.314 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, ImportFiles), status 64. Tolerating...
00001770.00001910::2010/02/09-16:07:06.317 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_HelpSystem), status 64. Tolerating...
00001770.00001910::2010/02/09-16:08:06.320 WARN [RES] File Server : Failed in NetShareGetInfo(MSCS3, PreProd_ImportFiles), status 64. Tolerating...
00001770.00000d14::2010/02/09-16:08:07.000 ERR [RHS] RhsCall::DeadlockMonitor: Call ISALIVE timed out for resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)'.
00001770.00000d14::2010/02/09-16:08:07.000 ERR [RHS] Resource FileServer-(MSCS3)(Cluster Disk 4- Database) handling deadlock. Cleaning current operation and terminaiting RHS process.
000009f0.00001324::2010/02/09-16:08:07.000 INFO [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'FileServer-(MSCS3)(Cluster Disk 4- Database)', gen(4) result 4.
000009f0.00001324::2010/02/09-16:08:07.000 INFO [RCM] rcm::RcmResource::HandleMonitorReply: Resource 'FileServer-(MSCS3)(Cluster Disk 4- Database)' consecutive failure count 1.
000009f0.00001324::2010/02/09-16:08:07.002 ERR [RCM] rcm::RcmMonitor::RecoverProcess: Recovering monitor process 6000 / 0x1770
000009f0.00001324::2010/02/09-16:08:07.003 INFO [RCM] Created monitor process 4748 / 0x128c






Analysis
----------------
We are getting Error 64 and 2114 and the File share is failing with a Deadlock Error

Status 64 = the specified network name is no longer available.

Status 2114 = The Server service is not started.

We setup Netmon and ran traces yesterday when the issue happened and they did not show anything. The Server service does not seem to get any errors.
We have also engaged EMC into the issue and MS has escalated the case but wanted to see if anyone else has experienced this issue or found any resolution. We have run out of options.

The requested object does not exist. (Exception from HRESULT: 0x80010114)

$
0
0

Hello,

I have a 3 node cluster that is setup as active, active, passive.  Two of the nodes report the following error when trying to connect to the cluster:

One of the active nodes is successfully able to connect the "Cluster" while the other two are not.  The objects do exists in AD and the virtual cluster name has full rights of itself.  I turned on DNS Client events logging and receive the following messages on nodes that are able to connect to the cluster:

  1. DNS FQDN Query operation for the name "ClusterNodeC" and for the type 28 is completed with result 0x251D
  2. DNS Cache lookup operation for the name "ClusterNodeC" and for the type 28 is completed with result 0x251D
  3. DNS Cache lookup is initiated for the name"ClusterNodeC" and for the type 28 with query options 0x40026010

Any help or direction would be greatly appreciated.
Thanks,
zWindows

 


RhsCall::DeadlockMonitor: Call ONLINERESOURCE timed out for resource

$
0
0

Hi,

We have a two node Windows Server 2008 R2 Cluster, which is experiencing issues on failover. Everything is currently running on Node 2, however we are no longer able to failover to Node 1. In the logs I see the following messages:

2014/01/27-23:24:29.840 INFO  [RES] Network Name <SQLMASTER>: DNS name SQLMASTER.here.net Registration with LSA was successful
2014/01/27-23:25:29.000 ERR   [RHS] RhsCall::DeadlockMonitor: Call ONLINERESOURCE timed out for resource 'SQLMASTER'.
2014/01/27-23:25:29.000 ERR   [RHS] Resource SQLMASTER handling deadlock. Cleaning current operation.
2014/01/27-23:25:29.000 WARN  [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'SQLMASTER', gen(0) result 5018.
2014/01/27-23:25:29.000 INFO  [RCM] TransitionToState(SQLMASTER) OnlinePending-->ProcessingFailure.
2014/01/27-23:25:29.000 ERR   [RCM] rcm::RcmResource::HandleFailure: (SQLMASTER)
2014/01/27-23:25:29.000 INFO  [RCM] resource SQLMASTER: failure count: 1, restartAction: 2.

The cluster then attempts to restart the resource, brings the network name online and encounters the following errors:

2014/01/27-23:26:26.445 WARN  [RES] Network Name <SQLMASTER>: WaitForTargetToComeUp: WSA_QOS_ADMISSION_FAILURE(11010)' because of '[cxl::Pinger-"SQLMASTER"] Could not send IPv4 echo.'
2014/01/27-23:26:26.445 WARN  [RES] Network Name <SQLMASTER>: WaitForTargetToComeUp: WSA_QOS_ADMISSION_FAILURE(11010)' because of '[cxl::Pinger-"SQLMASTER"] Could not send IPv4 echo.'
2014/01/27-23:26:29.048 INFO  [RES] Network Name <SQLMASTER>: [cxl::Pinger-"SQLMASTER"] Host registered, but no records of type 23
2014/01/27-23:26:29.048 INFO  [RES] Network Name <SQLMASTER>: [cxl::Pinger-"SQLMASTER"] Host registered, but no records of type 23
2014/01/27-23:26:29.048 WARN  [RES] Network Name <SQLMASTER>: [cxl::Pinger-"SQLMASTER"] Could not find any endpoints for remote target
2014/01/27-23:26:29.048 WARN  [RES] Network Name <SQLMASTER>: [cxl::Pinger-"SQLMASTER"] Could not find any endpoints for remote target
2014/01/27-23:26:29.049 INFO  [RES] Network Name <SQLMASTER>: Setting resource specific message to <Name Resolution Not Yet Available>.

Finally the cluster attempts to start SQL Server, which fails, and the cluster fails back to Node 2 successfully.

The cluster was previously running successfully on Node 1, and to my knowledge there have been no changes to the server or cluster configuration. 

Thank you in advance for any suggestions you can provide.

-Daniel

route add/delete broke my networking

$
0
0

I have run into a networking problem on my cluster, and I cannot figure out what changed.

I had a cluster configured and working.  Its access network is 192.168.10.0/24.  I have other networks, including a node management network on 10.29.130.0/24.  The access network is private to my lab; it's here I have my AD defined.  The management network is a 'lab' network that has a gateway available so we can access the lab from our corporate network with no issues.  In other words, the access network is only routed within my private lab, but the management network can be routed to corporate access.

But, as you know, Windows simply doesn't like creating two different networks with default gateways.  So before building the cluster, I removed the gateway from the management network, ensuring there was only a single gateway configured on each host.  Ran the validation and it came through fine (typical network warnings about non-routed networks not able to reach other networks, but that is expected and presents no problems).  Built the cluster.

I wanted to try to create an environment that would allow me to access the physical hosts through the management network, so I tried issuing a route add command specific to the management network. 

route add 10.29.130.0 mask 255.255.255.0 10.29.130.1 if 3

It didn't work as expected (I am by no means a networking expert, but I figured I would try it.)  Since it didn't work, I deleted it.

route delete 10.29.130.0 mask 255.255.255.0 10.29.130.1 if 3

Came back and ran another validation wizard on the cluster and now the validation fails with the following error (to each of the other nodes in the cluster):

Network interfaces FT4-Infra01.VSPEX.COM - Mgmt and FT4-Infra03.VSPEX.COM - Mgmt are on the same cluster network, yet address 10.29.130.37 is not reachable from 10.29.130.35 using UDP on port 3343.

I check my firewall, and those ports are open on all node for all firewall profiles.  From FT4-Infra01 (the machine I was messing with), I can ping the other nodes of the cluster.  From the other nodes in the cluster, I cannot ping FT4-Infra01.  Yes, I know ping is a different rule, but I always go back to basics.  And the fact that it is not responding to pings after playing with the route is strange.

Does anybody have any ideas about what playing with the route command could have changed?  I am assuming that is the cause because everything worked fine before issuing the command and now these errors are there.  No other changes were made to the cluster or the nodes.


.:|:.:|:. tim

Hyper-V Clustering - Cluster Storage

$
0
0

I have created a Hyper V failover Clustering between 2 nodes. 

On both nodes in C drive it created a folder ClusterStorage where i will store the Virtual Machine on. But i get an error that the user doesnt have access to the folder. 

Also noticed that the folder ClusterStorage has the security lock on it. 

Any ideas on how to give access to the administrator user? 


Multipath driver updates required

$
0
0

I need to update Multi-path drivers on my Windows server 2008 Standred 64bit Service Pack 2. I am installing netapp_windows_host_utilities_6.0.2_x64.msi on my server and it's giving me error:

{The following hotfixes are not available on the system:

1. Q2684681 - msiscsi.sys (Required: 6.0.6002.22814 / Installed: 6.0.6002.18005)

2. Q2754704 - mpio.sys (Required: 6.0.6002.22814 / Installed: 6.0.6002.18005)

Please install the hotfixes listed and retry the installation.}

Can anybody help me to solve this problem or provide me these hotfixex. I try a lot online search and try a lot of hotfixes but no result.

Thanks,

Ravinder Kumar.

How to move cluster Hyper-V computer object to another ou in AD

$
0
0

2 Hyper-V hosts (2012 R2 DC) in Cluster config

Cluster got created BEFORE Hyper-V hosts AD Computer Objects got moved to a proper desired OU

Is there a way to move Hyper-V host AD objects now?

Will Cluster need to be re-created?

Thanks

Seb


Cluster Aware Updating Scheduling

$
0
0

I am having an issue with Cluster Aware Updating (CAU) on Server 2012 and Server 2012 R2. If I schedule self-updating and specify a time other than 03:00 (e.g. 07:00) in the wizard, when I get to the end of the wizard the confirmation page shows the schedule to be 03:00. if I click apply then the schedule does appear to be set to 03:00.The same behaviour occurs whether I am setting up CAU for the first time or editing an existing configuration. Thus it is not possible to schedule cluster aware updating for any time other than 03:00.

I am assuming that this is a bug, although I am open to suggestions if anyone else can think of a possible cause. I haven't found this mentioned anywhere online and I have been to MS Connect and Server 2012 is not listed as open for bugs. Has anybody else been able to reproduce this? Any idea how to report a bug if connect is closed?

I have two clusters, one on Server 2012 and one on 2012 R2 and I can reproduce on both:

OS Name    Microsoft Windows Server 2012 Datacenter
Version    6.2.9200 Build 9200

OS Name    Microsoft Windows Server 2012 R2 Datacenter
Version    6.3.9600 Build 9600

I am happy to supply further details if anyone is willing to help.

Thanks


Services do not auto start when Failover occurs

$
0
0

We have Windows 2008R2 Failover cluster as 2-node and majority disk configuration.   We have several SQL services set to Start/Manual.   When we perform a failover test, these services do not start.   Is that normal?

Also, we use CommVault to backup our Quorum database.   It creates GxClusPlugin instance in the Cluster.   This instance does not failover and fails.   Has anyone advice on this?

Thanks

Replace 2 Nodes in Cluster

$
0
0
Have a 2 node SQL Cluster & looking for best way to replace these with two new servers. I was thinking of removing 1 SQL node and then remove node from windows failover cluster mmc. Then unplug crossover cable and plug into new server and make new server same name as one that was just removed.. Then add to cluster and start w. SQL nodes.. thoughts? any articles,etc to follow?
Viewing all 4519 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>