CSV AND DPM Redirected Access event id 5125 sis driver error

September 16, 2014, 3:01 am

≫ Next: Failover configuration on Cisco sf300-24 and two Windows 2008 R2

≪ Previous: SQL CLUSTER NAME REMAIN IN A FAILED STATE

We have 2x DPM 2012 servers witch we need to use to run hyper-v cluster as well.

When we bring the cluster up every thing is fine all disk validations work 100%.

As soon as you make the disk an csv you get an error that the disk is being redirectd over the network.

And the error states that there is a problem with the SIS filter driver (single instance storage that dpm installs to do its dedup)

Any idea how to work around this or if DPM on a cluster node is supported?

↧

Failover configuration on Cisco sf300-24 and two Windows 2008 R2

September 19, 2014, 9:26 am

≫ Next: Windows server 2012 Datacenter Hyper-V Cluster + mixing HP Proliant DL380 G7 and G8

≪ Previous: CSV AND DPM Redirected Access event id 5125 sis driver error

I have two Windows 2008 R2 server connect to a Cisco sf300-24 switch in customer environment. They provided us two IPs xxx.xxx.40.156, xxx.xxx.40.157 and let these IPs accessible from and to their network.

I want to make these servers active/standby (I implemented some website, wcf service and installed them on both) Some of our services have to access their apps via these IPs (they allow connection from xxx.xxx.40.156 and xxx.xxx.40.157)

My servers are dedicated servers and don't have any DNS, DHCP, AD servers in our environment.

How can I do it?

↧

Windows server 2012 Datacenter Hyper-V Cluster + mixing HP Proliant DL380 G7 and G8

September 10, 2014, 6:58 am

≫ Next: Windows server 2012 Datacenter Hyper-V Cluster -- Failed to validate Operating System Installation Option?

≪ Previous: Failover configuration on Cisco sf300-24 and two Windows 2008 R2

I've heard that with Windows server 2012, the requirement for similar hardware for the cluster nodes is not as strict as it was before. I have a 3-node Windows 2012 datacenter Hyper-V cluster, all 3 servers are HP DL380 G7. I'm thinking of adding one more node(a HP DL380 G8) into the cluster but I'm not sure how well it will work with the other 3 servers. Has anyone done something similar to what I want to do? Thanks.

↧

Windows server 2012 Datacenter Hyper-V Cluster -- Failed to validate Operating System Installation Option?

September 19, 2014, 8:37 am

≫ Next: server 2012 r2 cluster crash

≪ Previous: Windows server 2012 Datacenter Hyper-V Cluster + mixing HP Proliant DL380 G7 and G8

Hi I have a 4 node Windows server 2012 Hyper-V cluster. When I try to run a cluster validation report, everything else is fine but it fails at validate the Operating System Installation Option step. I did some research but couldn't really find any solution. Anyone knows how to pass this test? Thanks.

Here's the error I get when run the test:

An error occurred while executing the test.
The operation has failed. An error occurred while getting the operating system installation option for node "server1"

↧

server 2012 r2 cluster crash

September 20, 2014, 12:20 pm

≫ Next: Hyper-V Cluster in VMM

≪ Previous: Windows server 2012 Datacenter Hyper-V Cluster -- Failed to validate Operating System Installation Option?

hi,

i have configured windows cluster 2012 r2 datacenter on IBM pureflex with IBM v7000 SAN disk,

Certainly my cluster has crashed and cluster disk gone offline, cluster disk is showing reserve on both node, but when i am doing online its show me 0x8007174b the disk is not connected to the node. please check server is running on production mode

also i seeing event id 1146 & 1230 on windows server

please help

Ravi

↧

Hyper-V Cluster in VMM

September 21, 2014, 1:31 am

≫ Next: CAS Server not Migrating

≪ Previous: server 2012 r2 cluster crash

I am trying to build a Hyper-V cluster with VMM 2012 R2 but require some advice as it is not working how I want it too.

I have 2 Hyper-V servers, both with their own local storage and 1 iSCSI disk shared between them. I am trying to cluster the servers so that the shared iSCSI disk becomes a shared volume while maintaining the ability to use the local storage as well - some VMs will run from local storage while others will run from the CSV.

The issue I'm having is that when I cluster the 2 servers the iSCSI disk does not show up in VMM as a shared volume. In Windows Explorer the disk has the cluster icon but in VMM there is nothing. In the cluster properties I can add a shared volume... but it asks for a logical node which I cannot create because I have no storage pools (server manager says no groups of disks are available to pool).

I also noticed when I clustered the servers my 2 file shares to their local storage disappeared from VMM which isn't what I want.

Can someone please advise, or link to, a way to achieve my desired configuration?

Cheers,

MrGoodBytes

Note: Posts are provided “AS IS” without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.

↧

CAS Server not Migrating

September 7, 2014, 9:51 pm

≫ Next: SQL Server 2014 and CAU

≪ Previous: Hyper-V Cluster in VMM

Hope ur all doing well, we have 2 physical Servers HV1 and HV2, both servers are part of VMs cluster, each server hosts a CAS and an mbx HA VM i.e HV1 hosts CAS1, CAS2, and HV2 hosts MBX1 and MBX2. Few days ago CAS1 Server crashed, we created a new VM CAS3 from Fail-over Cluster Manager Console>> Services and applications, new vm created in the CSV storage with the same settings (Processors, RAM, NICs) as of crashed CAS1 server, and recovered it with setup /m:RecoverServer..all goes well Server recovered successfully and added back to the CAS WNLB successfully and working all fine, just one PROBLEM, CAS3 vm is unable to Live migrate, quick migrate or even unable to move on to the other host Server. When we live migrate it, it completes transfer 100% but comes back onto the original Host again with the following error in Cluster Events.

Even ID: 1205

"The Cluster service failed to bring clustered service or application 'CAS3' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application."

Please suggest how can I dig into this error where can I start troubleshooting with what resources might be missing? In Failover Cluster Manager all the Resources are Green.

Virgo

↧

SQL Server 2014 and CAU

September 22, 2014, 7:34 am

≫ Next: 2012 R2 hangs on 'Forming Cluster' or error 1460 from cluster.exe

≪ Previous: CAS Server not Migrating

Hi,

I know that CAU with SQL Server 2012 Availability Groups was unsupported, but has this changed for SQL 2014?

We tried it out and the failover to the second node worked fine, but it failed to move back to the primary after the patching was completed.

Is it now supported and we're doing it wrong or still not supported? If not supported, what is the supported method of patching the servers in the cluster?

Thanks

Gary

↧

2012 R2 hangs on 'Forming Cluster' or error 1460 from cluster.exe

September 22, 2014, 5:58 am

≫ Next: Cluster network name resource 'Cluster Name' cannot be brought online. Unable to update password for computer account.

≪ Previous: SQL Server 2014 and CAU

I am trying to build a 2 node cluster however it hangs on 'Forming Cluster' in the wizard. If I try via the cluster.exe command, it gets to '52% Forming Cluster' then errors with 1460 which I believe is a time out. The two nodes are VM's on HyperV 2012 R2. I have about 4 other 2 node clusters working successfully. I have also created this one successfully once, however we had to implement IPSec for communications to other subnets. Both nodes are in the same subnet however. Once the IPSec was enabled, the cluster failed. Since then, the VM's have been completely rebuilt but still unable to form cluster. Possibly an AD issue? Permissions seem fine, cluster object gets create fine. Here is the end of the cluster log. Thanks for any ideas!!

00000128.00000370::2014/09/22-12:20:40.657 INFO </ACL>
00000128.00000370::2014/09/22-12:20:40.657 INFO [DCM] Filter.SetMDSSecurityDescriptor(Sequence 1, Length=332)
00000128.00000370::2014/09/22-12:20:40.657 INFO [DCM] Launching CsvFs Listener
00000128.00000370::2014/09/22-12:20:40.657 INFO [DCM] Launching CsvFlt Listener
00000128.00000960::2014/09/22-12:20:40.657 INFO [DCM] CsvFs Listener: ping 10808
00000128.0000095c::2014/09/22-12:20:40.657 INFO [DCM] Opened CsvFlt event port: handle HDL( 770 )
00000128.00000370::2014/09/22-12:20:40.657 INFO [DCM] Launching Nflt Listener
00000128.00000938::2014/09/22-12:20:40.657 INFO [DCM] Opened NFlt event port: handle HDL( 778 )
00000128.00000370::2014/09/22-12:20:40.657 INFO [DCM] Filter.SetSecurityInfo (Sequence 2, NodeId=1, GlobalSequenceNumber=1, KeyBlobSize=0)
00000128.00000370::2014/09/22-12:20:40.657 INFO [DCM] SetSecurityInfo message sent
00000128.0000099c::2014/09/22-12:20:40.657 INFO [CLI] Generating key
00000128.0000099c::2014/09/22-12:20:40.657 INFO [CLI] Successfully initialized key
00000128.0000099c::2014/09/22-12:20:40.657 INFO [CLI] ExportState - Initializing blob from local registry
00000128.0000099c::2014/09/22-12:20:40.657 INFO [CLI] On first run
00000128.0000099c::2014/09/22-12:20:40.657 INFO [CLI] Writing last update time to local registry 130558620406570472
00000128.0000099c::2014/09/22-12:20:40.657 INFO [CLI] Reading account parameters from local registry
00000128.0000099c::2014/09/22-12:20:40.657 INFO [CLI] Creating account if needed
00000128.0000099c::2014/09/22-12:20:40.657 INFO [CLI] Configuring local account
00000204.00000308::2014/09/22-12:20:40.703 INFO [CAM] CAMTranslateNameToSID - Looking up local name
00000204.00000308::2014/09/22-12:20:40.719 INFO [CAM] CAMTranslateNameToSID - Finished looking up local name
00000128.0000099c::2014/09/22-12:20:40.719 INFO [CLI] Account Created
00000128.0000099c::2014/09/22-12:20:40.736 INFO [CLI] Users group set
00000128.0000099c::2014/09/22-12:20:40.736 INFO [CLI] Flags set, account configured
00000128.0000099c::2014/09/22-12:20:40.736 INFO [CLI] Initializing security
00000128.0000099c::2014/09/22-12:20:40.736 INFO [CLI] Notifying credentials to CAM, creation flags 0, control flags 7
00000204.00000308::2014/09/22-12:20:40.736 INFO [CAM] CamApCallPackage
00000204.00000308::2014/09/22-12:20:40.736 INFO [CAM] CallInfo: Proc 296 Thread 2460 Count 0 Att 512
00000204.00000308::2014/09/22-12:20:40.736 INFO [CAM] ClientInfo: Logon 999 Proc 296 Thread 2460 TCB 1 Impersonating 1 Restrict 0 Flags 0
00000204.00000308::2014/09/22-12:20:40.736 INFO [CAM] SetCNOCred 296 14 16 30 0 7
00000204.00000308::2014/09/22-12:20:40.736 INFO [CAM] Setting CurrentUser CLIUSR, Dom FS01-VM (Proc 296)
00000204.00000308::2014/09/22-12:20:40.736 INFO [CAM] New Process, old 0
00000204.00000308::2014/09/22-12:20:40.736 INFO [CAM] Creating new token when CNL credentials are set
00000204.00000308::2014/09/22-12:20:40.761 INFO [CAM] LsaLogon: c000015b
00000204.00000308::2014/09/22-12:20:40.761 ERR [CAM] Error in creating first token: -1073741477
00000204.00000308::2014/09/22-12:20:40.761 INFO [CAM] Obtaining current CNL SID
00000204.00000308::2014/09/22-12:20:40.761 INFO [CAM] CAMTranslateNameToSID - Looking up local name
00000204.00000308::2014/09/22-12:20:40.761 INFO [CAM] CAMTranslateNameToSID - Finished looking up local name
00000128.0000099c::2014/09/22-12:20:40.761 INFO [CLI] LsaCallAuthenticationPackage: -1073741477, 0 size: 0, buffer: HDL( 0 )
00000128.0000099c::2014/09/22-12:20:40.854 INFO [CLI] Credentials Failed to notify CAM
00000128.0000099c::2014/09/22-12:20:40.854 INFO [CLI] Initializing token
00000204.00000308::2014/09/22-12:20:40.854 INFO [CAM] CamApCallPackage
00000204.00000308::2014/09/22-12:20:40.854 INFO [CAM] CallInfo: Proc 296 Thread 2460 Count 0 Att 512
00000204.00000308::2014/09/22-12:20:40.854 INFO [CAM] ClientInfo: Logon 999 Proc 296 Thread 2460 TCB 1 Impersonating 1 Restrict 0 Flags 0
00000204.00000308::2014/09/22-12:20:40.854 INFO [CAM] GetCNO forceNew=0
00000204.00000308::2014/09/22-12:20:40.854 INFO [CAM] GetCNOToken: LUID 0:0, token: e32bea00, DuplicateHandle: c0000008
00000128.0000099c::2014/09/22-12:20:40.854 INFO [CLI] LsaCallAuthenticationPackage: -1073741816, 0 size: 0, buffer: HDL( 0 )
00000128.0000099c::2014/09/22-12:20:40.873 ERR mscs::QuorumAgent::FormLeaderWorker::operator (): (c0000008)' because of 'status'
00000be8.00000ae0::2014/09/22-12:23:37.527 DBG Cluster node cleanup thread started.
00000be8.00000ae0::2014/09/22-12:23:37.527 DBG Starting cluster node cleanup...
00000be8.00000ae0::2014/09/22-12:23:37.527 DBG Disabling the cluster service...
00000be8.00000ae0::2014/09/22-12:23:37.527 DBG Stopping the cluster service...
00000128.000008c8::2014/09/22-12:23:37.527 INFO [CS] Service Stopping...
00000128.000008c8::2014/09/22-12:23:37.527 INFO [CORE] Node quorum state is 'Not yet formed or joined a cluster'. Form/join status with other nodes is as follows:
00000128.000008c8::2014/09/22-12:23:37.527 INFO [DCM] UnregisterSwProvider(): CSV providers are not registered
00000128.000008c8::2014/09/22-12:23:37.527 WARN [QUORUM] Node 1: weight adjustment not performed, as all remanining voters have weight zero
00000128.000008c8::2014/09/22-12:23:37.527 INFO [RGP] node 1: MergeAndRestart +() -(1)
00000128.000008c8::2014/09/22-12:23:37.542 INFO [RGP] sending to 64 nodes 1: 001(1) => 101() +() -(1) [()]
00000128.00000950::2014/09/22-12:23:37.542 INFO [CORE] Node 1: Proposed View is <ViewChanged joiners=() downers=(1) newView=101() oldView=001(1) joiner=false form=false/>
00000128.000008c8::2014/09/22-12:23:37.652 INFO [DM]: Shutting down, so unloading the cluster database.
00000128.000008c8::2014/09/22-12:23:37.652 INFO [DM] Shutting down, so unloading the cluster database (waitForLock: true).
00000128.000008c8::2014/09/22-12:23:37.652 INFO [CS] Service Stopped...
00000128.000008c8::2014/09/22-12:23:37.652 INFO [CS] About to exit service...
00000be8.00000ae0::2014/09/22-12:23:39.555 DBG Releasing clustered storages...
00000be8.00000ae0::2014/09/22-12:23:39.556 DBG Getting clustered disks...
00000be8.00000ae0::2014/09/22-12:23:39.556 DBG Waiting for clusdsk to finish its cleanup...
00000be8.00000ae0::2014/09/22-12:23:39.556 DBG Clearing the clusdisk database...
00000be8.00000ae0::2014/09/22-12:23:39.556 DBG Waiting for clusdsk to finish its cleanup...
00000be8.00000ae0::2014/09/22-12:23:39.556 DBG Relinquishing clustered disks...
00000be8.00000ae0::2014/09/22-12:23:39.556 DBG Opening disk handle by index...
00000be8.00000ae0::2014/09/22-12:23:39.603 DBG Getting disk ID from layout...
00000be8.00000ae0::2014/09/22-12:23:39.603 DBG Reset CSV state ...
00000be8.00000ae0::2014/09/22-12:23:39.603 DBG Relinquish disk if clustered...
00000be8.00000ae0::2014/09/22-12:23:39.628 DBG Opening disk handle by index...
00000be8.00000ae0::2014/09/22-12:23:39.676 DBG Getting disk ID from layout...
00000be8.00000ae0::2014/09/22-12:23:39.676 DBG Reset CSV state ...
00000be8.00000ae0::2014/09/22-12:23:39.676 DBG Relinquish disk if clustered...
00000be8.00000ae0::2014/09/22-12:23:39.693 DBG Opening disk handle by index...
00000be8.00000ae0::2014/09/22-12:23:39.753 DBG Getting disk ID from layout...
00000be8.00000ae0::2014/09/22-12:23:39.768 DBG Reset CSV state ...
00000be8.00000ae0::2014/09/22-12:23:39.768 DBG Relinquish disk if clustered...
00000be8.00000ae0::2014/09/22-12:23:39.768 DBG Opening disk handle by index...
00000be8.00000ae0::2014/09/22-12:23:39.784 DBG Resetting cluster registry entries...
00000be8.00000ae0::2014/09/22-12:23:39.784 DBG Resetting NLBSFlags value ...
00000204.00000804::2014/09/22-12:23:39.800 INFO [CAM] In NotificationHandlerThread
00000204.00000804::2014/09/22-12:23:39.800 INFO [CAM] NotificationHandlerThread - Setting primary account refresh
00000be8.00000ae0::2014/09/22-12:23:39.852 DBG Unloading the cluster Windows registry hive...
00000be8.00000ae0::2014/09/22-12:23:39.852 DBG Getting the cluster Windows registry hive file path...
00000be8.00000ae0::2014/09/22-12:23:39.852 DBG Getting the cluster Windows registry hive file path...
00000be8.00000ae0::2014/09/22-12:23:39.852 DBG Getting the cluster Windows registry hive file path...

↧

Cluster network name resource 'Cluster Name' cannot be brought online. Unable to update password for computer account.

September 22, 2014, 12:54 pm

≫ Next: Windows Multi Site/Geo Cluster

≪ Previous: 2012 R2 hangs on 'Forming Cluster' or error 1460 from cluster.exe

Hi,

I've got a functioning Windows 2008 R2 cluster. Node 2 is throwing this error every 15 minutes. The cluster validates OK. I've compared permissions on the cluster name object and it looks like all other clusters. I'm not sure why the passive node is attempting to perform these operations. During installations, we make cluster computer account domain admin so it can create the service CNO.

------------------------------------------------------

Cluster network name resource 'Cluster Name' cannot be brought online. The computer object associated with the resource could not be updated in domain 'domain.root' for the following reason:

Unable to update password for computer account.

The text for the associated error code is: Access is denied.

The cluster identity 'CLUSTER-1$' may lack permissions required to update the object. Please work with your domain administrator to ensure that the cluster identity can update computer objects in the domain.

------------------------------------------------------

Then on the primary node every night at 10:35 we get this:

-------------------------------------------------------

Cluster network name resource '%1' failed to register dynamic updates for name '%2' over adapter '%4'. The DNS server may not be configured to accept dynamic updates. The error code was '%3'. Please contact your DNS server administrator to verify that the DNS server is available and configured for dynamic updates.

Alternatively, you can disable dynamic DNS updates by unchecking the 'Register this connection's addresses in DNS' setting in the advanced TCP/IP settings for adapter '%4' under the DNS tab.

Any suggestions on getting rid of these?

Thanks,

Sam

↧

Windows Multi Site/Geo Cluster

September 9, 2014, 12:10 am

≫ Next: Hyper-V Cluster went wacky, ERROR_TIMEOUT (1460), Remote Administration, Remote Registry

≪ Previous: Cluster network name resource 'Cluster Name' cannot be brought online. Unable to update password for computer account.

Hi,

I am looking for a Multi Site Cluster solution for File Server and SQL DB. Could you pls help with any case study of solution approach for building the Multi Site cluster using Windows 2012 ?

What is recommended for Replication

Which all Application/DB are compatible for Multi Site

Are there any implementation documents

pls suggest

Regards:Mahesh

↧

Hyper-V Cluster went wacky, ERROR_TIMEOUT (1460), Remote Administration, Remote Registry

September 23, 2014, 3:22 pm

≫ Next: Loss of connection with storage

≪ Previous: Windows Multi Site/Geo Cluster

Each Hyper-V host is a Windows 2012 Server

Each host is Connected to a Dell Powervault MD3260 via Dual SAS Cables.

We started getting Errors similar to the Following:

Log Name:      System
Source:        Microsoft-Windows-FailoverClustering
Date:          9/22/2014 12:01:24 PM
Event ID:      5142
Task Category: Cluster Shared Volume
Level:         Error
Keywords:      
User:          SYSTEM
Computer:      SVR-HBG-VM4.HAYDON-MILL.COM
Description:
Cluster Shared Volume 'VSVR_File_Data' ('VSVR_File_Data') is no longer accessible from this cluster node because
of error 'ERROR_TIMEOUT(1460)'. Please troubleshoot this node's connectivity to the storage device and network connectivity.

The CSV Resources that were giving the Timeout Errors were all managed on one specific host, other CSVs on the Powervault were all working fine form the Host reporting the timeout error.

Each of the Host then started to have issues with timeouts to other CSV. Dell Tech Support interrogated the Powervault and found no issued with the Connectivity and no events were present on the Powervault indicating an issue.

Each of the Host would then hand in Explorer, the Task bar would have no icons in the task bar and turn to hourglass when moving mouse to it.

Failover Cluster Manager would sometimes work, then stop working, It would not update the status of the VMs. They would jsut stay at "Loading..." Hyper-V Manager would have similar issues.

We ended up Shutting down all of the VMs and the Cluster and then bringing it back online. Though in the Process we had to tweak some things, the DNS settings on the Hosts were pointing to a VM that was Offline that was fixed. We could not reconnect to the cluster after we turned on the Hosts. We were getting RPC server unavailable.

I ended up turning off the Firewalls, resetting the NICs and things started to get better.

8 Minutes prior to the Issues did make a few GPO Changes.

Windows Firewall: Allow inbound Remote Desktop Exception		Domain Profile
was 10.0.0.0/8
Changed to 10.0.0.0/8,10.1.0.0/16

Windows Firewall: Allow inbound Remote Desktop Exception		Standard Profile
10.1.0.0/16, localsubnet

Windows Firewall: Allow inbound Remote administration Exception		Domain Profile
was localsubnet,10.0.0.0/8
Changed to localsubnet,10.0.0.0/8,10.1.0.0/16

Windows Firewall: Allow inbound Remote administration Exception		Standard Profile
10.1.0.0/16, localsubnet

Remote Registry Service
   Set to Manual Start

So my Question is, could these changes of Affected the Hyper-V Hosts in the Cluster?

Thanks!

Scott<-

↧

Loss of connection with storage

September 22, 2014, 12:17 pm

≫ Next: Cluster VMs sometime fail while doing an export-vm

≪ Previous: Hyper-V Cluster went wacky, ERROR_TIMEOUT (1460), Remote Administration, Remote Registry

Hi,

I have 3 IBM server hosts (Server 2012) and 2 ds3500 SANs connected through fiber channel through a switch. The VMs and SANS are clustered. The MPIO setting is Active/Optimized and Standby for the controllers (seems to be required for the ds3500). The last two times the hosts rebooted (during Cluster Aware Updating) the Active/Optimized and Standby settings got reversed for some of the CSVs and connection was lost. That caused some of the VMs to fail. (No failover.) I had to go into Disk Management on all the hosts and reset the Active/Optimized and Standby paths, then restart the VMs. Anyone have any idea why the MPIO settings are not sticking? Thanks.

↧

Cluster VMs sometime fail while doing an export-vm

September 17, 2014, 11:59 pm

≫ Next: You cannot destroy a cluster that contains services and applications

≪ Previous: Loss of connection with storage

I'm using a powershell script to export some Clustered (2012 R2 Hyper-V) VMs through task scheduler.

Every now and then a VM is restarted by cluster during the export-vm. The errors found from the event viewer are located in the end of this message. I cannot see a proper cause for the failure, is there anyway to debug this problem more deeply?

I would also like to know if there is some switch I could set on the cluster-resource, while doing the export-vm, to prevent cluster from trying to restart the VM, even if it is not responding for a while during the export-vm.

The powershell script used:

$VMS = get-vm -Name VM1,VM2,VM3 -EA SilentlyContinue
foreach ($VM in $VMS.vmname) {
del \\fileserver\HyperVexport\$VM -force -recurse
Export-VM -Name $VM -Path \\fileserver\HyperVexport
if ( $? -ne "True" )
{
$date = get-date -format s
"$date $VM Export failed" | out-file -FilePath c:\hyper-v\scripts\ExportVMs.log -Append
send-mailmessage -from "xxx@xx.xx" -to "xxx@xx.xx" -subject "Export of $VM in $env:COMPUTERNAME failed" -smtpServer mailserver
}
} #close foreach

Event logs from the Hyper-V host where the VM is running at the time of the failure:

Time	Log	Event-ID	Description
22:56:07	Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational	1637	Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state Online to state ProcessingFailure.
22:56:07	Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational	1637	Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state ProcessingFailure to state WaitingToTerminate. Cluster resource 'Virtual Machine VM1' is waiting on the following resources: .
22:56:07	Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational	1637	Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state WaitingToTerminate to state Terminating.
22:56:07	Windows Logs/System	1069	Cluster resource 'Virtual Machine VM1' of type 'Virtual Machine' in clustered role 'VM1' failed. Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.
22:57:07	Applications and Services Logs/Microsoft/Windows/Hyper-V-High-Availability/Admin	21128	Virtual Machine VM1' failed to shutdown the virtual machine during the resource termination. The virtual machine will be forcefully stopped.
22:57:07	Applications and Services Logs/Microsoft/Windows/Hyper-V-High-Availability/Admin	21119	Virtual Machine VM1' succesfully started the virtual machine during the resource termination. The virtual machine.
22:57:13	Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational	1637	Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state Terminating to state DelayRestartingResource.
22:57:13	Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational	1637	Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state DelayRestartingResource to state OnlineCallIssued.
22:57:13	Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational	1637	Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state OnlineCallIssued to state OnlinePending.
22:57:13	Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin	14070	Virtual machine 'VM1' (ID=9510686F-BE3C-4CAA-99A5-EB756ED8DED1) has quit unexpectedly.
22:57:13	Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin	15190	VM1' failed to take a checkpoint. (Virtual machine ID 9510686F-BE3C-4CAA-99A5-EB756ED8DED1)
22:57:13	Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin	15140	VM1' failed to turn off. (Virtual machine ID 9510686F-BE3C-4CAA-99A5-EB756ED8DED1)
22:57:13	Applications and Services Logs/Microsoft/Windows/Hyper-V-VMMS/Admin	18350	Export failed for virtual machine 'VM1' (9510686F-BE3C-4CAA-99A5-EB756ED8DED1) with error 'The process terminated unexpectedly.' (0x8007042B).
22:57:17	Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational	1637	Cluster resource 'Virtual Machine VM1' in clustered role 'VM1' has transitioned from state OnlinePending to state Online.
22:57:17	Applications and Services Logs/Microsoft/Windows/FailoverClustering/Operational	1201	The Cluster service successfully brought the clustered role 'VM1' online.

In the VM1 event-viewer I can only see the "The previous system shutdown at ... was unexpected", so it was forcefully shutdown as can be seen from the logs above.

↧

You cannot destroy a cluster that contains services and applications

September 24, 2014, 8:22 pm

≫ Next: Resource Hosting Subsystem Deadlocks - File Share Witness

≪ Previous: Cluster VMs sometime fail while doing an export-vm

I take this to mean that I must delete everything on the Cluster Storage before I destroy the cluster. Is there anything else involved in this?

↧

Resource Hosting Subsystem Deadlocks - File Share Witness

September 24, 2014, 3:27 pm

≫ Next: NLB Problem Broken Links in share Point site

≪ Previous: You cannot destroy a cluster that contains services and applications

Recently one of the SQL 2012 AlwaysOn clusters I manage that runs Windows Server 2008 R2 started experiencing problems with RHS Deadlocks on the File Share Witness resource for the cluster. When this happens the cluster triggers a failover to the other node. The cluster is running in VMware. Each node has 2 vCPUs and 10 GB of memory.

The deadlocks only seem to occur when CPU utilization is high on the active node. Typically if a large SQL restore is running, the deadlock will be triggered. There are other clusters that rely on the same File Share Witness (different shares) and they have not experienced any deadlocks. I doubt that this deadlock is related to a communication issue.

I have been searching online and cannot find a good way to troubleshoot this specific issue when dealing with a File Share Witness. Is it possible that starved CPU could be a trigger for an RHS deadlock? Has anyone got any tips or advice for digging further into this?

Here is an excerpt from the cluster.log. I can provide additional logs if they would be beneficial.

000008e8.00000184::2014/09/19-15:20:22.000 ERR   [RHS] RhsCall::DeadlockMonitor: Call ISALIVE timed out for resource 'File Share Witness'.
000008e8.00000184::2014/09/19-15:20:22.000 INFO [RHS] Enabling RHS termination watchdog with timeout 1200000 and recovery action 3.
000008e8.00000184::2014/09/19-15:20:22.000 ERR   [RHS] Resource File Share Witness handling deadlock. Cleaning current operation and terminating RHS process.
000008e8.00000184::2014/09/19-15:20:22.000 ERR   [RHS] About to send WER report.
0000074c.00000ccc::2014/09/19-15:20:22.000 WARN [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'File Share Witness', gen(0) result 4.
0000074c.00000ccc::2014/09/19-15:20:22.000 INFO [RCM] rcm::RcmResource::HandleMonitorReply: Resource 'File Share Witness' consecutive failure count 1.
000008e8.00000184::2014/09/19-15:20:22.100 ERR   [RHS] WER report is submitted. Result : WerReportQueued.
0000074c.00000ccc::2014/09/19-15:21:10.272 ERR   [RCM] rcm::RcmMonitor::RecoverProcess: Recovering monitor process 2280 / 0x8e8
0000074c.00000ccc::2014/09/19-15:21:10.274 INFO [RCM] Created monitor process 2560 / 0xa00

↧

NLB Problem Broken Links in share Point site

September 25, 2014, 1:16 am

≫ Next: Virtual servers are rebooting automatically

≪ Previous: Resource Hosting Subsystem Deadlocks - File Share Witness

Hi Guys i already setup NLB Network Load Balancing Server For SharePoint 2013 Farm I Have 2 FE servers Creating Cluster with NLB . the Problem Is my Site Not Loading Correctly from First Time It's Need More Than refresh To Load Complete I Use Flidder2 To Track Broken Links The Problem I find Many broken Links But I don't Know How To Create Role For Each Broken Link also I adjust Alternate Access Mapping I Make Cluster IP on Default Zone. Please Any One can help To solve this Issue thanks allot . Note :- I work On windows Server 2012 and Client Machine windows 8.1 Pro

↧

Virtual servers are rebooting automatically

September 24, 2014, 5:06 am

≫ Next: Node availability

≪ Previous: NLB Problem Broken Links in share Point site

Hello All,

I am having an issue with Virtual Servers.

I have 5 node cluster. Last week due to storage firmware upgradation we had some problems. after that we contacted storage expert and get it rectified. Now One of cluster node is having a issue. Once i move the VMs (This node is the owner of VMs) then the virtual servers are automatically rebooting. Now this node is ideal. In addition to that we checked all updates and run the failover test also..every thing is showing green.

Can any one have solution for this, please help me.

Thank you.

Karan

↧

Node availability

September 25, 2014, 12:10 pm

≫ Next: Unable to replicate for some HA virtual machines yet others work fine

≪ Previous: Virtual servers are rebooting automatically

Exemple:

I search to configure a cluster Windows 2012 R2 with 6 nodes (Node Majority).Howcan I losenodefor the clustercontinues to operate?

If i configure a same cluster with File Share Witness (FSW) ?Howcan I losenodefor the clustercontinues to operate?

Do you have a link to explain that?

Thanks

↧

Unable to replicate for some HA virtual machines yet others work fine

September 25, 2014, 8:51 pm

≫ Next: Can't Live Migrate VMs. Cluster Verification shows failure on storage. Microsoft DSM versions do not match.

≪ Previous: Node availability

I have a unique issue which I am trying to resolve, I have 2 clusters with 5 nodes each, each node has a bunch of clustered shared volumes and almost all the VM's are set to HA.

Most machines work perfectly replicating to the secondary cluster, and any new HA VM's I create replicate to the remote cluster perfectly. there are a few machines which have issues with replication, I have checked and ensured there is no outdated copy of the replica on the remote cluster and no VM copies with broken replication on the remote cluster.

Yet when I try to enable replication for these machines I get the following error

Regards,

Medi

↧