Cluster Heartbeat is using the wrong network

May 23, 2015, 1:30 am

≫ Next: Renaming Hyper-V Cluster Node

≪ Previous: Stretched General File Server FoC Questions

Hi there.

I have a very strange phenomena at one of my customers and can't really understand why this happened.

This is a a Windows 2008 R2 two node cluster (actually patched)

We have a Public network, a cluster network and a live migration network. Storage is Fibrechannel.

Live Migration network is a crossover 10GBit connection, the rest are non-teamed 1GBit switch connection.

All three networks are cluster use enabled. The cluster network was the network with the lowest metric.

Due to some network problems we decided to change the network for the clustertraffic to use the 10GBit connection. So I've lowered the metric for the livemigration network.
Today we tested the config. As we disabled the switchport for the "old" cluster network we've run immediately into a splitbrain situation! All the other networks where still up and running. So the nodes had three routes in total to see each other. But anyway, we've disabled a non-primary-heartbeat network so nothing should have happened.....

I've digged through the clusterlogs and see the entries that the route from the "old cluster network" vanished and that both nodes decided that the other one is dead!
Cluster validation test is fine!

When I use the network monitor to see where the heartbeat packets are actually running I can see them on the "old cluster network" and not on the LiveMigration network which they should using...........

So according to all settings I've checked Heartbeats should be running through the Livemigration Network and all the other networks should be redundancy. But the heartbeats are still running via the cluster network and redundancy is not working.

Does anybody have an idea?

Regards
Olaf

↧

Renaming Hyper-V Cluster Node

May 29, 2015, 9:00 am

≫ Next: SQL Server 2014 Failing to install in a cluster

≪ Previous: Cluster Heartbeat is using the wrong network

Good morning,

I just recently configured a a new Hyper-V cluster with 1 Hyper-V node attached to it. It is presently hosting 7 virtual machines.

I need to rename the node. Can I accomplish this without much work or will I have to re-deploy the cluster?

Thanks in advance!

↧

SQL Server 2014 Failing to install in a cluster

May 29, 2015, 8:44 am

≫ Next: Live migrations fail but Quick Migrations successful

≪ Previous: Renaming Hyper-V Cluster Node

Dear All,

When we are attempting to install SQL Server 2014 on a one node cluster, we are getting the following error. We know it storage related we just don't know what and why :P

Problem signature:

Problem Event Name: APPCRASH

Application Name: CPrepSrv.exe

Application Version: 6.3.9600.17093

Application Timestamp: 53477555

Fault Module Name: StackHash_6162

Fault Module Version: 6.3.9600.17736

Fault Module Timestamp: 550f4336

Exception Code: c0000374

Exception Offset: PCH_B8_FROM_ntdll+0x000000000009177A

OS Version: 6.3.9600.2.0.0.272.7

Locale ID: 2057

Additional Information 1: 6162

Additional Information 2: 61622186c57016e5ac2d3038f45ee3f2

Additional Information 3: 1139

Additional Information 4: 11398d6c53eb5dafdaf0ef26509b53e6

Anyone out there with some insights ?

Cookies vrs Cake - the eternal struggle

↧

Live migrations fail but Quick Migrations successful

May 29, 2015, 4:51 pm

≫ Next: Overlapping Cluster Disk Numbers on Windows 2012 R2 Cluster

≪ Previous: SQL Server 2014 Failing to install in a cluster

Currently running a (2) node cluster for virtual hosts that uses SMB3.0 share hosted on a (2) node cluster with SoFS role. I have had no issues running CAU or Live Migrations since the servers were first setup. About a month ago I noticed my CAU was failing due to not being able to drain the roles. I went in and attempted a live migration and it failed. I went through the error log and noticed a general access denied error (see screenshot for error). No permissions have been modified nor have any changes to the domain been made within this time. Not quite sure why suddenly I am receiving these errors. I went through and verified permissions on the SMB share. I also ensured I had delegation setup in ADUC. Currently have the option selected for "Trust this computer for delegation to any service (Kerberos only)." I set that on both cluster names and I even added it to the nodes just for testing purposes. I am currently thinking maybe I need to do quick migrations to drain the host and then run Windows updates to see if maybe I am missing something. Any ideas would be greatly appreciated. I did search through the forums but did not find something with this same error.

↧

Overlapping Cluster Disk Numbers on Windows 2012 R2 Cluster

May 11, 2015, 7:49 am

≫ Next: Disable - hang detection we have for Failover Clustering. -> what are the negative effect ?

≪ Previous: Live migrations fail but Quick Migrations successful

I have a windows 2012 R2 Cluster that is showing overlapping Cluster Disk numbers in the disk view. It doesn't seem to affect

any cluster behavior. Just wondering why this would happen and if it is a bug instead of expected behavior?

↧

Disable - hang detection we have for Failover Clustering. -> what are the negative effect ?

May 31, 2015, 10:10 am

≫ Next: Changing Quorum drive letter with One node in Cluster.

≪ Previous: Overlapping Cluster Disk Numbers on Windows 2012 R2 Cluster

hi ,

HangRecoveryAction of my cluster Hyper-v2012 R2 IS Set to Causes a Stop error (Bugcheck) on the cluster node.

this setting may reboot the server in the production hours.

So i want to set it Logs an event in the system log of the Event Viewer.

Iam asking what are the negative effect from doing change value of HangRecoveryAction.

B,R

Ramy

↧

Changing Quorum drive letter with One node in Cluster.

May 31, 2015, 12:22 pm

≫ Next: WIMMount (HSM) causing cluster storage to go redirected (2012r2 DC)

≪ Previous: Disable - hang detection we have for Failover Clustering. -> what are the negative effect ?

I am Trying to change Drive letter of Quorum and MSDTC to another drive letter as a requirement.

Where as i am getting a error as below.

Failed to get available drive letters because some of the nodes of the cluster are down.

We had two node cluster because of some issues one node is down.

In this setup how to Change Drive letter of the Quorum and MSTDC with one node in cluster.

Thanks in advance.

↧

WIMMount (HSM) causing cluster storage to go redirected (2012r2 DC)

June 1, 2015, 1:23 pm

≫ Next: node hold the SCSI PR on Test Disk 0 and brough the disk online, but failed in its attmpt to write file data to partition table entry 1.

≪ Previous: Changing Quorum drive letter with One node in Cluster.

Looking for options to resolve this error and prevent it in the future.

Thanks for any help.

Hardware:

2 node Dell HV cluster running2012r2 DC

8 NIC/Ea in Multiplex mode 4x hyper-v 4x hosts

Storage:

2x Synology NAS, accessed through Iscsi to Hosts and cluster

Relevant logs:

Log Name:      Microsoft-Windows-FailoverClustering/Diagnostic
Source:        Microsoft-Windows-FailoverClustering
Date:
Event ID:      2050
Task Category: None
Level:         Warning
Keywords:
User:          SYSTEM
Computer:      l
Description:
[DCM] filter WIMMount found at unsafe altitude 180700

Log Name:      System
Source:        Microsoft-Windows-FailoverClustering
Date:
Event ID:      5125
Task Category: Cluster Shared Volume
Level:         Warning
Keywords:
User:          SYSTEM
Computer:
Description:
Cluster Shared Volume 'Volume1' ('Cluster Disk 1') has identified one or more active filter drivers on this device stack that could interfere with CSV operations. I/O access will be redirected to the storage device over the network through another Cluster node. This may result in degraded performance. Please contact the filter driver vendor to verify interoperability with Cluster Shared Volumes.

Active filter drivers found:
WIMMount (HSM)

PS C:\Windows\system32> fltmc instances
Filter                Volume Name                              Altitude        Instance Name       Frame   SprtFtrs VlS
tatus
-------------------- ------------------------------------- ------------ ---------------------- -----   -------- ---
-----
CCFFilter             \Device\Mup                               261160     CCFFilter                 0     00000003
CsvFlt                \Device\HarddiskVolume50                  404800     CsvFlt Instance           0     00000003
CsvNSFlt              C:                                        404900     CsvNSFlt Instance         0     00000003
FsDepends             C:\ClusterStorage\Volume1                 407000     FsDepends                 0     00000003
FsDepends                                                       407000     FsDepends                 0     00000003
FsDepends             C:                                        407000     FsDepends                 0     00000003
FsDepends             D:                                        407000     FsDepends                 0     00000003
FsDepends             I:                                        407000     FsDepends                 0     00000003
FsDepends                                                       407000     FsDepends                 0     00000003
FsDepends             \Device\HarddiskVolume50                  407000     FsDepends                 0     00000003
FsDepends             \Device\Mup                               407000     FsDepends                 0     00000003
ResumeKeyFilter                                                 202000     ResumeKeyFilter           0     00000003
ResumeKeyFilter       \Device\HarddiskVolume50                  202000     ResumeKeyFilter           0     00000003
WIMMount                                                        180700     WIMMount                  0     00000000
WIMMount              C:                                        180700     WIMMount                  0     00000000
WIMMount              D:                                        180700     WIMMount                  0     00000000
WIMMount              I:                                        180700     WIMMount                  0     00000000
WIMMount                                                        180700     WIMMount                  0     00000000
WIMMount              \Device\HarddiskVolume50                  180700     WIMMount                  0     00000000
luafv                 C:                                        135000     luafv                     0     00000003
npsvctrig             \Device\NamedPipe                          46000     npsvctrig                 0     00000000
svhdxflt              \Device\HarddiskVolume50                  135100     svhdxflt                  0     00000003

↧

node hold the SCSI PR on Test Disk 0 and brough the disk online, but failed in its attmpt to write file data to partition table entry 1.

June 1, 2015, 3:33 pm

≫ Next: Problem with CUA: Analyze cluster updating readiness and Test-CauSetup doesn't work

≪ Previous: WIMMount (HSM) causing cluster storage to go redirected (2012r2 DC)

I keep getting the following error when setting up a 2 node 2012R2 cluster

Fails on storage validation. any idea what causing this? they are both virtual servers

↧

Problem with CUA: Analyze cluster updating readiness and Test-CauSetup doesn't work

June 2, 2015, 5:56 am

≫ Next: WFCM Is Not Restarting a Process After Exit

≪ Previous: node hold the SCSI PR on Test Disk 0 and brough the disk online, but failed in its attmpt to write file data to partition table entry 1.

I ve got two node (TT2 and TT3) hyper-v cluster (Microsoft Hyper-V 2012 R2).

When I lunch cluster MMC console on remote computer and lunch "Analyze cluster updating readiness" I've got error: "there was an error analyzing cluster update readiness"

When i Run PS command Test-CauSetup I've got error too:

Test-CauSetup : The Microsoft/Windows/ClusterAwareUpdating BPA model is not found on "TT3". To run Test-CauSetup, you might need to reinstall the Failover Clustering Tools, or run the cmdlet on another computer.

When I rune CAU Wizzard, updates are downloaded but not installed on two nodes.

How to solve this problem?

Kind Regards Tomasz

↧

WFCM Is Not Restarting a Process After Exit

June 2, 2015, 3:34 pm

≫ Next: Network Drops for 30 seconds During Hyper-V Live Migration

≪ Previous: Problem with CUA: Analyze cluster updating readiness and Test-CauSetup doesn't work

I have a two node cluster for availability purposes and find that it works quite well for the most part.

I do however, have an issue where a process will shut itself down with a clean exit after an exception. WFCM continues to show the service "online" even though services.msc show the service has stopped.

Any ideas what is going on?

↧

Network Drops for 30 seconds During Hyper-V Live Migration

May 18, 2015, 7:25 am

≫ Next: Creating a SMB Application share

≪ Previous: WFCM Is Not Restarting a Process After Exit

I have 3 physical Hyper-V hosts setup with clustered storage. I disabled VMQ because I was getting errors when trying to do live migrations. I have also ran the network portion of the cluster validation tests without errors. What happens is basically when I do a live migration from any host to any other host I lose network connectivity to any VM running on those hosts. During this time I have a SQL application that is running and locks up and freezes all the users. Many will have to use task manager to kill the application to get back in or even reboot their machines to free it up.

I have been doing a ton of reading on network settings and configurations and have made no progress. Any help to point me in a direction to get this solved will be appreciated. I need to be able to do Live Migrations on my cluster storage.

Thanks for any help.

↧

Creating a SMB Application share

June 3, 2015, 9:42 am

≫ Next: How to configure and setup cluster VM level and Host level

≪ Previous: Network Drops for 30 seconds During Hyper-V Live Migration

I have 2xtest servers (both Server 2012 R2 with the File Server Resource Manager role) that I'm trying to use to test Hyper-V clustering and a FreeNAS box with a CIFS/SMB share configured. I can map a network drive from the servers and see the share contents but when I try to create a share in File and Storage services the drive letter isn't listed. How can I get Windows to see the shared storage?

↧

How to configure and setup cluster VM level and Host level

June 3, 2015, 6:15 am

≫ Next: Two NLB clusters on two nodes?

≪ Previous: Creating a SMB Application share

How to configure and setup cluster VM level and Host level and configuration aslo (between 2 physical hosts (server)

↧

Two NLB clusters on two nodes?

June 4, 2015, 1:07 am

≫ Next: Network Load Balancing Manager - Citrix Webinterface

≪ Previous: How to configure and setup cluster VM level and Host level

Hi all,<o:p></o:p>

I have some questions about NLB clustering. <o:p></o:p>

I have 2 VMs, both running on Windows 2008 R2.<o:p></o:p>

I would like to have BOTH Virtual servers added to TWO NLB clusters<o:p></o:p>

Please have a look at the picture for more specified configuration info

<o:p></o:p>

1st:Is it possible AT ALL to add 1 Virtual server to 2 NLB clusters<o:p></o:p>

2nd:If this is supported, which configuration should be preferred (B?)<o:p></o:p>

Thanks for your help

<o:p>JBrew</o:p>

Jelle

↧

Network Load Balancing Manager - Citrix Webinterface

June 4, 2015, 1:13 am

≫ Next: Internal netbios traffic on WSFC 2012

≪ Previous: Two NLB clusters on two nodes?

Hello,

Im in use of 2 citrix webinterfaces, to make a cluster I used the Microsoft Network Load Balancing Manager.

It worked for a long time, then one day it crashed.

Since the crash i have interface 2 as active host in the cluster and it works, as soon as I add the interface 1, interface 2 goes offline, no connectivity on NIC anymore. then when interface 1 is in the cluster, pinging the cluster name responds, but from web im not able to reach the website (iis citrix webinterface), when I try to open the website directly using the hostname http://interface1... it responds. Now after this steps I have to remove interface 1 again from cluster, disable and enable the NIC of interface2 so this comes online and the cluster name responds again (ip + web). I need to use 2 webinterfaces to guarantee high availability. You have some tips how to solve this problem?

No errors in event log

thank you

↧

Internal netbios traffic on WSFC 2012

June 4, 2015, 6:57 am

≫ Next: Temporarily mix Intel and AMD hosts in Hyper-V cluster for migration purposes

≪ Previous: Network Load Balancing Manager - Citrix Webinterface

All hello.

Current configuration:
SQL Server of group of availability (AG) 2012 on Windows Server 2012 consisting of two nodes is developed. On each node two network interfaces, one for public access, the second for interconnect (heartbeat) are used.

First node:
Eth1 10.16.0.41
Eth2 192.168.10.1

Second node:
Eth1 10.16.0.42
Eth2 192.168.10.2

The second interface with IP 192.168.10.1 and 192.168.10.2 is private connection, allocated for internal cluster communication.

The administrator of a network noticed strange circulation of a traffic, and suggested it to block:
there is a traffic with IP 10.16.0.41 under a cluster user to the internal address 192.168.10.1 with UDP of port 137 on port 137 according to the netbios-ns appendix, in the same way addresses 10.16.0.42 on 192.168.10.2

Question:

why he addresses to himself?

↧

Temporarily mix Intel and AMD hosts in Hyper-V cluster for migration purposes

June 4, 2015, 11:11 pm

≫ Next: MSMQ options in WSFC 2012 R2

≪ Previous: Internal netbios traffic on WSFC 2012

Hi, we currently have a Hyper-V cluster (2012 R2) running on old AMD-based hosts. We are going to replace these with new Intel-based servers.

Is the following scenario possible (though likely not supported)?

join new (Intel-based) hosts to existing cluster
shut down all virtual machines
migrate all virtual machines to new hosts
remove old (AMD-based) hosts from cluster

Would be the easiest option. Otherwise I need to create a new cluster and either create new storage, or connect LUNs to new cluster and import all VMs, in any case much more work and more risk.

↧

MSMQ options in WSFC 2012 R2

June 4, 2015, 11:58 pm

≫ Next: VM Windows deactivated

≪ Previous: Temporarily mix Intel and AMD hosts in Hyper-V cluster for migration purposes

Hi,

If you want to make MSMQ higly available, most guides I have seen is about creating the Message Queuing role. That's fine.

However, I'm in a process of sorting out the other options of MSMQ in a Failover Cluster and I have done those tests:

1. Created an Message Queuing resource in existing application group, made it dependent on the disk and network name already in there. Test ok.

2. The same as above but used a separate shared disk only for MSMQ. Test ok.

3. Created a Message Queuing role and moved that one into an existing application group. Test ok.

4. The other way around from point 3; Moved an existing application group into the Message Queuing group. Test ok.

"Test ok" in my case is that it didn't produce any errors, but I have not tested the MSMQ functionality because I'm actually not an application guy, more on the infrastructure level.

Also, one thing about creating a separate role for MSMQ is that you get the "Manage Message Queuing" in the GUI, that's not the case with options 2 and 3 if you don't tweak the registry or use gwmi (http://blogs.msdn.com/b/clustering/archive/2010/01/12/9946994.aspx).

Now to my question :-)

Are any or all of those other options described valid ways to set up MSMQ i a clustered environment? What is your exeprience out there?

↧

VM Windows deactivated

June 5, 2015, 12:38 am

≫ Next: CSV - System Volume Information Problem

≪ Previous: MSMQ options in WSFC 2012 R2

Hi ,

We are running hyper v 2008 r2 failover Cluster. Problem is when we moves the virtual machines from one node to another node VM windows is deactivated and needs to be activated again.

Please advise

Regards

↧