Data move from ClusterA (iSCSI) to ClusterB (FC)

December 11, 2018, 9:13 am

≫ Next: VMs Failing to Automatically Migrate

≪ Previous: Cannot add cluster disk on windows 2012

Have existing Server 2012R2 cluster with iSCSI storage, need to move to new Server 2019 cluster with FC storage.

Do I have it right that one could do any:

1) Downtime (VMs moved via LAN from old cluster storage to new)

2) Add new 2019 hosts to existing cluster & configure at least one 2019 host to have iSCSI & then do LM?

3) Add new 2019 hosts to existing cluster & configure at least one old 2012 R2 host to have FC & then do LM?

How well 2019 adds to 2012R2 cluster?

If 2 or 3 then after data move, remove 2012R2 hosts & upgrade cluster version?

Any other options?

Seb

↧

VMs Failing to Automatically Migrate

November 21, 2018, 8:55 am

≫ Next: Can't live migrate multiple machines from FCM, but can from powershell

≪ Previous: Data move from ClusterA (iSCSI) to ClusterB (FC)

I come in every morning to find a hand full of my VMs indicating "Live Migration was canceled." This seems to be happening around 12:00 - 1:00 AM, but I can't find anything configured to tell it to migrate so I'm not sure why it is happening to begin with. The event logs are not helpful... Cluster Event ID is 1155 "The pending move for the role 'server name' did not complete." The Hyper-V-High-Availability log shows Event ID 21150 "'Virtual Machine Cluster WMI' successfully taken the cluster WMI provider offline." which was right before the 21111 Event ID "Live migration of 'VM Instance Name' failed. It is typically the same VMs, but not always. I see the error on both Nodes (2 node cluster, 2 CSVs). Hyper-V-VMMS logs show 1940 "The WMI provider 'VmmsWmiInstanceAndMethodProvider' has shut down." Then 20413 "The Virtual Machine Management service initiated the live migration of virtual machine 'VM Name' to destination host 'Other Node' (VMID)." for each of the VMs running on that node. Some are successful, but a few get 21014 "Virtual machine migration for 'VM Name' was not finished because the operation was canceled. (Virtual machine ID)" and finally 21024 "Virtual machine migration operation for 'VM Name' failed at migration source 'Host Name'. (Virtual machine ID)". I can manually live migrate all VMs back and forth all day. I have plenty of resources on both nodes (RAM & CPU), and I have turned off the Hyper-V cluster balancer to automatically move machines. We used to have SCVMM installed but it was overkill for our small environment so it was decommissioned. While I would like to resolve the failures, I would be happy just knowing what was causing the VMs to migrate in the first place since it isn't necessary for them to do this every night. The cluster is not configured with CAU. Any guidance would be greatly appreciated!!

↧

Can't live migrate multiple machines from FCM, but can from powershell

February 22, 2018, 7:14 am

≫ Next: Get-Volume returns all volumes within Windows Failover Cluster instead of just local

≪ Previous: VMs Failing to Automatically Migrate

Hey everyone,

I started experiencing a weird issue this week. We have a 2 node cluster (server 2016) setup with hyper-v vdi and some pooled desktops. We cannot live migrate these vdi machines consistently. At some point during the live migration one fails which causes the rest to stop migrating. The errors are few and far between, but the main one just says that the live migration failed. However there isn't an error message attached to that event (in FCM). There is an error message in event viewer, in the hyper-v vmms log, but the description can't be found. The event id is 22040 and the error code at the end of the message is 0x800705B4, which from my research refers to a timeout issue.

There are two weird issues with this problem. The first is that even though the machines fail to migrate I can migrate them one at a time. (I tested with draining the roles and it still fails). If I migrate them one at a time there are no errors, ever, and every machine migrates perfectly fine. The other issue is I wrote a powershell script to move the VMs in parallel, with a foreach command and all of the machines migrate just fine. I believe that is due to the script calling one command at a time to migrate each virtual machine, but I am not sure why that would work.

We are currently in the process of rebuilding our master image to see if something has gone wrong with it, however I don't have much faith in that. I think the issue lies somewhere in the FCM, but I am also not sure.

I have already checked the simultaneous migrations setup in hyper-v settings. We use Kerberos, but have tried CredSSP as well. Since the machines are live migrating one by one I don't think that is the issue. The servers are connected with a 10GB direct attached link which is the only network setup for live migration traffic. We also have a duplicate system in our primary location, identical servers with identical peripherals, and it doesn't have an issue. The only difference is the pools/master images. Both servers are connected to an iSCSI nimble SAN, but so is the duplicate system, just a different hardware piece. Everything is identical from the switches they are connecting to, to the coax that is directly attached, between the two different setups.

One other note the servers in each location are slightly different from each other though. One server is running a V2 of the same processor and has 8 cores as one is running V1 with 6 cores. However the exact same situation in our other site exists and it works fine.

Thanks for anything that you all can provide. For now I can utilize the powershell script, but need to figure out what this issue is in case it is a pre-cursor to what's to come.

↧

Get-Volume returns all volumes within Windows Failover Cluster instead of just local

December 13, 2018, 9:12 am

≫ Next: error FailoverClustering-Manager - I don't have a cluster

≪ Previous: Can't live migrate multiple machines from FCM, but can from powershell

Hello all.

This is my first entry in the forums, so apologies if I miss something or have this in the wrong place.

I am using the Get-Volume command in PowerShell to return all the volumes located on the server I am running it from.

However, our servers are members of Windows Failover Clusters.

On one of our clusters it does what I would expect. We get a list of all the volumes on this particular node. On the other cluster we get a list of all volumes within the cluster.

Does anyone know of any setting in the windows failover cluster (or anywhere else) that could explain the difference in behavior.

In addition if I try to create a new volume using New-Volume (PowerShell) in the cluster that behaves as expected it works without issue.

If I try to create a New-Volume using New-Volume (PowerShell) in the cluster that shows all volumes I get the below error:

Failover clustering could not be enabled for this storage object.
Activity ID: {<blanked>}
+ CategoryInfo : NotSpecified: (:) [New-Volume], CimException
+ FullyQualifiedErrorId : StorageWMI 46008,Microsoft.Management.Infrastructure.CimCmdlets.InvokeCimMethodCommand,New-Volume
+ PSComputerName : <blanked>

Any help on this would be greatly appreciated.

Thank you.

↧

error FailoverClustering-Manager - I don't have a cluster

December 14, 2018, 2:49 am

≫ Next: Cluster startup

≪ Previous: Get-Volume returns all volumes within Windows Failover Cluster instead of just local

Hi everybody.

a few days ago known a long series of errors as i said in the title ...
I checked that the associated service is disabled because i do not have a cluster but nothing changes.
I checked that there is no cluster configuration and they do not actually exist.
I still clean chache, old cluster configurations (although this machine has never seen a cluster)

As I have known of the strong slowdowns from a few days i noticed that it is concurrent with this error but I just can not get it back to normal.

The error is as follows:

-	System

-	Provider

[ Name]

Microsoft-Windows-FailoverClustering-Manager

[ Guid]

{11B3C6B7-E06F-4191-BBB9-7099FFF55614}

EventID

4657

Version

Level

Task

Opcode

Keywords

0x8000000000000000

-	TimeCreated

[ SystemTime]

2018-12-14T10:29:33.170788700Z

EventRecordID

2448336

-	Correlation

[ ActivityID]

{5D23092D-9393-0005-2381-235D9393D401}

-	Execution

[ ProcessID]

2792

[ ThreadID]

7864

Channel

Microsoft-Windows-FailoverClustering-Manager/Admin

Computer

ENPAPNAS03.Ente_Enpap.local

-	Security

[ UserID]

S-1-5-18

-	EventData

Parameter1

Get-ClusterNode

Parameter2

Il Servizio cluster non è in esecuzione. Verificare che il servizio sia in esecuzione su tutti i nodi del cluster.

thanks for your supports

↧

Cluster startup

November 26, 2018, 7:23 pm

≫ Next: for the second time in 6 weeks a 6 Node S2D cluster goes down after a simple disk failure. How is that possible ?

≪ Previous: error FailoverClustering-Manager - I don't have a cluster

Suppose I have a 4 nodes cluster + 1 FSW. And at the beginning the 4 nodes are shutdown .

I would like to startup the first 2 nodes, however the cluster seems identify down. (maybe not enough quorum).

When I startup the 3rd nodes, the cluster is up.

May I know this is normal ? I suggest wonder the vote of FSW should also make the cluster have quorum .

↧

for the second time in 6 weeks a 6 Node S2D cluster goes down after a simple disk failure. How is that possible ?

December 16, 2018, 7:59 am

≫ Next: windows 2019 s2d cluster failed to start event id 1809

≪ Previous: Cluster startup

We are using a 6 node DELL R740xd. The cluster passed the cluster validation test.

the node's did not reboot but the Cluster Shared Volume was off line and came only back to live after a 22! hour check...

after the CSV was back online we found all VM's were being moved onto one host.

after examining the logfiles it looked like the cluster service removed the owner node off the CSV from the cluster instead of deactivating the disk, or the node containing the faulty disk.

any suggestions as to why this is possible.

Peter

↧

windows 2019 s2d cluster failed to start event id 1809

October 4, 2018, 2:26 am

≫ Next: 2 nodes fail over cluster and disk witness quorum fail on fail over test

≪ Previous: for the second time in 6 weeks a 6 Node S2D cluster goes down after a simple disk failure. How is that possible ?

Hi I have lab with insider windows 2019 cluster which I inplace upgraded to rtm version of 2019 server and cluster is shutdown after while and event id 1809 is listed

This node has been joined to a cluster that has Storage Spaces Direct enabled, which is not validated on the current build. The node will be quarantined.

Microsoft recommends deploying SDDC on WSSD [https://www.microsoft.com/en-us/cloud-platform/software-defined-datacenter] certified hardware offerings for production environments. The WSSD offerings will be pre-validated on Windows Server 2019 in the coming months. In the meantime, we are making the SDDC bits available early to Windows Server 2019 Insiders to allow for testing and evaluation in preparation for WSSD certified hardware becoming available.

Customers interested in upgrading existing WSSD environments to Windows Server 2019 should contact Microsoft for recommendations on how to proceed. Please call Microsoft support [https://support.microsoft.com/en-us/help/4051701/global-customer-service-phone-numbers].

Its kind weird because my s2d cluster is running in VMs is there some registry switch to disable this stupid lock ???

↧

2 nodes fail over cluster and disk witness quorum fail on fail over test

December 18, 2018, 12:33 am

≫ Next: When to run Cluster validation wizard on a new node

≪ Previous: windows 2019 s2d cluster failed to start event id 1809

Hi,

I have configured Windows Server 2012 R2 failover cluster with 2 nodes.

I set quorum mode to disk witness with mapped disk from SAN Storage which is connected to servers directly.

and configured heartbeat network with direct server connection.

there were no warnings and fails on cluster validation test so, I did failover test.

I pulled LAN cables from active node, the quorum disk tried to move to standby node but, it couldn't be online.

it seems to be 'Split brain' is it possible to solve this situation with disk witness quorum?

or it file share witness only the solution?

thanks,

↧

When to run Cluster validation wizard on a new node

December 19, 2018, 2:29 pm

≫ Next: Cluster Shared Volumes Related

≪ Previous: 2 nodes fail over cluster and disk witness quorum fail on fail over test

I have an existing production multi-site/subnet 2012 failover cluster and I'm going to be adding a new node per site. I'm just trying to verify when I run the Cluster Validation wizard, is it before I install the failover cluster feature on the new node or after the feature is installed?

Salvador Diaz III

↧

Cluster Shared Volumes Related

December 20, 2018, 7:33 am

≫ Next: computer not working

≪ Previous: When to run Cluster validation wizard on a new node

Good Morning. I am in the process of setting up a 3 node Hyper-V Failover Cluster using Windows Server 2016 Datacenter Edition. I am using HPE DL380s and an HPE 2050 MSA. The MSA is configured for RAID 6 and has gobs of room (30+TB). I intend to create a 512 MB partition for use as a Quorum/Witness disk (kept in reserve in case of cluster node additions/removals) and the rest of the space for use as CSVs.

Should I create a single CSV using the remaining available space or 2-3 CSVs? Is there any advantage to multiple CSVs? Are all CSVs required to be hosted by a single node?

Thanks,
Vint

Thanks, Vint

↧

computer not working

December 21, 2018, 4:31 pm

≫ Next: add new node to existing failover cluster

≪ Previous: Cluster Shared Volumes Related

it locked up this afternoon I turned it off for awhile but will not come online now?>

↧

add new node to existing failover cluster

December 26, 2018, 5:57 pm

≫ Next: Guys caw we get the RegKey back for the passing WSSD on the windows server SSD cluster please ...

≪ Previous: computer not working

Hi,

I have problem adding new node to existing failover cluster. Existing failover cluster is a two node cluster for SQL AlwaysOn Availability group. There are no shared volumes. I want to add a new node in different domain and subnet into the cluster but failed. Is there any requirement for that?

↧

Guys caw we get the RegKey back for the passing WSSD on the windows server SSD cluster please ...

November 30, 2018, 7:44 am

≫ Next: Creating cluster (Windows 2016) for SQL 2014

≪ Previous: add new node to existing failover cluster

as per the official Microsoft position on windows server 2019 Data Centre

"...When can I deploy Storage Spaces Direct in Windows Server 2019 into production?

Microsoft recommends deploying Storage Spaces Direct on hardware validated by the WSSD program. For Windows Server 2019, the first wave of WSSD offers will launch in February 2019, in about three months.

If you choose instead to build your own with components from the Windows Server 2019 catalog with the SDDC AQs, you may be able to assemble eligible parts sooner. In this case, you can absolutely deploy into production – you’ll just need to contact Microsoft Support for instructions to work around the advisory message. ..."

regards,

Alex

↧

Creating cluster (Windows 2016) for SQL 2014

December 27, 2018, 8:44 pm

≫ Next: Windows Server 2016 Hyper-V cluster : Microsoft-Windows-Hyper-V-VMMS Event ID 20501

≪ Previous: Guys caw we get the RegKey back for the passing WSSD on the windows server SSD cluster please ...

Creating cluster for SQL

Hello friends.

is it possible for me to assign the same volume to multiple resource groups?

my goal is to allocate a volume so that we all have access to that volume at the same time.

resource groups will be created for each instance of SQL, with an individual volume. but the backup voume I wish it was unique to all nodes.

otherwise I'll have to create one volume for each instance and one backup for each instance as well.

Example:

Resource groups: SQL Instance A - Volume D: \ (Databases) - Volume Z: (Backup)
Resource groups: SQL Instance B - Volume E: \ (Databases) - Volume Z: (Backup)
Resource groups: SQL Instance C - Volume F: \ (Databases) - Volume Z: (Backup)
Resource groups: SQL Instance D - Volume G: \ (Databases) - Volume Z: (Backup)

Thank you.

↧

Windows Server 2016 Hyper-V cluster : Microsoft-Windows-Hyper-V-VMMS Event ID 20501

March 13, 2018, 7:27 pm

≫ Next: How can we move the Quorum Disk from Node1 to Node2 ? - Windows 2012 R2 - Hyper-V Clustering

≪ Previous: Creating cluster (Windows 2016) for SQL 2014

I found this warning on all hyper-v cluster node.

This warning are generate every 1 minute.

The description for Event ID 20501 from source Microsoft-Windows-Hyper-V-VMMS cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

If the event originated on another computer, the display information had to be saved with the event.

The following information was included with the event:

%%2147943788
0x8007056C

The locale specific resource for the desired message is not present

↧

How can we move the Quorum Disk from Node1 to Node2 ? - Windows 2012 R2 - Hyper-V Clustering

December 3, 2018, 6:08 am

≫ Next: Cluster volume disconnected

≪ Previous: Windows Server 2016 Hyper-V cluster : Microsoft-Windows-Hyper-V-VMMS Event ID 20501

Hello,

We have created a cluster with 2 Nodes and created a role for File share. There are totally 3 Disks in the cluster, among three we have allocated 1 disk as Quorum Disk.

When Node1 is powered off, all the 3 disks are moving to Node2 automatically. But I would like to know how can we move the Quorum Disk from Node1 to Node 2 when both the nodes are active ?

We can move the 2 Disks from Node1 to Node2 while both the Nodes are Powered On, but through the same option I am unable to Move the Quorum Disk from Node1 to Node2 (Right Click on the Disk -> Move -> Select Node).

Kindly suggest on this !!

Thanks & Regards,

Anoop Nair.

Anoop Nair

↧

Cluster volume disconnected

December 31, 2018, 12:45 am

≫ Next: Generic service resource using cluster name

≪ Previous: How can we move the Quorum Disk from Node1 to Node2 ? - Windows 2012 R2 - Hyper-V Clustering

Hi,

We are receiving the following error on our hyper-v cluster nodes...

Cluster Shared Volume 'CSV01' ('CSV01') is no longer accessible from this cluster node because of error 'ERROR_TIMEOUT(1460)'. Each node is experiencing the same issue apart from the coordinator node. Each affected node is also experiencing intermittent flashing screens when RDP'ing into the server.

Error reported in eventvwr was;

Cluster Shared Volume 'Volume1' ('CSVDisk') has entered a paused state because of '(c0000010)'. All I/O will temporarily be queued until a path to the volume is reestablished.

Could you please help..

↧

Generic service resource using cluster name

January 2, 2019, 2:58 am

≫ Next: The system detected an address conflict for IP address

≪ Previous: Cluster volume disconnected

Here's our issue
We have a windows 2012 MSCS Failover cluster consisting of 2 nodes. The cluster has several windows generic service resources setup all of which are dependent on the cluster name/IP. In each of the cluster services we set it to not use the cluster name (clustered service properties -> Use network name for computer name). For some reason, one of the services is still registering its connection (this is a proprietary connection record in our application) in the database as the cluster name. I am looking for help/ideas why this could be happening. It is causing clients to connect using the record in our database (proprietary connection record) and due to application limitations there is some data corruption. We cannot recompile the application to fix it with code.

In our test environment we can make the issue happen and mitigate it by simply checking / unchecking the box to use the cluster name (the setting noted above). For some reason, in production that check box doesn't seem to make difference.

↧

The system detected an address conflict for IP address

January 3, 2019, 2:16 am

≫ Next: S2D Volumes meta data

≪ Previous: Generic service resource using cluster name

Hi All,

I've a 2 node Windows Server 2012R2 Failover Cluster, quorum is on a network share. The cluster is up and running since some years.

I've performed a reboot of the passive node and the whole cluster went down, I'm investigating the issue. The first error I find is a failure of the cluster ip address itself:

The system detected an address conflict for IP address 172.16.20.69 with the system having network hardware address 00-50-56-B7-4E-50. Network operations on this system may be disrupted as a result.

this error has been logged on the passive node and the MAC address with conflicting IP is the MAC address of the public interface of the other node.

↧