Quantcast
Channel: High Availability (Clustering) forum
Viewing all 4519 articles
Browse latest View live

Failover Cluster Manager bug on Server 2019 after .NET 4.8 installed - unable to type more than two characters in to the IP fields

$
0
0

We ran into a nasty bug on Windows Server 2019 and I can't find any KB articles on it. It's really easy to replicate. 

1. Install Windows Server 2019 Standard with Desktop Experience from an ISO. 

2. Install Failover Cluster Services.

3. Create new cluster, on the 4th screen, add the current server name. This is what it shows:

cluster services working correctly before .NET 4.8 is installed

4. Install .NET 4.8 from an offline installer. (KB4486153) and reboot.

5. After the reboot, go back to the same screen of the same Create Cluster Wizard and now it looks different:

cluster services broken afte.NET 4.8 is installed - unable to put in a 3-digit IP

Now we are unable to type in a 3 digit IP in any of the octet fields. It accepts a maximum of two characters. 

Has anyone else encountered this? It should be really easy to reproduce. 


Cluster name pointing to Passive node instead of active node

$
0
0

Hi!

I have a Windows Server 2008 R2 two node cluster.

Cluster name:   CLUSM00

Active node:      CLUSM001

Passive node:   CLUSM002

There are a number of SQL instances on the Active node. When I rdp to the clustername it logs onto the passive node... This is causing havoc with the SQL backups (As they point to the clustername).

Any Idea how to make the clustername point to the active node?

Thanks,

Zoe

Windows Server 2016 Failover Cluster Get-Volume lists all volumes

$
0
0

I created a 2-node failover cluster in my Hyper-V environment. 

My concern here is that when I RAN:

Format-Volume -DriveLetter D

The D drives on both nodes were formatted.

When I ran Get-Volume on one of the nodes, I noticed that my D & E drives on each node was listed twice.

I noticed that 'Storage Replica' was added as a Cluster Resource Type and that the following device is installed:

Microsoft ClusPort HBA

Which some cursory research says:

"The Software Storage Bus (SSB) is a virtual storage bus spanning all the servers that make up the cluster. SSB essentially makes it possible for each server to see all disks across all servers in the cluster providing full mesh connectivity. SSB consists of two components on each server in the cluster; ClusPort and ClusBlft. ClusPort implements a virtual HBA that allows the node to connect to disk devices in all the other servers in the cluster. ClusBlft implements virtualization of the disk devices and enclosures in each server for ClusPort in other servers to connect to."

Is this by design? Is there a way to disable this? How do we fix this?

Windows Server 2016 Standard, running on Hyper-V



Drain role Failed

$
0
0

We have three Node N-1, N-2, N-3. I drain role from N-2 and there10 VM's Moved out of 14. 4 VM are not moving getting error . Tried to move manually but the error same. Please assists All the Node in are WIN-2012 R2

Error Message : "operation did not complete on resource virtual machine live migration"

Migrate Failover cluster role SQL server to new cluster shared volume (server 2019 - MSSQL 2017)

$
0
0

We have a 2-node failover cluster (windows server 2019) with iscsi cluster shared volume and on the cluster a SQL server 2017 role is running. We now have to migrate to our new storage. A new cluster shared volume is already configured on the failover cluster.

Searching the Internet for a migration guide the only thing I can find is rename the driveletter, but in our configuration we have no driveletters. 

Can I just copy the data to the new CSV and rename the volume name in de ClusterStorage folder?



s2d with 2 nodes

$
0
0

dears,

i need your help and advise with the below:

Please find below the scenario:

A new 2 nodes s2d will be created in order to host virtual machines that will be migrated from a standalone hyperv to this newly created cluster.

The standalone hyper-v has three different virtual switches:

  1. One for the LAN network communication
  2. One for the CCTV
  3. One for the DMZ

The new s2d cluster has on each nodes 2 nics who will be teamed using the set switch command.

The output will result to have one set switch.

I will create for the set switch three network adapters ( one for the cctv, one for the dmz, one for the smb/hb)

I will assign the LAN ip on the set switch itself.

Will migrate the vms from the standalone to the new s2d.

Vms already connected to the LAN will be assigned the set switch ( vswitch) created.

The question how to assign the other net adapters( cctv, dmz) to the virtual machines migrated which were connected to cctv/dmz hyperv vswitch on the standalone hyperv( because if we go the settings of the vm, we can just see the set switch created and not the adapters)

please guys your help is appreciated as im gonna deploy the project soon and i the network part is giving me a headache as i cannot test it in lab

best regards,

Server 2019 - Clustering Issue with UDP Port 3343

$
0
0

Evening all,

We've recently ran through an in-place upgrade of our Cluster servers from 2016 to 2019 Datacenter edition. It all went very smooth and everything seemed to be working.

However - a week later and suddenly the cluster is failing! A validation report shows that the three nodes cannot talk to each other on any cluster network via UDP Port 3343.

I've opened the necessary ports on the firewall, in addition to what HyperV probably adds - but it's still happening. I've restarted the machines - still happening. I've used Telnet on the servers to test they are open - and they are.

This was working fine as 2016 - so something has gone wrong somewhere.

I'd be grateful if anyone could help or offer advice on what to look for.

Thanks

Gareth

Windows 2012 r2 Cluster issues - Guest vms fail when one specific node hosts CSV

$
0
0

I have a Windows Server 2012 r2 cluster set up with 3 nodes.

2 nodes, vm3 and vm5, have no issues acting as owner of any role, including the CSV volumes, Quorum Disk Witness, and the individual VMs.  

1 node, vm1, has no issues owning any of the individual VM roles, one of the CSV volumes (high-speed-lun), or the Quorum Disk Witness.  However, if vm1 is set as the owner of LUN_1 or LUN_2, any of the VMs that have their OS vhd(x) file hosted on those LUNS and are not owned by vm1, fail and can't be restarted. 

The VMs that 

  • a) are owned by vm1 and have their os vhd(x) files on the a LUN that is owned by vm1 or,
  • b) are owned by any vm host and have their os vhd(x) files on the "high-speed-lun" no matter what node owns "high-speed-lun"

are not affected and have no issues booting or running.  It does not matter if LUN/CSV ownership fails over automatically, or if I manually change the owner node to vm1, any running VM that does not fit one of the above 2 descriptions will immediately die and not be able to restart.

Some scenarios that will hopefully clarify this issue a bit:

  1. vmguest1 and vmguest2 are hosted on vm1 node and their os storage is located on LUN_2, which is owned by vm5 node. this is not a problem and everything works.  also no issues if this is reversed.
  2. vmguest1 is owned by vm1 and vmguest2 is owned by vm3 node and their os storage is located on "high-speed-lun", which is owned by vm1 node.  This is not a problem and everything works.
  3. vmguest1 is owned by vm1 and vmguest2 is owned by vm3 node, with both os storage located on LUN_1, which is owned by vm1 node.  vmguest1 will be fine, while vmguest2 will fail to run/start.

When this issue occurs, I see the following errors in the Cluster Events/Event Viewer:

  • Error, Event ID 1069 "Cluster resource 'Virtual Machine vmguest1' of type 'Virtual Machine' in clustered role 'vmguest1' failed. The error code was '0x780' ('The file cannot be accessed by the system.').
  • Error, Event Id 1205 "The Cluster service failed to bring clustered role 'vmguest1' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role."

I know this is a lot of info, just trying to give as clear of an outline of the issues I'm seeing as possible up front.

Any thoughts anyone has to help get this all cleaned up would be greatly appreciated.


In the interest of reducing questions about the cluster setup/environment, I'm going to try and get all of the potentially relevant info here in one fell swoop below.

Node info ("vm1", "vm3", "vm5"):

  • all 3 nodes are running 2012 r2,
  • all have the same updates [verified by cluster validation],
  • 2x xeon e5-2430l hex-core, 64gb memory,
  • 2x onboard nics teamed for cluster comms,
  • 2x onboard nics teamed and assigned to hyper-v switch,
  • 4x nics on individual subnets for communication with SAN
  • only known physical difference between the nodes is that vm1 has it's OS drive set up as a 2-disk 558GB RAID1, while vm3/vm5 have their OS drives set up as 4-disk 1.1tb RAID10.
  • all AD Joined with 3 DCs in 2 locations, 2 remote in the satellite office, 1 in the dc local to this cluster on separate hardware.  All AD tests/replication/etc have been tested and are, to the best of my knowledge, working properly.

Storage hardware ("dcsan"):

  • Dell MD3200i with dual controllers
  • each controller has 4 nics that are set up on individual subnets to match how the server nics are configured
  • One disk group set up as RAID10 across 8 physical 2tb, 7.2k rpm drives, with 7,430 gb total storage available ("Disk Group 0")
  • One disk group set up as RAID5 across 4 physical 600gb, 15k rpm drives, with 1,660 gb total storage available ("Disk Group 2")
  • MPIO is configured on each server node

Dell MDSM host mappings (see screenshot, actual host names changed for security):


The LUNs are available in Storage->Disks on each node as follows (LUN name in screenshot above, LUN Size, disk group, assigned to, Disk Number):

  1.     High-Speed-lun (HighSpeed1, 1.6 tb, Disk Group 2, Cluster Shared Volume, 4)
  2.     LUN_1 (Lun_1, 3.5tb, Disk Group 1, Cluster Shared Volume, 3)
  3.     LUN_2 (LUN_2, 3.5tb, Disk Group 1, Cluster Shared Volume, 3)
  4.     Quorum Witness (Cluster_Quorum, 520 mb, Disk Group 1, Disk Witness in Quorum, 1)

Cluster Roles:

    approx 20-25 guest vms, majority running 2012 r2, with a few running ubuntu (14.04-18.04 os)



Collecting Cluster Performance Data

$
0
0
I’ve been using Windows Admin Center to view performance data as I execute different types of workloads in a Cluster with VMs and record the results. It works well visually, but if I run the workload for a specific amount of time the data can be skewed depending on when the snapshot at the moment was taken when I record those results. I think results that show a high, low, and average number to a counter may be better to compare with other results. Looking to collect the obvious CPU, Memory, IOPS, Latency, Throughput, etc.

I’m looking for an efficient method to collect this from nodes in a cluster running Hyper-V and S2D. Should I run PerfMon on all the nodes, or is there a way to calculate more efficiently using something like Get-ClusterPerformanceHistory, anything else I am missing?


T.J.


Admin Center hyper-converged cluster error ('There are no more endpoints available from the endpoint mapper.').

$
0
0

Hello! I have  hyper-converged s2d cluster on windows server 2016 nodes. I'm trying manage it whith admin center. Everything was done with help of this article https://docs.microsoft.com/en-us/windows-server/manage/windows-admin-center/use/manage-hyper-converged

But when i'm trying to connet whith admin center to s2d i get the error "Unable to create the "SDDC Management" cluster resource (required)" and int cluster events i recieve this error:

Cluster resource 'SDDC Management' of type 'SDDC Management' in clustered role 'Cluster Group' failed. The error code was '0x6d9' ('There are no more endpoints available from the endpoint mapper.').

 

then

The Cluster service failed to bring clustered role 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.

and then

Clustered role 'Cluster Group' has exceeded its failover threshold.  It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state.  No additional attempts will be made to bring the role online or fail it over to another node in the cluster.  Please check the events associated with the failure.  After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period.

What i'm done wrong?

Disk Sharing - Server 2016 Cluster

$
0
0

I'm sure this is probably a newbie question, but it is so hard to find information.

A person would think the ability to do this would be the most fundamental idea in failover clustering.

I have Failover Clustering set up, everything seems to be working fine. Also have SQL server (2017) clustering set up and everything seems to be working fine. I am able to run SQL queries, etc. from both nodes and a different computer.

The problem is support files. We have spreadsheets and document templates that users need to be able to access. I try to put them on the cluster nodes, but I cannot "Share" any of the folders in Active Directory

The File Server role is installed on both nodes (I call them SQL2 and SQL3). The drive I wish to share is listed in the Failover Cluster Manager as "Available Storage".

I put the drive in maintenance mode, and was able to share a folder on SQL3, but it seems to be shared only on that node (the path is '\\SQL3\HDrive')

In Computer Management,  on SQL2 , I cannot even see the drive in "Sharing." or in File Manager. But it does show up in Cluster Manager...

Will SQL2 see it if SQL3 goes offline?

Have I got it right? Or what am I missing?

2019 Hyper-V Cluster - Quorum

$
0
0

Hi All,

I just finished setting up a hyper-V cluster in our environment, a 3 node cluster.

since the cluster node count is "Odd" will it still be recommended rather necessary to put a witness (disk) in this case?

what i normally practice is to only put a witness when the cluster counts is "even" (for tie-breaking)

hopefully you can share me your thoughts

Thanks

Cluster Aware Update

$
0
0

Hi,

I have Windows Server 2012 R2 Cluster having 3 nodes and 15 to 15 VMs over Hyper-V Cluster, Normally for Windows update we use local WSUS. Firstly we download update on each cluster machine, install updates and reboot if required and repeat same procedure step by step for each cluster node.

Can i use CLUSTER AWARE UPDATE mechanism to update my Cluster Nodes, please note that i install security updates, update roll-ups and etc.



Please comments  



Cluster Shared Volumes: How to mask/unmask disk???

Upgrading the Network Load Balancing(NLB) cluster from 2008r2 to 2012r2


CSV Autopause - Single client contification start.

$
0
0

HI,

I've just got a warning from my cluster that one of my CSVs was stopped. But I just dont get what was going on.

From the Failoverclustering-CsvFs protocol I get this message:

"Volume {44179469-89e8-4971-b9ff-057c4579c647} is autopaused. Status 0xC00000C4. Source: Single client contification start."

What does that even mean? Single Client contification?

Best Regards

Daniel

Cluster Aware Update

$
0
0

Hi,

I have Windows Server 2012 R2 Cluster having 3 nodes and 15 to 15 VMs over Hyper-V Cluster, Normally for Windows update we use local WSUS. Firstly we download update on each cluster machine, install updates and reboot if required and repeat same procedure step by step for each cluster node.

Can i use CLUSTER AWARE UPDATE mechanism to update my Cluster Nodes, please note that i install security updates, update roll-ups and etc.



Please comments  



Failover Cluster Manager bug on Server 2019 after .NET 4.8 installed - unable to type more than two characters in to the IP fields

$
0
0

We ran into a nasty bug on Windows Server 2019 and I can't find any KB articles on it. It's really easy to replicate. 

1. Install Windows Server 2019 Standard with Desktop Experience from an ISO. 

2. Install Failover Cluster Services.

3. Create new cluster, on the 4th screen, add the current server name. This is what it shows:

cluster services working correctly before .NET 4.8 is installed

4. Install .NET 4.8 from an offline installer. (KB4486153) and reboot.

5. After the reboot, go back to the same screen of the same Create Cluster Wizard and now it looks different:

cluster services broken afte.NET 4.8 is installed - unable to put in a 3-digit IP

Now we are unable to type in a 3 digit IP in any of the octet fields. It accepts a maximum of two characters. 

Has anyone else encountered this? It should be really easy to reproduce. 

Can't move VMs in cluster to a particular host

$
0
0

I have a 3 node, 2016 Datacenter cluster. Multiple VMs. Right now, all VMs are on hosts 1 and host 2. If I try to live migrate to host 3, I get event id 1069 and 21502. I can migrate between hosts 1 and 2 at will with no problem. Even when I try a quick migrate, the VM appears to move to host 3, but when I start it, it fails immediately.

The thing I've noticed is that I can access the Cluster Shared Volume from windows explorer on host 1 and host 2. If I try to access it on host 3 I get:

C:\clusterstorage\volume1 is not accessible. The referenced account is currently locked out and may not be logged on to.

The 1069 error reads:

Cluster resource 'Virtual Machine X' of type 'Virtual Machine' in clustered role 'X' failed. The error code was '0x775' ('The referenced account is currently locked out and may not be logged on to.').


Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

The 21502 error mentions:

'Virtual Machine X' failed to start.

'X' failed to start. (Virtual machine ID blah_blah)

'X' failed to start worker process: The referenced account is currently locked out and may not be logged on to. (0x80070775). (Virtual machine ID blah_blah)

'Virtual Machine Name Unavailable' could not initialize. (Virtual machine ID blah_blah)

'Virtual Machine Name Unavailable' could not read or update virtual machine configuration: The referenced account is currently locked out and may not be logged on to. (0x80070775). (Virtual machine ID blah_blah)

What account is it referring to? Shouldn't this be happening across all hosts instead of host 3?

Any help is much appreciated.

Windows cluster - Hardware migration with Windows 2016 upgrade

$
0
0

Dear Cluster Gurus.

This is my scenario:

Existing setup : HP Gen 7 physical hosts with Windows 2008 R2 --> We have two node windows cluster (node1 & node2) running Oracle database / 3rd part applications.  has 4 virual IP's and 10 shared disks

New Setup : we want to migrate the above clusters to new Hardware HP gen 10 with windows 2016 LTSB. without changing the hostnames and cluster names..

Proposed method 1: Take image of node1 and node2 (wind 2008 R2) --> clone that image to new Hadware node1 & node2 --> shutdown the old node1 & node2 --> startup the new node1 & node2 --> start windows 2016 upgrade on node1 & node2 

Does the above method work?. anyone tried this?.

what are the alternate solutions?

Thanks

Viewing all 4519 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>