Quantcast
Channel: High Availability (Clustering) forum
Viewing all 4519 articles
Browse latest View live

Cluster-aware updating - Self updating not working

$
0
0

Hi,

I have a Windows Server 2012 failover cluster with 2 nodes, and I am having problems gettign the Self updating to work properly.

The Analyze CAU Readiness does not report any issues, and I have been able to run a remote update with no problems. I don't get any errors or failure messages in the CAU client, only this message: "WARNING: The Updating Run has been triggered, but it has not yet started and might take a long time or might fail. You
can use Get-CauRun to monitor an Updating Run in progress."

In the Event Viewer is see 2 errors and 1 warning for each run, Events 1015, 1007 and 1022.

1015: Failed to acquire lock on node "node2". This could be due to a different instance of orchestrator that owns the lock on this node.

1007: Error Message:There was a failure in a Common Information Model (CIM) operation, that is, an operation performed by software that Cluster-Aware Updating depends on.

Does anyone have any idea what is causing this to fail?

Thanks!


StorageSpacesDirect does not reclaim pool diskspace for a removed virtual disk

$
0
0

we have a server16 CU8 2-node S2D hyperV cluster with several cluster shared volumes. four of them have been decommissioned and deleted from the storage pool, including remove-virtualdisk for the detached objects. the pool shows the correct amount of free space (capacity volume). however, while for the first three virtualdisks their space was given back for new allocation, the fourth virtualdisk has been deleted from a powershell perspective but the storage pool does not offer the gained space for allocating a new virtualdisk, so it's essentially still allocating the blocks in the background somewhere. there is a miscrepancy in displayed free space at the storagepool view vs. what the virtualdisk wizard offers for volume creation.

any ideas how to cleanup the storagepool to reclaim the dead diskspace for usage?

thank you!

Thorsten

VM's went to stopping state at the time of live migration

$
0
0

We have 3 Node Cluster for Hyper V and around 50VM are running on the cluster. We initiate the live migration for one of the VM from node 2 to node 1 and it shows VM got moved to node 1 but the status is still Live Migrating within a min the VM resource got failed and after some time all the VM's which are running on the node 1 went to stopping state.  After the hard reboot of node 1, all the VM's are failover to other nodes and came online.

This we experienced in different clusters in the past two months. 



OS NameMicrosoft Windows Server 2016 Datacenter   - System Model:PowerEdge R630

A component on the server did not respond in a timely fashion. This caused the cluster resource 'Virtual Machine CHNXXXXXPS01' (resource type 'Virtual Machine', DLL 'vmclusres.dll') to exceed its time-out threshold. As part of cluster health detection, recovery actions will be taken. The cluster will try to automatically recover by terminating and restarting the Resource Hosting Subsystem (RHS) process that is running this resource. Verify that the underlying infrastructure (such as storage, networking, or services) that are associated with the resource <g class="gr_ gr_20 gr-alert gr_gramm gr_inline_cards gr_run_anim Grammar multiReplace" data-gr-id="20" id="20">are</g> functioning correctly.


Solution to configure the Hyper V Node in HA without Failover Cluster

$
0
0

Currently, we have clustered hyper V nodes in multiple data centers with iSCSI Storage. Failover Cluster creates multiple platform downtimes even if a VM did not respond to cluster. We have disabled the heartbeat monitoring for the VM in the cluster even we are facing the issue. 

Is there any solution to configure the hyper V node VM's in HA without failover cluster. 


Expanded Disk using Diskpart, now cluster shared volume shoes Failed to get the volume number \\&globalroot\device\***

$
0
0

I have a Nimble storage array that I expanded.

Then ran diskpart on server to expand disk.

When I go into cluster manager, storage, disk size reports correctly but volume does not show the correct size.

I also have two error messges 1069 resource control manager and 5150 - cluster shared volume

Unable to create a new database, or restore an existing database on SQL 2014 FCI

$
0
0

Hello Technet Support,

We have recently built a SQL Server 2014 Standard version with Failover Cluster Instance consists of two nodes only running on Windows Server 2016.

When we try to create a new database, or restore an existing database, unable to do that, faced with the below errors.

Please assist.

Thank you, Anand


Anand Franklin

3 Node HA cluster with 2 CSV's, only use 2 members for one csv ?

$
0
0

Hi, just going through a planning stage at the moment and wondered the following, with a 3 node cluster and one csv, all members acces this via iscsi (if that matters), want to add a second csv but only want 2 members to access this (via direct SAS connection), is this possible ?

S2D Reserve Capacity

$
0
0

Hi,

We are starting to implement S2D in our environment. Can anyone confirm how the reserve capacity thing works? Our nodes have 20 capacity disks so I want at least 2 disks worth in each node for reserve.

How do I do that? Is it literally just a case of not filling up the pool to the amount you want? I.e:

20X 600GB disks (+ 4X 800GB SSD for cache) = 24TB total physical storage
minus 4X 600GB for reserve = 21.6TB
divided by 2 (2 nodes, 2-way mirror) = 10.8TB usuable

So don't fill up the pool more than 10.8TB and you'll have 4 disks worth of reserve capacity?

Is that how it works?

Every time I read about setting reserve capacity I keep wanting to actually mark a physical disk as a reserve disk, like in the traditional hardware-RAID world.

Thanks



Error validating cluster computer resource name (Server 2016 Datacenter Cluster)

$
0
0

    An error occurred while executing the test.
    The operation has failed. An error occurred while checking the Active Directory organizational unit for the cluster name resource.

    The parameter is incorrect

    Interesting enough the cluster name was created successfully in the Computers OU and the cluster can be taken offline and brought back online with no problem. The DNS entry is correct and the cluster name pings to the correct IP.  Changing the name of the cluster will update the cluster computer name in AD with no errors.


SCSI disk doesn't reconnect when failover cluster is mounted

$
0
0

Before creating the failover cluster, I create my storage LUNS to put the VMs and the Quorum to the failover cluster storage setup, but when creating it, the SCSI disks doesn't become available to the cluster, and in the SCSI target connection, they stay with "reconnecting" status.

What should I do to make this storage available in the cluster?

IIS Role Based Failover Clustering - Failover Issue

$
0
0

Hi All,

I have deployed Windows failover clustering and added two nodes onto it. Now we have a requirement to add role based Failover for IIS. I tried adding WWW services in role, but the output is not as expected - failover is not happening.

Also i have downloaded Failover Script for websites from Technet and implemented it via Generic Script. But again the site is not getting failover.

Can someone help me out of this.

Regards,

Ramesh k


Failover clustering event id 5120

$
0
0

Hello everyone,

Every few days we receive the following message from our Hyper-V Cluster.

Cluster Shared Volume 'disk 1' ('disk 1) has entered a paused state because of '« STATUS_USER_SESSION_DELETED(c0000203) »'. All I/O will temporarily be queued until a path to the volume is reestablished.

We are running on Windows Server 2016 on HP BL660 Servers.

All the VM's off the CSV volume stay online and there are no interruption within the VM's. Anybody a idea how ik can solve the message. 

Windows Failover Cluster Best Practice - Should the Cluster Name and Resource Name should be a part of OU where Baseline GPO's are applied

$
0
0

HI All,

For Windows Failover Cluster What is the Best Practice ?

Should the Cluster Name and Resource Name should be a part of OU where Baseline GPO's are applied or not ?

[Oracle, MSCS] Service is not brang to another node when service is down

$
0
0

Hello, all

I deployed environment that two oracle instances are being operated upon on Microsoft Cluster HA.

Thing weired is when I shutdown active node or bring service onto another node manually then both work definetely.

However; in case of one service for oracle instance is stopped in the services.msc, it doesn't work.

It is never transsited its service to another node.

When I first configured this environment, I wanted to ensure that HA was possible even if either of the Oracle instances failed.

When I look at the cluster event, I see event id 1196 and event id 1069.

I looked at dns, referring to other technical documents, but did not get a clear answer. I would like to get a helpful opinion.

Thanx.

1090 & 7024 - adding new node

$
0
0

Hello,

How are you? I have created a new cluster on Hyper V server 2016. Actually it only have one node. 

When I want to add the second one appears the following errors:

Event ID 1090:

The Cluster service cannot be started. An attempt to read configuration data from the Windows registry failed with error '2'. Please use the Failover Cluster Management snap-in to ensure that this machine is a member of a cluster. If you intend to add this machine to an existing cluster use the Add Node Wizard. Alternatively, if this machine has been configured as a member of a cluster, it will be necessary to restore the missing configuration data that is necessary for the Cluster Service to identify that it is a member of a cluster. Perform a System State Restore of this machine in order to restore the configuration data.

Event ID 7024:

The Cluster Service service terminated with the following service-specific error: 
The system cannot find the file specified.

Event ID 7031:

The Cluster Service service terminated unexpectedly.  It has done this 67 time(s).  The following corrective action will be taken in 15000 milliseconds: Restart the service.

In this second node the "cluster service" appears on disabled. I tried starting manually and also failed.

I removed failorver cluster feature, then rrestarted and installed again but it failed too

Any ideas?

Thanks in advance

Regards


Windows 2016 file server stretch cluster - error 0x80071398 when moving to different node

$
0
0

Hi,

We want to deploy highly available file server with automatic failover between two sites and we followed the instructions from here:

https://docs.microsoft.com/en-us/windows-server/storage/storage-replica/stretch-cluster-replication-using-shared-storage

The only difference is that we have only one server in each site.

There were no issues during the configuration, but when we try to failback or move the node to the other site we get the following error:

"Error Code: 0x80071398 The operation failed because either the specified cluster node is not the owner of the group, or the node is not a possible owner of the group"

We have checked the possible owners and it all looks good. We even ran the following commands to make sure that the permissions are set correctly:

Get-ClusterResource | Set-ClusterOwnerNode Server1,Server2

Get-ClusterGroup | Set-ClusterOwnerNode Server1,Server2

We also tried evicting the second node and adding again to the cluster but still no luck.

Any help would be greatly appreciated.


HP MSA 2040 move data from Virtual RAID to Linear

$
0
0

Hi 

I have an MSA 2040 where at the beginning was created liek follows below:

 - 2x set of 12 disks (2.5 900GB, Virtual Raid) with  no spares.
 - 2x set of 5 disks (3.5 4TB, Virtual Raid) with no spares

Right now with data on it we decided to use spare disks and on all sets so this is the configuration we want:

 - 2x set of 12 disks (2.5 900GB, Linear Raid) with  1x spares each.
 - 2x set of 5 disks (3.5 4TB, Linear Raid) with 1x spares each (already created)

With this we need to move the Data on 2.5 disks volumes to the 3.5 disks and create new Raid on them. After that restore the data to original point.

Whats your advice on this and best practices.

Best Regards 

File share witness issue |frequently goes offline

$
0
0
I have Windows 2012 2 node fail over cluster and File share witness configured. File share witness frequently goes offline and triggers this events. When I check the File share witness its accessible. Its coming online when I manually bring online. This continues frequesntly and create the tickets. Please suggest a solution


EveNT ID 1052
File share witness resource 'File Share Witness (2)' failed a periodic health check on file share '\\server\FileShareWitness'. Please ensure that file share '\\server\FileShareWitness' exists and is accessible by the cluster.

EveNT ID 1069

Cluster resource 'File Share Witness (2)' of type 'File Share Witness' in clustered role 'Cluster Group' failed.

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

EveNT ID 1564
File share witness resource 'File Share Witness (2)' failed to arbitrate for the file share '\\Server\FileShareWitness'. Please ensure that file share '\\server\FileShareWitness' exists and is accessible by the cluster.
EveNT ID 1205
The Cluster service failed to bring clustered role 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.

EveNT ID 1254
Clustered role 'Cluster Group' has exceeded its failover threshold.  It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state.  No additional attempts will be made to bring the role online or fail it over to another node in the cluster.  Please check the events associated with the failure.  After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period.

¿Can I mix NLB 2008R2 with 2016 temporally ?

$
0
0

Hello, I have a farm (NLB) of iis in windows server 2008 R2, and we are migrating to windows server 2016.
Can I temporarily have a mixed farm (NLB) between windows server 2008 r2 and windows server 2016?

Thanks!

Enrique.

windows 2019 s2d cluster failed to start even id 1809

$
0
0

Hi I have lab with insider windows 2019 cluster which I inplace upgraded to rtm version of 2019 server and cluster is shutdown after while and event id 1809 is listed 

This node has been joined to a cluster that has Storage Spaces Direct enabled, which is not validated on the current build. The node will be quarantined.
Microsoft recommends deploying SDDC on WSSD [https://www.microsoft.com/en-us/cloud-platform/software-defined-datacenter] certified hardware offerings for production environments. The WSSD offerings will be pre-validated on Windows Server 2019 in the coming months. In the meantime, we are making the SDDC bits available early to Windows Server 2019 Insiders to allow for testing and evaluation in preparation for WSSD certified hardware becoming available.

Customers interested in upgrading existing WSSD environments to Windows Server 2019 should contact Microsoft for recommendations on how to proceed. Please call Microsoft support [https://support.microsoft.com/en-us/help/4051701/global-customer-service-phone-numbers].

Its kind weird because my s2d cluster is running in VMs is there some registry switch to disable this stupid lock ???

Viewing all 4519 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>