Quantcast
Channel: High Availability (Clustering) forum
Viewing all 4519 articles
Browse latest View live

Migrating Cluster File Server to SAN with new LUN

$
0
0

Hello Dear,

Can you guide how can move my current Cluster File Server role to SAN with New LUN actaully i just wanted to take this thing without impacting my current file server confgurations.

Current Structure:

FILE SEVER Running on Physcial MAchine with CSV Disk, LUN,

and Now i wanted to migrate these server to newly purcahse SAN by creating new LUN.

Please guide me the process how should i approach.

Thank you !

Abdul Wahab


VM Windows Server licensing in Cluster

$
0
0


If mix of OS editions and versions are running in VMs and hosted in multi nodes cluster. How to license Windows OS.

Scenario: if a 3 node cluster running total 12 Virtual machine with Windows Server 2008 R2 Enterprise,Windows Server 2012 Standard and Windows Server 2012 R2 Standard edition and DRS is active without any rule specification. It is virtualized with Vmware technologies.

Regards,

Windows Server 2012-R2 or 2016 Failover cluster manager: multiple online resources

$
0
0

I was wondering if anybody experienced and/or resolved the following issue:

Windows Failover cluster Setup:

  • Two Windows 2016 or 2012-R2 server nodes: A and B with current Windows patches.
  • Generic Application DLL resource: implements IsAlive(), LookAlive(), Online() and Offline()
  • Virtual IP address resource: as a dependency of the Generic Application
  • Policy: configured to failover at the first failure
    1. Period for Restarts=15:00
    2. Maximum restarts in the specified period=0
    3. Delay between restarts=0

Issue:

When IsAlive() fails on A primary server, the cluster manager:

  • Does not call Offline() on A (leaving A online)
  • Moves VIP address from A to B
  • Calls Online() on B

As a result, both A and B Application resources are online.



Access denied when remotely trying with Get-NlbClusterNode

$
0
0

Trying to write a monitoring script for NLB Cluster status, which has 2 nodes.

I have 2 VMS (Win Srv 2016 Std): CB-1 and CB-2

When I run this command on these vms:

Get-NlbClusterNode

I get the output I need.


But if I try the same from a remote server (same network and domain) I get an:

Powershell
Accessisdenied.(ExceptionfromHRESULT:0x80070005(E_ACCESSDENIED))+CategoryInfo:+FullyQualifiedErrorId:AccessDenied,Microsoft.NetworkLoadBalancingClusters.PowerShell.GetNlbClusterNode

UAC already DISABLED

FIREWALL already DISABLED

WINRM already RUNNING

It's a clean installation in a demo server, so i can exclude any kind of systems problems

Why is that? It is ONLY the "Get-NlbClusterNODE" command that gives me access is denied. "Get-NlbCluster" for an example, works just fine.

# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ #$domainuser="dom\administrator"$domainpassword="xxxxxxx"|ConvertTo-SecureString-AsPlainText-Force$domaincredentials=New-ObjectSystem.Management.Automation.PSCredential($domainuser,$domainpassword)# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ #Invoke-Command-ComputerNamecb-1.dom.local-Credential$domaincredentials-ScriptBlock{Get-NlbClusterNode-HostNamecb-1.dom.local}

Large virtual machine reboot during live migration - WS 2012R2

$
0
0

Hello,

live migrating large VMs often fail on a 2012R2 cluster, same situation as described here:https://social.technet.microsoft.com/Forums/en-US/3436c57b-8832-4981-a09f-47361ed5db1d/live-migration-of-big-vms-fail. There is even a solution provided in this postings:

On all hosts, goto below in registry 
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization\Migration
Create a new key "NetworkBufferCount" as a DWORD with value of "1024" 
Reboot the host

Unfortunately it is not saying wheter this key is hexadecimal or decimal. Would also be nice to know how this works.  

Any help would be highly appreciated!

Regards

Ueli

Unable to migrate from new HyperV 2016 host in failover cluster error 21502 and 1069

$
0
0

I have a two node HyperV 2016 Failover Cluster. Both hosts are Dell PowerEdge R610s. I just added another host (Dell PE R620). The existing hosts have Xeon E5-2430 2.50GHz procs. The new host has Xeon E5-2690 2.90GHz procs. 

I've changed my VM CPU settings by enabling "Migrate to a physical computer with a different processor version", but when I try to Live migrate I get error 21502:

The virtual machine 'Win10test' is using processor-specific features not supported on physical computer 'CLUSTER1'. To allow for migration of this virtual machine to physical computers with different processors, modify the virtual machine settings to limit the processor features used by the virtual machine. 

When I Quick migrate it moves to one of the existing hosts, but fails with error 1069:

Cluster resource 'Virtual Machine Win10test' of type 'Virtual Machine' in clustered role 'Win10test' failed. The error code was '0xc0370029' ('Cannot restore this virtual machine to the saved state because of hypervisor incompatibility. Delete the saved state data and then try to start the virtual machine.').

Large virtual machine reboot during live migration - WS 2012R2

$
0
0

Hello,

live migrating large VMs often fail on a 2012R2 cluster, same situation as described here:https://social.technet.microsoft.com/Forums/en-US/3436c57b-8832-4981-a09f-47361ed5db1d/live-migration-of-big-vms-fail. There is even a solution provided in this postings:

On all hosts, goto below in registry 
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization\Migration
Create a new key "NetworkBufferCount" as a DWORD with value of "1024" 
Reboot the host

Unfortunately it is not saying wheter this key is hexadecimal or decimal. Would also be nice to know how this works.  

Any help would be highly appreciated!

Regards

Ueli

Adding Storage To existing cluster

$
0
0
I currently have a failover cluster setup.  I'm starting to run out of storage space.  I have some empty slots in the cluster and have hard drives to plug in.  What do I need to consider before doing this?  I was told that this is going to break both of my clusters and I'll have to rebuild them from scratch? What information do I need to get off of the cluster before beginning?

ISCSI export on CSV

$
0
0

Hi.

I would appreciate it if you understand the inexperienced English... :D

I want to use S2D as network block storage. because I configure S2D storage and use storage in ISCSI format on other types of hypervisors (KVM, vSphere, etc.).

How do I export a CSV configured through Windows Server S2D to an ISCSI target?

When I try to add the ISCSI target server role in the Failover Manager with all of the disks in the S2D cluster, the disk is missing.

I'm waiting for your answers.


Change number of RSS CPUs (Emulex under W2016 DC) - limited to 4

$
0
0

I have an odd behaviour of Emulex OneConnect OCm14104B-N1-D 4-port 10GbE rNDC NIC

Latest available driver: 11.4.1205.0

ONE NIC can not be configured with expected number of RSS queues. I want 8, it lets me only set 4

Set-NetAdapterRss : Value must be within the range 1 - 4
At line:1 char:1+ Set-NetAdapterRss -Name NIC1 -MaxProcessors 8+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ CategoryInfo          : InvalidArgument: (MSFT_NetAdapter...C0DC5BFE3473}"):ROOT/StandardCi...rRssSettingData) [
   Set-NetAdapterRss], CimException+ FullyQualifiedErrorId : Windows System Error 87,Set-NetAdapterRss
Yet each OTHER adapter from the same daughterboard DO allow 8

For the config I used this and this

Makes NO sense to me at all, anybody seen it?

Seb


S2D CSV unable change owner node. The error code was '0x6f7' ('The stub received bad data.').

$
0
0

i have 2 node PowerEdge R740xd S2D enabled. Currently facing an issue unable to change owner node for cluster shared volume in Failover Cluster Manager. 

Cluster resource 'Cluster Virtual Disk (Volume2)' of type 'Physical Disk' in clustered role '906fbd9e-8861-441f-8191-dcb894585dd4' failed. The error code was '0x6f7' ('The stub received bad data.').

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Server 2008 R2 Cluster looses access to Network randomly

$
0
0

Hello,

Over the past few months my Server 2008 R2 cluster reports lost access to Cluster network and fails over.

Cluster log reports what appears to be a network stack reset with all adapters being checked before reporting that the IP address check fails. 

 [IM] Notify other nodes about local adapter disconnect. Issuing state change gum with interface result <class mscs::InterfaceResult>

I have already adjusted the cluster local subnet thresholds.  

There is no real network activity occurring at the time and no backups either.

Is this a fault with windows?


Steven Wells

Large virtual machine reboot during live migration - WS 2012R2

$
0
0

Hello,

live migrating large VMs often fail on a 2012R2 cluster, same situation as described here:https://social.technet.microsoft.com/Forums/en-US/3436c57b-8832-4981-a09f-47361ed5db1d/live-migration-of-big-vms-fail. There is even a solution provided in this postings:

On all hosts, goto below in registry 
HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization\Migration
Create a new key "NetworkBufferCount" as a DWORD with value of "1024" 
Reboot the host

Unfortunately it is not saying wheter this key is hexadecimal or decimal. Would also be nice to know how this works.  

Any help would be highly appreciated!

Regards

Ueli

VM shutdown, when try add highly available role in Windows 2016 Hyper-V Failover Cluster

$
0
0
Hi! We have Hyper-V Failover cluster on Windows 2016 (after clear upgrade from 2012R2). We have SoFS Cluster on 2016 as a file storage for VM. Also we have VMM.
Let's analyze the problem in steps:
1.) Create non highly available VM "vmtest1" in VMM on Hyper-v Cluster 2016, and start. 
2.) Add Virtual Machine Role for vm "vmtest1" in Failover Clauster snap-in.
3.) After ~5-10 min we have error in eventlog Hyper-V-SynthStor (Event ID 12630) - 'vmtest1': Virtual hard disk resiliency failed to recover the drive '\\test1.test.consto.ru\VD0\test1\vmtest1.vhdx'. The virtual machine will be powered off. Current status: Permanent Failure.
4.) Now virtual machine "vmtest1" is poweroff.  
This situation is repeated on other cluster on Windows 2016.
We have that problem, after upgrade cluster from windows 2012R2 to Windows 2016.
On 2012R2 clusters that problem is not noticed.
Its happen only when we add "highly available" role in cluster and VM is Running. If we just try create "highly available" VM in VMM, everything goes well.
All cluster servers and sofs have last updates. Some events:


Microsoft-Windows-Hyper-V-Worker/Admin:
'vmtest1': Virtual hard disk '\\\test1.test.consto.ru\VD0\test1\vmtest1.vhdx' received a resiliency status notification. Current status: Disconnected.
'vmtest1': Virtual hard disk '\\test1.test.consto.ru\VD0\test1\vmtest1.vhdx' has detected a recoverable error. Current status: Disconnected.
'vmtest1': Virtual hard disk resiliency failed to recover the drive '\\test1.test.consto.ru\VD0\test1\vmtest1.vhdx'. The virtual machine will be powered off. Current status: Permanent Failure.
'vmtest1' was paused for critical error
'vmtest1' was turned off as it could not recover from a critical error. 

Microsoft-Windows-Hyper-V-StorageVSP/Microsoft-Hyper-V-StorageVSP-Admin:
Storage device '\?\UNC\test1.test.consto.ru\VD0\test1\vmtest1.vhdx' changed recovery state. Previous state = Recoverable Error Detected, New state = Unrecoverable Error.
Storage device '\?\UNC\test1.test.consto.ru\VD0\test1\vmtest1.vhdx' received a recovery status notification. Current device state = Recoverable Error Detected, Last status = Disconnected, New status = Permanent Failure.
Storage device '\\?\UNC\test1.test.consto.ru\VD0\test1\vmtest1.vhdx' received a recovery status notification. Current device state = No Errors, Last status = No Errors, New status = Disconnected.
Storage device '\\?\UNC\test1.test.consto.ru\VD0\test1\vmtest1.vhdx' changed recovery state. Previous state = No Errors, New state = Recoverable Error Detected.

Microsoft-Windows-FailoverClustering/Diagnostic:
[RES] Virtual Machine Configuration <Virtual Machine Configuration vmtest1>: Current state 'Online', event 'UpdateVmConfigurationProperties'
[RES] Virtual Machine Configuration <Virtual Machine Configuration vmtest1>: Updated VmStoreRootPath property to '\\?\UNC\test1.test.consto.ru\VD0\test1\vmtest1.vhdx'
[RCM] HandleMonitorReply: LOCKEDMODE for 'Virtual Machine Configuration vmtest1', gen(0) result 0/0.
[RCM] Virtual Machine Configuration vmtest1: Flags 1 added to StatusInformation. New StatusInformation 1 
[RCM] vmtest1: Added Flags 1 to StatusInformation. New StatusInformation 1 
[RHS] Resource Virtual Machine vmtest1 called SetResourceLockedMode. LockedModeEnabled1, LockedModeReason0.
[RCM] HandleMonitorReply: LOCKEDMODE for 'Virtual Machine vmtest1', gen(0) result 0/0.
[RCM] Virtual Machine vmtest1: Flags 1 added to StatusInformation. New StatusInformation 1 
[GUM] Node 16: Processing RequestLock 16:1953
[RCM] HandleMonitorReply: INMEMORY_NODELOCAL_PROPERTIES for 'Virtual Machine vmtest1', gen(0) result 0/0.
[RHS] Resource Virtual Machine Configuration vmtest1 called SetResourceLockedMode. LockedModeEnabled0, LockedModeReason0.
[RCM] HandleMonitorReply: LOCKEDMODE for 'Virtual Machine Configuration vmtest1', gen(0) result 0/0.
[RCM] Virtual Machine Configuration vmtest1: Flags 1 removed from StatusInformation. New StatusInformation 0 
[RHS] Resource Virtual Machine vmtest1 called SetResourceLockedMode. LockedModeEnabled0, LockedModeReason0.
[RCM] HandleMonitorReply: LOCKEDMODE for 'Virtual Machine vmtest1', gen(0) result 0/0.
[RCM] Virtual Machine vmtest1: Flags 1 removed from StatusInformation. New StatusInformation 0 
[RCM] vmtest1: Removed Flags 1 from StatusInformation. New StatusInformation 0 
[RCM] HandleMonitorReply: INMEMORY_NODELOCAL_PROPERTIES for 'Virtual Machine vmtest1', gen(0) result 0/0.
[RCM] Virtual Machine vmtest1: Flags 1 removed from StatusInformation. New StatusInformation 0 
[RES] Virtual Machine <Virtual Machine vmtest1>: Current state 'Online', event 'VmStopped'
[RCM] vmtest1: Removed Flags 1 from StatusInformation. New StatusInformation 0 
[RES] Virtual Machine <Virtual Machine vmtest1>: State change 'Online' -> 'Offline'
[RCM] rcm::RcmApi::OfflineResource: (Virtual Machine vmtest1, 1)
[RCM] Res Virtual Machine vmtest1: Online -> WaitingToGoOffline( StateUnknown )
[RCM] TransitionToState(Virtual Machine vmtest1) Online-->WaitingToGoOffline.
[RCM] rcm::RcmGroup::UpdateStateIfChanged: (vmtest1, Online --> Pending)
[RCM] Res Virtual Machine vmtest1: WaitingToGoOffline -> OfflineCallIssued( StateUnknown )
[RCM] TransitionToState(Virtual Machine vmtest1) WaitingToGoOffline-->OfflineCallIssued.
[RCM] HandleMonitorReply: INMEMORY_NODELOCAL_PROPERTIES for 'Virtual Machine vmtest1', gen(0) result 0/0.








Query about DNS record TTL

$
0
0

Hi Team,

I have some queries about DNS records, please share your comments.

  • Refresh interval. Used to determine how often other DNS servers that load and host the zone must attempt to renew the zone.

  • Retry interval. Used to determine how often other DNS servers that load and host the zone are to retry a request for update of the zone each time that the refresh interval occurs.

  • Expire interval. Used by other DNS servers that are configured to load and host the zone to determine when zone data expires if it is not renewed.

  • Minimum TTL . This will be minimum TTL of the records in the Zone.

  • TTL for this record. This will be the TTL for SOA.

> Can someone please explain what would be the Minimum TTL and Maximum of the resources records in DNS ?

> Where can be specify the Maximum TTL of DNS record ?

> What is the maximum TTL of SOA ?

> If we use AD integrated DNS, do we need to configure or really used  Refresh Interval,retry interval & expire after settings  since these settings are primarily used for Primary - Secondary concept.

> Suppose if we didn't enable DNS scavenging , and the record in stale state , does the system use the stale record  for the name resolution / or will the server allow the clients to renew the record  ?


Regards Sajin P S



Failover Cluster - Cluster IP Address is already in use

$
0
0

Hi All,

This weekend we had a healthy 2 node cluster get restarted. When the machines got brought up, The cluster itself would not start up, nor would the 1 role. Both the Cluster and the role had the error "The Cluster IP Address is already in use". Yes, it was in use when the cluster itself was up and running, but now its not running. To get the actual cluster working, I changed the IP of Cluster Core Resources, restarted the cluster, and boom, the cluster itself came online. The Role is a different story as it is a SQL AG cluster. 

I figure it would not be that hard to change the IP's but since this is a pre-production box, I dont want to use that workaround for when this goes into production.

Server A NIC IP : 10.172.193.89

Server A Cluster IP: 10.172.193.90

Server A Role/SQL AG IP: 10.172.193.91

Server B NIC IP : 10.172.193.89

Server B Cluster IP: 10.172.193.90

Server B Role/SQL AG IP: 10.172.193.91

Within the actual Cluster events I get the Errors:

Cluster IP address resource 'IP Address 10.172.193.91' cannot be brought online because a duplicate IP address '10.172.193.91' was detected on the network.  Please ensure all IP addresses are unique.

Cluster resource 'IP Address 10.172.193.91' of type 'IP Address' in clustered role 'CCTX-AG-01' failed. The error code was '0x13c1' ('The cluster IP address is already in use.').

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Lastly,

I disabled the NIC on the VM and tried to ping the same IP's just in case someone built other machines without me knowing, but I could not ping or communicate with those IP's. As soon as I Enabled the nic again, I could ping the IP's accordingly.

Thanks,

Jordan

Changing processor on existing Hyper V cluster 2012 R2

$
0
0

Dear All,

We have an 2 node Hyper V cluster, both nodes has 8 cores X 2 Processors, we have planned to upgrade the processor to 12 cores X 2 .

i just wanted to know will this have any issues with the existing cluster configuration, as i will be changing the processor on each node one by one.


TechGUy,System Administrator.

How to add S2D disk to DTC resource in SQL cluster?

$
0
0

I have an Azure SQL cluster set up between 2 VMs and am trying to set up DTC as a resource within the SQL Server role.

I originally tried to set up DTC as a separate role, but found that I could not make that role dependent on the SQL Server role so it would follow the SQL role to the other node. This was a problem because DTC did not work if it was on a different node from SQL Server.

Why am I trying to set up a clustered DTC? Because, according to documentation I have read, if I don't use a clustered DTC then I can lose transactions that are in process when the cluster fails over.

Anyway, when I create a DTC resource under the SQL Server role I can add a dependency on the SQL server name but I have not found a way to add the S2D disk for DTC to use. This is apparently because of the way S2D disks are presented to the cluster. Under the SQL Server role I cannot see the disks assigned to the role but I know they are there because the cluster is up and running. and I have DBs on those disks.

I do not know if I need to assign the disk to DTC in any other way besides assigning a dependency on the disk to it.

If anyone can explain a solution or point me to documentation that addresses this, it will be greatly appreciated.

Thanks,

Chris

Hyper-V 2016 Failover Cluster - Cluster Aware Updating - Hostname Failing - FQDN Working

$
0
0

When setting up CAU, PSRemoting fails due to a kerberos error saying that it can't reach the server. What I can gather is that when CAU is going through its process it is trying to connect with just the hostname instead of the FQDN. When I use "Enter-PSSession" and connect via hostname, it fails but when I use the FQDN it works. 

Is there anyway to have the CAU use the FQDN instead of the host name or am I missing something else?

Thank you,

Steve

Creating the DHCP failover relation from remote server

$
0
0

Hi 

 I am trying to establish the failover relation ship between two windows servers (w1 and w2 which are 2012 version) with power shell commands like
Add-DhcpServerv4Failover -ComputerName "dhcpserver.contoso.com" -Name "SFO-SIN-Failover" -PartnerServer 10.0.0.99 -ScopeId 10.10.10.0,10.20.20.0 -LoadBalancePercent 70 -MaxClientLeadTime 2:00:00 -AutoStateTransition $True -StateSwitchInterval 2:00:00

but this command will be executed on one of windows server from an external linux machine(external remote), where winrm being the interface between the linux and windows. This winrm will carry the command from linux and gets executed on w1 at that time it is trying to communicate to another windows server w2 to cross check failover already exists or not but it is unable to communicate w2 stack trace gives me permission denied 

<S S="Error">Add-DhcpServerv4Failover : Failed to verify if a failover relationship by the_x000D__x000A_</S>
   <S S="Error">name SFO-SIN-Failover exists on server 10.0.0.99._x000D__x000A_</S>
   <S S="Error">At line:1 char:74_x000D__x000A_</S>
   <S S="Error">+ if (-not(Get-DhcpServerv4Failover | ? { $_.Name -eq "MSDHCP0037Peer"})){ _x000D__x000A_</S>
   <S S="Error">Add-Dhc ..._x000D__x000A_</S>
   <S S="Error">+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~_x000D__x000A_</S>
   <S S="Error">~~~_x000D__x000A_</S>
   <S S="Error">+ CategoryInfo          : PermissionDenied: (MSDHCP0037Peer:root/Microsoft _x000D__x000A_</S>
   <S S="Error">/...erverv4Failover) [Add-DhcpServerv4Failover], CimException_x000D__x000A_</S>
   <S S="Error">+ FullyQualifiedErrorId : WIN32 5,Add-DhcpServerv4Failover_x000D__x000A_</S>
   <S S="Error">_x000D__x000A_</S>

the above powershell command  internally communicates to partner server and checks the same failover name if exists or not if exists it throws exception and if not it proceed to create, but the first step i.e connecting to partner server itself is not happening hence it is saying failed to verify causing the issue can we have any such refernces of adding the failover from other remote servers

Thanks




Viewing all 4519 articles
Browse latest View live