Quantcast
Channel: High Availability (Clustering) forum
Viewing all 4519 articles
Browse latest View live

Windows NLB cluster only one node remains active always even after rebooting

$
0
0
Hi Team,

I have server A and server B running in windows NLB cluster ,running exchange 2013 CAS roles,anti virus is runing on server A and not installed on server B,the cluster IP is currently getting connected to server A which has anti virus.
For troubleshooting an issue i want to make the server B active to confirm if anti virus is casuing the issue,but the problem is even if i restart server A ,server B becomes the active node only untill server A restarts sucessfully,once server A is up it becomes the active node again,how do i make node B active?

TechGUy,System Administrator.


CSV using S2D Storage Spaces Direct not working

$
0
0

Hi,

I have two servers running Server 2016 Data Center and I am trying to set up a high-availability fail-over environment using Storage Spaces Direct (S2D) for a bunch of Hyper-V VMs that will be running on the two servers.

I have created the cluster, enabled storage spaces direct, created the S2D storage pool, created two virtual disk from the storage pool (formatted as ReFS) and then created a Volume in each Virtual Disk. There is a file share witness configured on a local server. Cluster Validation is all passes (except for an unsigned driver for logmein display mirror, and a complaint that one of the servers is on a slightly different version of the windows defender definitions).

On both servers, I can see that I have c:\clusterstorage\volume1\ and c:\custerstorage\volume2\

However, when I shut one of the servers down, the other server looses visibility of one of the c:\custerstorage\volumeX\ folders.

When I have VMs stored in c:\custerstorage\volume1\ and ..\volume2\ and both nodes are up, everything is fine and I can do live migrations between hosts. But when one of the servers is shut down the live migration happens but then when the other server looses sight of one of the c:\custerstorage\volumeX\ folders the VM locks up and stops working (not surprisingly).

It's my first time using S2D so, does this sound correct? I would have thought that even if I take one of the nodes down, the other node should still retain visibility of both its c:\custerstorage\volume1\ and ..\volume2\ folders thanks to the magic of S2D.

Is there any trouble-shooting I can do to figure out why this isn't working?

Possibly relevant: I am building this system in my lab for a client. The client has a single SBS2011 server on their LAN at a different site. I have created a site to site VPN from my lab LAN to the customer's LAN. All NICs on the servers (excluding the 2 x 10 GbE NICs used for SMB/S2D/etc) have their DNS settings pointing to the customer's DC over the site-to-site VPN. The hyper-v hosts are connected to the customer's domain in this way.





Active Directory detached cluster creation failed - MSG_AUTH_PACKAGE::KerberosAuth failed with status 2148074254

$
0
0

I'm having trouble creating active directory detached cluster. From log I can see there is problem with authentication, but both users are the local admin accounts with same password. Here is log file extract:

Excerpt from log file:

00001488.0000046c::2017/02/06-09:41:10.610 INFO  [ACCEPT] 0.0.0.0:~3343~: Accepted inbound connection from remote endpoint 10.32.6.201:~58602~.
00001488.00001890::2017/02/06-09:41:10.610 INFO  [SV] New real route: local (10.32.6.200:~3343~) to remote  (10.32.6.201:~58602~).
00001488.00001890::2017/02/06-09:41:10.610 INFO  [SV] Got a new incoming stream from 10.32.6.201:~58602~
00001488.00001890::2017/02/06-09:41:10.610 ERR   [SM] Sponsor: Setting package MSG_AUTH_PACKAGE::KerberosAuth failed with status 2148074254
00001488.00001890::2017/02/06-09:41:10.610 WARN  mscs::ListenerWorker::operator (): (-2146893042)' because of 'Status'
00001488.00001890::2017/02/06-09:41:13.892 INFO  [NM] Received request from client address PLSD001.
00001488.00001890::2017/02/06-09:41:13.892 WARN  [VER] Could not read version data from database for node PLSD002 (id 2).
00001488.00001890::2017/02/06-09:41:13.892 WARN  [VER] Could not read version data from database for node PLSD002 (id 2).

There is no Active Directory present and no DNS server.

Can someone tell me what am I doing wrong?

Tnx, Robert.

Cluster Migration - Cluster operating system rolling upgrade vs. Setting up a new Cluster?

$
0
0

Cluster Migration - Cluster operating system rolling upgrade vs. Setting up a new Cluster?

All- I currently have a Server 2012r2 cluster running with two nodes. I have just received two new hosts/servers with server 2016, I was wondering what would be the best option, Rolling Upgrade or just setting up a new cluster all together, and migrating the VM's to the new cluster. 

-Side Note worried about different CPU versions affecting live migration during the upgrade period, and only migrating hosts at this time. Storage migration will take place after. 

Notes on the migration 

1. Old Server 2012 R2 nodes would be retired / DR use after the migration would be complete. 

2. Processors would still be Intel but new versions - 2014 Xeons vs 2018 Xeons.

a. Current Cluster Processor Types - Intel Xeon E5-2640 v2

b. New Host Processor Types- Intel® Xeon® Silver 4116 

Always On VPN NLB/high Availability Solutions

$
0
0

Hello All

I have a question i'm hoping some one can answer

we are looking at implementing Always On VPN on behalf of our desktop support team and I have been asked to advise on the design. it seems pretty straight forward until I hit the fact that the implementation of load balancing but i am struggling to an advise on this. the issue I see is 2 fold

1. the Remote access server. we have a cisco network load balancer so could configure them as separate and use this to Load balance from a network prospective, but my concern then is for the connection validation etc. (the risk of the users swapping servers mid connection and ending up in a validation loop) we could cluster this but i'm not sure RAS supports clustering.

2.for the Radius NPS having two separate ones configured isn't practical and what I read I don't believe it will be a supported set-up anyway. but I can't seem to find anything to confirm if Radius and NPS support clustering either (I suspect if they do it's just active/passive not active/active as they want.)

my preferred solution would be the NLB for the RAS and cluster for the Radius. does anyone know if this is supported or do I need to look at other options

Storage Spaces Direct - Node Fault Tolerance

$
0
0

We have built a 3 node S2D cluster using SuperMicro 6029U-TRT SuperServer.  Each node has 2 Intel P4600 NVMe drives for cache and 8 10TB SATA drives (Seagate 3.5" 10TB,7.2K RPM,SATA 6Gb/s,256MB,512E,Helium) for capacity.

Everything ran through cluster validation and built fine.  When testing, the storage pool stays online if one node fails, but if you fail a second node the storage pool goes offline.

Doing get-StorageEnclosure shows 3 enclosures.

Using failover cluster manager, I can see the disks in each node.

This morning I rebuilt it trying to change the Cluster Fault Domain settings.  Then when I re-enabled the cluster I did get this message:

Performing operation 'Set rack fault tolerance on the S2D pool. This is normally recommended on setups with multiple
racks' on Target 'HVCL3'.

I thought that since I set each node to be in a different chassis and different rack, that maybe the fault tolerance would have been fixed, but it still only withstood a single node failure.

The only other issue I have found close to mine is:
https://social.technet.microsoft.com/Forums/windowsserver/en-US/4fc1fb86-61fa-4976-8b3f-9e314586fef8/storage-spaces-direct-cluster-virtual-disk-goes-offline-when-rebooting-a-node?forum=winserverClustering


James - Right Size Solutions

Storage Spaces Direct -- virtual disks shown as "no redundancy" and "unhealthy"

$
0
0

for no apparent reason I have a s2d virtual disk which has an operationalstatus of "no redundancy" and healthstatus "unhealthy". Running repair-virtualdisk gives this response:

PS C:\Users\administrator.PAO2K> get-virtualdisk "volume3" | repair-virtualdisk
repair-virtualdisk : There is not enough redundancy remaining to repair the virtual disk.
Activity ID: {a366982d-b602-4deb-8f6d-c8ec59c12217}
At line:1 char:29+ get-virtualdisk "volume3" | repair-virtualdisk+                             ~~~~~~~~~~~~~~~~~~+ CategoryInfo          : NotSpecified: (StorageWMI:ROOT/Microsoft/...SFT_VirtualDisk) [Repair-VirtualDisk], CimEx
   ception+ FullyQualifiedErrorId : StorageWMI 50001,Repair-VirtualDisk

PS C:\Users\administrator.PAO2K>

so how to fix? this virtual disk is tiered with triple mirror performance tier and double parity capacity tier, and is an ReFS volume.

Change IP address on Failover Cluster and Hosts

$
0
0

Hello,

So, Im in a situation with a Failover Cluster 2016 with 2 hosts, where I want to change ip from :

cluster ip : 10.11.12.20

host1 ip : 10.11.12.21

host2 ip : 10.11.12.22

to

cluster ip : 10.40.12.20

host1 ip : 10.40.12.21

host2 ip : 10.40.12.22

How should I do this?

I've looked through these articles : 

https://blogs.technet.microsoft.com/chrad/2011/09/16/changing-hyper-v-cluster-virtual-ip-address-vip-after-layer-3-changes/

https://social.technet.microsoft.com/Forums/windowsserver/en-US/a640a75b-52e1-43c1-a1fb-acbc142c614a/how-to-change-hyperv-cluster-ip-address?forum=winserverClustering

http://bartvanvugt.blogspot.com/2012/01/change-ip-address-hyper-v-cluster.html

But havent found the entire solution yet - so hopefully, someone in here can help me :)


2-Node Stretch Cluster with Storage Replica?

$
0
0
I want to create a 2-node stretch cluster with Storage Replica.  I was able to setup the replica but once I create the cluster it breaks the storage replica.  I'm not sure if that is by design or not.  Is there something I am missing that prevents this type of scenario?  If I create the cluster first it does not allow me to create the replica with an error that it cannot find the volume on the source server that it must be a CSV on the cluster or added to a role on the cluster.  Neither of those are possible without the Replica running between the cluster nodes.

Storage Spaces Direct / Cluster Virtual Disk goes offline when rebooting a node

$
0
0

Hello

We have several Hyper-converged einvoronments based on HP ProLiant DL360/DL380.
We have 3 Node and 2 Node Clusters, running with Windows 2016 and actual patches, Firmware Updates done, Witness configured.

The following issue occurs with at least one 3 Node and one 2 Node cluster:
When we put one node into maintenance mode (correctly as described in microsoft docs and checked everything is fine) and reboot that node, it can happen, that one of the Cluster Virtual Disks goes offline. It is always the Disk Performance with the SSD only storage in each environment. The issue occurs only sometimes and not always. So sometimes I can reboot the nodes one after the other several times in a row and everything is fine, but sometimes the Disk "Performance" goes offline. I can not bring this disk back online until the rebooted node comes back online. After the node which was down during maintenance is back online the Virtual Disk can be taken online without any issues.

We have created 3 Cluster Virtual Disks & CSV Volumes on these clusters:
1x Volume with only SSD Storage, called Performance
1x Volume with Mixed Storage (SSD, HDD), called Mixed
1x Volume with Capacity Storage (HDD only), called Capacity

Disk Setup for Storage Spaces Direct (per Host):
- P440ar Raid Controller
- 2 x HP 800 GB NVME (803200-B21)
- 2 x HP 1.6 TB 6G SATA SSD (804631-B21)
- 4 x HP 2 TB 12G SAS HDD (765466-B21)
- No spare Disks
- Network Adapter for Storage: HP 10 GBit/s 546FLR-SFP+ (2 storage networks for redundancy)
- 3 Node Cluster Storage Network Switch: HPE FlexFabric 5700 40XG 2QSFP+ (JG896A), 2 Node Cluster directly connected with each other

Cluster Events Log is showing the following errors when the issue occurs:

Error 1069 FailoverClustering
Cluster resource 'Cluster Virtual Disk (Performance)' of type 'Physical Disk' in clustered role '6ca63b55-1a16-4bb2-ac53-2b23619e258a' failed.

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Warning 5120 FailoverClustering
Cluster Shared Volume 'Performance' ('Cluster Virtual Disk (Performance)') has entered a paused state because of 'STATUS_NO_SUCH_DEVICE(c000000e)'. All I/O will temporarily be queued until a path to the volume is reestablished.

Error 5150 FailoverClustering
Cluster physical disk resource 'Cluster Virtual Disk (Performance)' failed.  The Cluster Shared Volume was put in failed state with the following error: 'Failed to get the volume number for \\?\GLOBALROOT\Device\Harddisk10\ClusterPartition2\ (error 2)'

Error 1205 FailoverClustering
The Cluster service failed to bring clustered role '6ca63b55-1a16-4bb2-ac53-2b23619e258a' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.

Error 1254 FailoverClustering
Clustered role '6ca63b55-1a16-4bb2-ac53-2b23619e258a' has exceeded its failover threshold.  It has exhausted the configured number of failover attempts within the failover period of time allotted to it and will be left in a failed state.  No additional attempts will be made to bring the role online or fail it over to another node in the cluster.  Please check the events associated with the failure.  After the issues causing the failure are resolved the role can be brought online manually or the cluster may attempt to bring it online again after the restart delay period.

Error 5142 FailoverClustering
Cluster Shared Volume 'Performance' ('Cluster Virtual Disk (Performance)') is no longer accessible from this cluster node because of error '(1460)'. Please troubleshoot this node's connectivity to the storage device and network connectivity.

Any hints / inputs appreciated. Had someone something similar?

Thanks in advance

Philippe



Hyper-V cluster - Performance Problems with Cluster share Volume

$
0
0

Hi Experts,

we are in the process of implementing SQL server 2016 2-node (Node1 and Node2) traditional fail-over cluster on Hyper-V guest VMs with Clustered shared volume and Quorum with Disk witness.

  • Each VM we have two Disks C:\ and D:\
  • We have cluster shared Volume (Cluster Disk)  of 250 GB like I:\ and J:\ for SQL server files
  • We have one cluster shared Volume (Cluster Disk)  of 25 Gb for Quorum as Q:\
  • All the local disks, Cluster volumes I:\ and Q:\ are carved out from the same 4 TB Data store available at Esxi hos

Now we are Testing Disk performance and Results are showing bad performance on clustered shared volumes compared to local dedicated volumes.

What could be the problem and are there any ways to improved the cluster shared volumes performance.


Ramesh M



Cluster Validation in Windows Server 2016

$
0
0

Gents,

I've upgraded 3-node Hyper-V cluster from Windows Server 2012 R2 to Windows Server 2016.

And a bit confused a new interface of the "Validate Cluster" wizard.

I don't see any choice of the disk for Disk's cluster tests as it was in Windows Server 2012 R2.

In Win 2012 I can choose "Run all test" and wizard will ask you which disk you will use for failover testing. I don't see the same in Win 2016.

Can any one explain if I choose "Run all tests" in Win 2016, will be there disruption ?

I don't see any explanation in the help. All articles are about Win 2012.

Thanks


mcse^4




Problem with failover clustering and iSCSI target in Windows Server 2016

$
0
0

Hi

I have a problem with failover clustering and iSCSI target in Windows Server 2016.

I often demo the configuration of Failover Clustering for students and customers. And for that I have built a small lab consisting of a Domain Controller and three servers functioning as my cluster nodes.

All Servers run Windows Server 2016 (1607) and have been fully updated with the latest patches. All servers are virtual machines running in Hyper-V on Windows 10 v1709/1803.

The Domain Controller functions as Domain Controller and I have also added the iSCSI role service on that as well. And yes, I know that is not the way to do but this is only a test/demo environment. I have created two iSCSI disk, one for the Witness (1 GB) and one for Shared Storage (100 GB).

My three nodes also run Windows Server 2016 (1607) and is configured with two network cards, one for management (LAN) and one for cluster traffic. I have installed the Failover Clustering feature on these servers and are using the built-in iSCSI initiator to connect to the shared storage published by my iSCSI target running on the Domain Controller. And this is where I run into the first problem.

In order to give the nodes (servers) access to the storage they need to be added to theInitiators list from within Server Manger ->File and Storage Services -> iSCSI. If I choose theQuery initiator computer for ID options and then click Browse (in order to browse for the names of the nodes) Server Manager crashes and I get a window stating that Server Manager has stopped working. If I look in the Event Viewer I see the following events:

Log Name:     Application

Source:       .NET Runtime

Date:         04-06-2018 17:36:52

Event ID:     1026

Task Category: None

Level:        Error

Keywords:     Classic

User:         N/A

Computer:     cph-dc-01.ad.petzfeed.com

Description:

Application: ServerManager.exe

Framework Version: v4.0.30319

Description: The process was terminated due to an unhandled exception.

 

Application: ServerManager.exe

Framework Version: v4.0.30319

Description: The process was terminated due to an unhandled exception.

Exception Info: System.ArgumentException

  at Microsoft.FileServer.Management.Plugin.Dialogues.BrowseDSObjectsNativeMethods+IDsObjectPicker.Initialize(DSOP_INIT_INFO ByRef)

  at Microsoft.FileServer.Management.Plugin.Dialogues.BrowseDSObjectsDialog.ShowDialog(System.Windows.Forms.IWin32Window, PickerTypes, Boolean, StartScope, ProviderTypes, System.String, System.Security.SecureString)

  at Microsoft.FileServer.Management.Plugin.Services.DialogService.ShowPickDialog(PickerTypes, Microsoft.FileServer.Management.Framework.ComputerName, StartScope, ReturnSourceTypes, System.String ByRef)

  at Microsoft.FileServer.Management.Plugin.Services.DialogService.ShowPickComputerDialog(Microsoft.FileServer.Management.Framework.ComputerName, Microsoft.FileServer.Management.Framework.ComputerName ByRef)

  at Microsoft.FileServer.Management.Plugin.Dialogues.AddInitiatorIdSectionDescriptor+<>c__DisplayClass41_0.<.ctor>b__0()

  at MS.Internal.Commands.CommandHelpers.CriticalExecuteCommandSource(System.Windows.Input.ICommandSource, Boolean)

  at System.Windows.Controls.Primitives.ButtonBase.OnClick()

  at System.Windows.Controls.Button.OnClick()

  at System.Windows.Controls.Primitives.ButtonBase.OnMouseLeftButtonUp(System.Windows.Input.MouseButtonEventArgs)

  at System.Windows.RoutedEventArgs.InvokeHandler(System.Delegate, System.Object)

  at System.Windows.RoutedEventHandlerInfo.InvokeHandler(System.Object, System.Windows.RoutedEventArgs)

  at System.Windows.EventRoute.InvokeHandlersImpl(System.Object, System.Windows.RoutedEventArgs, Boolean)

  at System.Windows.UIElement.ReRaiseEventAs(System.Windows.DependencyObject, System.Windows.RoutedEventArgs, System.Windows.RoutedEvent)

  at System.Windows.UIElement.OnMouseUpThunk(System.Object, System.Windows.Input.MouseButtonEventArgs)

  at System.Windows.RoutedEventArgs.InvokeHandler(System.Delegate, System.Object)

  at System.Windows.RoutedEventHandlerInfo.InvokeHandler(System.Object, System.Windows.RoutedEventArgs)

  at System.Windows.EventRoute.InvokeHandlersImpl(System.Object, System.Windows.RoutedEventArgs, Boolean)

  at System.Windows.UIElement.RaiseEventImpl(System.Windows.DependencyObject, System.Windows.RoutedEventArgs)

  at System.Windows.UIElement.RaiseTrustedEvent(System.Windows.RoutedEventArgs)

  at System.Windows.Input.InputManager.ProcessStagingArea()

  at System.Windows.Input.InputManager.ProcessInput(System.Windows.Input.InputEventArgs)

  at System.Windows.Input.InputProviderSite.ReportInput(System.Windows.Input.InputReport)

  at System.Windows.Interop.HwndMouseInputProvider.ReportInput(IntPtr, System.Windows.Input.InputMode, Int32, System.Windows.Input.RawMouseActions, Int32, Int32, Int32)

  at System.Windows.Interop.HwndMouseInputProvider.FilterMessage(IntPtr, MS.Internal.Interop.WindowMessage, IntPtr, IntPtr, Boolean ByRef)

  at System.Windows.Interop.HwndSource.InputFilterMessage(IntPtr, Int32, IntPtr, IntPtr, Boolean ByRef)

  at MS.Win32.HwndWrapper.WndProc(IntPtr, Int32, IntPtr, IntPtr, Boolean ByRef)

  at MS.Win32.HwndSubclass.DispatcherCallbackOperation(System.Object)

  at System.Windows.Threading.ExceptionWrapper.InternalRealCall(System.Delegate, System.Object, Int32)

  at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(System.Object, System.Delegate, System.Object, Int32, System.Delegate)

  at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(System.Windows.Threading.DispatcherPriority, System.TimeSpan, System.Delegate, System.Object, Int32)

  at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr, Int32, IntPtr, IntPtr)

  at MS.Win32.UnsafeNativeMethods.DispatchMessage(System.Windows.Interop.MSG ByRef)

  at System.Windows.Threading.Dispatcher.PushFrameImpl(System.Windows.Threading.DispatcherFrame)

  at System.Windows.Window.ShowHelper(System.Object)

  at System.Windows.Window.ShowDialog()

  at Microsoft.FileServer.Management.Plugin.Services.DialogService.ShowAddInitiatorIdDialog(Microsoft.FileServer.Management.Framework.ComputerName, Microsoft.FileServer.Management.Plugin.Model.InitiatorId ByRef)

  at Microsoft.FileServer.Management.Plugin.PropertyPages.IscsiTargetInitiatorsPropertySectionDescriptor.<.ctor>b__4_0()

  at MS.Internal.Commands.CommandHelpers.CriticalExecuteCommandSource(System.Windows.Input.ICommandSource, Boolean)

  at System.Windows.Controls.Primitives.ButtonBase.OnClick()

  at System.Windows.Controls.Button.OnClick()

  at System.Windows.Controls.Primitives.ButtonBase.OnMouseLeftButtonUp(System.Windows.Input.MouseButtonEventArgs)

  at System.Windows.RoutedEventArgs.InvokeHandler(System.Delegate, System.Object)

  at System.Windows.RoutedEventHandlerInfo.InvokeHandler(System.Object, System.Windows.RoutedEventArgs)

  at System.Windows.EventRoute.InvokeHandlersImpl(System.Object, System.Windows.RoutedEventArgs, Boolean)

  at System.Windows.UIElement.ReRaiseEventAs(System.Windows.DependencyObject, System.Windows.RoutedEventArgs, System.Windows.RoutedEvent)

  at System.Windows.UIElement.OnMouseUpThunk(System.Object, System.Windows.Input.MouseButtonEventArgs)

  at System.Windows.RoutedEventArgs.InvokeHandler(System.Delegate, System.Object)

  at System.Windows.RoutedEventHandlerInfo.InvokeHandler(System.Object, System.Windows.RoutedEventArgs)

  at System.Windows.EventRoute.InvokeHandlersImpl(System.Object, System.Windows.RoutedEventArgs, Boolean)

  at System.Windows.UIElement.RaiseEventImpl(System.Windows.DependencyObject, System.Windows.RoutedEventArgs)

  at System.Windows.UIElement.RaiseTrustedEvent(System.Windows.RoutedEventArgs)

  at System.Windows.Input.InputManager.ProcessStagingArea()

  at System.Windows.Input.InputManager.ProcessInput(System.Windows.Input.InputEventArgs)

  at System.Windows.Input.InputProviderSite.ReportInput(System.Windows.Input.InputReport)

  at System.Windows.Interop.HwndMouseInputProvider.ReportInput(IntPtr, System.Windows.Input.InputMode, Int32, System.Windows.Input.RawMouseActions, Int32, Int32, Int32)

  at System.Windows.Interop.HwndMouseInputProvider.FilterMessage(IntPtr, MS.Internal.Interop.WindowMessage, IntPtr, IntPtr, Boolean ByRef)

  at System.Windows.Interop.HwndSource.InputFilterMessage(IntPtr, Int32, IntPtr, IntPtr, Boolean ByRef)

  at MS.Win32.HwndWrapper.WndProc(IntPtr, Int32, IntPtr, IntPtr, Boolean ByRef)

  at MS.Win32.HwndSubclass.DispatcherCallbackOperation(System.Object)

  at System.Windows.Threading.ExceptionWrapper.InternalRealCall(System.Delegate, System.Object, Int32)

  at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(System.Object, System.Delegate, System.Object, Int32, System.Delegate)

  at System.Windows.Threading.Dispatcher.LegacyInvokeImpl(System.Windows.Threading.DispatcherPriority, System.TimeSpan, System.Delegate, System.Object, Int32)

  at MS.Win32.HwndSubclass.SubclassWndProc(IntPtr, Int32, IntPtr, IntPtr)

  at MS.Win32.UnsafeNativeMethods.DispatchMessage(System.Windows.Interop.MSG ByRef)

  at System.Windows.Threading.Dispatcher.PushFrameImpl(System.Windows.Threading.DispatcherFrame)

  at System.Windows.Application.RunDispatcher(System.Object)

  at System.Windows.Application.RunInternal(System.Windows.Window)

  at Microsoft.Windows.ServerManager.SingleInstanceAppLauncher.StartApplication(Microsoft.Windows.ServerManager.Common.ArgumentsProcessor)

  at Microsoft.Windows.ServerManager.MainApplication.Main(System.String[])

 

And this error as well:

Faulting application name: ServerManager.exe, version: 10.0.14393.1358, time stamp: 0x593272e2

Faulting module name: KERNELBASE.dll, version: 10.0.14393.1532, time stamp: 0x5965ac8c

Exception code: 0xe0434352

Fault offset: 0x0000000000033c58

Faulting process id: 0xf2c

Faulting application start time: 0x01d3fc19440b98b8

Faulting application path: C:\Windows\system32\ServerManager.exe

Faulting module path: C:\Windows\System32\KERNELBASE.dll

Report Id: a9a4656c-453c-4ef6-8d83-165ee783cbfd

Faulting package full name:

Faulting package-relative application ID:

 

If I choose theEnter a value for the selected type and select DNS and browse for the machine name the same thing happens – Server Manager crashes and the same events is reported in Event Viewer.

 

However, I can work around this issue by using the IP addresses of my three nodes instead and add them to the Initiators list. When all that is done my three nodes have no problem connecting to the iSCSI storage.

 

The real problem comes when I try to run the Validate a Configuration Wizard (Cluster wizard) from within the Failover Cluster Manager. When I clickBrowse (in order to search for the servers, I want to validate) the wizard closes without any visible errors. In the Event Viewer the following error is reported:

 

Log Name:     Microsoft-Windows-FailoverClustering-Manager/Admin

Source:       Microsoft-Windows-FailoverClustering-Manager

Date:         04-06-2018 18:12:52

Event ID:     4709

Task Category: MMC Snapin

Level:        Error

Keywords:     

User:         AD\mbucadmin

Computer:     cph-hv-01.ad.petzfeed.com

Description:

Failover Cluster Manager encountered a fatal error.

 

System.ApplicationException: Unable to browse for computer objects. ---> System.ArgumentException: Value does not fall within the expected range.

  at MS.Internal.ServerClusters.NativeMethods.IDsObjectPicker.Initialize(DSOP_INIT_INFO& pInitInfo)

  at MS.Internal.ServerClusters.BrowseDSObjectsDialog.ShowDialog(IWin32Window owner, PickerTypes pickerType, Boolean multipleSelect)

  at MS.Internal.ServerClusters.BrowseDSObjectsDialog.ShowPickComputerDialog(IWin32Window owner, Boolean multipleSelect)

  --- End of inner exception stack trace ---

  at MS.Internal.ServerClusters.BrowseDSObjectsDialog.ShowPickComputerDialog(IWin32Window owner, Boolean multipleSelect)

  at MS.Internal.ServerClusters.Wizards.SelectItemsPage.OnBrowseClicked(Object sender, EventArgs e)

  at System.Windows.Forms.Control.OnClick(EventArgs e)

  at System.Windows.Forms.Button.OnClick(EventArgs e)

  at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)

  at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)

  at System.Windows.Forms.Control.WndProc(Message& m)

  at System.Windows.Forms.ButtonBase.WndProc(Message& m)

  at System.Windows.Forms.Button.WndProc(Message& m)

  at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)

 

System.ArgumentException: Value does not fall within the expected range.

  at MS.Internal.ServerClusters.NativeMethods.IDsObjectPicker.Initialize(DSOP_INIT_INFO& pInitInfo)

  at MS.Internal.ServerClusters.BrowseDSObjectsDialog.ShowDialog(IWin32Window owner, PickerTypes pickerType, Boolean multipleSelect)

  at MS.Internal.ServerClusters.BrowseDSObjectsDialog.ShowPickComputerDialog(IWin32Window owner, Boolean multipleSelect)

 

Again, I can work around this by typing the name of my nodes and adding them one at time by clickingAdd. Then the validation begins and the List Disk to Be Validated test fails every time with the following error:

 

* Failed while verifying removal of any Persistent Reservation on physical disk {24dea968-82e9-4dbf-aacd-a0fe00236632} at node cph-hv-01.ad.petzfeed.com.

 * Failed while verifying removal of any Persistent Reservation on physical disk {24dea968-82e9-4dbf-aacd-a0fe00236632} at node cph-hv-01.ad.petzfeed.com.

 

 

If I just ignore the error and create the cluster anyway, it doesn´t function properly and complains about access to the storage.

 

Here is what I have tried:

  • Recreating the virtual machines using different installation media but that makes no difference.
  • Tried using three different Windows 10 machines when running the virtual machines but that doesn´t help either.
  • Using two different Synology boxes as iSCSI target but it is the same.
  • Tried using both Gen 1 and Gen 2 virtual machines with or without Secure boot enabled.

 

But the real funny part is that if I use the Windows Server 2016 RTM media (1607) and choose not to patch (update) the nodes everything works as expected. If I apply the updates after the cluster is up and running it seems to work okay. If I update the servers before I setup the cluster I run in to the issues described above. If I setup the same environment using Windows Server 2012 R2 it just works.

 

Are anyone out there able to reproduce what I am seeing or maybe someone has seen this problem before.

 


Windows 10 offline network drive file server cluster name nightmare

$
0
0

Hello,

My initial thread : https://social.technet.microsoft.com/Forums/en-US/651ed135-e72a-4371-838e-a8670c5070c2/windows-10-offline-network-drive-nightmare?forum=win10itpronetworking

To summarize  :

My problem only happens on W10 (no matter the hardware / domain joined or not / software installed / build version) AND with a network drive mapped to the cluster nameof our 2012R2 file servers (works ok with a drive mapped directly to any node of the cluster).

Here is the problem : when a workstation has a network drive mounted and is disconnected from network (unplug the cable), the OS is constantly trying to reach out for the file server instead of marking the drive as disconnected after a few attempts. The pc becomes slow and irresponsive until the network comes back. Since we have for example Word configured to save files in the network drive by default, users are unable to save a document because word is just waiting forever for the network drive to become accessible again.

Regards

Gracefully/soft shutdown of Windows server 2012 R2

$
0
0

I use a Windows Hyper-V cluster mit Windows Server 2012 R2.

If I have the Windows Server not in clustered mode, then the Graceful Shutdown (soft shutdown) works well.

If I have the Windows Server in clustered mode, then the Graceful Shutdown does not work.

The Graceful Shutdown is working in clustered mode only during the first 2 hours after the last OS shutdown or reboot.This could be an authentication timeout.  

The local security policy „Shutdown: Allow system to be shut down without having to log on” has no influence on this behavior.

With VMWare ESXi6.7 Graceful Shutdown is OK.

To trigger the graceful shutdown I use CISCO UCSM or an IPMI Tool which send the soft shutdown signal via CISCO CIMC and ACPI to the Windows OS.

How can I trace (examine) the ACPI soft shutdown signal on the Windows Server side?

last update: 2018-07-02

Windows 2012 R2 Cluster Fails - RDM Disks

$
0
0

Hi Everyone,

We have a 2 node Windows 2012 R2 Failover cluster configured with Shared RDM's running on VMware. All of a sudden, the resources got failed over to the other node.

Below are the sequence of events triggered.

Ownership of cluster disk 'Cluster Disk 3' has been unexpectedly lost by this node. Run the Validate a Configuration wizard to check your storage configuration.

Cluster resource 'Cluster Disk 3' of type 'Physical Disk' in clustered role 'XXXX' failed.

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

Ownership of cluster disk 'Cluster Disk 2' has been unexpectedly lost by this node. Run the Validate a Configuration wizard to check your storage configuration.

Ownership of cluster disk 'Cluster Disk 4' has been unexpectedly lost by this node. Run the Validate a Configuration wizard to check your storage configuration.

Can someone let me know what was actually happened here. 


Cluster with two nodes

$
0
0

Hello...

I want to configure a cluster on two servers with windows server 2012 R2...I could configure the cluster without a shared Storage? 

the storage is done on the disks of the servers and replicate the data between them.Does it work like this?

S2D disk performance problem - grinding to a halt.

$
0
0

Hi All,

I've recently built an 2016 S2D 4 node cluster and have run into major issues with disk performance:

barely getting kb/s throughput (yep kilo and a small b - dial up modem speeds for disk access)

vm's are unresponsive

multiple other issues associated with disk access to the csv's 

The hardware is all certified and as per Lenovo's most recent guidelines. Servers are ThinkSystem SR650, the networking is 100Gb/s with 2x Mellanox Connect-X4 adapters per node and 2x Lenovo NE10032 switches, 12x Intel SSD's and 2x Intel NVMe per node for the storage pool. RoCE/RDMA, DCB etc all configured as per the guidelines and verified (as far as I can diagnose). It should be absolutely flying along.  

I should point out that it was working OK (though with no thorough testing done) for approx. 1 week. The vm's (about 10 or so) were running fine and any file transfers that were performed were limited by the Gb/s connectivity to the file share source (on older equipment serviced by a 10Gb/s switch uplink and 1Gb/s NIC connections at the source). 

About 3pm yesterday I decided to configure the Cluster Aware Updating and this may or may not have been a factor. The servers were already fully patched with the exception of 2 updates: KB4284833 and a definition update for defender. These were installed and one at a time a manual reboot performed. Ever since, I've had blue screens, nodes/pools/csv's failing over and almost non-existent disk throughput. There is no other significant errors in the event logs, there have been cluster alerts as things go down - but nothing that has led to a google/bing search for a solution. The immediate thought is going to be "it was KB4284833 what done it" but I'm not certain that is the cause.

Interestingly - when doing a file copy to/from the CSV volumes there is an initial spurt of disk throughput (but no where near as fast as it should be - say up to 100MB/s but could equally be as low as 7MB/s) and then it dies off to kB/s and effectively 0. So it look like there is some sort of cache that is working to some extent and then nothing.

I've been doing a lot of research for the past 24 hours or so - no smoking guns. I did find someone with similar issues that were traced back to the power mode settings - I've since set these to High Performance (rather than the default balanced) but have seen no change (might be worth another reboot to double check this though - will do that shortly)  

Any suggestions or similar experience? 

Thanks for any help.

Event ID 80 Hyper-V-Shared-VHDX in SQL Server

$
0
0

Dears

We have VM working in hyper v 2012 R2 we facing the issue after facing backup,

Event ID 80 Hyper-V-Shared-VHDX

Error attaching to volume. Volume: \Device\HarddiskVolumeShadowCopy62. Error: The specified request is not a valid operation for the target device..
Error attaching to volume. Volume: \Device\HarddiskVolumeShadowCopy61. Error: The specified request is not a valid operation for the target device..
 This issue happing in cluster Server we are using Veritas application for backup

We keeing VM and Storage to same VM but still facing same issue ?

https://www.experts-exchange.com/questions/29003325/Error-Log-in-Microsoft-Hyper-V-Shared-VHDX-section-after-backup.html

https://forums.veeam.com/veeam-backup-replication-f2/failed-to-invoke-func-t44712.html

We need solve this issue no impact from backup but we want to know why this happing after backup only in cluster

Regards


Cluster Client access point and generic application resources disappeared

$
0
0

We have a Windows 2008 R2 cluster hosting some generic application like Apache services and some opentext applications. It is node and disk majority with 2 nodes and a Witness disk.

Yesterday we were removing some disks which were a part of my Windows 2008 R2 cluster. While removing disks, it got hung while removing a disk which was mounted on another cluster disk(another resource in same application group). After five mins, the entire disks(46 disks) were moved to available storage and when I checked my application group, i found it was showing just two resources, one cluster disk and one Apache application- both in failed state. 

There was an access point with name and IP address and now missing along with another two generic applications which were a part of this application cluster. Also all Cluster disks which were a part of this application cluster is now moved to available storage. I have checked cluster registry key, and can find registry values for all those missing resources on resource hive. 

I am still able to ping the client access point with name and IP. But I cannot find anything on application cluster. I have tried cluster restarts, node restarts, but still shows same status. I have tried failover. Though it failed over between two nodes, the status is same. 


Viewing all 4519 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>