CAU: Analyze Clutser Passes all nodes except local node

December 11, 2013, 11:31 am

≫ Next: One VM Cluster Resources Regularly Failing

≪ Previous: Hyper-V Failover Cluster - Inconsistent Network Availability

No matter which node I run Test-CAUSetup on, it passes all nodes except the node running the test. Specifically, it fails test remote management via WMIv2 & PowerShell remoting. WSManFault Code 2150859027 thrown.
This is a Hyper-V cluster managed by a SC 2012 SP1 VMM server. For security purposes, VMM does not use the default remote management HTTP port 5985.
On each node, I've set the PowerShell execution policy to unrestricted, enabled the two remote management firewall rules, and opened HTTP port 5985 with the following command: WinRm Create winrm/config/listener?AddressIP:hyperV_mgt_Ip+Transport=HTTP 'at_symbol{Port="5985"}'.
Can I safely ignore this issue and press forward with CAU or does it need to be addressed? If the latter, what? Create HTTPS:5986 listener?

Thank You

↧

One VM Cluster Resources Regularly Failing

December 11, 2013, 3:49 pm

≫ Next: win server 2012 two node cluster, local "cliuser" issue

≪ Previous: CAU: Analyze Clutser Passes all nodes except local node

Hi All,

We run hundreds of Windows and Linux VMs in clustered and non-clustered environments. However, we're having issues with one particular VM that regularly restarts itself. The environment the problem VM is running in is a Windows 2012 R2 cluster.

The event log within the VM provides no BSOD information, the only entry of any note:

The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.

Therefore, I don't believe that the actual OS within the VM (Windows 2008 R2) is crashing.

The cluster log shows only a single entry:

Cluster resource 'Virtual Machine vps.xxxxxx.com' of type 'Virtual Machine' in clustered role 'vps.xxxxxx.com' failed.

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

How can I debug this? I've added extra RAM to it, and moved it to other nodes and other storage but the problem continues to occur.

Thanks

Will

↧

win server 2012 two node cluster, local "cliuser" issue

December 10, 2013, 8:05 am

≫ Next: Getting SMB Witness Client Errors in Eventlog on Witness Disk Clusters

≪ Previous: One VM Cluster Resources Regularly Failing

Hello,

I have a two node Windows Server 2012 STN Cluster with a few SQL instances installed inside it. Recently in my security event log I see these errors on both nodes:

An attempt was made to reset an account's password.

Subject:
Security ID: SYSTEM
Account Name:<>$
Account Domain:<>
Logon ID: 0x3E7

Target Account:
Security ID: lcoalmachinename\CLIUSR
Account Name: CLIUSR
Account Domain:localmachine name

When I look at the local account on both nodes, I see that password is set to never expire, and not be able to be reset. I am quite confused then, how the above could happen. Any advice or ideas would be greatly appreciated.

Thank you

↧

Getting SMB Witness Client Errors in Eventlog on Witness Disk Clusters

December 10, 2013, 12:45 pm

≫ Next: Windows 2008 Cluster question on using a new cluster drive source from shrinking existing disk

≪ Previous: win server 2012 two node cluster, local "cliuser" issue

Hi there.

I saw this on my third Windows Server 2012 Hyper-V Failover Cluster.

There are a lot of SMB-Witness Client Error Messages in the eventlogs of all nodes.

The problem is that these are Failover Clusters without SMB Witness Shares.

They have the "classic" Witness Disk attached. Quorum Configuration is "Node and Disk Majority (Witness Disk)"

I never use Witness Shares!

OS of the nodes are Windows Server 2012. All actual windows updates are installed and I've also used the technet articles for recommended cluster and hyper-v hotfixes to get the nodes as actual as possible.

Thanks in advance

Olaf

Regards
Olaf

↧

Windows 2008 Cluster question on using a new cluster drive source from shrinking existing disk

December 9, 2013, 10:27 am

≫ Next: Can't access Failover Cluster Manager in Windows 2012 R2 Cluster - The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server

≪ Previous: Getting SMB Witness Client Errors in Eventlog on Witness Disk Clusters

I have a two node Windows 2008 R2 enterprise SP1 cluster. It has a basic cluster setup of one (Q:)quorum disk and data disk (E:) which is 2.7tb is size. This cluster is connected to a shared Dell Disk array.

My question is can I safely shrink the 2.7tb drive down and carve out a disk size of 500gb from the same disk and use for a new cluster disk resource. We want to install Globalscape SFTP software on this new disk for use as a cluster resource.

Will this work without crashing the cluster.

Thanks,

Gonzolean

↧

Can't access Failover Cluster Manager in Windows 2012 R2 Cluster - The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server

November 19, 2013, 5:48 pm

≫ Next: event id -5120

≪ Previous: Windows 2008 Cluster question on using a new cluster drive source from shrinking existing disk

Hello!

My client has a two-node Windows Server 2012 R2 Hyper-V Cluster which has been running for month or two. The other day I found that from the Hyper-V1 host I couldn't run the Failover Cluster Manager - I got an error "The RPC Server is Unavailable - Exception from HRESULT: 0x800706BA". Then I tried to log into the other Hyper-V Host (Hyper-V2) and found that I couldn't log in using domain credentials - I had to log in locally. On this Hyper-V2 server I saw errors in the System event log:

The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server hyper-v2$. The target name used was HYPER-V2$. This indicates that the target server failed to decrypt the ticket provided by the client. This can occur when the target server principal name (SPN) is registered on an account other than the account the target service is using. Ensure that the target SPN is only registered on the account used by the server. This error can also happen if the target service account password is different than what is configured on the Kerberos Key Distribution Center for that target service. Ensure that the service on the server and the KDC are both configured to use the same password. If the server name is not fully qualified, and the target domain (XXXX.XXXX.com) is different from the client domain (XXXX.XXXX.COM), check if there are identically named server accounts in these two domains, or use the fully-qualified name to identify the server.

I've done quite a bit of searching but I didn't find this exact scenario. And because this is a very busy season for my client I'd like to try to resolve this issue with minimal impact to the running VM's. As of right now all of the VM's are working - I just can't manage them from the Failover Cluster Manager (which means that I can't even run live-migrations). I feel pretty much stuck.

Can anyone give me any (safe) guidance on how to proceed? I'd certainly appreciate the help!!!

dave

↧

event id -5120

December 12, 2013, 3:14 pm

≫ Next: Cluster Disk i/o Timeout

≪ Previous: Can't access Failover Cluster Manager in Windows 2012 R2 Cluster - The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server

Hi All,

Can any one help me to resolve this issue. iam not geting 100% validation report for my cluster configaration...

DCluster Shared Volume 'Volume1' ('Cluster Disk 1') is no longer available on this node because of 'STATUS_MEDIA_WRITE_PROTECTED(c00000a2)'. All I/O will temporarily be queued until a path to the volume is reestablished.

↧

Cluster Disk i/o Timeout

November 30, 2013, 11:27 am

≫ Next: Cluster Shared Volume is no longer accessible from cluster node

≪ Previous: event id -5120

Hi ,

We are stuck in problem with our private cloud protection , when ever DPM trying to backup virtual machines the cluster shared volume i.o timeout and the disk disappeared for a moment and that cause my virtual machine rebooted unexpectedly and move to different nodes of cluster .

Follow are the configuration of my Infrastructure .

1. Windows server 2012 Cluster X 6 nodes

2. DPM 2012 SP1

Event Generated when backup initiate :

Cluster Shared Volume 'Volume2' ('CSV 7TB Cluster Disk Production') is no longer available on this node because of 'STATUS_CLUSTER_CSV_AUTO_PAUSE_ERROR(c0130021)'. All I/O will temporarily be queued until a path to the volume is reestablished.

I have applied hot fix as Microsoft recommended

http://support.microsoft.com/kb/2813630/en-us

Disabled ODX as well , because my storage doesn't support this feature .

please help me out to resolve this matter .

Best Regards,

Muzammil

Muzammil Ubaray

↧

Cluster Shared Volume is no longer accessible from cluster node

November 30, 2013, 9:40 pm

≫ Next: Windows 2012 R2 Failover Cluster Hyper-V Invalid Class error when I'm trying to create VM

≪ Previous: Cluster Disk i/o Timeout

Hello,

We have a 3 nodes Hyper-v Cluster running Windows Server 2012. Recently we start having error below intermittently on a node, and the VMs running on this host and LUN will power off.

Alert: Cluster Shared Volume is no longer accessible from cluster node
Source: Cluster Service
Path: HV01.itl.local
Last modified by: System
Last modified time: 12/1/2013 12:27:18 AM
Alert description: Cluster Shared Volume 'Volume1' ('Cluster_Vol1_R6') is no longer accessible from this cluster node because of error 'ERROR_TIMEOUT(1460)'. Please troubleshoot this node's connectivity to the storage device and network connectivity.

The only changes made recently is we installed VEEAM on test basis for DR replication. We switched off the Veeam server and stop the Veeam Services on the Hyper-V Hosts but we are still having same issue.

We are using an EMC SAN connected via FC as Shared storage and Powerpath as Multi-Pathing. No errors were found on the SAN.

I don't think the issue is related to the number of IO as we also experienced the issue at midnight during the week-end where no one was working.

Any help would be very much appreciated.

Thanks.

Irfan

Irfan Goolab SALES ENGINEER (Microsoft UC) MCP, MCSA, MCTS, MCITP, MCT

↧

Windows 2012 R2 Failover Cluster Hyper-V Invalid Class error when I'm trying to create VM

December 13, 2013, 2:06 am

≫ Next: SIMPLE QUESTION: HOW TO MIGRATE FROM WINDOWS 2008 R2 + SQL 2012 FAILOVER CLUSTER to WINDOWS SERVER 2012 CLUSTER WITH ALWAYS ON AVAILABILITY GROUP

≪ Previous: Cluster Shared Volume is no longer accessible from cluster node

Hi, in my test lab environment I created Windows 2012 R2 Failover Cluster with 4 Servers to get Hyper-V HA.

I had no issues with Windows 2008 R2 or Windows 2012 before in same setup (2 NIC, FC HBA, SAN storage), but this time I cannot creat VM using Failover Cluster console:

Roles - Vurtual Machines - New Virtual Machine - Select Host -> <ANY HOST>

I'm using default settings during VM creation (except setting path to VM manually to pont to desired disk).

IN progress I see disk creation and both VM configuration and .vhdx files on target disk, but after that I see "The Operation has failed. An error occured creating a New Virtual Machine. Invalid Class.

In fact I see virtual machine in Hyper-V Manager and it's fully functional, but not added to Failover Cluster Roles. When I use Configure Role - Virtual Machine to see eligible machines - it's not there.

Cluster Validation says that everything is OK.

I wasn't able to find anything in Eventllog(s) or %SYSTEMROOT%\Cluster as it was in Win2k8.

How should I troubleshoot this issue ?

↧

SIMPLE QUESTION: HOW TO MIGRATE FROM WINDOWS 2008 R2 + SQL 2012 FAILOVER CLUSTER to WINDOWS SERVER 2012 CLUSTER WITH ALWAYS ON AVAILABILITY GROUP

September 5, 2013, 10:04 am

≫ Next: Server 2012 Hyper-V Cluster Network Configuration

≪ Previous: Windows 2012 R2 Failover Cluster Hyper-V Invalid Class error when I'm trying to create VM

Hello,

We have 2-node Windows 2008 R2 Enterprise Edition failover cluster with Fibre shared storage (SAN) running SQL Server 2012 SP1. Below is current configuration - very simple and classic, I would say everything by the book:

This is what I think we want to achieve:

Objectives:

1. Upgrade Windows Operating System from Windows Server 2008 R2 to Windows Server 2012

2. Migrate to SQL Server 2012 Always On Availability Group (AAG) for High Availability and Disaster Recovery

My question is how to achieve both goals?

If possible I would like to upgrade OS first. Ideally I would like to upgrade on the same hardware (because it should be minimal impact - no need to migrate data). If this is not possible, we have new hardware I can use also. But I guess it will be more impact and actual data migration will be required.

For AAG what I'm honestly missing is what would be the name of the second SQL server? Lets say my servers called DB1 and DB2, and SQL server called DB. If I create AAG, and fail-over to replica server, would SQL server name be DB as well?

I know there is lots of documentation on AAG and I went through it but I cannot find any specific information about names.

Another question I have - would 3rd server (DB3) be part of the same MSCS cluster? Or it will be separate server? How fail-over exactly works - do I use Fail-over cluster Manager to initiate failover?

Sorry for lots of questions, but any information would be appreciated very much.

Thanks!

↧

Server 2012 Hyper-V Cluster Network Configuration

December 12, 2013, 11:28 am

≫ Next: Upgrading from Server 2008 R2 Core to 2012 R2

≪ Previous: SIMPLE QUESTION: HOW TO MIGRATE FROM WINDOWS 2008 R2 + SQL 2012 FAILOVER CLUSTER to WINDOWS SERVER 2012 CLUSTER WITH ALWAYS ON AVAILABILITY GROUP

I have not been able to find any documentation that explicitly says if my network configuration is supported or not. It passes cluster validation but after a crash last weekend I am questioning it. Hopefully one of the experts here can chime in.

Here is our setup:

Server 2012 Hyper-V cluster, two nodes with a disk witness. Dell R520 Servers and a Dell Equalogic iSCSI SAN. CSV and witness disk are running on the SAN (on different volumes). Currently running four VMs, will expand to eight, possibly nine in the near future.

Each host has (all gigabit):

a 4-port NIC dedicated to the SAN using Dell mpio on its own subnet

a 4-port NIC team dedicated to the VM network

a 2-port NIC team for management, heartbeat and live migration

All are connected to a Dell two-switch stack with load balancing and failover at the switch level. All teams are using link aggregation and the SAN connections are using aggregation and jumbo frames.

My concern is having the live migration network on the same network team as management. I know this is not recommended but I have not seen this configuration listed as "not supported" either. Redundancy shouldn't be an issue with failover at the team and the switches. I am concerned with possible bottle-necking though.

If I were to add QoS to the 2-port team to cap the management and heartbeat bandwidth would that be enough? How much of a cap should I set?

Or should I break the team and create another network exclusively for live migration?

Or is the current configuration okay?

↧

Upgrading from Server 2008 R2 Core to 2012 R2

December 13, 2013, 9:02 am

≫ Next: server 2012 cluster repair option

≪ Previous: Server 2012 Hyper-V Cluster Network Configuration

I am having a few problems with my 2008 R2 Core installation and am considering upgrading instead of just reinstalling which would be easier. My question is this; I have a three node failover cluster with Cluster shared volumes where the VHD and VM configs are stored on a NetApp SAN. I want to do a staged upgrade where I take two nodes out of the cluster and install fresh 2012 R2 full installs, configure the servers and set up the new cluster, once I have done that I can have a couple of hours downtime while I move the SAN over to the new cluster and then import all the machines into the new cluster.

The scenario works in my head but I was wondering if I am missing something. Also I am no iscsi guru so can someone give me a step by step guide to setup the MPIO and iscsi Section of the cluster.

We have a limited window to do this work as this will have to take place over the Christmas break which this year consists of 7 working days.

Can I use the same cluster name or is it best to use a new one?
Lastly should I remove the nodes properly leaving one node or just shutdown and let the cluster windge for a couple of days?
Would I have to do any other config changes to the NetApp?

↧

server 2012 cluster repair option

December 13, 2013, 6:13 am

≫ Next: DNS name for sql clustering instance name

≪ Previous: Upgrading from Server 2008 R2 Core to 2012 R2

Hello

Can anyone tell me what a repair actually does to an offline cluster? I cannot find it in technet or msdn.

Thanks

↧

DNS name for sql clustering instance name

December 13, 2013, 10:55 am

≫ Next: Cluster physical disk resource 'SQLVS01 Logs' cannot be brought online

≪ Previous: server 2012 cluster repair option

Hi all,
sql 2005 or sql 2008 clustering on windows 2008 R2
We create sql 2005 or sql 2008 clustering. The sql clustering
instance name (DNS name) was created manaully or created automatically in DNS.

is it issue if sql instance name was created manaully in DNS?

Thank you.

↧

Cluster physical disk resource 'SQLVS01 Logs' cannot be brought online

December 14, 2013, 3:11 am

≫ Next: Linux NFS share to Windows 2008 R2 cluster as a resource

≪ Previous: DNS name for sql clustering instance name

Hello,

I was having SQL failover cluster farm contains 4 Servers, this week I expanded the farm to be 7 Servers.

the farm contains 3 Instance (VS01,VS02 and VS03), after adding the new servers i can move the instances (VS02 and VS03) to any of the new servers, but when trying to move Instance VS01 (the most important Instance) to any of thew servers I face the below error. I can only move this Instance within the Old servers (1.2.3.and 4) but i cannot move it to (5,6 and 7)

while as I said i can move the other instance to any of the servers even the new servers..

I already check some of the topics in the forum regarding the same issue but it could not help me

any help please

Error Message:

Cluster physical disk resource 'SQLVS01 Logs' cannot be brought online because the associated disk could not be found. The expected signature of the disk was 'B13B9D5A'. If the disk was replaced or restored, in the Failover Cluster Manager snap-in, you can use the Repair function (in the properties sheet for the disk) to repair the new or restored disk. If the disk will not be replaced, delete the associated disk resource.

↧

Linux NFS share to Windows 2008 R2 cluster as a resource

November 24, 2013, 7:59 pm

≫ Next: Change Cluster LUNs from Path-through Disk to directly attached to VMs

≪ Previous: Cluster physical disk resource 'SQLVS01 Logs' cannot be brought online

Hello,

I would like to share a directory on RHEL 5 Linux server with Windows 2008 R2 server cluster having 2 nodes via NFS read only access to make it as a cluster resource to be accessible by cluster users.

Tried sharing in /etc/exports file as following, got permission denied at Windows server node when tried to open the folder after connecting to it.

/etc/exports file look like following:

/user/test_share windows_server.com(async)

Kindly let me know the best practice to accomplish this.

Thanks in advance.

↧

Change Cluster LUNs from Path-through Disk to directly attached to VMs

December 15, 2013, 12:49 am

≫ Next: Huge single storage pool for lots of VMs, or seperate storage pools for each VM

≪ Previous: Linux NFS share to Windows 2008 R2 cluster as a resource

Hi Dears,

I have the bellow situation:

6 Hyper-V 2012 Servers, all Hyper-V 2012 are standalone (not a clustered scenario)
All Those Hyper-V are connected to Hitachi SAN Storage using QLogic HBA on the Enclosure, and Brocade as switch from Hitach Side.
Those 6 Hyper-V are having Clustered VMs on them (examples: on HV1 I have ClusterFIleServer1 and ClusteredVMM1 and MailboxDAG1, and NLBCAS1, and on HV2 I have ClusteredFileServer2, and ClusteredVMM2 and MailboxDAG2, and NLBCAS2.)
For VMM, Mailbox, and File Servers, the LUNs are path-through Disks (in other words Quorums is path-through disk!!!!)

Now I will configure Virtual SAN Manager instep to be able to assign the LUNs to VMs directly, to remove the restriction of Path-through disks.

My request:

After I have Virtual SAN Manager in place, and added to VMs, can I just configure LUNs to be mapped to VMs, in other words: can I assign File and Print Server Quorum to VMs directly and remove it from path-through disk?

Is it going to work? Or the quorum will be damaged, and my cluster service will not be able to local the quorum?

Thanks

↧

Huge single storage pool for lots of VMs, or seperate storage pools for each VM

December 9, 2013, 12:08 pm

≫ Next: Virtual Machine Network Health not working

≪ Previous: Change Cluster LUNs from Path-through Disk to directly attached to VMs

Background

Two identical 2008 R2 servers for running VMs with Failover Cluster (Hyper-V)
One IOmega NAS (old) with 15 VMs sharing one large (2 TB) clustered storage pool
One VNXe NAS (New) Nothing setup yet

I am getting ready to start exporting all of my VMs from our old SAN to our new SAN.

Is it better to create one huge clustered storage pool that will be shared between all the VMs, or would it be better to create separate smaller clustered storage spaces for each VM?

Thanks,
Brian

↧

Virtual Machine Network Health not working

December 5, 2013, 5:38 am

≫ Next: Windows 2003 Clustrer- Resources in Evict node

≪ Previous: Huge single storage pool for lots of VMs, or seperate storage pools for each VM

Server 2012 R2 is supposed to have a feature which detects when the public LAN connection used to get into the VM's becomes unavailable, this is supposed to kick in around 60 seconds after it being unavailable. however, I have tried to test this by physically unplugging the network cable, and disabling the network adapter in ncpa.cpl and the virtual machine doesn't seem to live migrate to the other node. the settings for the network adapter in the virtual machine configuration has the "protected network" box checked by default. is there something else we need to check/configure here?

it looks like it tries to migrate but fails, the message in the information details state:

Live migration of 'Virtual Machine Test-TERM' failed.

'Virtual Machine Test-TERM' failed to fixup network settings. Verify VM settings and update them as necessary

thanks

Steve

↧