Quantcast
Channel: High Availability (Clustering) forum
Viewing all 4519 articles
Browse latest View live

Hyper-v cluster changes Volume numbers after power failure or storage reset

$
0
0

We had encountered multiple times a case where the volume numbers - which the virtual machines usually depend on to located their VHD/VHDX files and their configuration files as well - totally changes after storage restarts due to power failure or something, the path which Hyper-v usually uses to locate the VHD files are mounting points like C:\ClusterStorage\Volume1...C:\ClusterStorage\Volume10. the paths changes and the numbers Volume1-10 shuffles , resulting in VM's not able to start because it misses the location of the VHDX file or even the location configuration file of the VM as a whole, and as a solution we have to find out the changed volumes and change them or we have to reattached the VHD's or even redefine the VM, I believe there should be a better solution if we could define the cause first.

We searched a lot on internet and didn't find anyone reported such an issue although it happened a lot with us, and didn't find any solution or better workaround.

Appreciate your help.


Aymanq


window 2012 r2 - cluster resource vs cluster disk

$
0
0

Hi all,

I create a 2 node WSFC and added a cluster disk to it.
I have not installed any clustered application and roles yet.

I am playing around with the cluster resource (such as moving it around).
On the cluster itself -> More Action -> Move core cluster resource -> Select node -> and choose node i want.

With the above, i can see the clusterIP moving between the designated nodes BUT NOT the cluster disk created.
I would have to go to Storage -> Disk -> Move available Storage ->  Select node -> choose node i want .

q1) is the above observed behavior normal ? or somewhere my configurations are wrong ?

q2) why isn't the cluster disk consider as part of the core cluster resource ? what would be a reason that i would want my cluster IP on node2 and my cluster disk mounted on node 1 ?

q3) i also realize with the "Move available storage"  option, it is movingALL cluster disks, and not particular cluster disk. What if i want some disk mounted on Node1 and some mounted on Node2 ?

Hope gurus here can shed some light

Regards,
Noob

Windows 2012 NLB + ARR (proxy server functions redundant)

$
0
0

Hello everybody,

I am developing a Windows Server 2012 R2 architecture (Virtualized all servers as Virtual Machines - VMs).

The architecture foresees some application with WebServer IIS.

The architecture foresees 2 Web Servers in NLB configuration.

REQUIREMENTS:

Reverse Proxy

It is request the use of proxy server. MS ARR is under consideration.

(I am not expert in Reverse Proxy and therefore I am asking your help)

QUESTION:

1) Does MS ARR have to installed in different server from the NLB web Server farm?

Or it is possible to install it on the 2 Web Server (as configured in NLB)

(By using additional servers, this lead to increase the number of VMs and licensing costs)

2) IS Windows NLB required? or it is MS ARR to provide NLB functions?

3) Is MS ARR a good tools to build Proxy server functions ? or third-parties software should be taken into consideration?

Please

Could you help in addressing this requirement?

Thanks

how to prep a Win2k12 R2 server for SQL Server 2014 clustering?

$
0
0
we are preparing to implement high-availability or clustering for SQL Server 2014 Enterprise Edition with SP2. is there some sort of guidelines how to prep a Windows Server 2012 R2 to support it?

Cluster shared volume disappear... STATUS_MEDIA_WRITE_PROTECTED(c00000a2)

$
0
0

Hi all, I am having an issue hopefully someone can help me with. I have recently inherited a 2 node cluster, both nodes are one half of an ASUS RS702D-E6/PS8 so both nodes should be near identical. They are both running Hyper-V Server 2008 R2 hosting some 14 VM's.

Each node is hooked up via cat5e to a PromiseVessRAID 1830i via iSCSI using one of the servers onboard NICs each, whose cluster network is setup as Disabled for cluster use (the way I think it is supposed to be not the way I had originally inherited it) on it's own Class A Subnet and on it's own private physical switch...

The SAN hosts a 30GB CSV Witness Disk and 2 2TB CSV Volumes, one for each node labeled Volume1 and Volume2. Some VHD's on each.

The Cluster Clients connect to the rest of the company via the Virtual ExternalNIC adapters created in Hyper-V manager but physically are off of Intel ET Dual Gigabit adapters wired into our main core switch which is set up with class c subnets.

I also have a crossover cable wired up running to the other ports on the Intel ET Dual Port NICs using yet a third Class B Subnet and is configured in the Failover Cluster Manger as internal so there are 3 ipv4 Cluster networks total.

Even though the cluster passes the validation tests with flying colors I am not convinced all is well. With Hyperv1 or node 1, I can move the CSV's and machines over to hyperv2 or node 2, stop the cluster service on 1 and perform maintenance such as a reboot or install patches if needed. When it reboots or I restart the cluster service to bring it back online, it is well behaved leaving hyperv2 the owner of all 3 CSV's Witness, Volume 1 and 2. I can then pass them back or split them up any which way and at no point is cluster service interrupted or noticed by users, duh I know this is how it is SUPPOSED to work but...

if I try the same thing with Node 2, that is move the witness and volumes to node 1 as owner and migrate all VM's over, stop cluster service on node 2, do whatever I have to do and reboot, as soon as node 2 tries to go back online, it tries to snatch volume 2 back, but it never succeeds and then the following error is logged in cluster event log:

Hyperv1

Event ID: 5120

Source: Microsoft-Windows-FailoverClustering

Task Category: Cluster Shared Volume

The listed message is:Cluster Shared Volume 'Volume2' ('HyperV1 Disk') is no longer available on this node because of 'STATUS_MEDIA_WRITE_PROTECTED(c00000a2)'. All I/O will temporarily be queued until a path to the volume is reestablished.

Followed 4 seconds later by:

Hyperv1

event ID: 1069

Source: Microsoft-Windows-FailoverClustering

Task Catagory: Resource Control Manager

Message: Cluster Resource 'Hyperv1 Disk in clustered service or application '75d88aa3-8ecf-47c7-98e7-6099e56a097d' failed.

- AND -

2 of the following:

Hyperv1

event ID: 1038

Source: Microsoft-Windows-FailoverClustering

Task Catagory: Physical Disk Resource

Message: Ownership of cluster disk 'HyperV1 Disk' has been unexpectedly lost by this node. Run the Validate a Configuration wizard to check your storage configuration.

Followed 1 second later by another 1069 and then various machines are failing messages.

If you browse to\\hyperv-1\c$\clusterstorage\ or\\hyperv-2\c$\Clusterstorage\, Volume 2 is indeed missing!!

This has caused me to panic a few times as the first time I saw this I thought everything was lost but I can get it back by stopping the service on node 1 or shutting it down, restarting node 2 or the service on node 2 and waiting forever for the disk to list as failed and then shortly thereafter it comes back online. I can then boot node 1 back up and let it start servicing the cluster again. It doesn’t pull the same craziness node 2 does when it comes online; it leaves all ownership with 2 unless I tell I to move.

I am very new to clusters and all I know at this point is this is pretty cool stuff but basically if it is running don’t mess with it is the attitude I have taken with it but there is a significant amount of money tied up in this hardware and we should be able to leverage this as needed, not wonder if it is going to act up again. 

To me it seems for a ‘failover’ cluster it should be way more robust than this...

I can go into way more detail if needed but I didn’t see any other posts on this specific issue no matter what forum I scoured. I’m obviously looking for advice on how to get this resolved as well as advice on whether or not I wired the cluster networks correctly. I am also not sure about what protocols are bound to what nics anymore and what the binding order should be, could this be what is causing my issue?

I have NVSPBIND and NVSPSCRUB on both boxes if needed.

Thanks!

-LW

Live migration of 'Virtual Machine ADVM-01 ' failed. Event ID : 21502

$
0
0

I've HA Cluster running on Windows 2012 R2 with configured fail over cluster. it's running Windows 2008 , 2008 R2 , 2012 VMs.

already installed the Integration Services. when i tried to Live Migrate to other Node , it's getting failed.

in the event viewer below error message shows.

" Live migration of 'Virtual Machine ADVM-01' failed.

Virtual machine migration operation for 'ADVM-01' failed at migration source 'NODE01'. (Virtual machine ID D840382C-194B-4B4F-8BF5-19552537D0EF)

'ADVM-01' failed to delete configuration: The request is not supported. (0x80070032). (Virtual machine ID D840382C-194B-4B4F-8BF5-19552537D0EF) "

please advise me.


Regards, COMDINI

adding in a second ip address to SQL 2008 R2 cluster instance

$
0
0

2008 R2 cluster. I want to add in a second IP address on a different subnet, for backups. I don't want failure of the iP resource to be service impacting. 

My plan is to 

-add in new ip resource to the SQL group, and bring online

-restart the whole SQL instance, so SQL picks up the new IP.

I am assuming that I don't have to add in the new IP address as a dependency for the network name resource, because I don't want any failures to trigger failover. Is this okay? I am checking I don't have to add in the new IP address as a dependency? I am aware I could add it in as an AND dependency if forced to.



Cluster Aware Updating - Possible owners causes drain failure

$
0
0

Hello all,

I have a question regarding possible owners that a VM can have and Cluster Aware Updating. We have several servers that cannot be moved between HyperV hosts, because the high availability is configured in the application the VM is running, or it is not supported for an application (Lync HA, Exchange DAG, etc.). We configured the possible owner of VM's through SCVMM to be only a certain node.  However, we want to use Cluster Aware Updating and when we run it, the process fails. Of course there is no way to drain a role that is fixed on a cluster node, but we shut these servers down and had hoped that the roles were not required to be drained that way. It appeared that this was not the case and the update process failed. 

The workaround was to set the possible owners to several other hosts as well (they were shutdown, so no problem). This was only for a couple of VM's on this cluster, but we have clusters that have much, much more of these type of hosts so we want to see if there is another solution.

Isn't there any way to make sure that Cluster Aware Updating does not require hosts that are shutdown, to be drained to another node? Or any other brilliant idea? :-) I found out that you can do a "forced" drain through Powershell, but I do not think that that is a good solution and that we cannot apply it via Cluster Aware Updating. Anyone?



Can't add new node to existing failover cluster

$
0
0

Hi,

i have problem adding new node to existing failover cluster. Existing failover cluster is two node cluster with node and file share majority. i'm using this cluster for SQL AlwaysOn Availability group. There are no shared volumes.

when i use failover cluster manager console i'm getting error:

The server 'N3.local' could not be added to the cluster.
An error occurred while adding node 'N3.local' to cluster 'Cluster1'.

The parameter is incorrect

Also i have this error in Application and services log/Microsoft/FailoverClustering-Manager/Diagnostic:

Exception occurred in background operation - System.ApplicationException: An error occurred while adding nodes to the cluster 'Cluster1'. ---> System.ApplicationException: An error occurred while adding node 'N3.local' to cluster 'Cluster1'. ---> System.ComponentModel.Win32Exception: The parameter is incorrect
   --- End of inner exception stack trace ---
   at MS.Internal.ServerClusters.ClusApiExceptionFactory.CreateAndThrow(Cluster cluster, Int32 sc, String format, Object arg0, Object arg1)
   at MS.Internal.ServerClusters.Cluster.AddNode(String nodeName, ClusterActionCallback callback)
   at MS.Internal.ServerClusters.Configuration.AddNodeManagement.AddNodes(ActionArgs actionArgs, ActionUpdateHelper updateHelper)
   --- End of inner exception stack trace ---
   at MS.Internal.ServerClusters.Configuration.AddNodeManagement.AddNodes(ActionArgs actionArgs, ActionUpdateHelper updateHelper)
   at MS.Internal.ServerClusters.Configuration.AddNodeManagement.PerformAddNodes(ActionArgs actionArgs)
   at MS.Internal.ServerClusters.Configuration.ConfigurationBase.PerformActionWrapper(BackgroundOperationStatus backgroundOperationStatus, BackgroundOperationArgs parameter)
   at MS.Internal.ServerClusters.BackgroundOperation`2.BackgroundOperationProc(Object state)


i have tried to add node from powershell with same error (parameter is  incorrect). I have tried to remove Failover cluster role and add it again but i'm still getting the same error.

Please advice,

Thank you

Node in cluster - status changes to "paused"

$
0
0

We have seven Windows 2012 R2 nodes in a Hyper-V cluster. They are all identical hardware (HP BladeSystem). For a while, we had only six nodes, and there were no problems.

Recently, we added the seventh node, and the status keeps reverting to "paused". I can't find any errors that directly point to why this is happening - either in the System or Application log of the server, in the various FailoverClustering logs, or in the Cluster Event logs. I created a cluster.log using the get-clusterlog command, but if it explains why this is happening, I can't figure it out (it's also a very large file - 150 MB, so it's difficult to determine what lines are the important ones).

As far as I can tell, everything on this new node is the same as the previous ones - the software versions, network settings, etc. The Cluster Validation report also doesn't give me anything helpful.

Any ideas on how to go about investigating this? Even before I can solve the problem, I'd like to at least know when and why the status reverts to paused.

Thanks,

David

Not able to see cluster resources in the failover cluster manager

$
0
0

We are using Windows Server 2008 R2 and configured SQL Server Cluster services.

Randomly every 10-15 days  cluster service gets hang and to load resource list in Failover Cluster manager it takes very long time(almost 10-15 mins). When this issue is happening then in SQL server if we try to add any anew DB then it does not allow us to create and query run on and on.

From the SQL Server logs we came to know waittype as PREEMPTIVE_CLUSAPI_CLUSTERRESOURCECONTROL it means cluster service is hung some where.

Not sure how to troubleshoot this issue, I have checked event log for cluster service but not getting any clue.

When we restart our server then cluster server becomes normal and after 10-15 days it again starts behaving same way.

Need help how to find out the reason where cluster service is hung?

2008 R2 cluster network name resource ip addresses, versus ip address resource.

$
0
0

We have a lot of clusters in our company. Many have two IP addresses in an application, one for prod and one for backup, each on a different adapter. E.g we have SQL instances like this.

Looking at the failover cluster manager, some applications have two ip subnet addresses in the network name resource(in the ip addresses section)

Some applications have the second backup ip address in as in ip address resource, located in the other resources section.

It made me wonder, what is the difference between putting two IP subnets addresses in the network name resources ip addresses section, versus putting the second IP address in as an IP resource in other resources. Both seem to work. Is there actually any difference, or is it purely cosmetic/graphical? 

This distinction was never seen on older 2003 clusters, as every ip address was just an ip resource, with the network name having dependency on it.

MPIO with Windows Failover Cluster

$
0
0

Hi Team,

I am looking some pointers on configuring MPIO for Windows Failover cluster with shared storage. Is it mandatory to install MPIO feature before or with Windows server failover cluster with shared storage on Windows level or is it something that can be taken care at the Storage level? Any article that could help with configuring Windows Server MPIO feature will be appreciated. Thanks

Regards, 

Validate SCSI-3 Persistent Reservation failed during Cluster validation

$
0
0

Hi Team,

We have 'Validate SCSI-3 Persistent Reservation' check failed during Cluster validation.

Cluster nodes are running on VMware virtualization platform and shared storage is from EMC storage mapped as pass through LUN directly with Cluster node VMs. Any one any pointers pls. what could be the issues here. Is there anything that need to be corrected or need to check in Windows Server or more related to VMWare and storage related issue? Any pointers will be appreciated. Thanks

Regards,

Witness Client failed to Register

$
0
0

I have a recently built pair of W2012 R2 servers with all of the applicable updates applied via WSUS.  They are clustered servers but are not using shared storage.  Instead they are using Vision Solutions' "High Availability for Windows" to replicate several drives.  Also installed is SQL 2012 (clustered) and all available updates.  I have used this solution several times previously on other clusters.  The servers are performing as expected but are not yet in production.  A few days ago I began to see the following notifications approximately every 20 seconds on the "owning" node (PCSCALEA) of the cluster:

Log Name:      WitnessClientAdmin
Source:        Microsoft-Windows-SMBWitnessClient
Date:          3/26/2015 9:54:23 AM
Event ID:      8
Task Category: None
Level:         Error
Keywords:     
User:          NETWORK SERVICE
Computer:      PCScaleA.rms.org
Description:
Witness Client failed to register with Witness Server PCSCALEB for notification on NetName\\Pcscale with error (The parameter is incorrect.)
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Microsoft-Windows-SMBWitnessClient" Guid="{32254F6C-AA33-46F0-A5E3-1CBCC74BF683}" />
    <EventID>8</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x8000000000000000</Keywords>
    <TimeCreated SystemTime="2015-03-26T13:54:23.951194200Z" />
    <EventRecordID>114860</EventRecordID>
    <Correlation />
    <Execution ProcessID="1624" ThreadID="9612" />
    <Channel>WitnessClientAdmin</Channel>
    <Computer>PCScaleA.rms.org</Computer>
    <Security UserID="S-1-5-20" />
  </System>
  <EventData>
    <Data Name="WitnessServerIP">PCSCALEB</Data>
    <Data Name="NetName">Pcscale</Data>
    <Data Name="Error">87</Data>
  </EventData>
</Event>

The following errors appear approximately every 20 seconds on the "non-owning node" (PCSCALEB):

Log Name:      WitnessServiceAdmin
Source:        Microsoft-Windows-SMBWitnessService
Date:          3/26/2015 9:51:43 AM
Event ID:      5
Task Category: None
Level:         Error
Keywords:     
User:          SYSTEM
Computer:      PCScaleB.rms.org
Description:
Witness Service registration request from Witness Client (PCSCALEA.RMS.ORG) for NetName\\PCSCALE failed with error (The parameter is incorrect.)
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Microsoft-Windows-SMBWitnessService" Guid="{CE704B50-B105-4BC8-A24F-1792C0401C2A}" />
    <EventID>5</EventID>
    <Version>0</Version>
    <Level>2</Level>
    <Task>0</Task>
    <Opcode>0</Opcode>
    <Keywords>0x8000000000000000</Keywords>
    <TimeCreated SystemTime="2015-03-26T13:51:43.906614200Z" />
    <EventRecordID>153093</EventRecordID>
    <Correlation />
    <Execution ProcessID="5252" ThreadID="5368" />
    <Channel>WitnessServiceAdmin</Channel>
    <Computer>PCScaleB.rms.org</Computer>
    <Security UserID="S-1-5-18" />
  </System>
  <EventData>
    <Data Name="ClientName">PCSCALEA.RMS.ORG</Data>
    <Data Name="NetName">PCSCALE</Data>
    <Data Name="ErrorCode">87</Data>
  </EventData>
</Event>

Any ideas?  Any suggestions?

Thanks,

Ryan

 

Cluster Aware Updating - Possible owners causes drain failure

$
0
0

Hello all,

I have a question regarding possible owners that a VM can have and Cluster Aware Updating. We have several servers that cannot be moved between HyperV hosts, because the high availability is configured in the application the VM is running, or it is not supported for an application (Lync HA, Exchange DAG, etc.). We configured the possible owner of VM's through SCVMM to be only a certain node.  However, we want to use Cluster Aware Updating and when we run it, the process fails. Of course there is no way to drain a role that is fixed on a cluster node, but we shut these servers down and had hoped that the roles were not required to be drained that way. It appeared that this was not the case and the update process failed. 

The workaround was to set the possible owners to several other hosts as well (they were shutdown, so no problem). This was only for a couple of VM's on this cluster, but we have clusters that have much, much more of these type of hosts so we want to see if there is another solution.

Isn't there any way to make sure that Cluster Aware Updating does not require hosts that are shutdown, to be drained to another node? Or any other brilliant idea? :-) I found out that you can do a "forced" drain through Powershell, but I do not think that that is a good solution and that we cannot apply it via Cluster Aware Updating. Anyone?


Team up iScsi networks in windows 2012 R2 failover cluster

$
0
0

Hi Team,

I have a fail over cluster with 3 nodes. We are using  iSCSI initiators to connect SAN storage. Each node has two iSCSI network adapters.

Need your suggestion is it feasible or advisable to team up both iSCSI networks, currently they are not teamed up. 

And also iSCSI network cluster use set as NONE, is that right?

Regards,

KR

Assigning Permission to Cluster

$
0
0

Hi Team,

We have installed Windows Failover Server with an account that has domain admin rights.

Once failover cluster is created, we are handing over the Cluster access to SQL Admin. We have an SQL installation user account created as 'SQL_admin'. SQL Admin logs into the node using the SQL_admin account but not able to connect with the Cluster.

SQL_admin account is added to 'Local Administrators' Group on each node those are part of cluster. I have logged into one of the cluster node with domain admin and try to add 'SQL_admin' account to 'Cluster permission' but not able to so with an error (screen shot attach)

We are getting attach error and need some pointers to provide full access to SQL_admin account so that he can start installing SQL instances.

Any help would be highly appreciated. Thanks

Regards,

Failover Clustering Check - Validate CSV Settings

$
0
0

Hello,

I have a lab environment.

While checking failover requirements, i get a warning error for my CSV Storage.

"Failure while setting up to run cluster shared volumes support testing on node cluster1. The password does not meet the password policy requirements. check the minimum password length, password complexity and password history requirements."

I updated my password policy in my domain to have a lower requirement. Now the thing is, i'm testing this lab before applying to a live environment and want to minimize as much errors on the cluster validation and I don't want my network to exempt the cluster objects to have a lower password requirement.

It there anyway we can set the cluster to create a password that is compliant on my domain Password Security?


For God, and Country.


MPIO and moving the tempdb to a different disk

$
0
0

We need to move the SQL server temp database of our SQL server cluster to new partition on the same disk. I've found some instructions that discuss moving the temp database, however I'm trying to find out whether there will be any additional issues as a result of the change. 

1. The disk resides on SAN storage, are there any additional steps required regarding MPIO?

2. Are there any steps that need to be carried out on the secondary (failover) cluster node?

(current steps identified)

USE master
GO
ALTER DATABASE TempDB MODIFY FILE
(NAME = tempdev, FILENAME = 'd:\datatempdb.mdf')
GO
ALTER DATABASE TempDB MODIFY FILE
(NAME = templog, FILENAME = 'e:\datatemplog.ldf')
GO

stop the SQL server instance

move files to the new location

restart the instance


(Environment Details)

OS: Windows Server 2008 R2

Platform: SQL Server 2008 R2

Clustering: Windows Failover Cluster Manager

Viewing all 4519 articles
Browse latest View live