1、NetAppxClusteredDataONTAP8.2High-AvailabilityConfigurationGuideNetApp, Inc.495 East Java Drive Sunnyvale, CA 94089 U.S.Part number: 215-07970-A0May 2013Telephone:+1(408)822-60Fax:+1(408)822-4501Supporttelephone:+1(888)463-8277Web:Feedback:doccommentsContentsUnderstandingHAWhatanHApairis6HowHApairssu
2、pportIiondisruptiveoperationsandfaulttolerance6WheretofindproceduresfornondisruptiveoperationswithHApairs7HowtheHApairimprovesfaulttolerance8ConnectionsandcomponentsofanHApair11HowHApairsrelatetothecluster12Ifyouhaveatwo-nodeswitchlesscluster14Understandingtakeoverandgiveback15Whentakeoversoccur15Fa
3、ilovereventcause-and-effecttable16Howhardware-assistedtakeoverspeedsuptakeover19Whathappensduringtakeover20Whathappensduringgiveback21Backgrounddiskfirmwareupdateandtakeover,giveback,andaggregaterelocation22HApolicyandgivebackoftherootaggregateandvolume22Howaggregaterelocationworks24PlanningyourHApa
4、irconfiguration26BestpracticesforHApairs26SetuprequirementsandrestrictionsforHApairs27Requirementsforhardware-assistedtakeover28IfyourclusterconsistsofasingleHApair28PossiblestorageconfigurationsintheHApairs29HApairsandstoragesystemmodeltypes30Single-chassisanddual-chassisHApairs30Interconnectcablin
5、gforsystemswithvariableHAconfigurations31HAconfigurationandtheHAstatePROMvalue31TableofstoragesystemmodelsandHAconfigurationdifferences31InstallingandcablinganHApair35Systemcabinetorequipmentrackinstallation35HApairsinanequipmentrack35HApairsinasystemcabinet35Requireddocumentation36Requiredtls37Requ
6、iredequipment37Preparingyourequipment38Installingthenodesinequipmentracks38Installingthenodesinasystemcabinet39CablinganHApair39DeterminingwhichFibreChannelportstouseforFibreChanneldiskshelfconnections40CablingNodeAtoDS14mk2orDS14mk4diskshelves41CablingNodeBtoDS14mk2orDS14mk4diskshelves43CablingtheH
7、Ainterconnect(allsystemsexcept32xx)45CablingtheHAinterconnect(32xxsystemsinseparatechassis)46RequiredconnectionsforusinguninterruptiblepowersupplieswithHApairs46ConflgUrInganHPalr47EnablingclusterHAandswitchless-clusterinatwo-nodecluster47EnablingtheHAmodeandstoragefailover48Commandsforenablinganddi
8、sablingstoragefailover48CommandsforsettingtheHAmode49Configuringanodefornon-HA(stand-alone)use49VerifyingtheHApaircablingandconfiguration51Configuringhardware-assistedtakeover51Commandsforconfiguringhardware-assistedtakeover51Configuringautomatictakeover52Commandsforcontrollingautomatictakeover52Sys
9、temeventsthatalwaysresultinanautomatictakeover52Systemeventsthattriggerhardware-assistedtakeover53Configuringautomaticgiveback53Understandingautomaticgiveback54Commandsforconfiguringautomaticgiveback55Testingtakeoverandgiveback55MonitoringanHApair58CommandsformonitoringanHApair58Descriptionofnodesta
10、tesdisplayedbystoragefailovershow-typecommands59Commandsforhaltingorrebootinganodewithoutinitiatingtakeover69Performingamanualtakeover70Commandsforperformingandmonitoringamanualtakeover70Performingamanualgiveback72Ifgivebackisinterrupted72Ifgivebackisvetoed72Commandsforperformingamanualgiveback74Man
11、agingDS14mk2orDS14mk4diskshelvesinanHApair75AddingDS14mk2orDS14mk4diskshelvestoamultipathHAloop75UpgradingorreplacingmodulesinanHApair76Aboutthediskshelfmodules77Restrictionsforchangingmoduletypes77Bestpracticesforchangingmoduletypes77Testingthemodules78DeterminingpathstatusforyourHApair78Hot-swappi
12、ngamodule80RelocatingaggregateownershipwithinanHApair82Howaggregaterelocationworks82Relocatingaggregateownership83Commandsforaggregaterelocation85Keyparametersofthestorageaggregaterelocationstartcommand85Vetoanddestinationchecksduringaggregaterelocation86TroubleshootingHAissues89Troubleshootinggener
13、alHAissues89Troubleshootingifgivebackfailsfortherootaggregate91Troubleshootingifgivebackfails(SFOaggregates)92Troubleshootingaggregaterelocation95TroubleshootingHAstateissues100Copyrightinformation102Trademarkinformation103Howtosendyourcomments104Index105UnderStandingHAPairSHApairsprovidehardwarered
14、undancythatisrequiredfornondisruptiveoperationsandfaulttoleranceandgiveeachnodeinthepairthesoftwarefunctionalitytotakeoveritspartner,sstorageandsubsequentlygivebackthestorage.WhatanHApairisAnHApairistwostoragesystems(nodes)whosecontrollersareconnectedtoeachotherdirectly.Inthisconfiguration,onenodeca
15、ntakeoveritspartnersstoragetoprovidecontinueddataserviceifthepartnergoesdown.YoucanconfiguretheHApairsothateachnodeinthepairsharesaccesstoacommonsetofstorage,subnets,andtapedrives,oreachnodecanownitsowndistinctsetofstorage.ThecontrollersareconnectedtoeachotherthroughanHAinterconnect.Thisallowsonenod
16、etoservedatathatresidesonthedisksofitsfailedpartnernode.Eachnodecontinuallymonitorsitspartner,mirroringthedataforeachother,snonvolatilememory(NVRAMorNVMEM).Theinterconnectisinternalandrequiresnoexternalcablingifbothcontrollersareinthesamechassis.Takeoveristheprocessinwhichanodetakesoverthestorageofi
17、tspartner.Givebackistheprocessinwhichthatstorageisreturnedtothepartner.Bothprocessescanbeinitiatedmanuallyorconfiguredforautomaticinitiation.HowHApairssupportnondisruptiveoperationsandfaulttoleranceHApairsprovidefaulttoleranceandletyouperformnondisruptiveoperations,includinghardwareandsoftwareupgrad
18、es,relocationofaggregateownership,andhardwaremaintenance.FaulttoleranceWhenonenodefailsorbecomesimpairedandatakeoveroccurs,thepartnernodecontinuestoservethefailednodesdata.NondisruptivesoftwareupgradesorhardwaremaintenanceDuringhardwaremaintenanceorupgrades,whenyouhaltonenodeandatakeoveroccurs(autom
19、atically,unlessyouspecifyotherwise),thepartnernodecontinuestoSerVedataforthehaltednodewhileyouupgradeorperformmaintenanceonthenodeyouhalted.DuringnondisruptiveupgradesofDataONTAP,theusermanuallyentersthestoragefailovertakeovercommandtotakeoverthepartnernodetoallowthesoftwareupgradetooccur.Thetakeove
20、rnodecontinuestoservedataforbothnodesduringthisoperation.Formoreinfbnationaboutnondisruptivesoftwareupgrades,seetheClusteredDataONTAPUpgradeandRevertZDowngradeGuide.Nondisruptiveaggregateownershiprelocationcanbeperformedwithoutatakeoverandgiveback.TheHApairsuppliesnondisruptiveoperationandfaulttoler
21、anceduetothefollowingaspectsofitsconfiguration:ThecontrollersintheHApairareconnectedtoeachothereitherthroughanHAinterconnectconsistingofadaptersandcables,or,insystemswithtwocontrollersinthesamechassis,throughaninternalinterconnect.Thenodesusetheinterconnecttoperformthefollowingtasks:Continuallycheck
22、whethertheothernodeisfunctioning,Mirrorlogdataforeachothcr,sNVRAMorNVMEMTheyusetwoormorediskshelfloops,orstoragearrays,inwhichthefollowingconditionsapply:EachnodemanagesitsowndisksorarrayLUNs.Incaseoftakeover,thesurvivingnodeprovidesread/writeaccesstothepartnersdisksorarrayLUNsuntilthefailednodebeco
23、mesavailableagain.Note:DiskownershipisestablishedbyDataONTAPortheadministratorratherthanbywhichdiskshelfthediskisattachedto.Formoreinformationaboutdiskownership,seetheClusteredDataONTAPPhysicalStorageManagementGuide.Theyowntheirsparedisks,sparearrayLUNs,orboth,anddonotsharethemwiththeothernode.Theye
24、achhavemailboxdisksorarrayLUNsontherootvolumethatperformthefollowingtasks:MaintainconsistencybetweenthepairContinuallycheckwhethertheothernodeisrunningorwhetherithasperformedatakeoverStoreconfigurationinformationRelatedconceptsWheretofindproceduresfornondisruptiveoperationswithHApairsonpage7Wheretof
25、indproceduresfornondisruptiveoperationswithHApairsBytakingadvantageofanHApairstakeoverandgivebackoperations,youcanchangehardwarecomponentsanderfbnsoftwareupgradesinyourconfigurationwithoutdisruptingaccesstothesystemsstorage.Youcanrefertothespecificdocumentsfortherequiredprocedures.Youcanperformnondi
26、sruptiveoperationsonasystembyhavingitspartnertakeoverthesystemsstorage,performingmaintenance,andthengivingbackthestorage.Aggregaterelocationextendstherangeofnondisruptivecapabilitiesbyenablingstoragecontrollerupgradeandreplacementoperations.Thefollowingtablelistswhereyoucanfindinformationonspecificp
27、rocedures:Ifyouwanttoperformthistasknondisruptively.Seethe.UpgradeDataONTAPClusteredDataONTAPUpgradeandRevert/DowngradeGuideReplaceahardwareFRUcomponentFRUproceduresforyourplatformHowtheHApairimprovesfaulttoleranceAstoragesystemhasavarietyofsinglepointsoffailure,suchascertaincablesorhardwarecomponen
28、ts.AnHApairgreatlyreducesthenumberofsinglepointsoffailurebecauseifafailureoccurs,thepartnercantakeoverandcontinueservingdatafortheaffectedsystemuntilthefailureisfixed.SinglepointoffailuredefinitionAsinglepointoffailurerepresentsthefailureofasinglehardwarecomponentthatcanleadtolossofdataaccessorpoten
29、tiallossofdata.Singlepointoffailuredoesnotincludemultiple/rollinghardwareerrors,suchastriplediskfailure,dualdiskshelfmodulefailure,andsoon.Allhardwarecomponentsincludedwithyourstoragesystemhavedemonstratedverygoodreliabilitywithlowfailurerates.Ifahardwarecomponentsuchasacontrolleroradapterfails,youc
30、anusethecontrollerfailoverfunctiontoprovidecontinuousdataavailabilityandpreservedataintegrityfrclientapplicationsandusers.SinglepointoffailureanalysisforHApairsDifferentindividualhardwarecomponentsandcablesinthestoragesystemaresinglepointsoffailure,butanHAconfigurationcaneliminatethesepointstoimprov
31、edataavailability.HardwareSinglepointofHowstoragefailovereliminatessinglecomponentsoffailureStand-aloneHApairControllerYesNoIfacontrollerfails,thenodeautomaticallyfailsovertoitspartnernode.Thepartner(takeover)nodeservesdataforbothofthenodes.NVRAMYesNoIfanNVRAMad即terfails,thenodeautomaticallyfailsove
32、rtoitspartnernode.Thepartner(takeover)nodeservesdataforbothofthenodes.HardwareSinglepointofHowstoragefailovereliminatessingleoffailurecomponentsStand-aloneHApairCPUfanYesNoIftheCPUfanfails,thenodeautomaticallyfailsovertoitspartnernode.Thepartner(takeover)nodeservesdataforbothofthenodes.MultipleNICsw
33、ithMaybe,ifallNoIfoneofthenetworkinglinkswithinaninterfacegroupsNICsfailinterfacegroupfails,thenetworkingtrafficis(virtualinterfaces)YesNoautomaticallysentovertheremainingnetworkinglinksonthesamenode.Nofailoverisneededinthissituation.FC-ALad叩tcrorSASIfanFC-ALadapterfortheprimaryloopfailsHBANo,ifdual
34、NoforaconfigurationwithoutmultipathHA,thepartnernodeattemptsatakeoveratthetimeoffailure.WithmultipathHA,notakeoverisrequired.IftheFC-ALadapterforthesecondaryloopfailsforaconfigurationwithoutmultipathHA,thefailovercapabilityisdisabled,butbothnodescontinuetoservedatatotheirrespectiveapplicationsandus
35、ers,withnoimpactordelay.WithmultipathHA,failovercapabilityisnotaffected.FC-ALorSAScableIfanFC-ALlooporSASstackbreaksina(controller-to-shelfpathcablingconfigurationthatdoesnothavemultipathHA,shelf-to-shelf)isusedthebreakcouldleadtoafailover,dependingontheshelftype.Thepartnerednodesinvokethenegotiated
36、failoverfeaturetodeterminewhichnodeisbestforservingdata,basedonthediskshelfcount.WhenmultipathHAisused,noNo,ifdual-Nofailoverisrequired.DiskshelfmodulepathcablingIfadiskshelfmodulefailsinaconfigurationthatdoesnothavemultipathHA,thefailureisusedcouldleadtoafailover.Thepartnerednodesinvokethenegotiate
37、dfailoverfeaturetodeterminewhichnodeisbestforSerVingdata,basedonthediskshelfcount.WhenmultipathHAisused,thereisnoimpact.HardwarecomponentsSinglepointofHowstoragefailovereliminatessingleoffailureStand-aloneHApairDiskdriveNoNoIfadiskfails,thenodecanreconstructdatafromtheRAID4paritydisk.Nofailoverisnee
38、dedinthissituation.PowersupplyFan(controllerordiskMaybe,ifbothpowersuppliesfailMaybe,ifNoNoBoththecontrolleranddiskshelfhavedualpowersupplies.Ifonepowersupplyfails,thesecondpowersupplyautomaticallykicksin.Nofailoverisneededinthissituation.Ifbothpowersuppliesfail,thenodeautomaticallyfailsovertoitspar
39、tnernode,whichservesdataforbothnodes.BoththecontrolleranddiskShelfhavemultipleshelf)HAinterconnectbothfansfailNotNofans.Ifonefanfails,thesecondfanautomaticallyprovidescooling.Nofailoverisneededinthissituation.Ifbothfansfail,thenodeautomaticallyfailsovertoitspartnernode,whichservesdataforbothnodes.If
40、anHAinterconnectadapterfails,thefailoveradapterapplicablecapabilityisdisabledbutbothnodescontinuetoservedatatotheirrespectiveapplicationsandusers.HAinterconnectcableNotapplicableNoTheHAinterconnectadaptersupportsdualHAinterconnectcables.Ifonecablefoils,theheartbeatandNVRAMdataareautomaticallysentove
41、rthesecondcablewithnodelayorinterruption.Ifbothcablesfail,thefailovercapabilityisdisabledbutbothnodescontinuetoservedatatotheirrespectiveapplicationsandusers.pointConnectionsandcomponentsofanHApairEachnodeinanHApairrequiresanetworkconnection,anHAinterconnectbetweenthecontrollers,andconnectionsbothto
42、itsowndiskshelvesaswellasitspartnernodesshelves.PrimaryconnectionRedundantprimaryconnectionStandbyconnectionRedundantstandbyconnectionThisdiagramshowsastandardHApairwithnativediskshelvesandmultipathHA.ThisdiagramshowsDS4243diskshelves.FormoreinformationaboutcablingSASdiskshelves,seetheUniversalSASan
43、dACPCablingGuideontheNetAppSupportSite.HowHApairsrelatetotheclusterHApairsarecomponentsofthecluster,andbothnodesintheHApairareconnectedtoothernodesintheclusterthroughthedataandclusternetworks.ButonlythenodesintheHApaircantakeovereachothersstorage.AlthoughthecontrollersinanHApairareconnectedtootherco
44、ntrollersintheclusterthroughtheclusternetwork,theHAinterconnectanddisk-shelfconnectionsarefoundonlybetweenthenodeanditspartnerandtheirdiskshelvesorarrayLUNs.TheHAinterconnectandeachnodesconnectionstothepartnersstorageprovidephysicalsupportforhigh-availabilityfunctionality.Thehigh-availabilitystorage
45、failovercapabilitydoesnotextendtoothernodesinthecluster.Note:NetworkfailoverdoesnotrelyontheHAinterconnectandallowsdatanetworkinterfacestofailovertodifferentnodesintheclusteroutsidetheHApair.NetWorkfailoverisdifferentthanstoragefailoversinceitenablesnetworkresiliencyacrossallnodesinthecluster.Non-HA
46、orstand-alone)nodesarenotsupportedinaclustercontainingtwoormorenodes.Althoughsinglenodeclustersaresupported,joiningtwoseparatesinglenodeclusterstocreateoneclusterisnotsupported,unlessyouwipecleanoneofthesinglenodeclustersandjoinittotheothertocreateatwo-nodeclusterthatconsistsofanHApair.Forinformati
47、ononsinglenodeclusters,seetheClusteredDataONTAPSystemAdministrationGuideforClusterAdministrators.ThefollowingdiagramshowstwoHApairs.ThemultipathHAstorageconnectionsbetweenthenodesandtheirstorageareshownforeachHApair.Forsimplicity,onlytheprimaryconnectionstothedataandclusternetworksareshown.HA pairHA pairNode3StorageI No