NETWORK ENGINEER BLOG

Tips and Reviews for Engineers

NetApp ディスクの手動 fail について

NetApp では disk を明示的に fail する事が可能です。システム上は問題ないと判断されている disk において、何らかの不具合が懸念される場合等に有効です。本例では、仮想ディスク[v5.18]を fail してみます。

ディスクの手動 fail

仮想ディスク[v5.18]が aggr0 で使用している事を確認します。

node1> sysconfig -r
Aggregate aggr0 (online, raid_dp) (block checksums)
  Plex /aggr0/plex0 (online, normal, active)
    RAID group /aggr0/plex0/rg0 (normal, block checksums)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      dparity   v5.16   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      parity    v5.17   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      v5.18   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448

Spare disks

RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           v5.19   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.20   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.21   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.22   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.24   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.25   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.26   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.27   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.28   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.29   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.32   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448

"disk fail" コマンドにより[v5.18]を手動 fail します。

node1> disk fail v5.18
*** You are about to prefail the following file system disk, ***
*** which will eventually result in it being failed ***
  Disk /aggr0/plex0/rg0/v5.18

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      data      v5.18   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
***
Really prefail disk v5.18? y
disk fail: The following disk was prefailed: v5.18
Disk v5.18 has been prefailed.  Its contents will be copied to a
replacement disk, and the prefailed disk will be failed out.

コマンドを実行すると[v5.18]のデータが、spare disk[v5.19]にコピーされます。

node1> sysconfig -r
Aggregate aggr0 (online, raid_dp) (block checksums)
  Plex /aggr0/plex0 (online, normal, active)
    RAID group /aggr0/plex0/rg0 (normal, block checksums)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      dparity   v5.16   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      parity    v5.17   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      v5.18   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448 (prefail, copy in progress)
      -> copy   v5.19   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448 (copy 1% completed)

Spare disks

RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           v5.20   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.21   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.22   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.24   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.25   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.26   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.27   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.28   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.29   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.32   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448

コピーが完了すると[v5.19]が aggr0 に参加[v5.18]は Broken disk として認識されます。

node1> sysconfig -r
Aggregate aggr0 (online, raid_dp) (block checksums)
  Plex /aggr0/plex0 (online, normal, active)
    RAID group /aggr0/plex0/rg0 (normal, block checksums)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      dparity   v5.16   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      parity    v5.17   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      v5.19   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448

Spare disks

RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           v5.20   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.21   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.22   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.24   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.25   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.26   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.27   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.28   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.29   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.32   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448

Broken disks

RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------
admin failed    v5.18   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
ディスクの unfaill

advance mode へ移行します。

node1> priv set advanced
Warning: These advanced commands are potentially dangerous; use them only when directed to do so by NetApp personnel.

unfail は "disk unfail" コマンドにより行います。

node2*> disk unfail v5.18
disk unfail: unfailing disk v5.18...

non-advanced mode へ復帰します。

node2*> priv set

[v5.18]が spare ディスクとして認識されます。

node1> sysconfig -r
Aggregate aggr0 (online, raid_dp) (block checksums)
  Plex /aggr0/plex0 (online, normal, active)
    RAID group /aggr0/plex0/rg0 (normal, block checksums)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      dparity   v5.16   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      parity    v5.17   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      v5.19   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448

Spare disks

RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           v5.18   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448 (not zeroed)
spare           v5.20   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.21   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.22   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.24   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.25   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.26   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.27   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.28   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.29   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.32   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448

必要に応じてディスクの zeroing を実行します。

node1> disk zero spares

node1> sysconfig -r
Aggregate aggr0 (online, raid_dp) (block checksums)
  Plex /aggr0/plex0 (online, normal, active)
    RAID group /aggr0/plex0/rg0 (normal, block checksums)

      RAID Disk Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
      --------- ------  ------------- ---- ---- ---- ----- --------------    --------------
      dparity   v5.16   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      parity    v5.17   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
      data      v5.19   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448

Spare disks

RAID Disk       Device  HA  SHELF BAY CHAN Pool Type  RPM  Used (MB/blks)    Phys (MB/blks)
---------       ------  ------------- ---- ---- ---- ----- --------------    --------------
Spare disks for block checksum
spare           v5.18   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448 (zeroing, 3% done)
spare           v5.20   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.21   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.22   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.24   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.25   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.26   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.27   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.28   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.29   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448
spare           v5.32   v5    ?   ?   FC:B   -  FCAL 15000 1020/2089984      1027/2104448