Tips and Tricks

Tips and Tricks

Security and Zoning

While data sharing is in vogue, there are instances where you do not want all your servers to see all your storage. The most obvious example is an Internet firewall. There are three ways to provide security between disk volumes and physical ports.

  • Physical separation
    If you have a really critical server, which must have exclusive access to its own storage, then consider putting it into its own SAN fabric. Its a bit extreme, you will lose some management benefits, but security is guaranteed.
  • Zoning
    Zoning splits the SAN fabric into sections, for example, you might want to separate out individual operating systems into their own zones. 3 levels of Zoning exist.

    1. Node World Wide Name
    2. Port World Wide Name
      Both these methods rely on a domain name server in the SAN to supply the WWW names.
    3. Physical Fabric port number (hard wired in the switches, and so called Hard Zoning). This method is more secure, as it does not rely on access to World Wide Name Information. It uses ASICs in the switch ports.
      Zones can overlap, as in a server or LUN can belong to more than one zone. Also, Zones can share inter-link switches.
  • LUN masking
    LUN masking is Host based. Servers all have access to a given fibre channel port, and can potentially access all LUNs served by that port. However, each server is masked, so it can only see its own LUN

Dual Fabrics

The most robust way to design a SAN is to have two independent fabrics, and connect every server, and every storage device to both. That provides you with total resilience. Now all the SAN switches have a LAN port, which you connect to for configuration and maintenance. If you connect the switches together using ISLs, then you can see every switch from one LAN connection. So if you are installing new firmware, you install it to one switch, and let it propagate out to the others. If you have a dual fabric, then don’t connect any switch in one fabric to any switch in the others. That keeps the fabrics totally independent. Then, if there is a problem with the firmware, it will only affect one half of your SAN, and the other half should work fine.

Size limitations on switched fabric

A switched fabric can only contain 239 switches, with a maximum of 8 switches between server and storage.
If you use SCSI-FC adapters to connect non-SAN ready devices, then the available bandwidth to that server drops by 80%

Fan-out and Fan-in

Fan-out, is how many hosts can be attached to a storage path
Fan-in, is how many storage ports can be served from a single host channel

How many servers can you run down a single fibre channel? As you might expect, there is no simple answer to this, it is based on the FC capacity, and the amount of work your servers are doing. If you want to pay a SAN designer a few thousand pounds, they will work it out for you, but a starting rule of thumb is

  • AIX 4:1
  • NT 5:1
  • Netware 5:1

The latest word here is that a non-blocking architecture could make fan-out ratios irrelevant. Does that mean you can force a pint into a quart pot? I don’t think so. Even with non-blocking allowing path sharing, there has to be a limitation on how much data can be put down a channel, and still maintain reasonable performance.

When designing your SAN, you must consider the future. Your growth rate will determine the number of connections you will require, and your SAN must be scaleable, as far as you can predict the future. Your SAN also needs to be available, so you need two independent paths from every server to the data. The paths should be routed through 2 directors, or two independent switch paths.
Your initial design should include some free ports for growth, but eventually you will need more switches. This is where a 2 tier switch approach can help, as it improves scaleability. The top tier connects to the servers, and the bottom tier connects to the storage, with each switch in the top tier connected to each switch in the bottom tier. This makes it easy to extend the fabric, while providing more redundant paths and better bandwidth.

When working out how many paths are needed to a storage subsystem, remember that UNIX and NT cannot share physical paths with other operating systems

Management

If switches are cascaded in a fabric, then they can all be monitored and managed by a single screen.
16 port switches have an extra Ethernet connection. One switch needs to have this connected to the network.

Windows Clusters and SANs

Every Windows cluster should be configured into its own SAN zone. Storage LUNs must be available to every node in the cluster, and visible to that cluster only to prevent data corruption.

All HBAs in a cluster must be of the same type and running the same firmware level, and all the device drives must be running the same software version.

Never add an arbitrated loop HBA into a switched fabric SAN, as this can cause the whole fabric to go down.

If you connect a server with multiple HBAs, always load the multi-path driver software, or else when the server sees two HBAs it will assume they are on different buses and give each disk tow different device numbers. It will then apparently see two disks with the same disk signature and try to re-write one of them. The disk will then fail and the data could be corrupted.

If you use a storage subsystem snapshot facility to create a copy of a volume, it will have the same disk signature as the original. If you try to mount the snapshot to the same server as that hosting the original disk, the server will overwrite the snapshot disk signature. If you mount the disk to another server in the cluster, you will have two identical disks in the cluster and will corrupt your data. The answer is to mount the snapshot disk to a server that is not in the cluster.

Disks must be added to the cluster as cluster resources. Zone the disk to one server first, add it as a cluster resource, then zone it to the rest of the servers in the cluster.