Sunday, April 30, 2006

SATA drives and 520bps formatting

Almost all vendors who use Fibre Channel drives format them using 520bps. 512 bytes are used to store data and 8 bytes are used to store a Block checksum (BCS) of the previous 512 bytes as a protection scheme.

However, PATA/SATA drives have a fixed format of 512bps that can't be changed. So one question you need to ask your vendor, if you deploy SATA drives, is if and how they implement Block checksums on SATA drives. One vendor I know of, HDS, implements a technique called read-after-write. What they do is, that after they write to the drive, they read back the data and verify it. That also means that the for each write there are 2 IOs from disk. One write and one read. So for heavy write ops the overhead can be significant.

Netapp has a very nice technique largely attributed to the flexibility of DataONTAP and WAFL. Netapp implements BCS on SATA drives!!! How you say?

Netapp uses what's called an 8/9ths scheme. A WAFL block is 4k. Because Netapp has complete control of RAID and the filesystem, what ONTAP does is to use every 9th 512 byte sector as an area that contains the checksum of the previous 8 512b sectors (4k WAFL block). As a result of this RAID treats the disk as if it were formatted with 520bps. Thus there's no need to immediately read back the data after its written.


EricS said...

I had no idea NTAP could do this. I learned something today.

Keep posting. Great info


Anonymous said...

Hi Nick:

Is Raid DP as fast at serving files as the standard NTAP raid 4? Also, How much of a penalty is there in perfromance using SATA as opposed to using FC.

Would you keep your critical financial data on SATA disk or FC disk?

Nick Triantos said...

The correct answer is it depends. It depends on the workload. Obviously, for read intensive environments there's no overhead. For write intensive environments, we've seen it (RAID-DP) internally as well as at production sites as high as 3% lower than RAID4, which considering what you're getting by deploying it, is a tradeoff most folks are willing to make.

The performance of SATA vs FC drives is very much dependent on the the workload type.

In general, you should be expecting PATA/SATA drives to deliver around 25%-35% the performance of an equivalent FC drive configuration with disk bound small block random I/O at a high thread count. The comparisson looks much better (~55-65%) when the I/O characteristics change to sequential with a large block size and low thread counts.

By nature I'm a conservative person and my personal belief is that you should pick your spots depending on your application requirements, criticallity of the data, RPO/RTO as well as comfort and personal Bias. :-)

I've seen a lot of deployments primarily for Exchange for environments that consider exchange as a non-mission critical app as well other non-mission critical apps. These folks also had some pretty loose requirements.

I've also seen environments where they considered Exchange as a Mission critical app and they deployed FC drives.

I don't believe folks should be deploying protocols or drive types, for this matter, purely based on cost. I believe it's the wrong approach. I think you should always look at the applications requirements and make intelligent decisions.

Anonymous said...

HI, i am trying to measure how fast is one SATA II Hard Disk instead another one SATA I Hard Disk. So, i did not find any benchmark about. There is anyone?

Nick Triantos said...

I haven't see anything. The main difference between a SATA I and SATA II is the interface speed which can be, and is, very misleading as the interface type by itself, essentially has no tangible impact on the underlying disk drive hardware (i.e HDA). In fact the HDA is the same.

The performance of a drive is determined not only by the interface speed but also by using metrics such as seek time, RPMs, and sustain TPUT from media and avoid the interface type is, by itself, makes a significant difference.

All you have to do is go take a look at a couple of SATA I & SATA II drive specs and what you'll find is they have pretty much the same specs, excluding the inmterface.

My 0.02c