`
standalone
  • 浏览: 595891 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

Copy-On-Write Strategy in ZFS

ZFS 
阅读更多



 http://blogs.sun.com/bjoyes/entry/zfs

 

Zfs is a transaction-based filesystem, as ZFS makes copy-on-write operations when there are modifications of files from filesystems, by transactions. This means that we theorically should have on disks at a given moment a  kind of versioning of data, so an image of the data before file modification, or after, but never while the modification, so that filesystem inconsistency is normally impossible.

The transactionnal mode also means that when transaction is committed on disks, it is committed on the whole pool in an atomic way meaning that all pointers to data are updated in the same time, the uberblock and checksum for blocks too.

Here is a fantastic and very clear schema and explanation I'd like to copy from http://www.sun.com/bigadmin/.

Transaction-Based Copy-on-Write Operations

ZFS is a combination of file system and volume manager; the file system-level commands require no concept of the underlying physical disks because of storage pool virtualization. All of the high-level interactions occur through the data management unit (DMU), a concept similar to a memory management unit (MMU), only for disks instead of RAM. All of the transactions committed through the DMU are atomic, so data is never left in an inconsistent state.

In addition to being a transaction-based file system, ZFS only performs copy-on-write operations. This means that the blocks containing the in-use data on disk are never modified. The changed information is written to alternate blocks, and the block pointer to the in-use data is only moved once the write transactions are complete. This happens all the way up the file system block structure to the top block, called the uberblock.

As shown in Figure 1, transactions select unused blocks to write modified data and only then change the location to which the preceding block points.

Figure 1: Copy-On-Write Transactions
Image source: Jeff Bonwick

If the machine were to suffer a power outage in the middle of a data write, no corruption occurs because the pointer to the "good" data is not moved until the entire write is complete. (Note: The pointer to the data is the only thing that is moved.) This eliminates the need for a journaling or logging file system and any need for a fsck or mirror resync when a machine reboots unexpectedly.

  • 大小: 33.9 KB
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics