snfsdefrag(1) snfsdefrag(1)
NAME
snfsdefrag - Xsan File System Defrag Utility
SYNOPSIS
snfsdefrag [-DdPqsv] [-G group] [-K key] [-k key] [-m count] [-r] [-S file] <Tar- get> [<Target>...] snfsdefrag -e [-b] [-G group] [-K key] [-r] [-S file] <Target> [<Tar- get>...] snfsdefrag -E [-b] [-G group] [-K key] [-r] [-S file] <Target> [<Tar- get>...] snfsdefrag -c [-G group] [-K key] [-r] [-S file] <Target> [<Target>...] snfsdefrag -p [-DvPq] [-G group] [-K key] [-m count] [-r] [-S file] <Target> [<Target>...] snfsdefrag -l [-Dv] [-G group] [-K key] [-m count] [-r] [-S file] <Tar- get> [<Target>...]
DESCRIPTION
snfsdefrag is a utility for defragmenting files on an Xsan volume by relocating the data in a file to a smaller set of extents. Reducing the number of extents in a file improves performance by minimizing disk head movement when performing I/O. In addition, with fewer extents, Xsan File System Manager (FSM) overhead is reduced. By default, the new extents are created using the file's current stor- age pool affinity. However, the file can be "moved" to a new storage pool by using the -k option. This migration capability can be espe- cially useful when a storage pool is going out of service. See the use of the -G option in the EXAMPLES section below. In addition to defragmenting and migrating files, snfsdefrag can be used to list the extents in a file (see the -e option) or to prune away unused space that has been preallocated for the file (see the -p option).
OPTIONS
[-b] Show extent size in blocks instead of kilobytes. Only useful with the -e (list extents) option. [-c] This option causes snfsdefrag to just display an extent count instead of defragmenting files. [-D] Turns on debug messages. [-d] Causes snfsdefrag to operate on files containing extents that have depths that are different than the current depth for the extent's storage pool. This option is useful for reclaiming disk space that has become "shadowed" after cvupdatefs has been run for bandwidth expansion. Note that when -d is used, a file may be defragmented due to the stripe depth in one or more of its extents OR due to the file's extent count. [-e] This option causes snfsdefrag to not actually attempt the defragmentation, but instead report the list of extents con- tained in the file. The extent information includes the start- ing file relative offset, starting and ending storage pool block addresses, the size of the extent, the depth of the extent, and the storage pool number. [-E] This option has the same effect as the -e option except that file relative offsets and starting and ending stripe group block addresses that are stripe-aligned are highlighted with an aster- isk (*). Also, starting storage pool addresses that are equally misaligned with the file relative offset are highlighted with a plus sign (+). Currently, this option is intended for use by support personnel only. [-G storagepool] This option causes snfsdefrag to only operate on files having at least one extent in the given storage pool. Note that multiple -G options can be specified to match files with an extent in at least one of the specified storage pools. [-K key] This option causes snfsdefrag to only operate on source files that have the supplied affinity key. If key is preceded by '!' then snfsdefrag will only operate on source files that do not have the affinity key. See EXAMPLES below. [-k key] Forces the new extent for the file to be created on the storage pool specified by key. [-l] This option causes snfsdefrag to just list candidate files. [-m count] This option tells snfsdefrag to only operate on files containing more than count extents. By default, the value of count is 1. [-p] Causes snfsdefrag to perform a prune operation instead of defragmenting the file. During a prune operation, blocks beyond EOF that have been preallocated either explicitly or as part of inode expansion are freed, thereby reducing disk usage. Files are otherwise unmodified. Note: While prune operations reclaim unused disk space, performing them regularly can lead to free space fragmentation. [-P] Lists skipped files. [-q] Causes snfsdefrag to be quiet. [-r [<TargetDirectory>]] This option instructs snfsdefrag to recurse through the <Target- Directory> and attempt to defragment each fragmented file that it finds. If <TargetDirectory> is not specified, the current directory is assumed. [-s] Causes snfsdefrag perform allocations that line up on the begin- ning block modulus of the storage pool. This can help perfor- mance in situations where the I/O size perfectly spans the width of the storage pool's disks. [-S file] Writes status monitoring information in the supplied file. This is used internally by Xsan and the format of this file may change. [-v] Causes snfsdefrag to be verbose.
EXAMPLES
Count the extents in the file foo. rock% snfsdefrag -c foo List the extents in the file foo. rock% snfsdefrag -e foo Defragment the file foo. rock% snfsdefrag foo Defragment the file foo if it contains more than 2 extents. Otherwise, do nothing. rock% snfsdefrag -m 2 foo Traverse the directory abc and its sub-directories and defragment every file found containing more than one extent. rock% snfsdefrag -r abc Traverse the directory abc and its sub-directories and defragment every file found having one or more extents whose depth differs from the cur- rent depth of extent's storage pool OR having more than one extent. rock% snfsdefrag -rd abc Traverse the directory abc and its sub-directories and only defragment files having one or more extents whose depth differs from the current depth of extent's storage pool. rock% snfsdefrag -m 9999999999 -rd abc Traverse the directory abc and recover unused preallocated disk space in every file visited. rock% snfsdefrag -rp abc Force the file foo to be relocated to the storage pool with the affin- ity key "fast" rock% snfsdefrag -k fast -m 0 foo If the file foo has the affinity fast, then move its data to a storage pool with the affinity slow. rock% snfsdefrag -K fast -k slow -m 0 foo If the file foo does NOT have the affinity slow, then move its data to a storage pool with the affinity slow. rock% snfsdefrag -K '!slow' -k slow -m 0 foo Traverse the directory abc and migrate any files containing at least one extent in storage pool 2 to any non-exclusive storage pool. rock% snfsdefrag -r -G 2 -m 0 abc Traverse the directory abc and migrate any files containing at least one extent in storage pool 2 to storage pools with the affinity slow. rock% snfsdefrag -r -G 2 -k slow -m 0 abc Traverse the directory abc list any files that have the affinity fast and having at least one extent in storage pool 2. rock% snfsdefrag -r -G 2 -k fast -l -m 0 abc
NOTES
Only the owner of a file or superuser is allowed to defragment a file. (To act as superuser on a Xsan volume, in addition to becoming the user root, the configuration option GlobalSuperUser must be enabled. See cvfs_config(4) for more information.) snfsdefrag will not operate on open files or files that been modified in the past 10 seconds. If a file is modified while defragmentation is in progress, snfsdefrag will abort and the file will be skipped. snfsdefrag skips special files and files containing holes. snfsdefrag does not follow symbolic links. When operating on a file marked for PerfectFit allocations, snfsdefrag will "do the right thing" and preserve the PerfectFit attribute. While operating on a file, snfsdefrag creates a temporary file named <TargetFile>__defragtmp. If the command is interrupted, snfsdefrag will attempt to remove this file. However, if snfsdefrag is killed or a power failure occurs, this file may be left behind and it will be necessary to find and remove it as it will continue to consume space. snfsdefrag will fail if it cannot locate a set of extents that would reduce the current extent count on a file.
ADVANCED FRAGMENTATION ANALYSIS
There are two major types of fragmentation to note: file fragmentation and free space fragmentation. File fragmentation is measured by the number of file extents used to store a file. A file extent is a con- tiguous allocation unit within a file. When a large enough contiguous space cannot be found to allocate to a file, multiple smaller file extents are created. Each extent represents a different physical spot in a storage pool. Requiring multiple extents to address file data impacts performance in a number of ways. First, the file system must do more work looking up locations for a file's data. In addition, for every ten (10) extents used to address a file's data, a new file inode must be allocated to that file. This will cause increased metadata reads while looking up the locations of data. Also, having file data spread across many different locations in the file system requires the storage hardware to do more work while reading a file. On a disk there will be increased head movements, as the drive seeks around to read in each data extent. Many disks also attempt to optimize I/O performance, for example, by attempting to predict upcoming read locations. When a file's data is contiguous these optimizations work well. However, with a fragmented file the drive optimizations are not nearly as efficient. A file's fragmentation should be viewed more as a percentage than as a hard number. While it's true that a file of nearly any size with 50000 fragments is extremely fragmented and should be defragmented, a file that has 500 fragments that are mostly one or two FsBlockSize in length is also very fragmented. Keeping files to under 10% fragmentation is the ideal, and how close you come to that ideal is a compromise based on real-world factors (file system use, file sizes and their life span, opportunities to run snfsdefrag, etc.). When examining file fragmentation with snfsdefrag -e, be on the lookout for files that have many small fragments, especially if they have small fragments at the end of the list. If more than 10% of the fragments in the list are InodeExpandMax in size, you'll probably want to increase the InodeExpandMax parameter in the .cfg file. (See the following para- graph for some hints.) If the fragments are all smaller than InodeEx- pandMax, then this could be caused by the way the application writes the files, and if so, look for alternate IO options in the application (perhaps used a "buffered" mode instead a "direct" or "DMA" mode), it could be because the file is opened by a second client as it's being written, or it could be because the file was created as a "sparse" file, etc. The real goal is to see if the work flow can be changed such that files are not created with small fragments in the first place. This is better than spending time later trying to defragment them (pre- vention is always better than recovery). Another possible source of fragmentation is the InodeExpandMin/InodeEx- pandInc/InodeExpandMax parameters. These parameters are used when a write is above the auto_dma_write_length threshold (default is 1MB+1byte). For this reason, most small files are not effected by these values (small files are typically written with small IOs). However, large files that are written slowly with small IOs take advantage of these settings once they grow to a threshold size. If you have large files, careful tuning of auto_dma_write_length and the InodeExpand parameters is the best way to keep your file system defragmented. Set the InodeExpandMax value to a value that is close to the size of the average large file on the file system, up to its maximum of 512M. If the file system is composed primarily of multi-gigabyte files, an aggressive InodeExpandMin of 8M to 16M and the maximum InodeExpandMax of 512M will help the files have the fewest fragments possible (this gets the large files reserving contiguous space as quickly as possi- ble). If the file system has many medium sized (less than 512M) and some large sized files (over 1G), you may want a conservative InodeEx- pandInc of 1M or 2M while keeping the InodeExpandMax of 512M. If the file system is composed primarily of small files, then these parameters have less of an impact because those small files probably don't even use these values, but they are still worth tuning towards your average file size. Over-allocated space can be reclaimed with the -p option, though when tuned correctly, there should be very little wasted space. Some experimentation with the InodeExpand* parameters may be necessary, and these parameters in the .cfg file can be adjusted with just a stop/start (or failover) of the file system. Some common causes of fragmentation are having very full stripe groups (possibly because of affinities), a file system that has a lot of frag- mented free space (deleting a fragmented file produces fragmented free space), heavy use of CIFS or NFS which typically use out of order and cause unoptimized (uncoalesced) allocations, or an application that writes files in a random order. snfsdefrag is designed to detect files which contain file fragmentation and coalesce that data onto a minimal number of file extents. The effi- ciency of snfsdefrag is dependent on the state of the file system's free data blocks, or free space. The second type of fragmentation is free space fragmentation. The file system's free space is the pool of unallocated data blocks. Space allo- cation for new files, as well as allocations for extending existing files, comes from the file system's free space. Free space fragmenta- tion is measured by the number of fragments of contiguous free blocks. Fragmentation in the file system's free space affects the file system's ability to allocate large extents. A file can only be allocated an extent as large as the largest contiguous block of free space. Thus free space fragmentation can lead to file fragmentation in larger files. As snfsdefrag processes fragmented files it attempts to use large enough free space fragments to create a new defragmented file space. If free space is too fragmented snfsdefrag may not be able to allocate a large enough extent for the file's data. In the case that snfsdefrag must use multiple extents in the defragmented file, it will only proceed if the processed file will have less extents than the original. Otherwise snfsdefrag will abort that file's defrag process and move on to remaining defrag requests.
FRAGMENTATION ANALYSIS EXAMPLES
The following examples include reporting from snfsdefrag as well as cvfsck. Some examples require additional tools such as awk and sort. Reporting a specific file's fragmentation (extent count). # snfsdefrag -c <filename> The following command will create a report showing each file's path, followed by extent count, with the report sorted by extent count. Files with the greatest number of extents will show up at the top of the list. Replace <fsname> in the following example with the name of your Xsan file system. The report is written to stdout and should be redirected to a file. # cvfsck -x <fsname> | awk -F, '{print$6", "$7}' | sort -uk1 -t, \ | sort -nrk2 -t, This next command will display all files with at least 10 extents and with a size of at least 1MB. Replace <fsname> in the following example with the name of your Xsan file system. The report is written to stdout and can be redirected to a file. # echo "#extents file size av. extent size filename" ;\ cvfsck -r <fsname> | awk '{if ($3+0 > 1048576 && $5+0 > 10)\ { printf("%8d %16d %16d %s\n", $5, $3, $3/$5, $8); }}' | sort -nr The next command displays a report of free space fragmentation. This allows an administrator to see if free space fragmentation may affect future allocation fragmentation. See cvfsck(1) man page for description of report output. # cvfsck -f <fsname>
SEE ALSO
cvfsck(1), cvcp(1), cvmkfile(1), cvfs_config(4) cvaffinity(1) Xsan File System December 2005 snfsdefrag(1)
Mac OS X 10.7 - Generated Sat Aug 20 09:59:29 CDT 2011