File: gawk.info, Node: Extension Sample File Functions, Next: Extension Sample Fnmatch, Up: Extension Samples 17.7.1 File-Related Functions ----------------------------- The 'filefuncs' extension provides three different functions, as follows. The usage is: '@load "filefuncs"' This is how you load the extension. 'result = chdir("/some/directory")' The 'chdir()' function is a direct hook to the 'chdir()' system call to change the current directory. It returns zero upon success or a value less than zero upon error. In the latter case, it updates 'ERRNO'. 'result = stat("/some/path", statdata' [', follow']')' The 'stat()' function provides a hook into the 'stat()' system call. It returns zero upon success or a value less than zero upon error. In the latter case, it updates 'ERRNO'. By default, it uses the 'lstat()' system call. However, if passed a third argument, it uses 'stat()' instead. In all cases, it clears the 'statdata' array. When the call is successful, 'stat()' fills the 'statdata' array with information retrieved from the filesystem, as follows: Subscript Field in 'struct stat' File type ---------------------------------------------------------------- '"name"' The file name All '"dev"' 'st_dev' All '"ino"' 'st_ino' All '"mode"' 'st_mode' All '"nlink"' 'st_nlink' All '"uid"' 'st_uid' All '"gid"' 'st_gid' All '"size"' 'st_size' All '"atime"' 'st_atime' All '"mtime"' 'st_mtime' All '"ctime"' 'st_ctime' All '"rdev"' 'st_rdev' Device files '"major"' 'st_major' Device files '"minor"' 'st_minor' Device files '"blksize"' 'st_blksize' All '"pmode"' A human-readable version of the All mode value, like that printed by 'ls' (for example, '"-rwxr-xr-x"') '"linkval"' The value of the symbolic link Symbolic links '"type"' The type of the file as a All string--one of '"file"', '"blockdev"', '"chardev"', '"directory"', '"socket"', '"fifo"', '"symlink"', '"door"', or '"unknown"' (not all systems support all file types) 'flags = or(FTS_PHYSICAL, ...)' 'result = fts(pathlist, flags, filedata)' Walk the file trees provided in 'pathlist' and fill in the 'filedata' array, as described next. 'flags' is the bitwise OR of several predefined values, also described in a moment. Return zero if there were no errors, otherwise return -1. The 'fts()' function provides a hook to the C library 'fts()' routines for traversing file hierarchies. Instead of returning data about one file at a time in a stream, it fills in a multidimensional array with data about each file and directory encountered in the requested hierarchies. The arguments are as follows: 'pathlist' An array of file names. The element values are used; the index values are ignored. 'flags' This should be the bitwise OR of one or more of the following predefined constant flag values. At least one of 'FTS_LOGICAL' or 'FTS_PHYSICAL' must be provided; otherwise 'fts()' returns an error value and sets 'ERRNO'. The flags are: 'FTS_LOGICAL' Do a "logical" file traversal, where the information returned for a symbolic link refers to the linked-to file, and not to the symbolic link itself. This flag is mutually exclusive with 'FTS_PHYSICAL'. 'FTS_PHYSICAL' Do a "physical" file traversal, where the information returned for a symbolic link refers to the symbolic link itself. This flag is mutually exclusive with 'FTS_LOGICAL'. 'FTS_NOCHDIR' As a performance optimization, the C library 'fts()' routines change directory as they traverse a file hierarchy. This flag disables that optimization. 'FTS_COMFOLLOW' Immediately follow a symbolic link named in 'pathlist', whether or not 'FTS_LOGICAL' is set. 'FTS_SEEDOT' By default, the C library 'fts()' routines do not return entries for '.' (dot) and '..' (dot-dot). This option causes entries for dot-dot to also be included. (The extension always includes an entry for dot; more on this in a moment.) 'FTS_XDEV' During a traversal, do not cross onto a different mounted filesystem. 'filedata' The 'filedata' array holds the results. 'fts()' first clears it. Then it creates an element in 'filedata' for every element in 'pathlist'. The index is the name of the directory or file given in 'pathlist'. The element for this index is itself an array. There are two cases: _The path is a file_ In this case, the array contains two or three elements: '"path"' The full path to this file, starting from the "root" that was given in the 'pathlist' array. '"stat"' This element is itself an array, containing the same information as provided by the 'stat()' function described earlier for its 'statdata' argument. The element may not be present if the 'stat()' system call for the file failed. '"error"' If some kind of error was encountered, the array will also contain an element named '"error"', which is a string describing the error. _The path is a directory_ In this case, the array contains one element for each entry in the directory. If an entry is a file, that element is the same as for files, just described. If the entry is a directory, that element is (recursively) an array describing the subdirectory. If 'FTS_SEEDOT' was provided in the flags, then there will also be an element named '".."'. This element will be an array containing the data as provided by 'stat()'. In addition, there will be an element whose index is '"."'. This element is an array containing the same two or three elements as for a file: '"path"', '"stat"', and '"error"'. The 'fts()' function returns zero if there were no errors. Otherwise, it returns -1. NOTE: The 'fts()' extension does not exactly mimic the interface of the C library 'fts()' routines, choosing instead to provide an interface that is based on associative arrays, which is more comfortable to use from an 'awk' program. This includes the lack of a comparison function, because 'gawk' already provides powerful array sorting facilities. Although an 'fts_read()'-like interface could have been provided, this felt less natural than simply creating a multidimensional array to represent the file hierarchy and its information. See 'test/fts.awk' in the 'gawk' distribution for an example use of the 'fts()' extension function.