summaryrefslogtreecommitdiff
path: root/man7/glob.7
diff options
context:
space:
mode:
authorMichael Kerrisk <mtk.manpages@gmail.com>2004-11-03 13:51:07 +0000
committerMichael Kerrisk <mtk.manpages@gmail.com>2004-11-03 13:51:07 +0000
commitfea681dafb1363a154b7fc6d59baa83d2a9ebc5c (patch)
tree8ea275c0f242af739617d0afc3e1b16c4eff3dc2 /man7/glob.7
Import of man-pages 1.70
Diffstat (limited to 'man7/glob.7')
-rw-r--r--man7/glob.7187
1 files changed, 187 insertions, 0 deletions
diff --git a/man7/glob.7 b/man7/glob.7
new file mode 100644
index 00000000..02ead1c5
--- /dev/null
+++ b/man7/glob.7
@@ -0,0 +1,187 @@
+.\" Copyright (c) 1998 Andries Brouwer
+.\"
+.\" This is free documentation; you can redistribute it and/or
+.\" modify it under the terms of the GNU General Public License as
+.\" published by the Free Software Foundation; either version 2 of
+.\" the License, or (at your option) any later version.
+.\"
+.\" The GNU General Public License's references to "object code"
+.\" and "executables" are to be interpreted as the output of any
+.\" document formatting or typesetting system, including
+.\" intermediate and printed output.
+.\"
+.\" This manual is distributed in the hope that it will be useful,
+.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
+.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+.\" GNU General Public License for more details.
+.\"
+.\" You should have received a copy of the GNU General Public
+.\" License along with this manual; if not, write to the Free
+.\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111,
+.\" USA.
+.\"
+.\" 2003-08-24 fix for / by John Kristoff + joey
+.\"
+.TH GLOB 7 2003-08-24 "Unix" "Linux Programmer's Manual"
+.SH NAME
+glob \- Globbing pathnames
+.SH DESCRIPTION
+Long ago, in Unix V6, there was a program
+.I /etc/glob
+that would expand wildcard patterns.
+Soon afterwards this became a shell built-in.
+
+These days there is also a library routine
+.BR glob (3)
+that will perform this function for a user program.
+
+The rules are as follows (POSIX 1003.2, 3.13).
+.SH "WILDCARD MATCHING"
+A string is a wildcard pattern if it contains one of the
+characters `?', `*' or `['. Globbing is the operation
+that expands a wildcard pattern into the list of pathnames
+matching the pattern. Matching is defined by:
+
+A `?' (not between brackets) matches any single character.
+
+A `*' (not between brackets) matches any string,
+including the empty string.
+
+.SS "Character classes"
+An expression `[...]' where the first character after the
+leading `[' is not an `!' matches a single character,
+namely any of the characters enclosed by the brackets.
+The string enclosed by the brackets cannot be empty;
+therefore `]' can be allowed between the brackets, provided
+that it is the first character. (Thus, `[][!]' matches the
+three characters `[', `]' and `!'.)
+
+.SS Ranges
+There is one special convention:
+two characters separated by `-' denote a range.
+(Thus, `[A-Fa-f0-9]' is equivalent to `[ABCDEFabcdef0123456789]'.)
+One may include `-' in its literal meaning by making it the
+first or last character between the brackets.
+(Thus, `[]-]' matches just the two characters `]' and `-',
+and `[--0]' matches the three characters `-', `.', `0', since `/'
+cannot be matched.)
+
+.SS Complementation
+An expression `[!...]' matches a single character, namely
+any character that is not matched by the expression obtained
+by removing the first `!' from it.
+(Thus, `[!]a-]' matches any single character except `]', `a' and `-'.)
+
+One can remove the special meaning of `?', `*' and `[' by
+preceding them by a backslash, or, in case this is part of
+a shell command line, enclosing them in quotes.
+Between brackets these characters stand for themselves.
+Thus, `[[?*\e]' matches the four characters `[', `?', `*' and `\e'.
+
+.SH PATHNAMES
+Globbing is applied on each of the components of a pathname
+separately. A `/' in a pathname cannot be matched by a `?' or `*'
+wildcard, or by a range like `[.-0]'. A range cannot contain an
+explicit `/' character; this would lead to a syntax error.
+
+If a filename starts with a `.', this character must be matched explicitly.
+(Thus, `rm *' will not remove .profile, and `tar c *' will not
+archive all your files; `tar c .' is better.)
+
+.SH "EMPTY LISTS"
+The nice and simple rule given above: `expand a wildcard pattern
+into the list of matching pathnames' was the original Unix
+definition. It allowed one to have patterns that expand into
+an empty list, as in
+.br
+.nf
+ xv -wait 0 *.gif *.jpg
+.fi
+where perhaps no *.gif files are present (and this is not
+an error).
+However, POSIX requires that a wildcard pattern is left
+unchanged when it is syntactically incorrect, or the list of
+matching pathnames is empty.
+With
+.I bash
+one can force the classical behaviour by setting
+.IR allow_null_glob_expansion=true .
+
+(Similar problems occur elsewhere. E.g., where old scripts have
+.br
+.nf
+ rm `find . -name "*~"`
+.fi
+new scripts require
+.br
+.nf
+ rm -f nosuchfile `find . -name "*~"`
+.fi
+to avoid error messages from
+.I rm
+called with an empty argument list.)
+
+.SH NOTES
+.SS Regular expressions
+Note that wildcard patterns are not regular expressions,
+although they are a bit similar. First of all, they match
+filenames, rather than text, and secondly, the conventions
+are not the same: e.g., in a regular expression `*' means zero or
+more copies of the preceding thing.
+
+Now that regular expressions have bracket expressions where
+the negation is indicated by a `^', POSIX has declared the
+effect of a wildcard pattern `[^...]' to be undefined.
+
+.SS Character classes and Internationalization
+Of course ranges were originally meant to be ASCII ranges,
+so that `[ -%]' stands for `[ !"#$%]' and `[a-z]' stands
+for "any lowercase letter".
+Some Unix implementations generalized this so that a range X-Y
+stands for the set of characters with code between the codes for
+X and for Y. However, this requires the user to know the
+character coding in use on the local system, and moreover, is
+not convenient if the collating sequence for the local alphabet
+differs from the ordering of the character codes.
+Therefore, POSIX extended the bracket notation greatly,
+both for wildcard patterns and for regular expressions.
+In the above we saw three types of items that can occur in a bracket
+expression: namely (i) the negation, (ii) explicit single characters,
+and (iii) ranges. POSIX specifies ranges in an internationally
+more useful way and adds three more types:
+
+(iii) Ranges X-Y comprise all characters that fall between X
+and Y (inclusive) in the currect collating sequence as defined
+by the LC_COLLATE category in the current locale.
+
+(iv) Named character classes, like
+.br
+.nf
+[:alnum:] [:alpha:] [:blank:] [:cntrl:]
+[:digit:] [:graph:] [:lower:] [:print:]
+[:punct:] [:space:] [:upper:] [:xdigit:]
+.fi
+so that one can say `[[:lower:]]' instead of `[a-z]', and have
+things work in Denmark, too, where there are three letters past `z'
+in the alphabet.
+These character classes are defined by the LC_CTYPE category
+in the current locale.
+
+(v) Collating symbols, like `[.ch.]' or `[.a-acute.]',
+where the string between `[.' and `.]' is a collating
+element defined for the current locale. Note that this may
+be a multi-character element.
+
+(vi) Equivalence class expressions, like `[=a=]',
+where the string between `[=' and `=]' is any collating
+element from its equivalence class, as defined for the
+current locale. For example, `[[=a=]]' might be equivalent
+to `[aáàäâ]' (warning: Latin-1 here), that is,
+to `[a[.a-acute.][.a-grave.][.a-umlaut.][.a-circumflex.]]'.
+
+.SH "SEE ALSO"
+.BR sh (1),
+.BR fnmatch (3),
+.BR glob (3),
+.BR locale (7),
+.BR regex (7)