Glob Patterns Examples and Syntax
Introduction
Glob patterns, also known as wildcard patterns or globbing patterns, are invaluable tools for matching and manipulating filenames and paths in a filesystem. They provide a flexible and efficient way to specify patterns and perform operations on multiple files at once. In this article, we will discover glob patterns examples and syntax and how they can be used in various programming languages and tools.
Glob Patterns Syntax
At its core, a glob pattern is a string pattern that uses special characters called wildcards to represent variable parts of filenames or paths. These wildcards allow us to match filenames based on specific patterns rather than exact names. The most commonly used wildcards are “*”, “?”, and “[]”.
Note: normally, the path separator character (/
on Linux/Unix, MacOS, etc. or \
on Windows) will never be matched. Some shells, such as Bash have functionality allowing users to circumvent this.
Wildcard | Description | Example | Matches | Does not match |
---|---|---|---|---|
* | matches any number of any characters including none | Law* | Law Laws Lawyer | GrokLaw La aw |
*Law* | Law GrokLaw Lawyer . | La aw | ||
** | matches zero or more directories and their subdirectories. Exclude directories that start with “.” or “..” or are hidden | src/** | src/a.js ,src/b/a.js src/b/ | |
\ | an escape character | |||
? | matches any single character | ?at | Cat cat Bat bat | at |
^ | matches the beginning of the input string. When used within square brackets, it indicates that the characters set within the square brackets are not accepted. | ^a | a ab | b |
[abc] | matches one character given in the bracket | [CB]at | Cat Bat | cat bat CBat |
[a-z] | matches one character from the (locale-dependent) range given in the bracket | Letter[0-9] | Letter0 Letter1 Letter2 | Letters Letter Letter10 |
[^a-z] or [!a-z] | matches a single character within a range of characters that are not within parentheses, you need to escape [!abc] as [\!abc] in the bash command line. | [!A-Z]at | aat bat zat | Aat Bat Zat |
{x, y, ...} | use commas within curly brackets to separate patterns | a.{png,jp{,e}g} | a.png a.jpg a.jpeg | |
() | parentheses must be used following the characters ?, *, +, @, and !. The content within the parentheses consists of a set of patterns separated by the “|” symbol, such as abc|a?c|ac* | |||
?(pattern-list) | matches the given pattern zero or one time | a.?(txt|bin) | a. a.txt a.bin | a |
*(pattern-list) | matches the given pattern zero or more times | a.*(txt|bin) | a. a.txt a.bin a.txtbin | a |
+(pattern-list) | matches the given pattern one or more times | a.+(txt|bin) | a.txt , a.bin , a.txtbin | a a. |
@(pattern-list) | matches the given pattern | a.@(txt|bin) | a.txt a.bin | a. a.txtbin |
!(pattern-list) | matches a pattern that is not given | a.!(txt|bin) | a. a.txtbin | a.txt a.bin |
The pattern-list is a collection of patterns separated by the “|” symbol, such as abc|a?c|ac*
Glob Patterns in Practice
Glob patterns find extensive use in various programming languages, command-line interfaces, and tools. Many programming languages provide built-in functions or libraries that support glob pattern matching for file operations. Tools like the Unix command “find” or the Python library “glob” leverage glob patterns to search for files in a directory hierarchy.
For example, in Python, the “glob” module allows us to easily retrieve files matching a specific pattern:
import glob
# Get all text files in the current directory
text_files = glob.glob("*.txt")
print(text_files)
Glob patterns exclude folders or files
Ignore all files in the public directory except for the “worker” folder
For example, configure inside the .gitignore
/public/**/*
!/public/worker
Differences between glob patterns and regular expressions
Glob patterns and regular expressions are both used for pattern matching, but they have distinct differences in syntax, functionality, and use cases. Here are some key differences between glob patterns and regular expressions:
1. Syntax
- Glob patterns use a simplified syntax with a limited set of wildcards, such as “*”, “?”, and “[]”, to represent patterns and match filenames or paths. They are typically used for simple pattern matching.
- Regular expressions have a more complex syntax with a wide range of metacharacters, quantifiers, and anchors, allowing for more advanced pattern matching and manipulation. They are used for complex pattern matching and text processing.
2. Matching Scope
- Glob patterns are primarily used for matching filenames or paths in filesystems. They are designed to work at a file or directory level, allowing for simple pattern matching based on file or directory names.
- Regular expressions, on the other hand, can match patterns within larger sets of text, not limited to just filenames. They can match patterns within strings, paragraphs, or even entire documents, providing more extensive text processing capabilities.
3. Wildcards vs. Metacharacters
- Glob patterns use specific wildcards like “*”, “?”, and “[]” to represent variable parts of filenames. These wildcards have predefined meanings and are limited in functionality.
- Regular expressions use metacharacters, such as “.”, “*”, “+”, and “()”. These metacharacters provide more powerful pattern matching capabilities, including character classes, quantifiers, capturing groups, and more.
4. Complexity
- Glob patterns are simpler and easier to understand, making them more accessible for basic pattern matching tasks. They have a more straightforward syntax and are suitable for simple matching operations.
- Regular expressions are more complex and require a deeper understanding of their syntax and functionality. They offer more advanced pattern matching and manipulation capabilities, making them more suitable for complex text processing tasks.
5. Tool and Language Support
- Glob patterns are widely supported in various programming languages and tools, often with dedicated functions or libraries for working with filenames and paths.
- Regular expressions are supported by nearly all programming languages and text-processing tools, with dedicated functions or libraries for working with regular expressions.
Conclusion
Glob patterns provide a powerful and flexible mechanism for matching and manipulating filenames and paths. Their wildcards enable us to specify pattern-based searches rather than relying on exact name matches. By understanding the syntax and usage of glob patterns, we can leverage their capabilities in various programming languages and tools, making file operations more efficient and convenient.