Tools

Glob Patterns Examples and Syntax

Introduction

Glob patterns, also known as wildcard patterns or globbing patterns, are invaluable tools for matching and manipulating filenames and paths in a filesystem. They provide a flexible and efficient way to specify patterns and perform operations on multiple files at once. In this article, we will discover glob patterns examples and syntax and how they can be used in various programming languages and tools.

Glob Patterns Syntax

At its core, a glob pattern is a string pattern that uses special characters called wildcards to represent variable parts of filenames or paths. These wildcards allow us to match filenames based on specific patterns rather than exact names. The most commonly used wildcards are “*”, “?”, and “[]”.

Note: normally, the path separator character (/ on Linux/Unix, MacOS, etc. or \ on Windows) will never be matched. Some shells, such as Bash have functionality allowing users to circumvent this.

WildcardDescriptionExampleMatchesDoes not match
*matches any number of any characters including noneLaw*Law
Laws
Lawyer
GrokLaw
La
aw
*Law*Law
GrokLaw
Lawyer.
La
aw
**matches zero or more directories and their subdirectories. Exclude directories that start with “.” or “..” or are hiddensrc/**src/a.js,
src/b/a.js
src/b/
\an escape character
?matches any single character?atCat
cat
Bat
bat
at
^matches the beginning of the input string.
When used within square brackets, it indicates that the characters set within the square brackets are not accepted.
^aa
ab
b
[abc]matches one character given in the bracket[CB]atCat
Bat
cat
bat
CBat
[a-z]matches one character from the (locale-dependent) range given in the bracketLetter[0-9]Letter0
Letter1
Letter2
Letters
Letter
Letter10
[^a-z] or [!a-z]matches a single character within a range of characters that are not within parentheses, you need to escape [!abc] as [\!abc] in the bash command line.[!A-Z]ataat
bat
zat
Aat
Bat
Zat
{x, y, ...}use commas within curly brackets to separate patternsa.{png,jp{,e}g}a.png
a.jpg
a.jpeg
()parentheses must be used following the characters ?, *, +, @, and !. The content within the parentheses consists of a set of patterns separated by the “|” symbol, such as abc|a?c|ac*
?(pattern-list)matches the given pattern zero or one timea.?(txt|bin)a.
a.txt
a.bin
a
*(pattern-list)matches the given pattern zero or more timesa.*(txt|bin)a.
a.txt
a.bin
a.txtbin
a
+(pattern-list)matches the given pattern one or more timesa.+(txt|bin)a.txt, a.bin, a.txtbina
a.
@(pattern-list)matches the given patterna.@(txt|bin)a.txt
a.bin
a.
a.txtbin
!(pattern-list)matches a pattern that is not givena.!(txt|bin)a.
a.txtbin
a.txt
a.bin

The pattern-list is a collection of patterns separated by the “|” symbol, such as abc|a?c|ac*

Glob Patterns in Practice

Glob patterns find extensive use in various programming languages, command-line interfaces, and tools. Many programming languages provide built-in functions or libraries that support glob pattern matching for file operations. Tools like the Unix command “find” or the Python library “glob” leverage glob patterns to search for files in a directory hierarchy.

For example, in Python, the “glob” module allows us to easily retrieve files matching a specific pattern:

import glob

# Get all text files in the current directory
text_files = glob.glob("*.txt")
print(text_files)

Glob patterns exclude folders or files

Ignore all files in the public directory except for the “worker” folder

For example, configure inside the .gitignore

/public/**/*
!/public/worker

Differences between glob patterns and regular expressions

Glob patterns and regular expressions are both used for pattern matching, but they have distinct differences in syntax, functionality, and use cases. Here are some key differences between glob patterns and regular expressions:

1. Syntax

  • Glob patterns use a simplified syntax with a limited set of wildcards, such as “*”, “?”, and “[]”, to represent patterns and match filenames or paths. They are typically used for simple pattern matching.
  • Regular expressions have a more complex syntax with a wide range of metacharacters, quantifiers, and anchors, allowing for more advanced pattern matching and manipulation. They are used for complex pattern matching and text processing.

2. Matching Scope

  • Glob patterns are primarily used for matching filenames or paths in filesystems. They are designed to work at a file or directory level, allowing for simple pattern matching based on file or directory names.
  • Regular expressions, on the other hand, can match patterns within larger sets of text, not limited to just filenames. They can match patterns within strings, paragraphs, or even entire documents, providing more extensive text processing capabilities.

3. Wildcards vs. Metacharacters

  • Glob patterns use specific wildcards like “*”, “?”, and “[]” to represent variable parts of filenames. These wildcards have predefined meanings and are limited in functionality.
  • Regular expressions use metacharacters, such as “.”, “*”, “+”, and “()”. These metacharacters provide more powerful pattern matching capabilities, including character classes, quantifiers, capturing groups, and more.

4. Complexity

  • Glob patterns are simpler and easier to understand, making them more accessible for basic pattern matching tasks. They have a more straightforward syntax and are suitable for simple matching operations.
  • Regular expressions are more complex and require a deeper understanding of their syntax and functionality. They offer more advanced pattern matching and manipulation capabilities, making them more suitable for complex text processing tasks.

5. Tool and Language Support

  • Glob patterns are widely supported in various programming languages and tools, often with dedicated functions or libraries for working with filenames and paths.
  • Regular expressions are supported by nearly all programming languages and text-processing tools, with dedicated functions or libraries for working with regular expressions.

Conclusion

Glob patterns provide a powerful and flexible mechanism for matching and manipulating filenames and paths. Their wildcards enable us to specify pattern-based searches rather than relying on exact name matches. By understanding the syntax and usage of glob patterns, we can leverage their capabilities in various programming languages and tools, making file operations more efficient and convenient.

Leave a Reply

Your email address will not be published. Required fields are marked *