voussoirkit/voussoirkit/winglob.py
Ethan Dalool b288cca519
Rewrite a lot of pathclass, spinal.walk using tuple-based Path.
I was inspired by the idea of "making impossible states impossible"
and using a data model that accurately represents what we intend for
it to represent. Instead of storing the path as a string where "it's
a string but actually you're supposed to know that the parts between
os.seps are different parts and the first one is special and...", we
can use a data model that directly says that. Storing the path as a
tuple of (Drive, Part, Part) helps me focus on the semantics of the
Path as a collection of parts joined by the os.sep.

Furthermore, storing the path as a string made some operations slow.
Every time we call one of the os.path functions with a string, it
has to do a lot of normalization and edge-case handling even when we
know it wouldn't be needed. By storing the path as a tuple, we can
instantly get the drive name, parent dir name, and basename without
asking os.path to split it for us every single time. It also makes
relative path / common ancestor checks a lot easier to understand.
Fewer operations need to go into the slow functions.
2021-11-30 21:16:47 -08:00

50 lines
1.6 KiB
Python

'''
On Windows, square brackets do not have a special meaning in glob strings.
However, python's glob module is written for unix-style globs in which brackets
represent character classes / ranges.
On Windows we should escape those brackets to get results that are consistent
with a Windows user's expectations. But calling glob.escape would also escape
asterisk which may not be desired. So this module just provides a modified
version of glob.glob which will escape only square brackets when called on
Windows, and behave normally on Linux.
'''
import fnmatch as python_fnmatch
import glob as python_glob
import os
import re
if os.name == 'nt':
GLOB_SYMBOLS = {'*', '?'}
else:
GLOB_SYMBOLS = {'*', '?', '['}
def fix(pattern):
if os.name == 'nt':
pattern = re.sub(r'(\[|\])', r'[\1]', pattern)
return pattern
def fnmatch(name, pat):
return python_fnmatch.fnmatch(name, fix(pat))
def fnmatch_filter(names, pat):
return python_fnmatch.filter(names, fix(pat))
def glob(pathname, *, recursive=False):
return python_glob.glob(fix(pathname), recursive=recursive)
def glob_many(patterns, *, recursive=False):
'''
Given many glob patterns, yield the results as a single generator.
Saves you from having to write the nested loop.
'''
for pattern in patterns:
yield from glob(pattern, recursive=recursive)
def is_glob(pattern):
'''
Improvements can be made to validate [] ranges for unix, but properly
parsing the range syntax is not something I'm interested in doing right now
and it would become the largest function in the whole module.
'''
return len(set(pattern).intersection(GLOB_SYMBOLS)) > 0