-
Notifications
You must be signed in to change notification settings - Fork 36.5k
Description
VSCode is implement its own solution for glob pattern matching and does not rely on a 3rd party module. The reasons are mostly historic but our glob patterns also behave slightly different compared to others (e.g. **/foo.txt matches on foo.txt, both backslash and slash are supported).
Our glob matching works in 2 stages:
- if it matches a typical pattern (such as
**/*.js), we optimize the matching to avoid a regex use (from here) - otherwise we try to convert the pattern into a single regex to match (from here)
Specifically for the conversion of ** to a regular expression we have a flaw that results in the issue at hand here, the regex is:
vscode/src/vs/base/common/glob.ts
Lines 57 to 60 in c7fcb30
| // Matches: (Path Sep OR Path Val followed by Path Sep OR Path Sep followed by Path Val) 0-many times | |
| // Group is non capturing because we don't need to capture at all (?:...) | |
| // Overall we use non-greedy matching because it could be that we match too much | |
| return `(?:${PATH_REGEX}|${NO_PATH_REGEX}+${PATH_REGEX}|${PATH_REGEX}${NO_PATH_REGEX}+)*?`; |
Or specifically for **/p* ends up to be:
/^(?:[/\\]|[^/\\]+[/\\]|[/\\][^/\\]+)*?p[^/\\]*?$/
Unfortunately, given a string of /foo/ap, the segment [/\\][^/\\]+ eagerly matches on /a and as such, the overall result is a match, even though that is unexpected.
Help wanted
Opening up for others to contribute, the solution should:
- preserve our tests for globs to succeed (
glob.test.ts) - enable the now skipped test to succeed (code)
- somewhat preserve our strategy for computing a single regex for matching [1]
[1] however if this is not possible, I am also open for ideas how to solve this differently if the solution is not overly complex.