Windows paths in glob

Windows paths in glob

am 30.03.2008 21:09:18 von Dmitry

OK, so there's a well-known difficulty with handling Windows-style paths in glob: it doesn't
like backslashes, nor does it like spaces. One solution to that is to use Unix-style paths:

glob('C:\Documents and Settings\*'); # Doesn't work
glob('C:/Documents\ and\ Settings/*'); # Works

Problem is, the rest of Perl's built-in file-handling functionality behaves the other way around.
For instance, with -d:

-d 'C:\Documents and Settings'; # Works
-d 'C:/Documents\ and\ Settings'; # Doesn't work

Question: is there any way to use the same path string with glob and with the rest of Perl,
without having to convert them back and forth?

Re: Windows paths in glob

am 30.03.2008 21:16:24 von someone

Dmitry wrote:
> OK, so there's a well-known difficulty with handling Windows-style paths in glob: it doesn't
> like backslashes, nor does it like spaces. One solution to that is to use Unix-style paths:
>
> glob('C:\Documents and Settings\*'); # Doesn't work
> glob('C:/Documents\ and\ Settings/*'); # Works
>
> Problem is, the rest of Perl's built-in file-handling functionality behaves the other way around.
> For instance, with -d:
>
> -d 'C:\Documents and Settings'; # Works
> -d 'C:/Documents\ and\ Settings'; # Doesn't work
>
> Question: is there any way to use the same path string with glob and with the rest of Perl,
> without having to convert them back and forth?

perldoc File::DosGlob
perldoc File::Spec
perldoc File::Basename


John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall

Re: Windows paths in glob

am 30.03.2008 21:27:16 von Martijn Lievaart

On Sun, 30 Mar 2008 19:09:18 +0000, Dmitry wrote:

> OK, so there's a well-known difficulty with handling Windows-style paths
> in glob: it doesn't like backslashes, nor does it like spaces. One
> solution to that is to use Unix-style paths:
>
> glob('C:\Documents and Settings\*'); # Doesn't work glob('C:/Documents\
> and\ Settings/*'); # Works
>
> Problem is, the rest of Perl's built-in file-handling functionality
> behaves the other way around. For instance, with -d:
>
> -d 'C:\Documents and Settings'; # Works -d 'C:/Documents\ and\
> Settings'; # Doesn't work
>
> Question: is there any way to use the same path string with glob and
> with the rest of Perl, without having to convert them back and forth?

I don't have Windows to test here, but I recall that using either a
forward slash '/' or a backward slash -- properly escaped -- '\\' works
either way in both situations.

In the examples you gave, the versions with backslashes cannot work, the
backslashes are not escaped.

M4

Re: Windows paths in glob

am 30.03.2008 22:05:49 von Gunnar Hjalmarsson

Dmitry wrote:
> OK, so there's a well-known difficulty with handling Windows-style paths in glob: it doesn't
> like backslashes, nor does it like spaces. One solution to that is to use Unix-style paths:
>
> glob('C:\Documents and Settings\*'); # Doesn't work
> glob('C:/Documents\ and\ Settings/*'); # Works
>
> Problem is, the rest of Perl's built-in file-handling functionality behaves the other way around.
> For instance, with -d:
>
> -d 'C:\Documents and Settings'; # Works
> -d 'C:/Documents\ and\ Settings'; # Doesn't work
>
> Question: is there any way to use the same path string with glob and with the rest of Perl,
> without having to convert them back and forth?

A long time ago I decided to use opendir() and readdir() instead of
glob(). It may not be as 'elegant', but it works flawlessly without
escaping spaces.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: Windows paths in glob

am 30.03.2008 23:44:11 von hjp-usenet2

On 2008-03-30 19:27, Martijn Lievaart wrote:
> On Sun, 30 Mar 2008 19:09:18 +0000, Dmitry wrote:
>> OK, so there's a well-known difficulty with handling Windows-style paths
>> in glob: it doesn't like backslashes, nor does it like spaces. One
>> solution to that is to use Unix-style paths:
>>
>> glob('C:\Documents and Settings\*'); # Doesn't work
>> glob('C:/Documents\ and\ Settings/*'); # Works

I didn't expect that but on second thought it makes sense.


>> Problem is, the rest of Perl's built-in file-handling functionality
>> behaves the other way around. For instance, with -d:
>>
>> -d 'C:\Documents and Settings'; # Works -d 'C:/Documents\ and\
>> Settings'; # Doesn't work
>>
>> Question: is there any way to use the same path string with glob and
>> with the rest of Perl, without having to convert them back and forth?
>
> I don't have Windows to test here, but I recall that using either a
> forward slash '/' or a backward slash -- properly escaped -- '\\' works
> either way in both situations.

You misunderstood the problem. The problem is that glob patterns, like
regexps are mini-languages where some characters (or sequences of
characters) have a special meaning. Just as you cannot just use any
string as a regexp and expect it to match itself (or even be a
well-formed regexp) you cannot use any filename as a glob pattern and
expect it to expand to itself. Actually, for globs the situation is
worse: While any string can be converted to a regexp matching that
string, this is not true for globs. Spaces can be escaped with a
backslash, but I didn't find any way to escape an asterisk or question
mark.

So I guess Gunnar's advice is the best: If you need to deal with
arbitrary file and directory names, avoid glob and use opendir/readdir.
Or maybe File::Find or a similar module (which uses opendir/readdir
internally).

hp

Re: Windows paths in glob

am 31.03.2008 00:48:55 von Ben Morrow

Quoth Gunnar Hjalmarsson :
>
> A long time ago I decided to use opendir() and readdir() instead of
> glob(). It may not be as 'elegant', but it works flawlessly without
> escaping spaces.

To save Uri the trouble of pointing it out :), File::Slurp now has a
read_dir function.

Ben

Re: Windows paths in glob

am 31.03.2008 01:20:48 von Uri Guttman

>>>>> "BM" == Ben Morrow writes:

BM> Quoth Gunnar Hjalmarsson :
>>
>> A long time ago I decided to use opendir() and readdir() instead of
>> glob(). It may not be as 'elegant', but it works flawlessly without
>> escaping spaces.

BM> To save Uri the trouble of pointing it out :), File::Slurp now has a
BM> read_dir function.

it has always had a read_dir function! its advantages are a simpler API
(no need for a handle, opendir, closedir calls) and it filters out . and
... for you. a minor disadvantage (and very minor IMO) is that it can't
iterate in scalar mode so you get one dir entry at a time. that would
only matter if your dir was enormous and i mean very big.

future plans include passing in a regex or code ref to filter for
you. yeah, you can use grep on the output but it is slightly shorter
that way.

uri

--
Uri Guttman ------ uri@stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Free Perl Training --- http://perlhunter.com/college.html ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------

Re: Windows paths in glob

am 31.03.2008 07:54:54 von Dmitry

"John W. Krahn" wrote in
news:cGRHj.9264$9X3.7583@edtnps82:

> Dmitry wrote:
>> OK, so there's a well-known difficulty with handling Windows-style
>> paths in glob: it doesn't like backslashes, nor does it like spaces.
>> One solution to that is to use Unix-style paths:
>>
>> glob('C:\Documents and Settings\*'); # Doesn't work
>> glob('C:/Documents\ and\ Settings/*'); # Works
>>
>> Problem is, the rest of Perl's built-in file-handling functionality
>> behaves the other way around. For instance, with -d:
>>
>> -d 'C:\Documents and Settings'; # Works
>> -d 'C:/Documents\ and\ Settings'; # Doesn't work
>>
>> Question: is there any way to use the same path string with glob and
>> with the rest of Perl, without having to convert them back and forth?
>
> perldoc File::DosGlob
> perldoc File::Spec
> perldoc File::Basename

I tried DosGlob, but when I passed it 'C:\Documents and Settings\*' it bugged out with an
error somewhere in the module...

Re: Windows paths in glob

am 31.03.2008 07:58:49 von Dmitry

Martijn Lievaart wrote in news:pan.2008.03.30.19.27.16@rtij.nl.invlalid:

> On Sun, 30 Mar 2008 19:09:18 +0000, Dmitry wrote:
>
>> OK, so there's a well-known difficulty with handling Windows-style paths
>> in glob: it doesn't like backslashes, nor does it like spaces. One
>> solution to that is to use Unix-style paths:
>>
>> glob('C:\Documents and Settings\*'); # Doesn't work glob('C:/Documents\
>> and\ Settings/*'); # Works
>>
>> Problem is, the rest of Perl's built-in file-handling functionality
>> behaves the other way around. For instance, with -d:
>>
>> -d 'C:\Documents and Settings'; # Works -d 'C:/Documents\ and\
>> Settings'; # Doesn't work
>>
>> Question: is there any way to use the same path string with glob and
>> with the rest of Perl, without having to convert them back and forth?
>
> I don't have Windows to test here, but I recall that using either a
> forward slash '/' or a backward slash -- properly escaped -- '\\' works
> either way in both situations.
>
> In the examples you gave, the versions with backslashes cannot work, the
> backslashes are not escaped.
>
> M4

Spaces are a more serious problem than slashes. But anyway, the examples work,
because I used single quotes. BTW, current core glob seems to ignore backslashes
altogether, unless they escape something other than a backslash.

Re: Windows paths in glob

am 31.03.2008 08:09:33 von Dmitry

Gunnar Hjalmarsson wrote in
news:65aa8vF2euim1U1@mid.individual.net:

> Dmitry wrote:
>> OK, so there's a well-known difficulty with handling Windows-style
>> paths in glob: it doesn't like backslashes, nor does it like spaces.
>> One solution to that is to use Unix-style paths:
>>
>> glob('C:\Documents and Settings\*'); # Doesn't work
>> glob('C:/Documents\ and\ Settings/*'); # Works
>>
>> Problem is, the rest of Perl's built-in file-handling functionality
>> behaves the other way around. For instance, with -d:
>>
>> -d 'C:\Documents and Settings'; # Works
>> -d 'C:/Documents\ and\ Settings'; # Doesn't work
>>
>> Question: is there any way to use the same path string with glob and
>> with the rest of Perl, without having to convert them back and forth?
>
> A long time ago I decided to use opendir() and readdir() instead of
> glob(). It may not be as 'elegant', but it works flawlessly without
> escaping spaces.
>

OK, thanks. I guess if I wanted to process wildcards in the file name, I would pass them
through grep?

Re: Windows paths in glob

am 31.03.2008 09:16:27 von szr

Dmitry wrote:
> OK, so there's a well-known difficulty with handling Windows-style
> paths in glob: it doesn't like backslashes, nor does it like spaces.
> One solution to that is to use Unix-style paths:
>
> glob('C:\Documents and Settings\*'); # Doesn't work
> glob('C:/Documents\ and\ Settings/*'); # Works
>
> Problem is, the rest of Perl's built-in file-handling functionality
> behaves the other way around. For instance, with -d:
>
> -d 'C:\Documents and Settings'; # Works
> -d 'C:/Documents\ and\ Settings'; # Doesn't work
>
> Question: is there any way to use the same path string with glob and
> with the rest of Perl, without having to convert them back and forth?

I find, just as in geenral under Win32, putting double quotes around the
path gets around problems like this:

C:\>perl -e "my @d = glob('"""C:/Documents and Settings"""/*'); print
qq{\n}, join(qq{\n}, @d), qq{\n};"

C:/Documents and Settings/Administrator
C:/Documents and Settings/All Users
[...]

*** Note that """, when used in a double quoted string, under the
cmd.exe shell yields a literal ", so the glob statement is effectively:

glob('"C:/Documents and Settings"/*');

*** This is only because the command was run from the command line; in
an actual script you would of course use a normal double quote around
the path (just like in the linux examples below.)


And this works for tests like -d as well:

C:\>perl -e "print int (-d """C:/Documents and Settings""")"
1
C:\>perl -e "print int (-d """C:/123Documents and Settings""")"
0


And this form works under linux as well:

$ perl -e 'my @d = glob(q{"/mnt/samba/win_hd/Documents and
Settings"/*}); print qq{\n}, join(qq{\n}, @d), qq{\n};'

/mnt/samba/win_hd/Documents and Settings/Administrator
/mnt/samba/win_hd/Documents and Settings/All Users

$ perl -e 'print int (-d "/mnt/samba/win_hd/Documents and Settings")'
1
$ perl -e 'print int (-d "/mnt/samba/win_hd/123Documents and Settings")'
0

This was tested under ActivePerl 5.6.1 and 5.8.7, and under linux using
5.10.0, 5.8.8, and 5.6.1.


So if you want to do it in a way that works on most platforms (at the
very least windows and *nix),

1) Use a forward slash, not a back slash, as a path delimiter.
I.E., C:/path to/somewhere/file.ext, and

2) Surround the path with quotes.
I.E., "C:/path to/somewhere/a long filename.ext", or
"C:/path to/somewhere"/file.ext, or
"C:/Documents and Settings/"

and you should be fine.

Hope this helps.

--
szr

Re: Windows paths in glob

am 31.03.2008 09:52:56 von Joe Smith

Dmitry wrote:

> OK, thanks. I guess if I wanted to process wildcards in the file name, I would pass them
> through grep?

Yes, after converting wildcard characters into regex characters, of course.