Can wget preserve perms or is there a umask on my face?

Can wget preserve perms or is there a umask on my face?

am 18.01.2008 21:27:04 von paintedjazz

I just downloaded wget to obtain a mirror of a website for which I'm
volunteering to create additional content. After installing wget, I
used the following command to download the entire site:

wget -mirror http://www.mysite.com

The download succeeded perfectly as far as I can tell at this
point ... except for at least one little detail. The download did not
preserve the permissions for files and folders. I looked at the man
page and searched for permissions, umask and preserve but no mention
is made of any of the three terms. I presume that the perms are being
set by my own default umask. Is there a way to get the proper perms?
Many thanks.

Re: Can wget preserve perms or is there a umask on my face?

am 18.01.2008 22:06:15 von Bill Marcum

On 2008-01-18, paintedjazz@gmail.com wrote:
>
>
> I just downloaded wget to obtain a mirror of a website for which I'm
> volunteering to create additional content. After installing wget, I
> used the following command to download the entire site:
>
> wget -mirror http://www.mysite.com
>
> The download succeeded perfectly as far as I can tell at this
> point ... except for at least one little detail. The download did not
> preserve the permissions for files and folders. I looked at the man
> page and searched for permissions, umask and preserve but no mention
> is made of any of the three terms. I presume that the perms are being
> set by my own default umask. Is there a way to get the proper perms?
> Many thanks.
>
>
To preserve permissions you might have to use tar or rsync.

Re: Can wget preserve perms or is there a umask on my face?

am 18.01.2008 22:09:33 von PK

paintedjazz@gmail.com wrote:

> I just downloaded wget to obtain a mirror of a website for which I'm
> volunteering to create additional content. After installing wget, I
> used the following command to download the entire site:
>
> wget -mirror http://www.mysite.com
>
> The download succeeded perfectly as far as I can tell at this
> point ... except for at least one little detail. The download did not
> preserve the permissions for files and folders. I looked at the man
> page and searched for permissions, umask and preserve but no mention
> is made of any of the three terms. I presume that the perms are being
> set by my own default umask. Is there a way to get the proper perms?

HTTP does not preserve permissions. It just transfers files, with no added
attributes (or at least, not that kind of attributes). How can you know
what the "proper perms" are in the first place?
In theory, the site you're downloading from might even be hosted on a file
system that does not support permissions (eg, fat32). What should HTTP be
supposed to do in these cases?

Re: Can wget preserve perms or is there a umask on my face?

am 19.01.2008 08:18:13 von Kaz Kylheku

On Jan 18, 12:27=A0pm, paintedj...@gmail.com wrote:
> I just downloaded wget to obtain a mirror of a website for which I'm
> volunteering to create additional content. =A0After installing wget, I
> used the following command to download the entire site:
>
> =A0 =A0wget -mirrorhttp://www.mysite.com
>
> The download succeeded perfectly as far as I can tell at this
> point ... except for at least one little detail. =A0The download did not
> preserve the permissions for files and folders.

There is no HTTP header, standard or de-facto, for representing
permissions.
A Unix-based HTTP server could easily be hacked to push out some non-
standard header line in its response, like:

POSIX-Perms: rwxr-xr-x
POSIX-Owner: bob
POSIX-Group: users

Propose it; it might catch on. :)

>=A0I looked at the man page and searched for permissions, umask and preserv=
e but no mention
> is made of any of the three terms. =A0I presume that the perms are being
> set by my own default umask. =A0Is there a way to get the proper perms?

You need to get a copy of the website in the proper manner. This can't
be done over HTTP.

What about dynamic content? Some pages are actually scripts. When you
access the script over HTTP, the server runs the code and you get the
output of that code, rather than the code itself. That's useless if
you need to work on the original page.

Simply put, sucking down a website with wget is an incorrect way to
obtain the development materials which constitute it.