url to ps/pdf

url to ps/pdf

am 18.11.2007 13:42:07 von Ed Morton

I'm trying to write a script to convert URLs to either postscript or PDF (or
send them to a printer). I can use htmldoc (http://www.htmldoc.org/) on pages
written in HTML, but not on pages generated on-the-fly using PHP or Java, or on
files that are already PDF or text or.....

I'm just looking for something I can run on UNIX that, without popping up a GUI,
would perform the same magic as if I pointed a web browser at a link and clicked
"Print" with a PS or PDF printer driver. Anyone got that already or have any
suggestions?

Ed.

Re: url to ps/pdf

am 18.11.2007 15:11:10 von Steven J Masta

Ed Morton wrote:
> I'm trying to write a script to convert URLs to either postscript or PDF (or
> send them to a printer). I can use htmldoc (http://www.htmldoc.org/) on pages
> written in HTML, but not on pages generated on-the-fly using PHP or Java, or on
> files that are already PDF or text or.....
>
> I'm just looking for something I can run on UNIX that, without popping up a GUI,
> would perform the same magic as if I pointed a web browser at a link and clicked
> "Print" with a PS or PDF printer driver. Anyone got that already or have any
> suggestions?

I haven't tried it myself (yet), but there's a perl script html2ps
available from http://user.it.uu.se/~jan/html2ps.html that might work
for you.

Steve

Re: url to ps/pdf

am 18.11.2007 15:28:34 von Ed Morton

On 11/18/2007 8:11 AM, Steven J Masta wrote:
> Ed Morton wrote:
>
>>I'm trying to write a script to convert URLs to either postscript or PDF (or
>>send them to a printer). I can use htmldoc (http://www.htmldoc.org/) on pages
>>written in HTML, but not on pages generated on-the-fly using PHP or Java, or on
>>files that are already PDF or text or.....
>>
>>I'm just looking for something I can run on UNIX that, without popping up a GUI,
>>would perform the same magic as if I pointed a web browser at a link and clicked
>>"Print" with a PS or PDF printer driver. Anyone got that already or have any
>>suggestions?
>
>
> I haven't tried it myself (yet), but there's a perl script html2ps
> available from http://user.it.uu.se/~jan/html2ps.html that might work
> for you.
>
> Steve

I tried it previously and it seems to have all the same drawbacks as htmldoc,
but produces worse formatting in the success cases.

Thanks anyway,

Ed,

Re: url to ps/pdf

am 20.11.2007 08:29:41 von Michael DeBusk

On Sun, 18 Nov 2007 06:42:07 -0600, Ed Morton wrote:

> I'm just looking for something I can run on UNIX that, without
> popping up a GUI, would perform the same magic as if I pointed a web
> browser at a link and clicked "Print" with a PS or PDF printer
> driver. Anyone got that already or have any suggestions?

How are you getting the HTML pages now?

Sounds to me like if you used wget or curl to pull the page down to a
particular directory and had a script check that directory every so
often for HTML files, you could run your html-to-pdf program against any
HTML you might find in there.

--
Registered Linux User #450983 * Ubuntu Counter Project #10548
Home: http://nlphilia.com * Blog: http://nlphilia.net
The "mypacks.net" address from which this message was sent is
legitimate and not spam-trapped. It is, however, disposable.

Re: url to ps/pdf

am 20.11.2007 14:55:30 von Ed Morton

On 11/20/2007 1:29 AM, Michael DeBusk wrote:
> On Sun, 18 Nov 2007 06:42:07 -0600, Ed Morton wrote:
>
>
>> I'm just looking for something I can run on UNIX that, without
>> popping up a GUI, would perform the same magic as if I pointed a web
>> browser at a link and clicked "Print" with a PS or PDF printer
>> driver. Anyone got that already or have any suggestions?
>
>
> How are you getting the HTML pages now?

Using a GUI web browser.

> Sounds to me like if you used wget or curl to pull the page down to a
> particular directory and had a script check that directory every so
> often for HTML files, you could run your html-to-pdf program against any
> HTML you might find in there.
>

I don't need to do that as htmldoc can just take a url and produce the ps/pdf
file, but it's all the stuff that gets displayed on the web that ISN'T an HTML
file that's the problem. For example:

http://www.nissanusa.com/apps/brochure?intcmp=HTML_HP.Promo. Brochure.P2

Ed.