extracting from a pdf files txt file

extracting from a pdf files txt file

am 08.01.2008 10:10:01 von franzi

Hi there,is there a methode to extract from a pdf files in sed & awk
languages the text file?
thanks in advance

Re: extracting from a pdf files txt file

am 08.01.2008 11:59:15 von Bill Marcum

On 2008-01-08, franzi wrote:
>
>
> Hi there,is there a methode to extract from a pdf files in sed & awk
> languages the text file?
> thanks in advance

pdftotext, but it is not a standard unix command.

Re: extracting from a pdf files txt file

am 09.01.2008 23:24:02 von Edward Rosten

On Jan 8, 2:10 am, franzi wrote:
> Hi there,is there a methode to extract from a pdf files in sed & awk
> languages the text file?
> thanks in advance


One poster suggested pdftotext. An alternative is ps2ascii. Since it's
based on GS, it can interpret PDFs as well as PS files, so it should
work as well.

-Ed
--
(You can't go wrong with psycho-rats.)(http://mi.eng.cam.ac.uk/~er258)

/d{def}def/f{/Times s selectfont}d/s{11}d/r{roll}d f 2/m{moveto}d -1
r
230 350 m 0 1 179{1 index show 88 rotate 4 mul 0 rmoveto}for/s 12 d f
pop 235 420 translate 0 0 moveto 1 2 scale show showpage