Converting .gif to .txt

Converting .gif to .txt

am 09.02.2006 07:42:17 von heisspf

Hi,

How can one convert a text document which has been scanned and therefore
has become a
..gif file back to a .txt file in order that one can copy and paste text
from it.

I was told it can be done with some software in windows. Is there such
software in
Linux?

I was able to convert a .gif file in question to .pdf and open it with
acroread. Acroread has an option to convert to text, however, trying to do
it I get an empty file.

Thanks for any information.

Peter

------------------------------------------------------------ --------
mail2web - Check your email from the web at
http://mail2web.com/ .


-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

Re: Converting .gif to .txt

am 09.02.2006 09:08:44 von Andrew

heisspf@skyinet.net wrote:

>Hi,
>
>How can one convert a text document which has been scanned and therefore
>has become a
>.gif file back to a .txt file in order that one can copy and paste text
>from it.
>
>I was told it can be done with some software in windows. Is there such
>software in
>Linux?
>
>
I've done this sort of thing under Windows using Optical Character
Recognition software bundled with a scanner. There are OCR programs
available for Linux (I think there's something called clara - not sure
now), but they take rather a lot of work (or did when I looked at them a
year or so ago).

>I was able to convert a .gif file in question to .pdf and open it with
>acroread. Acroread has an option to convert to text, however, trying to do
>it I get an empty file.
>
>
If you can convert to pdf, it is possible to cut and paste from xpdf to
certain editors (Open Office, gedit. Not abiword), but results are
hardly perfect and will probably need quite a bit of editing).

>Thanks for any information.
>
>Peter
>
>
HTH

Andrew
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

Re: Converting .gif to .txt

am 09.02.2006 14:21:40 von James Miller

On Thu, 9 Feb 2006, Andrew wrote:
> heisspf@skyinet.net wrote:
>>
>> How can one convert a text document which has been scanned and therefore
>> has become a
>> .gif file back to a .txt file in order that one can copy and paste text
>> from it.
>>
>> I was told it can be done with some software in windows. Is there such
>> software in
>> Linux?
>>
> I've done this sort of thing under Windows using Optical Character
> Recognition software bundled with a scanner. There are OCR programs available
> for Linux (I think there's something called clara - not sure now), but they
> take rather a lot of work (or did when I looked at them a year or so ago).

I've never tried OCR with Linux myself. I recall reading an article
recently decrying the poor state of OCR under Linux. It described several
programs that do some semblance of OCR (I seem to recall the name Kooka,
and it being a KDE app, as one of them), but all were found to be not very
adequate, and some were very difficult to get working. The article was on
OCR functionality using scanners, so not totally in line with Peter's
query. I'd guess that, if you find some Linux program that can do OCR on
the .gif file you mention, it will not provide results anywhere near the
quality of the Windows program(s) you mention (and those give far from
perfect results in my experience) and that it will be difficult to find
and set up. Were I to need to do something like this, I'd probably check
with the local Kinkos to see if they had OCR programs (running on Win or
Mac) that could do this. Don't know if there is a Kinkos or some suitable
counterpart where you're at. Caveat: I'm not much of a geek, and could be
wrong. I would appreciate being corrected if so.

James
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

Re: Converting .gif to .txt

am 09.02.2006 15:08:41 von joy merwin monteiro

>
> I was able to convert a .gif file in question to .pdf and open it with
> acroread. Acroread has an option to convert to text, however, trying to do
> it I get an empty file.


Never used any OCR proggies myself, but the above thing wont work beacuse
converting a GIF to PDF wil merely treat the file as an image, not as text.
Write the same thing in Open Office and convert to PDF, and you'll see the
difference in PDF output.

Joy

>
> Thanks for any information.
>
> Peter
--
The only true wisdom is in knowing you know nothing.
--- Socrates
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

Re: Converting .gif to .txt

am 09.02.2006 15:14:07 von Yawar Amin

This is an OpenPGP/MIME signed message (RFC 2440 and 3156)
--------------enig77CDE7F5A82C3606BB628970
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

heisspf@skyinet.net wrote:
> Hi,
>
> How can one convert a text document which has been scanned and therefore
> has become a
> .gif file back to a .txt file in order that one can copy and paste text
> from it.

This is the job of OCR software, as mentioned.

> I was told it can be done with some software in windows. Is there such
> software in
> Linux?

Yes, but it's not up to the mark. I did some research on Freshmeat
some time ago. I recommend, if this is a one-time thing, getting the
OCR done at a shop.

> I was able to convert a .gif file in question to .pdf and open it with
> acroread. Acroread has an option to convert to text, however, trying to do
> it I get an empty file.

Basically the image is being embedded, pixel by pixel, into a PDF
file. The conversion program doesn't understand that the image is
showing English (?) text, and neither does Acroread. Acroread's
convert to text feature usually works because usually PDF files
contain text. You can verify this by opening a sampling of PDF files
with `less'. Then try opening your scanned image's PDF file with
`less'. You won't see the scanned text, but you will see a truckload
of gibberish data -- more or less the pixel-by-pixel description of
the image.

--
Yawar
Malaysia +60 (12) 918 6642
Bangladesh +880 (174) 614 754 or +880 (2) 882 1848 or +880 (175) 003
706 or +880 (189) 250 170
OpenPGP key ID 8B6B0839
Fingerprint EFB0 5050 6F27 AFC2 42B2 3B40 FD9C B344 8B6B 0839

--------------enig77CDE7F5A82C3606BB628970
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: OpenPGP digital signature
Content-Disposition: attachment; filename="signature.asc"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFD604v/ZyzRItrCDkRAuwmAJ46DE/N/QOLVyOq4xacate2Z1xnEQCf SZTQ
BCPkSnLqfPZ1biyYLXf1jR4=
=IUNm
-----END PGP SIGNATURE-----

--------------enig77CDE7F5A82C3606BB628970--
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

Re: Converting .gif to .txt

am 10.02.2006 07:46:18 von heisspf

On Thu, 09 Feb 2006 09:08:44 +0100
Andrew wrote:

> heisspf@skyinet.net wrote:
>
> >Hi,
> >
> >How can one convert a text document which has been scanned and therefore
> >has become a
> >.gif file back to a .txt file in order that one can copy and paste text
> >from it.
> >
> >I was told it can be done with some software in windows. Is there such
> >software in
> >Linux?
> >
> >
> I've done this sort of thing under Windows using Optical Character
> Recognition software bundled with a scanner. There are OCR programs
> available for Linux (I think there's something called clara - not sure
> now), but they take rather a lot of work (or did when I looked at them a
> year or so ago).
>
> >I was able to convert a .gif file in question to .pdf and open it with
> >acroread. Acroread has an option to convert to text, however, trying to do
> >it I get an empty file.
> >
> >
> If you can convert to pdf, it is possible to cut and paste from xpdf to
> certain editors (Open Office, gedit. Not abiword), but results are
> hardly perfect and will probably need quite a bit of editing).

Yes that seems to work when apparently the file was written as pdf, however,
not with a file converted from gif to pdf. I did this conversion with xpaint
and I think all it does changing the suffix. Opening it with mc the
pdf looks the same as the gif file.

Thanks for all the information.

Peter

> Andrew
>

--
Peter
-
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs