Convert plain text to html formatted text

Convert plain text to html formatted text

am 26.04.2006 00:14:51 von Justin C

To keep users away from the dirty end of html/perl-cgi I want to
automate some web-pages. The easiest way I can see of doing this is that
the users submit their data for this web-site in plain text. I know the
first line will always be

and the second line onwards will be just
plain

, each paragraph will be separated by an extra CR (these text
files will be generated on OS X clients so I shouldn't have trouble with
MS CR/LF combinations).

I can see how I can get the first line to format it accordingly but I'm
not sure how to read the rest of the text, specifically: how to spot two
CRs (must remember not to chomp!).

So, I suppose the question is, how do I spot two consecutive CRs in a
string?

I have to do this without modules as the site will be hosted on a server
that has unknown modules and also on which I can't add or request
modules.

If external modules would make it all simpler then I can, probably,
pre-process the text files with a perl script locally before uploading
so they can just be dropped into the CGI output.

Thanks for any suggestions you care to make.

Justin.

--
Justin C, by the sea.

Re: Convert plain text to html formatted text

am 26.04.2006 01:15:43 von Matt Garrish

"Justin C" wrote in message
news:107d.444e9f5b.ecbaa@stigmata...
>
> To keep users away from the dirty end of html/perl-cgi I want to
> automate some web-pages. The easiest way I can see of doing this is that
> the users submit their data for this web-site in plain text. I know the
> first line will always be

and the second line onwards will be just
> plain

, each paragraph will be separated by an extra CR (these text
> files will be generated on OS X clients so I shouldn't have trouble with
> MS CR/LF combinations).
>

my @paras = split(/\r\r/, $userinput);

my $h1 = shift @paras;

foreach my $para (@paras) {
# do whatever
}

Matt

Re: Convert plain text to html formatted text

am 26.04.2006 01:25:50 von Gunnar Hjalmarsson

Justin C wrote:
> how do I spot two consecutive CRs in a string?

/\n\n/

> I have to do this without modules as the site will be hosted on a server
> that has unknown modules and also on which I can't add or request
> modules.

You can always add modules, at least many of the pure Perl modules.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: Convert plain text to html formatted text

am 26.04.2006 02:04:20 von Jim Gibson

In article <107d.444e9f5b.ecbaa@stigmata>, Justin C
wrote:

> To keep users away from the dirty end of html/perl-cgi I want to
> automate some web-pages. The easiest way I can see of doing this is that
> the users submit their data for this web-site in plain text. I know the
> first line will always be

and the second line onwards will be just
> plain

, each paragraph will be separated by an extra CR (these text
> files will be generated on OS X clients so I shouldn't have trouble with
> MS CR/LF combinations).
>
> I can see how I can get the first line to format it accordingly but I'm
> not sure how to read the rest of the text, specifically: how to spot two
> CRs (must remember not to chomp!).
>
> So, I suppose the question is, how do I spot two consecutive CRs in a
> string?
>
> I have to do this without modules as the site will be hosted on a server
> that has unknown modules and also on which I can't add or request
> modules.
>
> If external modules would make it all simpler then I can, probably,
> pre-process the text files with a perl script locally before uploading
> so they can just be dropped into the CGI output.
>
> Thanks for any suggestions you care to make.

If you are reading one line at a time, then two CRs will result in a
blank line being read for the second one, whether or not you use chomp.
You can also read more than one line at a time to detect the end of a
paragraph signified by two CRs: perldoc -q paragraphs "How can I read
in a file by paragraphs?"

If you already have the entire text in a string, then you can use index
to locate two CRs in the string, or use split on two CRs to split the
string into paragraphs.

Re: Convert plain text to html formatted text

am 26.04.2006 10:08:19 von Joe Smith

Justin C wrote:
> each paragraph will be separated by an extra CR (these text
> files will be generated on OS X clients so I shouldn't have trouble with
> MS CR/LF combinations).

Are you sure the data being read by the CGI will have only CR
and not LF? The fact that Macs use \r inside files is somewhat
irrelevant to when it comes to data posted from an HTTP client.

Will you be handling multipart/form-data and application/x-www-form-urlencoded
in Content-Type headers?

-Joe

Re: Convert plain text to html formatted text

am 26.04.2006 23:52:46 von Justin C

On 2006-04-26, Joe Smith wrote:
> Justin C wrote:
>> each paragraph will be separated by an extra CR (these text
>> files will be generated on OS X clients so I shouldn't have trouble with
>> MS CR/LF combinations).
>
> Are you sure the data being read by the CGI will have only CR
> and not LF? The fact that Macs use \r inside files is somewhat
> irrelevant to when it comes to data posted from an HTTP client.
>
> Will you be handling multipart/form-data and application/x-www-form-urlencoded
> in Content-Type headers?

Turns out they're \n's anyway. The files aren't being uploaded by
users, they're being saved to a specific place on a server and they're
being sourced from their by another perl script that then
uploads/mirrors our local web-site on that hosted by our ISP.

Thanks anyway.


Justin.

--
Justin C, by the sea.

Re: Convert plain text to html formatted text

am 26.04.2006 23:55:29 von Justin C

On 2006-04-25, Gunnar Hjalmarsson wrote:
> Justin C wrote:
>> how do I spot two consecutive CRs in a string?
>
> /\n\n/
>
>> I have to do this without modules as the site will be hosted on a server
>> that has unknown modules and also on which I can't add or request
>> modules.
>
> You can always add modules, at least many of the pure Perl modules.

Do you mean instead of using CPAN to install modules I can just install
them to somewhere I have control and then call them in the script? I'm
guessing I'd have to specify the path too but that's no problem.

That's an interesting idea if I have understood you correctly.

It seems with Perl I just keep learning!


Justin.

--
Justin C, by the sea.

Re: Convert plain text to html formatted text

am 26.04.2006 23:57:04 von Justin C

On 2006-04-25, Matt Garrish wrote:
>
> "Justin C" wrote in message
> news:107d.444e9f5b.ecbaa@stigmata...
>>
>> To keep users away from the dirty end of html/perl-cgi I want to
>> automate some web-pages. The easiest way I can see of doing this is that
>> the users submit their data for this web-site in plain text. I know the
>> first line will always be

and the second line onwards will be just
>> plain

, each paragraph will be separated by an extra CR (these text
>> files will be generated on OS X clients so I shouldn't have trouble with
>> MS CR/LF combinations).
>>
>
> my @paras = split(/\r\r/, $userinput);
>
> my $h1 = shift @paras;
>
> foreach my $para (@paras) {
> # do whatever
> }

That's interesting, thanks for posting. I'm going with another posters
suggestion of getting the text as one whole string, the substitutions
seem quite trivial after that.


Justin.

--
Justin C, by the sea.

Re: Convert plain text to html formatted text

am 27.04.2006 00:07:42 von Justin C

On 2006-04-26, Jim Gibson wrote:
>
[snip]

> If you are reading one line at a time, then two CRs will result in a
> blank line being read for the second one, whether or not you use chomp.
> You can also read more than one line at a time to detect the end of a
> paragraph signified by two CRs: perldoc -q paragraphs "How can I read
> in a file by paragraphs?"
>
That would do it exactly, the trouble I have is that I don't know what I
should search for in perldoc!


> If you already have the entire text in a string, then you can use index
> to locate two CRs in the string, or use split on two CRs to split the
> string into paragraphs.

I can see from 'index' how to locate parts of the text but not how to
replace. I've gone with putting it all into one string and then doing
basic substitution - it's a bit ugly but doesn't appear too slow.

Thanks to all who replied. I still feel I'm new to Perl, I read the
Llama book two or three years ago now but don't get to write much code.
I've had to write a whole bunch recently for various automated web-pages
and I've learned a lot from doing so. When I look back over my early
code I find so many ways of improving it - especially it's readability.
Perhaps, in a few more years I'll look at this lot I've just written and
be able to improve it to the same extent. I like it when you can review
your code and you can find a way of making that 20 lines of ugly code
into 5 lines of clean, readable code.

Thanks again for your time and suggestions.


Justin.

--
Justin C, by the sea.

Re: Convert plain text to html formatted text

am 27.04.2006 00:25:55 von Gunnar Hjalmarsson

Justin C wrote:
> On 2006-04-25, Gunnar Hjalmarsson wrote:
>>Justin C wrote:
>>>I have to do this without modules as the site will be hosted on a server
>>>that has unknown modules and also on which I can't add or request
>>>modules.
>>
>>You can always add modules, at least many of the pure Perl modules.
>
> Do you mean instead of using CPAN to install modules I can just install
> them to somewhere I have control and then call them in the script?

Yes.

> I'm guessing I'd have to specify the path too but that's no problem.

Se the docs of a module I wrote for an example:
http://search.cpan.org/perldoc?CGI%3A%3AContactForm#Manual_I nstallation

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

Re: Convert plain text to html formatted text

am 27.04.2006 22:30:27 von Justin C

On 2006-04-26, Gunnar Hjalmarsson wrote:
> Justin C wrote:
>> On 2006-04-25, Gunnar Hjalmarsson wrote:
>>>Justin C wrote:
>>>>I have to do this without modules as the site will be hosted on a server
>>>>that has unknown modules and also on which I can't add or request
>>>>modules.
>>>
>>>You can always add modules, at least many of the pure Perl modules.
>>
>> Do you mean instead of using CPAN to install modules I can just install
>> them to somewhere I have control and then call them in the script?
>
> Yes.
>
>> I'm guessing I'd have to specify the path too but that's no problem.
>
> Se the docs of a module I wrote for an example:
> http://search.cpan.org/perldoc?CGI%3A%3AContactForm#Manual_I nstallation

That's very interesting, thank you. I may have use for this... at least,
I did have a use for this! I'll have to try and remember what it was.

Justin.

--
Justin C, by the sea.

Re: Convert plain text to html formatted text

am 28.04.2006 02:07:10 von Jim Gibson

In article <2792.444fef2e.40b77@stigmata>, Justin C
wrote:

> On 2006-04-26, Jim Gibson wrote:
> >
> [snip]
>
> > If you are reading one line at a time, then two CRs will result in a
> > blank line being read for the second one, whether or not you use chomp.
> > You can also read more than one line at a time to detect the end of a
> > paragraph signified by two CRs: perldoc -q paragraphs "How can I read
> > in a file by paragraphs?"
> >
> That would do it exactly, the trouble I have is that I don't know what I
> should search for in perldoc!

Yes, I know the problem! The Perl Cookbook is a good source of complete
programs doing common tasks. If you can find one there that is close,
you can adapt for your needs.


> > If you already have the entire text in a string, then you can use index
> > to locate two CRs in the string, or use split on two CRs to split the
> > string into paragraphs.
>
> I can see from 'index' how to locate parts of the text but not how to
> replace. I've gone with putting it all into one string and then doing
> basic substitution - it's a bit ugly but doesn't appear too slow.

The substr function can be used in conjunction with index to do a
substitution and avoid regular expressions. You can use substr on the
left-hand-side of an assignment statement or use the 4-argument form
with a replacement string. It is not as elegant or as concise as using
regular expressions and the substitute operator, but it can be quicker.

Posted Via Usenet.com Premium Usenet Newsgroup Services
----------------------------------------------------------
** SPEED ** RETENTION ** COMPLETION ** ANONYMITY **
----------------------------------------------------------
http://www.usenet.com

Re: Convert plain text to html formatted text

am 30.04.2006 14:45:14 von Justin C

On 2006-04-28, Jim Gibson wrote:
> In article <2792.444fef2e.40b77@stigmata>, Justin C
> wrote:
>
>> On 2006-04-26, Jim Gibson wrote:
>> >
>> [snip]
>>
>> > If you are reading one line at a time, then two CRs will result in a
>> > blank line being read for the second one, whether or not you use chomp.
>> > You can also read more than one line at a time to detect the end of a
>> > paragraph signified by two CRs: perldoc -q paragraphs "How can I read
>> > in a file by paragraphs?"
>> >
>> That would do it exactly, the trouble I have is that I don't know what I
>> should search for in perldoc!
>
> Yes, I know the problem! The Perl Cookbook is a good source of complete
> programs doing common tasks. If you can find one there that is close,
> you can adapt for your needs.

Dammit! I've got that book too! I've hardly picked it up. I should
really use it from time to time... it cost enough.
>
>
>> > If you already have the entire text in a string, then you can use index
>> > to locate two CRs in the string, or use split on two CRs to split the
>> > string into paragraphs.
>>
>> I can see from 'index' how to locate parts of the text but not how to
>> replace. I've gone with putting it all into one string and then doing
>> basic substitution - it's a bit ugly but doesn't appear too slow.
>
> The substr function can be used in conjunction with index to do a
> substitution and avoid regular expressions. You can use substr on the
> left-hand-side of an assignment statement or use the 4-argument form
> with a replacement string. It is not as elegant or as concise as using
> regular expressions and the substitute operator, but it can be quicker.

Oooh. That looks messy (combining index and substr). Still, I think I'll
set myself a task and give it a try, then compare the solutions. If
nothing else I'll learn more perl.

Thank you for your reply.


Justin.

--
Justin C, by the sea.