Shameless plug: Techworld covers mod_perl based CMS

Shameless plug: Techworld covers mod_perl based CMS

am 06.03.2009 19:31:13 von Hendrik Van Belleghem

--000e0cd2457a5a28010464777d6f
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

All,

The Australian Techworld website recently published an interview with me on
Spine, a mod_perl based CMS I created, as part of their Open Source Identity
series. It discusses the differences with existing CMSs, RDBMS and language
choices.
Maybe a good read. Article link below:

http://www.techworld.com.au/article/278847/open_source_ident ity_spine_cms_creator_hendrik_van_belleghem

Ciao
Hendrik Van Belleghem
Spine - The backbone for your website - http://spine.sf.net

--000e0cd2457a5a28010464777d6f
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

All,

The Australian Techworld website recently published an intervie=
w with
me on Spine, a mod_perl based CMS I created, as part of their Open
Source Identity series. It discusses the differences with existing
CMSs, RDBMS and language choices.
Maybe a good read. Article link below:=


dentity_spine_cms_creator_hendrik_van_belleghem">http://www. techworld.com.a=
u/article/278847/open_source_identity_spine_cms_creator_hend rik_van_bellegh=
em



Ciao
Hendrik Van Belleghem
Spine - The backbone for=
your website -


--000e0cd2457a5a28010464777d6f--

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 10:31:49 von aw

Iosif Fettich wrote:
[...]
Hi.
I have not looked at your code in detail, but in general : there is
nothing in Apache or mod_perl that will automatically and magically wrap
any response in any html tag sequence.
So the only reasonable explanation, is that it is your back-end server
which generates these tags (or your browser ?).
Why don't you put some tracing code in your handler, to dump what it
really receives from the back-end ?
Like :

$r->log_error("going to get : $new_url")
...
my $content = $response->content;
$r->log_error("got : " . $content);
...

and look in your Apache error log.
You could also use lwp-request (comes with perl), to retrieve directly
what you think your handler is retrieving.
lwp-request, as the name indicates, uses the LWP module, just like your
handler does.

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 11:53:57 von Perrin Harkins

On Mon, Mar 23, 2009 at 4:10 AM, Iosif Fettich wrote:
> The problem is that what I want to be the handler's proxied response is
> actual embedded instead in an construct like
>
>
>
>
>


> ...
>
>

>
>
>
> which I seem not to be able to get rid of. What am I doing wrong..?

I don't see you printing any content type or other headers. Those
aren't in $response->content.

> Are there any obvious/better ways to get the functionality I hope to get ?

Everything you've shown so far could be done more efficiently by a
couple of lines of mod_rewrite.

- Perrin

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 12:14:13 von Iosif Fettich

Hi Perrin,

> I don't see you printing any content type or other headers. Those
> aren't in $response->content.

I've ommited printing headers explicitely :(

Have to see when and how I should do this; simply inserting a

$r->content_type( 'text/html' );

before my

$r->print( $content );

seems to be a NOOP..

> Everything you've shown so far could be done more efficiently by a
> couple of lines of mod_rewrite.

I'll re-read mod_rewrite then ;)

I'm just not aware yet that I could check the outcome of a subrequest and
put some proxied response in place if the subrequest is unsuccessful.
Isn't mod-rewrite just a _request_ rewrite ?

Thanks,

Iosif Fettich

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 12:35:09 von Perrin Harkins

On Mon, Mar 23, 2009 at 7:14 AM, Iosif Fettich wrote:
> I've ommited printing headers explicitely :(

HTTP won't work without headers.

> Have to see when and how I should do this; simply inserting a
>
> $r->content_type( 'text/html' );
>
> before my
>
> $r->print( $content );
>
> seems to be a NOOP..

Nope, it's not a NOOP. Maybe you're setting it too late.

>> Everything you've shown so far could be done more efficiently by a couple
>> of lines of mod_rewrite.
>
> I'll re-read mod_rewrite then ;)
>
> I'm just not aware yet that I could check the outcome of a subrequest and
> put some proxied response in place if the subrequest is unsuccessful. Isn't
> mod-rewrite just a _request_ rewrite ?

It can do just about anything:
http://httpd.apache.org/docs/1.3/misc/rewriteguide.html

- Perrin

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 12:57:18 von aw

Perrin Harkins wrote:
> On Mon, Mar 23, 2009 at 7:14 AM, Iosif Fettich wrote:
>> I've ommited printing headers explicitely :(
>
> HTTP won't work without headers.
>
>> Have to see when and how I should do this; simply inserting a
>>
>> $r->content_type( 'text/html' );
>>
>> before my
>>
>> $r->print( $content );
>>
>> seems to be a NOOP..
>
> Nope, it's not a NOOP. Maybe you're setting it too late.
>
You may also want to have a look at this previous thread :
http://marc.info/?l=apache-modperl&m=123072345912551&w=2

In the second or third message, there is a paragraph by Rainer Jung
which explains why setting the Content-Type may sometimes appear as not
working. There is also later on a solution.

This does not mean that what Perrin says above is wrong.
Apparently, with the Content-Type header, timing is really of the essence.

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 16:28:14 von Iosif Fettich

Hi Perrin,

>> I'm just not aware yet that I could check the outcome of a subrequest and
>> put some proxied response in place if the subrequest is unsuccessful. Isn't
>> mod-rewrite just a _request_ rewrite ?
>
> It can do just about anything:
> http://httpd.apache.org/docs/1.3/misc/rewriteguide.html

Looks like it should be simple. And still I can't get it do what I want.
There is actually an almost _exact_ FAQ-like answer to my problem in the
doc, http://httpd.apache.org/docs/2.2/rewrite/rewrite_guide_advan ced.html:

---
Redirect Failing URLs to Another Web Server

Description:

A typical FAQ about URL rewriting is how to redirect failing requests
on webserver A to webserver B. Usually this is done via ErrorDocument CGI
scripts in Perl, but there is also a mod_rewrite solution. But note that
this performs more poorly than using an ErrorDocument CGI script!
Solution:

The first solution has the best performance but less flexibility, and
is less safe:

RewriteEngine on
RewriteCond /your/docroot/%{REQUEST_FILENAME} !-f
RewriteRule ^(.+) http://webserverB.dom/$1

The problem here is that this will only work for pages inside the
DocumentRoot. While you can add more Conditions (for instance to also
handle homedirs, etc.) there is a better variant:

RewriteEngine on
RewriteCond %{REQUEST_URI} !-U
RewriteRule ^(.+) http://webserverB.dom/$1

This uses the URL look-ahead feature of mod_rewrite. The result is
that this will work for all types of URLs and is safe. But it does have a
performance impact on the web server, because for every request there is
one more internal subrequest. So, if your web server runs on a powerful
CPU, use this one. If it is a slow machine, use the first approach or
better an ErrorDocument CGI script.

---

So it seems to be very, very easy. Still, when using the above receipt
like
RewriteEngine on
RewriteCond %{REQUEST_URI} !-U
RewriteRule ^\/(.+) http://OLDDOMAIN.COM/$1 [QSA,P]

instead of getting the proxied content, I get

---
Not Found

The requested URL /index.php was not found on this server.
---

for a GET request like http://mydomain.com/index.php and the rewrite log
(with RewriteLogLevel 9) looks like

[rid#2ad8dc3dfb98/initial] (2) init rewrite engine with requested uri /index.php
[rid#2ad8dc3dfb98/initial] (3) applying pattern '^\/(.+)' to uri '/index.php'
[rid#2ad8dc3e5bc8/subreq] (2) init rewrite engine with requested uri /index.php
[rid#2ad8dc3e5bc8/subreq] (3) applying pattern '^\/(.+)' to uri '/index.php'
[rid#2ad8dc3e5bc8/subreq] (4) RewriteCond: input='/index.php' pattern='!-U' => matched
[rid#2ad8dc3e5bc8/subreq] (2) rewrite '/index.php' -> 'http://OLDDOMAIN.COM/index.php'
[rid#2ad8dc3e5bc8/subreq] (2) forcing proxy-throughput with http://OLDDOMAIN.COM/index.php
[rid#2ad8dc3e5bc8/subreq] (1) go-ahead with proxy request proxy:http://OLDDOMAIN.COM/index.php [OK]
[rid#2ad8dc3dfb98/initial] (5) RewriteCond URI (-U) check: path=/index.php -> status=200
[rid#2ad8dc3dfb98/initial] (4) RewriteCond: input='/index.php' pattern='!-U' => not-matched
[rid#2ad8dc3dfb98/initial] (1) pass through /index.php

If I do the proxying unconditionally, like

RewriteEngine on
# RewriteCond %{REQUEST_URI} !-U
RewriteRule ^\/(.+) http://OLDDOMAIN.COM/$1 [QSA,P]

it works OK.

Any clou/idea of what might be wrong here ?

Many thanks,

Iosif Fettich

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 17:20:38 von torsten.foertsch

On Mon 23 Mar 2009, Iosif Fettich wrote:
> So it seems to be very, very easy. Still, when using the above
> receipt like
> =A0 =A0 =A0RewriteEngine on
> =A0 =A0 =A0RewriteCond =A0 %{REQUEST_URI} =A0 !-U
> =A0 =A0 =A0RewriteRule =A0 ^\/(.+) =A0 =A0 =A0 =A0 =A0http://OLDDOMAIN.CO=
M/$1 [QSA,P]

The engine tries to resolve the request uri with a subrequest. Hence it=20
goes through this rule twice. Try to add "NS" (no subrequest) to the=20
flags [QSA,P,NS].

Torsten

=2D-=20
Need professional mod_perl support?
Just hire me: torsten.foertsch@gmx.net

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 17:32:02 von Iosif Fettich

This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.

--8323328-373298202-1237825922=:4197
Content-Type: TEXT/PLAIN; charset=iso-8859-1; format=flowed
Content-Transfer-Encoding: 8BIT

Hi Torsten,

On Mon, 23 Mar 2009, Torsten Foertsch wrote:

> On Mon 23 Mar 2009, Iosif Fettich wrote:
>> So it seems to be very, very easy. Still, when using the above
>> receipt like
>>      RewriteEngine on
>>      RewriteCond   %{REQUEST_URI}   !-U
>>      RewriteRule   ^\/(.+)          http://OLDDOMAIN.COM/$1 [QSA,P]
>
> The engine tries to resolve the request uri with a subrequest. Hence it
> goes through this rule twice. Try to add "NS" (no subrequest) to the
> flags [QSA,P,NS].

I have; the log now shows

[rid#2ba8b0168af8/initial] (3) applying pattern '^\/(.+)' to uri '/index.php'
[rid#2ba8b04b9e18/subreq] (2) init rewrite engine with requested uri /index.php
[rid#2ba8b04b9e18/subreq] (1) pass through /index.php
[rid#2ba8b0168af8/initial] (5) RewriteCond URI (-U) check: path=/index.php -> status=200
[rid#2ba8b0168af8/initial] (4) RewriteCond: input='/index.php' pattern='!-U' => not-matched
[rid#2ba8b0168af8/initial] (1) pass through /index.php

which looks a buit different, but still leaves mw with the "The requested
URL /index.php was not found on this server." error.

I'm not really understanding what the log is saying.

Or is the [P]roxy flag not working as it should or as I expect it to ?
It seems to work fine for the subrequest (status=200) ...?

So far, I just cannot understand what's going on here or where to look and
what to try.

Many thanks,

Iosif Fettich


--8323328-373298202-1237825922=:4197--

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 18:18:11 von torsten.foertsch

On Mon 23 Mar 2009, Iosif Fettich wrote:
> Or is the [P]roxy flag not working as it should or as I expect it to
> ? It seems to work fine for the subrequest (status=200) ...?

This is exactly the problem. The 404 is normally generated in the
response phase from the default response handler. The subreq lookup
won't check that. It does not run the subreq but only checks if after
fixup there is still no error.

So the simplest solution for you would be an ErrorDocument I think.

Something like this:

ErrorDocument 404 /-/404 <- put an url here you are not using otherwise
RewriteRule ^/-/404 http://...%{REDIRECT_URI} [P]

Write a simple printenv and activate it as ErrorDocument first to get
the right environment variable names (not sure if it is REDIRECT_URI).
If you use mod_include that would be



in an .shtml file.

This solution should work for GET/HEAD. POST requests are still a
problem. If you use them we'll make it work without ErrorDocument. BTW,
why don't you use the file lookup (-f). That is much easier. Is your
uri->file mapping so complicated?

Torsten

--
Need professional mod_perl support?
Just hire me: torsten.foertsch@gmx.net

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 19:08:50 von Iosif Fettich

Hi Torsten,

>> ? It seems to work fine for the subrequest (status=200) ...?
>
> This is exactly the problem. The 404 is normally generated in the
> response phase from the default response handler. The subreq lookup
> won't check that. It does not run the subreq but only checks if after
> fixup there is still no error.

Hmm... Would that mean that the example receipt in the rewrite guide
simply doesn't work as given/explained ? For no one ...?!

> So the simplest solution for you would be an ErrorDocument I think.
>
> Something like this:
>
> ErrorDocument 404 /-/404 <- put an url here you are not using otherwise
> RewriteRule ^/-/404 http://...%{REDIRECT_URI} [P]
>
> Write a simple printenv and activate it as ErrorDocument first to get
> the right environment variable names (not sure if it is REDIRECT_URI).
> If you use mod_include that would be
>
>
>
> in an .shtml file.

I must do something wrong again. Outcommenting RewriteEngine and setting

ErrorDocument 404 /404.shtml

with 404.shtml contaiining the above shows
[...]
REDIRECT_URL=/404.xxx
GATEWAY_INTERFACE=CGI/1.1
SERVER_PROTOCOL=HTTP/1.1
REQUEST_METHOD=GET
QUERY_STRING=
REQUEST_URI=/404.xxx
SCRIPT_NAME=/404.shtml
DATE_LOCAL=Monday, 23-Mar-2009 19:49:31 EET
DATE_GMT=Monday, 23-Mar-2009 17:49:31 GMT
LAST_MODIFIED=Monday, 23-Mar-2009 19:30:26 EET
DOCUMENT_URI=/404.shtml
DOCUMENT_NAME=404.shtml

(Nothing like REDIRECT_URI).

So I believe it is actually REQUEST_URI I'm after ?

Now, re-enabling the RewriteEngine and setting

ErrorDocument 404 /-/404

RewriteLog /tmp/REWRITE.log
RewriteLogLevel 99

RewriteEngine on
RewriteRule ^/-/404 http://OLD_SITE.COM%{REQUEST_URI} [P]

brings me the _error_ page of the old site, and the rewrite log for
my http://ACTUAL_SITE.COM/index.php looks like

[rid#2b5dc09e5c68/initial] (2) init rewrite engine with requested uri /index.php
[rid#2b5dc09e5c68/initial] (3) applying pattern '^/-/404' to uri '/index.php'
[rid#2b5dc09e5c68/initial] (1) pass through /index.php
[rid#2b5dc09e4d48/initial/redir#1] (2) init rewrite engine with requested uri /-/404
[rid#2b5dc09e4d48/initial/redir#1] (3) applying pattern '^/-/404' to uri '/-/404'
[rid#2b5dc09e4d48/initial/redir#1] (2) rewrite '/-/404' -> 'http://OLD_SITE.COM/-/404'
[rid#2b5dc09e4d48/initial/redir#1] (2) forcing proxy-throughput with http://OLD_SITE.COM/-/404
[rid#2b5dc09e4d48/initial/redir#1] (1) go-ahead with proxy request proxy:http://OLD_SITE.COM/-/404 [OK]

Still not there yet, not sure how I could get what you mean by
REDIRECT_URI.

> This solution should work for GET/HEAD. POST requests are still a
> problem. If you use them we'll make it work without ErrorDocument.

I'm not sure if POSTs are used, that would be the next step then.

> BTW, why don't you use the file lookup (-f). That is much easier. Is
> your uri->file mapping so complicated?

I'd say yes, it's some CMS in the back and almost all of the content is
dynamic, most requests look like

http://MYDOAMIN.COM/index.php?qs_sect_id=3456 etc.

Many thanks,

Iosif Fettich

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 19:18:21 von Iosif Fettich

Hi Torsten,

just following up my previous message:

> [...]
> REDIRECT_URL=/404.xxx
> GATEWAY_INTERFACE=CGI/1.1
> [...]
>
> (Nothing like REDIRECT_URI).

Using the (obvious...!) REDIRECT_URL as you suggested works! :)

I think there's just a new member in the 'Torsten, you're the greatest!'
club ;)

Many many thanks, once again.

Iosif Fettich

Re: who"s putting that <pre> tag in the output...?

am 23.03.2009 19:59:56 von torsten.foertsch

On Mon 23 Mar 2009, Iosif Fettich wrote:
> Using the (obvious...!) REDIRECT_URL as you suggested works! :)

An ErrorDocument is an internal redirect. These REDIRECT_... environment
variables are copied from the previous ($r->prev) request's
$r->subprocess_env just by copying everything and prepending REDIRECT_
to each key. So if the original request has an environment variable
named REQUEST_URI the error document should have a
REDIRECT_REQUEST_URI, see rename_original_env() in
httpd-x.y/modules/http/http_request.c.

Since REQUEST_URI is the standard CGI environment variable (see
ap_add_cgi_vars() httpd-x.y/server/util_script.c) I'd take
REDIRECT_REQUEST_URI.

Torsten

--
Need professional mod_perl support?
Just hire me: torsten.foertsch@gmx.net

Re: who"s putting that <pre> tag in the output...?

am 24.03.2009 13:02:19 von Iosif Fettich

Hi Torsten,

> An ErrorDocument is an internal redirect. These REDIRECT_... environment
> variables are copied from the previous ($r->prev) request's
> $r->subprocess_env just by copying everything and prepending REDIRECT_
> to each key. So if the original request has an environment variable
> named REQUEST_URI the error document should have a REDIRECT_REQUEST_URI,
> see rename_original_env() in httpd-x.y/modules/http/http_request.c.
>
> Since REQUEST_URI is the standard CGI environment variable (see
> ap_add_cgi_vars() httpd-x.y/server/util_script.c) I'd take
> REDIRECT_REQUEST_URI.

As it turned out, I was (entirely) wrong when I thought it is
working. It was wishfull thinking - but not a real solution - neighter one
of the REQUEST_URI, REDIRECT_URL and/or REDIRECT_QUERY_STRING environment
variables seemed to be good enough for a mod_rewrite solution, or at least
I was unable to build one. (I just made some errors in testing and
repeatedly out- and out-out-commenting various httpd.conf setting, but it
wasn't _really_ working whein I thought it would).

Summing up what I have so far ( which might be incomplete or even wrong):

looking for a cheap/good/working solution for a way to solve what

http://httpd.apache.org/docs/2.2/rewrite/rewrite_guide_advan ced.html

describes under the title "Redirect Failing URLs to Another Web Server",
but with the (it seems important) difference that I want to hide the new
server from the eyes of the customers and as such _proxy_ the failing
requests instead of redirecting, the given receipt

RewriteEngine on
RewriteCond %{REQUEST_URI} !-U
RewriteRule ^(.+) http://webserverB.dom/$1

shows up to NOT work when I attempted to make it

RewriteEngine on
RewriteCond %{REQUEST_URI} !-U
RewriteRule ^(.+) http://webserverB.dom/$1 [P]

Neither was I able to use the Error_Document trick you sugegsted and use
Rewrite on/with it.

I've given up my first attempt - the earlier in the thread shown
PerlResponse handler - as I was unable to output the Content-Type header
as 'text/html'; I haven't however tried the solution suggested with adding
an extra filter for the end phase to substitute the 'text/plain' that I
was seeing and which actually generated the initial question for this
thread and it's subject.

I finally went the 'standard way' [?] and added

ErrorDocument 404 /cgi-bin/404_to_oldserver.pl

in httpd.conf, making /cgi-bin/404_to_oldserver.pl to be

----------------------------
#!/usr/bin/perl

use LWP::UserAgent;

my $ua = LWP::UserAgent->new;

my $url = $ENV{'REQUEST_URI'};
$url = "http://OLD.SERVER.COM/$url" ;

my $response = $ua->get( $url );

my $body = $response->content;

my $h = $response->{'_headers'};
$h->push_header( 'Status' => $response->code );
my $header = $h->as_string;

print $header;
print "\n";
print $body;

1;
-----------------------------------------------

This way might have it's own special problems too, but at least it seems
to work OK so far and give me a start.

I'm still [a bit] convinced that a mod_perl solution might or should be
available and be both better and more effective, but I wasn't able to get
it working - even after spending much more effort than I thought initially
that it will take - and gave up for now.

Many thanks to all those that offered advice or help.

Iosif Fettich

PS. Firebug once again proved to be an invaluable resource in helping
understand what's up and find a solution.