Hard time with a regex...

Hard time with a regex...

am 30.11.2007 02:39:49 von landemaine

Hello,

I'm trying to extract the home page URL out of a any URL from the same
web site
For instance if I'm on http://www.regular-expressions.info/javascriptexample.html
I want to extract http://www.regular-expressions.info
So, I have my regex ready, it's working fine: http://[\w.-]+
But when I add it to PHP's preg_match function this way:

preg_match('/http\:\/\/\[\\w.-\]+/','http://www.regular-
expressions.info/javascriptexample.html',$matches);

I get this error message:

Warning: preg_match() [function.preg-match]: Delimiter must not be
alphanumeric or backslash
I'm a little lost, I read this page http://www.php.net/manual/en/function.preg-match.php
and I wasn't able to find what is wrong. On the page they put slashes
around the regex, sometimes they don't... Do you know what is causing
the error message?
Thanks,

--
Charles.

Re: Hard time with a regex...

am 30.11.2007 03:25:37 von zeldorblat

On Nov 29, 8:39 pm, Charles wrote:
> Hello,
>
> I'm trying to extract the home page URL out of a any URL from the same
> web site
> For instance if I'm onhttp://www.regular-expressions.info/javascriptexample.html
> I want to extracthttp://www.regular-expressions.info
> So, I have my regex ready, it's working fine:http://[\w.-]+
> But when I add it to PHP's preg_match function this way:
>
> preg_match('/http\:\/\/\[\\w.-\]+/','http://www.regular-
> expressions.info/javascriptexample.html',$matches);
>
> I get this error message:
>
> Warning: preg_match() [function.preg-match]: Delimiter must not be
> alphanumeric or backslash
> I'm a little lost, I read this pagehttp://www.php.net/manual/en/function.preg-match.php
> and I wasn't able to find what is wrong. On the page they put slashes
> around the regex, sometimes they don't... Do you know what is causing
> the error message?
> Thanks,
>
> --
> Charles.

Why not make life easy and use this instead:

Re: Hard time with a regex...

am 30.11.2007 07:34:34 von taps128

Charles wrote:
> Hello,
>
> I'm trying to extract the home page URL out of a any URL from the same
> web site
> For instance if I'm on http://www.regular-expressions.info/javascriptexample.html
> I want to extract http://www.regular-expressions.info
> So, I have my regex ready, it's working fine: http://[\w.-]+
> But when I add it to PHP's preg_match function this way:
>
> preg_match('/http\:\/\/\[\\w.-\]+/','http://www.regular-
> expressions.info/javascriptexample.html',$matches);
>
> I get this error message:
>
> Warning: preg_match() [function.preg-match]: Delimiter must not be
> alphanumeric or backslash
> I'm a little lost, I read this page http://www.php.net/manual/en/function.preg-match.php
> and I wasn't able to find what is wrong. On the page they put slashes
> around the regex, sometimes they don't... Do you know what is causing
> the error message?
> Thanks,
>
> --
> Charles.
Or just use this if you are running the script on the server whose name
you are trying to find: $_SERVER["SERVER_NAME"]

Re: Hard time with a regex...

am 30.11.2007 11:34:28 von Toby A Inkster

Charles wrote:

> preg_match('/http\:\/\/\[\\w.-\]+/','http://www.regular-
> expressions.info/javascriptexample.html',$matches);

Firstly, don't backlash-escape the square brackets:

preg_match('/http\:\/\/[\\w.-]+/','http://www.regular-expres sions.info/
javascriptexample.html',$matches);

That should work. But there's still room for improvement -- to make it
more readable. You don't need to backslash escape the backslash itself,
nor the colon:

preg_match('/http:\/\/[\w.-]+/','http://www.regular-expressi ons.info/
javascriptexample.html',$matches);

Also, PHP (and Perl) allows you to choose a character other than slash as
a delimited (i.e. the character at the beginning and end of the
expression). In this case, lets choose a hash instead:

preg_match('#http:\/\/[\w.-]+#','http://www.regular-expressi ons.info/
javascriptexample.html',$matches);

Because we're not using a slash as a delimiter, it means we no longer need
to backslash-escape the slashes within the expression:

preg_match('#http://[\w.-]+#','http://www.regular-expression s.info/
javascriptexample.html',$matches);

That's much more readable, right?

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 5 days, 17:19.]

Sharing Music with Apple iTunes
http://tobyinkster.co.uk/blog/2007/11/28/itunes-sharing/

Re: Hard time with a regex...

am 30.11.2007 15:39:14 von John Dunlop

Charles:

> I'm trying to extract the home page URL out of a any URL from the same
> web site
> For instance if I'm onhttp://www.regular-expressions.info/javascriptexample.html
> I want to extracthttp://www.regular-expressions.info

If you know your subject string is a well-formed URL then use the
regular expression from RFC3986:

^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?

$1 and $3 give you the scheme and authority, e.g., http://host.invalid

http://www.apps.ietf.org/rfc/rfc3986.html

--
Jock

Re: Hard time with a regex...

am 30.11.2007 17:07:33 von landemaine

On Nov 29, 11:25 pm, ZeldorBlat wrote:
> Why not make life easy and use this instead:
>

Thanks! Exactly what I needed :)

--
Charles.

Re: Hard time with a regex...

am 30.11.2007 17:12:51 von landemaine

On Nov 30, 7:34 am, Toby A Inkster
wrote:
> preg_match('#http://[\w.-]+#','http://www.regular-expression s.info/
> javascriptexample.html',$matches);
>
> That's much more readable, right?

Thanks Toby, incredible, but it works! :)

--
Charles.