Hard time with a regex...
Hard time with a regex...
am 30.11.2007 02:39:49 von landemaine
Hello,
I'm trying to extract the home page URL out of a any URL from the same
web site
For instance if I'm on http://www.regular-expressions.info/javascriptexample.html
I want to extract http://www.regular-expressions.info
So, I have my regex ready, it's working fine: http://[\w.-]+
But when I add it to PHP's preg_match function this way:
preg_match('/http\:\/\/\[\\w.-\]+/','http://www.regular-
expressions.info/javascriptexample.html',$matches);
I get this error message:
Warning: preg_match() [function.preg-match]: Delimiter must not be
alphanumeric or backslash
I'm a little lost, I read this page http://www.php.net/manual/en/function.preg-match.php
and I wasn't able to find what is wrong. On the page they put slashes
around the regex, sometimes they don't... Do you know what is causing
the error message?
Thanks,
--
Charles.
Re: Hard time with a regex...
am 30.11.2007 03:25:37 von zeldorblat
On Nov 29, 8:39 pm, Charles wrote:
> Hello,
>
> I'm trying to extract the home page URL out of a any URL from the same
> web site
> For instance if I'm onhttp://www.regular-expressions.info/javascriptexample.html
> I want to extracthttp://www.regular-expressions.info
> So, I have my regex ready, it's working fine:http://[\w.-]+
> But when I add it to PHP's preg_match function this way:
>
> preg_match('/http\:\/\/\[\\w.-\]+/','http://www.regular-
> expressions.info/javascriptexample.html',$matches);
>
> I get this error message:
>
> Warning: preg_match() [function.preg-match]: Delimiter must not be
> alphanumeric or backslash
> I'm a little lost, I read this pagehttp://www.php.net/manual/en/function.preg-match.php
> and I wasn't able to find what is wrong. On the page they put slashes
> around the regex, sometimes they don't... Do you know what is causing
> the error message?
> Thanks,
>
> --
> Charles.
Why not make life easy and use this instead:
Re: Hard time with a regex...
am 30.11.2007 07:34:34 von taps128
Charles wrote:
> Hello,
>
> I'm trying to extract the home page URL out of a any URL from the same
> web site
> For instance if I'm on http://www.regular-expressions.info/javascriptexample.html
> I want to extract http://www.regular-expressions.info
> So, I have my regex ready, it's working fine: http://[\w.-]+
> But when I add it to PHP's preg_match function this way:
>
> preg_match('/http\:\/\/\[\\w.-\]+/','http://www.regular-
> expressions.info/javascriptexample.html',$matches);
>
> I get this error message:
>
> Warning: preg_match() [function.preg-match]: Delimiter must not be
> alphanumeric or backslash
> I'm a little lost, I read this page http://www.php.net/manual/en/function.preg-match.php
> and I wasn't able to find what is wrong. On the page they put slashes
> around the regex, sometimes they don't... Do you know what is causing
> the error message?
> Thanks,
>
> --
> Charles.
Or just use this if you are running the script on the server whose name
you are trying to find: $_SERVER["SERVER_NAME"]
Re: Hard time with a regex...
am 30.11.2007 11:34:28 von Toby A Inkster
Charles wrote:
> preg_match('/http\:\/\/\[\\w.-\]+/','http://www.regular-
> expressions.info/javascriptexample.html',$matches);
Firstly, don't backlash-escape the square brackets:
preg_match('/http\:\/\/[\\w.-]+/','http://www.regular-expres sions.info/
javascriptexample.html',$matches);
That should work. But there's still room for improvement -- to make it
more readable. You don't need to backslash escape the backslash itself,
nor the colon:
preg_match('/http:\/\/[\w.-]+/','http://www.regular-expressi ons.info/
javascriptexample.html',$matches);
Also, PHP (and Perl) allows you to choose a character other than slash as
a delimited (i.e. the character at the beginning and end of the
expression). In this case, lets choose a hash instead:
preg_match('#http:\/\/[\w.-]+#','http://www.regular-expressi ons.info/
javascriptexample.html',$matches);
Because we're not using a slash as a delimiter, it means we no longer need
to backslash-escape the slashes within the expression:
preg_match('#http://[\w.-]+#','http://www.regular-expression s.info/
javascriptexample.html',$matches);
That's much more readable, right?
--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 5 days, 17:19.]
Sharing Music with Apple iTunes
http://tobyinkster.co.uk/blog/2007/11/28/itunes-sharing/
Re: Hard time with a regex...
am 30.11.2007 15:39:14 von John Dunlop
Charles:
> I'm trying to extract the home page URL out of a any URL from the same
> web site
> For instance if I'm onhttp://www.regular-expressions.info/javascriptexample.html
> I want to extracthttp://www.regular-expressions.info
If you know your subject string is a well-formed URL then use the
regular expression from RFC3986:
^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
$1 and $3 give you the scheme and authority, e.g., http://host.invalid
http://www.apps.ietf.org/rfc/rfc3986.html
--
Jock
Re: Hard time with a regex...
am 30.11.2007 17:07:33 von landemaine
On Nov 29, 11:25 pm, ZeldorBlat wrote:
> Why not make life easy and use this instead:
>
Thanks! Exactly what I needed :)
--
Charles.
Re: Hard time with a regex...
am 30.11.2007 17:12:51 von landemaine
On Nov 30, 7:34 am, Toby A Inkster
wrote:
> preg_match('#http://[\w.-]+#','http://www.regular-expression s.info/
> javascriptexample.html',$matches);
>
> That's much more readable, right?
Thanks Toby, incredible, but it works! :)
--
Charles.