escaping strings properly for use with REGEXP

escaping strings properly for use with REGEXP

am 02.05.2003 15:27:28 von Mark Stosberg

Hello,

I'm writing a web application using DBD::mysql. I would like for the
user to enter some keyword search terms that become part of a REGEXP
clause, like:

field REGEXP 'term1|term2|term3',

However, I don't want anything the user enters to have special meaning
in my regular expression. For example, you can see with this test query,
MySQL throws an error:

# mysql> select 'big blue truck!' regexp 'Big|Blu~e|[[.charactersfu)ck{';
# ERROR 1139: Got error 'brackets ([ ]) not balanced' from regexp

I think with Perl regular expressions this is easy using "\Q" and "\E".
There doesn't currently seem to be an easy way to do this with MySQL.
Any help is appreciated.

Thanks,

Mark

--
http://mark.stosberg.com/


--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules=x 7+yCfqV/g9AfugRpC6u6w@public.gmane.org

Re: escaping strings properly for use with REGEXP

am 03.05.2003 01:22:15 von Garry Williams

On Fri, May 02, 2003 at 13:27:28 +0000, Mark Stosberg wrote:
> I'm writing a web application using DBD::mysql. I would like for the
> user to enter some keyword search terms that become part of a REGEXP
> clause, like:
>
> field REGEXP 'term1|term2|term3',
>
> However, I don't want anything the user enters to have special meaning
> in my regular expression. For example, you can see with this test query,
> MySQL throws an error:
>
> # mysql> select 'big blue truck!' regexp 'Big|Blu~e|[[.charactersfu)ck{';
> # ERROR 1139: Got error 'brackets ([ ]) not balanced' from regexp


Your requirement seems to be ambiguous. By your examples, I take it
that meta | is okay, but meta [ is not. What if the user wanted a
character class? Is meta [ okay in that case? Perhaps you just want
to tell the users to use regular expressions, so they will give you
`\[', if that is what they mean.

I'd just send in the query and feed back the exception. Otherwise you
will be parsing a regular expression, I fear.


> I think with Perl regular expressions this is easy using "\Q" and "\E".


So what would `\Qterm1|term2|term3\E' mean? Is that what you'd really
want?

If that is what you really want, simply transform any meta character
to `\$1' with s///g before handing it to mysql. That's all \Q does.

--
Garry Williams, Zvolve Systems, Inc., +1 770 551-4504

--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules@m .gmane.org

Re: escaping strings properly for use with REGEXP

am 03.05.2003 01:22:15 von Garry Williams

On Fri, May 02, 2003 at 13:27:28 +0000, Mark Stosberg wrote:
> I'm writing a web application using DBD::mysql. I would like for the
> user to enter some keyword search terms that become part of a REGEXP
> clause, like:
>
> field REGEXP 'term1|term2|term3',
>
> However, I don't want anything the user enters to have special meaning
> in my regular expression. For example, you can see with this test query,
> MySQL throws an error:
>
> # mysql> select 'big blue truck!' regexp 'Big|Blu~e|[[.charactersfu)ck{';
> # ERROR 1139: Got error 'brackets ([ ]) not balanced' from regexp


Your requirement seems to be ambiguous. By your examples, I take it
that meta | is okay, but meta [ is not. What if the user wanted a
character class? Is meta [ okay in that case? Perhaps you just want
to tell the users to use regular expressions, so they will give you
`\[', if that is what they mean.

I'd just send in the query and feed back the exception. Otherwise you
will be parsing a regular expression, I fear.


> I think with Perl regular expressions this is easy using "\Q" and "\E".


So what would `\Qterm1|term2|term3\E' mean? Is that what you'd really
want?

If that is what you really want, simply transform any meta character
to `\$1' with s///g before handing it to mysql. That's all \Q does.

--
Garry Williams, Zvolve Systems, Inc., +1 770 551-4504

--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules@m .gmane.org

Re: escaping strings properly for use with REGEXP

am 03.05.2003 22:00:40 von Mark Stosberg

In article <20030502192215.B4198=MvZwh4zv0oPQT0dZR+AlfA@public.gmane.org>, Garry Williams wrote:
> On Fri, May 02, 2003 at 13:27:28 +0000, Mark Stosberg wrote:
>> I'm writing a web application using DBD::mysql. I would like for the
>> user to enter some keyword search terms that become part of a REGEXP
>> clause, like:
>>
>> field REGEXP 'term1|term2|term3',
>>
>> However, I don't want anything the user enters to have special meaning
>> in my regular expression. For example, you can see with this test query,
>> MySQL throws an error:
>>
>> # mysql> select 'big blue truck!' regexp 'Big|Blu~e|[[.charactersfu)ck{';
>> # ERROR 1139: Got error 'brackets ([ ]) not balanced' from regexp
>
>
> Your requirement seems to be ambiguous.

Thanks for the response, and sorry for being ambiguous. My users are
just regular web users who won't have any ability to create regular
expressions. Any use of a meta character should be treated as
unintentional.

To clarify, I'm thinking that a user might enter the following in a
search box:

"my three keywords"

I was then thinking about implementing the SQL search logic of

'If field matches "my" or "three" or "keywords", return the record'.

Maybe all I need is simple function to replace all meta characters
with "\meta". This seems like a common use that someone might already
have a standard routine for.

Thanks,

Mark


--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules=x 7+yCfqV/g9AfugRpC6u6w@public.gmane.org

Re: escaping strings properly for use with REGEXP

am 03.05.2003 22:00:40 von Mark Stosberg

In article <20030502192215.B4198=MvZwh4zv0oPQT0dZR+AlfA@public.gmane.org>, Garry Williams wrote:
> On Fri, May 02, 2003 at 13:27:28 +0000, Mark Stosberg wrote:
>> I'm writing a web application using DBD::mysql. I would like for the
>> user to enter some keyword search terms that become part of a REGEXP
>> clause, like:
>>
>> field REGEXP 'term1|term2|term3',
>>
>> However, I don't want anything the user enters to have special meaning
>> in my regular expression. For example, you can see with this test query,
>> MySQL throws an error:
>>
>> # mysql> select 'big blue truck!' regexp 'Big|Blu~e|[[.charactersfu)ck{';
>> # ERROR 1139: Got error 'brackets ([ ]) not balanced' from regexp
>
>
> Your requirement seems to be ambiguous.

Thanks for the response, and sorry for being ambiguous. My users are
just regular web users who won't have any ability to create regular
expressions. Any use of a meta character should be treated as
unintentional.

To clarify, I'm thinking that a user might enter the following in a
search box:

"my three keywords"

I was then thinking about implementing the SQL search logic of

'If field matches "my" or "three" or "keywords", return the record'.

Maybe all I need is simple function to replace all meta characters
with "\meta". This seems like a common use that someone might already
have a standard routine for.

Thanks,

Mark


--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules=x 7+yCfqV/g9AfugRpC6u6w@public.gmane.org

Re: escaping strings properly for use with REGEXP

am 03.05.2003 23:36:51 von Garry Williams

On Sat, May 03, 2003 at 20:00:40 +0000, Mark Stosberg wrote:
> In article <20030502192215.B4198@zvolve.com>, Garry Williams wrote:
> > On Fri, May 02, 2003 at 13:27:28 +0000, Mark Stosberg wrote:

[snip]

> Maybe all I need is simple function to replace all meta characters
> with "\meta". This seems like a common use that someone might
> already have a standard routine for.

You don't need a function. Just code something like this in-line:

my $meta = qr/[\[\]{}.|()?*+\\]/;
...
$user_input =~ s/($meta)/\\$1/g;

--
Garry Williams, Zvolve Systems, Inc., +1 770 551-4504

--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules@m .gmane.org

Re: escaping strings properly for use with REGEXP

am 03.05.2003 23:36:51 von Garry Williams

On Sat, May 03, 2003 at 20:00:40 +0000, Mark Stosberg wrote:
> In article <20030502192215.B4198@zvolve.com>, Garry Williams wrote:
> > On Fri, May 02, 2003 at 13:27:28 +0000, Mark Stosberg wrote:

[snip]

> Maybe all I need is simple function to replace all meta characters
> with "\meta". This seems like a common use that someone might
> already have a standard routine for.

You don't need a function. Just code something like this in-line:

my $meta = qr/[\[\]{}.|()?*+\\]/;
...
$user_input =~ s/($meta)/\\$1/g;

--
Garry Williams, Zvolve Systems, Inc., +1 770 551-4504

--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules@m .gmane.org