Comparing tsearch2 vectors.
Comparing tsearch2 vectors.
am 12.07.2004 10:46:06 von mallah
Hi,
We want to compare strings after stemming. Can anyone
tell me what is the best method. I was thinking to compare
the tsvector ,but there is no operator for that.
Regds
Mallah.
tradein_clients=# SELECT to_tsvector('handicraft exporters');
+---------------------------+
| to_tsvector |
+---------------------------+
| 'export':2 'handicraft':1 |
+---------------------------+
(1 row)
Time: 710.315 ms
tradein_clients=#
tradein_clients=# SELECT to_tsvector('handicrafts exporter');
+---------------------------+
| to_tsvector |
+---------------------------+
| 'export':2 'handicraft':1 |
+---------------------------+
(1 row)
Time: 400.679 ms
tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there');
ERROR: operator does not exist: tsvector = tsvector
HINT: No operator matches the given name and argument type(s). You may
need to add explicit type casts.
tradein_clients=#
--
regds
Mallah.
Rajesh Kumar Mallah
+---------------------------------------------------+
| Tradeindia.com (3,11,246) Registered Users |
| Indias' Leading B2B eMarketPlace |
| http://www.tradeindia.com/ |
+---------------------------------------------------+
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Re: Comparing tsearch2 vectors.
am 12.07.2004 12:59:46 von achill
O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
>
> Hi,
>
> We want to compare strings after stemming. Can anyone
> tell me what is the best method. I was thinking to compare
> the tsvector ,but there is no operator for that.
I'd tokenize each string and then apply lexize() to get the
equivalent stemified
word, but what exactly are you trying to accomplish?
>
> Regds
> Mallah.
>
>
>
> tradein_clients=# SELECT to_tsvector('handicraft exporters');
> +---------------------------+
> | to_tsvector |
> +---------------------------+
> | 'export':2 'handicraft':1 |
> +---------------------------+
> (1 row)
>
> Time: 710.315 ms
> tradein_clients=#
> tradein_clients=# SELECT to_tsvector('handicrafts exporter');
> +---------------------------+
> | to_tsvector |
> +---------------------------+
> | 'export':2 'handicraft':1 |
> +---------------------------+
> (1 row)
>
> Time: 400.679 ms
> tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there');
> ERROR: operator does not exist: tsvector = tsvector
> HINT: No operator matches the given name and argument type(s). You may
> need to add explicit type casts.
> tradein_clients=#
>
>
--
-Achilleus
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
Re: Comparing tsearch2 vectors.
am 12.07.2004 14:34:25 von mallah
Dear Mantzios,
I have to get set of banners from database in
response to a search term. I want that the search term
be compared to the keyword corresponding to the
banners stored in database. current i am doing an
equality match but i woild like to do it after stemming
both the sides (serch term and keywords).
So that the banners for the adword say 'incense exporter' is
shown even if 'incenses exporter' or 'incense exporters' is
searched.
I hope i am able to clarify.
Regds
Mallah.
Achilleus Mantzios wrote:
>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
>
>
>
>>Hi,
>>
>>We want to compare strings after stemming. Can anyone
>>tell me what is the best method. I was thinking to compare
>>the tsvector ,but there is no operator for that.
>>
>>
>
>I'd tokenize each string and then apply lexize() to get the
>equivalent stemified
>word, but what exactly are you trying to accomplish?
>
>
>
>>Regds
>>Mallah.
>>
>>
>>
>>tradein_clients=# SELECT to_tsvector('handicraft exporters');
>>+---------------------------+
>>| to_tsvector |
>>+---------------------------+
>>| 'export':2 'handicraft':1 |
>>+---------------------------+
>>(1 row)
>>
>>Time: 710.315 ms
>>tradein_clients=#
>>tradein_clients=# SELECT to_tsvector('handicrafts exporter');
>>+---------------------------+
>>| to_tsvector |
>>+---------------------------+
>>| 'export':2 'handicraft':1 |
>>+---------------------------+
>>(1 row)
>>
>>Time: 400.679 ms
>>tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there');
>>ERROR: operator does not exist: tsvector = tsvector
>>HINT: No operator matches the given name and argument type(s). You may
>>need to add explicit type casts.
>>tradein_clients=#
>>
>>
>>
>>
>
>
>
--
regds
Mallah.
Rajesh Kumar Mallah
+---------------------------------------------------+
| Tradeindia.com (3,11,246) Registered Users |
| Indias' Leading B2B eMarketPlace |
| http://www.tradeindia.com/ |
+---------------------------------------------------+
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Re: Comparing tsearch2 vectors.
am 12.07.2004 15:40:38 von achill
O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> Achilleus Mantzios wrote:
>
> >O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >
> >
> >
> >>Dear Mantzios,
> >>
> >>I have to get set of banners from database in
> >>response to a search term. I want that the search term
> >>be compared to the keyword corresponding to the
> >>banners stored in database. current i am doing an
> >>equality match but i woild like to do it after stemming
> >>both the sides (serch term and keywords).
> >>
> >>
> >
> >You could transform your search terms so that there is the "&"
> >separator between them. (& stands for "AND").
> >E.g. "handicrafts exporter" becomes "handicrafts&exporter"
> >And then
> >select * from where idxfti @@ to_tsquery();
> >
> >
>
> But i do not want 'handicraft exporters of delhi' to pop out if i search
> for 'handicrafts exporters' whereas
>
> SELECT to_tsvector('handycrafts exporters of delhi') @@ to_tsquery('handycraft&exporting');
>
> will be true.
Define what you want, and then read tsearch2 userguide.
I'm sure you'll find your way :)
>
> Regds
> Mallah.
>
>
>
> >where idxfti is your tsvector column.
> >
> >E.g.
> ># SELECT to_tsvector('handycrafts exporters') @@ to_tsquery('handycraft&exporting');
> > ?column?
> >----------
> > t
> >(1 row)
> >
> >
> >
> >
> >
> >>So that the banners for the adword say 'incense exporter' is
> >>shown even if 'incenses exporter' or 'incense exporters' is
> >>searched.
> >>
> >>I hope i am able to clarify.
> >>
> >>Regds
> >>Mallah.
> >>
> >>Achilleus Mantzios wrote:
> >>
> >>
> >>
> >>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>Hi,
> >>>>
> >>>>We want to compare strings after stemming. Can anyone
> >>>>tell me what is the best method. I was thinking to compare
> >>>>the tsvector ,but there is no operator for that.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>I'd tokenize each string and then apply lexize() to get the
> >>>equivalent stemified
> >>>word, but what exactly are you trying to accomplish?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>Regds
> >>>>Mallah.
> >>>>
> >>>>
> >>>>
> >>>>tradein_clients=# SELECT to_tsvector('handicraft exporters');
> >>>>+---------------------------+
> >>>>| to_tsvector |
> >>>>+---------------------------+
> >>>>| 'export':2 'handicraft':1 |
> >>>>+---------------------------+
> >>>>(1 row)
> >>>>
> >>>>Time: 710.315 ms
> >>>>tradein_clients=#
> >>>>tradein_clients=# SELECT to_tsvector('handicrafts exporter');
> >>>>+---------------------------+
> >>>>| to_tsvector |
> >>>>+---------------------------+
> >>>>| 'export':2 'handicraft':1 |
> >>>>+---------------------------+
> >>>>(1 row)
> >>>>
> >>>>Time: 400.679 ms
> >>>>tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there');
> >>>>ERROR: operator does not exist: tsvector = tsvector
> >>>>HINT: No operator matches the given name and argument type(s). You may
> >>>>need to add explicit type casts.
> >>>>tradein_clients=#
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >
> >
> >
>
>
>
--
-Achilleus
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
Re: Comparing tsearch2 vectors.
am 12.07.2004 16:06:03 von mallah
Achilleus Mantzios wrote:
>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
>
>
>
>>Dear Mantzios,
>>
>>I have to get set of banners from database in
>>response to a search term. I want that the search term
>>be compared to the keyword corresponding to the
>>banners stored in database. current i am doing an
>>equality match but i woild like to do it after stemming
>>both the sides (serch term and keywords).
>>
>>
>
>You could transform your search terms so that there is the "&"
>separator between them. (& stands for "AND").
>E.g. "handicrafts exporter" becomes "handicrafts&exporter"
>And then
>select * from where idxfti @@ to_tsquery();
>
>
But i do not want 'handicraft exporters of delhi' to pop out if i search
for 'handicrafts exporters' whereas
SELECT to_tsvector('handycrafts exporters of delhi') @@ to_tsquery('handycraft&exporting');
will be true.
Regds
Mallah.
>where idxfti is your tsvector column.
>
>E.g.
># SELECT to_tsvector('handycrafts exporters') @@ to_tsquery('handycraft&exporting');
> ?column?
>----------
> t
>(1 row)
>
>
>
>
>
>>So that the banners for the adword say 'incense exporter' is
>>shown even if 'incenses exporter' or 'incense exporters' is
>>searched.
>>
>>I hope i am able to clarify.
>>
>>Regds
>>Mallah.
>>
>>Achilleus Mantzios wrote:
>>
>>
>>
>>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
>>>
>>>
>>>
>>>
>>>
>>>>Hi,
>>>>
>>>>We want to compare strings after stemming. Can anyone
>>>>tell me what is the best method. I was thinking to compare
>>>>the tsvector ,but there is no operator for that.
>>>>
>>>>
>>>>
>>>>
>>>I'd tokenize each string and then apply lexize() to get the
>>>equivalent stemified
>>>word, but what exactly are you trying to accomplish?
>>>
>>>
>>>
>>>
>>>
>>>>Regds
>>>>Mallah.
>>>>
>>>>
>>>>
>>>>tradein_clients=# SELECT to_tsvector('handicraft exporters');
>>>>+---------------------------+
>>>>| to_tsvector |
>>>>+---------------------------+
>>>>| 'export':2 'handicraft':1 |
>>>>+---------------------------+
>>>>(1 row)
>>>>
>>>>Time: 710.315 ms
>>>>tradein_clients=#
>>>>tradein_clients=# SELECT to_tsvector('handicrafts exporter');
>>>>+---------------------------+
>>>>| to_tsvector |
>>>>+---------------------------+
>>>>| 'export':2 'handicraft':1 |
>>>>+---------------------------+
>>>>(1 row)
>>>>
>>>>Time: 400.679 ms
>>>>tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there');
>>>>ERROR: operator does not exist: tsvector = tsvector
>>>>HINT: No operator matches the given name and argument type(s). You may
>>>>need to add explicit type casts.
>>>>tradein_clients=#
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
--
regds
Mallah.
Rajesh Kumar Mallah
+---------------------------------------------------+
| Tradeindia.com (3,11,246) Registered Users |
| Indias' Leading B2B eMarketPlace |
| http://www.tradeindia.com/ |
+---------------------------------------------------+
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Re: Comparing tsearch2 vectors.
am 13.07.2004 05:31:53 von mallah
Achilleus Mantzios wrote:
>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
>
>
>
>>Achilleus Mantzios wrote:
>>
>>
>>
>>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
>>>
>>>
>>>
>>>
>>>
>>>>Dear Mantzios,
>>>>
>>>>I have to get set of banners from database in
>>>>response to a search term. I want that the search term
>>>>be compared to the keyword corresponding to the
>>>>banners stored in database. current i am doing an
>>>>equality match but i woild like to do it after stemming
>>>>both the sides (serch term and keywords).
>>>>
>>>>
>>>>
>>>>
>>>You could transform your search terms so that there is the "&"
>>>separator between them. (& stands for "AND").
>>>E.g. "handicrafts exporter" becomes "handicrafts&exporter"
>>>And then
>>>select * from where idxfti @@ to_tsquery();
>>>
>>>
>>>
>>>
>>But i do not want 'handicraft exporters of delhi' to pop out if i search
>>for 'handicrafts exporters' whereas
>>
>>SELECT to_tsvector('handycrafts exporters of delhi') @@ to_tsquery('handycraft&exporting');
>>
>>will be true.
>>
>>
>
>Define what you want, and then read tsearch2 userguide.
>I'm sure you'll find your way :)
>
>
The requirement is different than full text search.
I am not searching a word in a collection of words (text)
rather comparing two strings after all the words in those
strings are stemmed. Hope my requirement is clear now.
Regds
mallah.
>
>
>>Regds
>>Mallah.
>>
>>
>>
>>
>>
>>>where idxfti is your tsvector column.
>>>
>>>E.g.
>>># SELECT to_tsvector('handycrafts exporters') @@ to_tsquery('handycraft&exporting');
>>>?column?
>>>----------
>>>t
>>>(1 row)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>>So that the banners for the adword say 'incense exporter' is
>>>>shown even if 'incenses exporter' or 'incense exporters' is
>>>>searched.
>>>>
>>>>I hope i am able to clarify.
>>>>
>>>>Regds
>>>>Mallah.
>>>>
>>>>Achilleus Mantzios wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>Hi,
>>>>>>
>>>>>>We want to compare strings after stemming. Can anyone
>>>>>>tell me what is the best method. I was thinking to compare
>>>>>>the tsvector ,but there is no operator for that.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>I'd tokenize each string and then apply lexize() to get the
>>>>>equivalent stemified
>>>>>word, but what exactly are you trying to accomplish?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>Regds
>>>>>>Mallah.
>>>>>>
>>>>>>
>>>>>>
>>>>>>tradein_clients=# SELECT to_tsvector('handicraft exporters');
>>>>>>+---------------------------+
>>>>>>| to_tsvector |
>>>>>>+---------------------------+
>>>>>>| 'export':2 'handicraft':1 |
>>>>>>+---------------------------+
>>>>>>(1 row)
>>>>>>
>>>>>>Time: 710.315 ms
>>>>>>tradein_clients=#
>>>>>>tradein_clients=# SELECT to_tsvector('handicrafts exporter');
>>>>>>+---------------------------+
>>>>>>| to_tsvector |
>>>>>>+---------------------------+
>>>>>>| 'export':2 'handicraft':1 |
>>>>>>+---------------------------+
>>>>>>(1 row)
>>>>>>
>>>>>>Time: 400.679 ms
>>>>>>tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there');
>>>>>>ERROR: operator does not exist: tsvector = tsvector
>>>>>>HINT: No operator matches the given name and argument type(s). You may
>>>>>>need to add explicit type casts.
>>>>>>tradein_clients=#
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
>
--
regds
Mallah.
Rajesh Kumar Mallah
+---------------------------------------------------+
| Tradeindia.com (3,11,246) Registered Users |
| Indias' Leading B2B eMarketPlace |
| http://www.tradeindia.com/ |
+---------------------------------------------------+
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
Re: Comparing tsearch2 vectors.
am 13.07.2004 08:36:46 von achill
O kyrios Rajesh Kumar Mallah egrapse stis Jul 13, 2004 :
> Achilleus Mantzios wrote:
>
> >O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >
> >
> >
> >>Achilleus Mantzios wrote:
> >>
> >>
> >>
> >>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>Dear Mantzios,
> >>>>
> >>>>I have to get set of banners from database in
> >>>>response to a search term. I want that the search term
> >>>>be compared to the keyword corresponding to the
> >>>>banners stored in database. current i am doing an
> >>>>equality match but i woild like to do it after stemming
> >>>>both the sides (serch term and keywords).
> >>>>
> >>>>
> >>>>
> >>>>
> >>>You could transform your search terms so that there is the "&"
> >>>separator between them. (& stands for "AND").
> >>>E.g. "handicrafts exporter" becomes "handicrafts&exporter"
> >>>And then
> >>>select * from where idxfti @@ to_tsquery();
> >>>
> >>>
> >>>
> >>>
> >>But i do not want 'handicraft exporters of delhi' to pop out if i search
> >>for 'handicrafts exporters' whereas
> >>
> >>SELECT to_tsvector('handycrafts exporters of delhi') @@ to_tsquery('handycraft&exporting');
> >>
> >>will be true.
> >>
> >>
> >
> >Define what you want, and then read tsearch2 userguide.
> >I'm sure you'll find your way :)
> >
> >
> The requirement is different than full text search.
> I am not searching a word in a collection of words (text)
> rather comparing two strings after all the words in those
> strings are stemmed. Hope my requirement is clear now.
Ok, so we drop back to the initial assumption.
Tokenize both strings into an array of strings.
Let them be String[] string1,String[] string2
If arrays are not of same length then they are not equal.
Otherwise for each i in string1 compare
lexize(,string1[i]) against
lexize(,string2[i])
The tokenization is your job, while the lexize function comes with
tsearch2.
I dont know if its possible to be done in sql, since it requires some sort
of iteration.
>
>
> Regds
> mallah.
>
>
>
>
> >
> >
> >>Regds
> >>Mallah.
> >>
> >>
> >>
> >>
> >>
> >>>where idxfti is your tsvector column.
> >>>
> >>>E.g.
> >>># SELECT to_tsvector('handycrafts exporters') @@ to_tsquery('handycraft&exporting');
> >>>?column?
> >>>----------
> >>>t
> >>>(1 row)
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>So that the banners for the adword say 'incense exporter' is
> >>>>shown even if 'incenses exporter' or 'incense exporters' is
> >>>>searched.
> >>>>
> >>>>I hope i am able to clarify.
> >>>>
> >>>>Regds
> >>>>Mallah.
> >>>>
> >>>>Achilleus Mantzios wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 :
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Hi,
> >>>>>>
> >>>>>>We want to compare strings after stemming. Can anyone
> >>>>>>tell me what is the best method. I was thinking to compare
> >>>>>>the tsvector ,but there is no operator for that.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>I'd tokenize each string and then apply lexize() to get the
> >>>>>equivalent stemified
> >>>>>word, but what exactly are you trying to accomplish?
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Regds
> >>>>>>Mallah.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>tradein_clients=# SELECT to_tsvector('handicraft exporters');
> >>>>>>+---------------------------+
> >>>>>>| to_tsvector |
> >>>>>>+---------------------------+
> >>>>>>| 'export':2 'handicraft':1 |
> >>>>>>+---------------------------+
> >>>>>>(1 row)
> >>>>>>
> >>>>>>Time: 710.315 ms
> >>>>>>tradein_clients=#
> >>>>>>tradein_clients=# SELECT to_tsvector('handicrafts exporter');
> >>>>>>+---------------------------+
> >>>>>>| to_tsvector |
> >>>>>>+---------------------------+
> >>>>>>| 'export':2 'handicraft':1 |
> >>>>>>+---------------------------+
> >>>>>>(1 row)
> >>>>>>
> >>>>>>Time: 400.679 ms
> >>>>>>tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there');
> >>>>>>ERROR: operator does not exist: tsvector = tsvector
> >>>>>>HINT: No operator matches the given name and argument type(s). You may
> >>>>>>need to add explicit type casts.
> >>>>>>tradein_clients=#
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >>
> >
> >
> >
>
>
>
--
-Achilleus
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org