Roundup Tracker - Issues

Issue 2550805

classification
Postgres should search title attribute case insensitive like sqlite
Type: behavior Severity: normal
Components: Database Versions: 1.4
process
Status: closed fixed
:
: : ber, rouilj, tekberg
Priority: : patch

Created on 2013-04-11 21:23 by tekberg, last changed 2013-05-10 21:25 by ber.

Files
File name Uploaded Description Edit Remove
like.tar.gz tekberg, 2013-04-26 19:46 3 files: test case and 2 files to change
rdbms_common.py.diff tekberg, 2013-04-29 14:41 updated patch file with updated line numbers
back_postgresql.py-20130508.diff ber, 2013-05-07 23:03
like-class-variable-patch.diff tekberg, 2013-05-09 15:55 Changes implementation to use a class variable instead of a method
like-class-variable-patch2.diff tekberg, 2013-05-10 15:17 Patch using getattr on a class variable
Messages
msg4841 Author: [hidden] (tekberg) Date: 2013-04-11 21:23
I just changed the backend from sqlite to postgres. When using the 
sqlite backend if I do an issue search on 'title' and enter xyz, the 
search results show all %xyz% (as in LIKE) title matches ignoring case. 
Doing the same thing when using the postgres backend, the search results 
only show records that have an exact LIKE match with %xyz%. I ran into 
this with another program and wrote a insensitive_like function to 
handle the differences between MySQL, sqlite and Postgres. MySQL and 
sqlite do a case insensitive LIKE, but with Postgres one has to use 
ILIKE.

I searched the roundup/ code and only found one place where LIKE was 
being used in a database context: ./backends/rdbms_common.py:2437. That 
code didn't put %'s around the search string so that wasn't it. Sorry I 
wasn't able to determine where the problem was to provide a patch.
msg4842 Author: [hidden] (ber) Date: 2013-04-12 07:43
Tom,
thanks for the report and the analysis!

I believe the next step would be to write a test function
that shows how the behaviour should be and that fails.
This way, we can be sure of what the code should be.

Tom, has you have a pretty good understanding about the problem now.
Are you willing to write a test function for it?

Best,
Bernhard
msg4843 Author: [hidden] (tekberg) Date: 2013-04-12 15:14
I am having trouble writing tests that fail for Postgres. I put these 
two functions in in test/db_test_base.py in the DBTest class:

    def testStringFindCase(self):
        self.assertRaises(TypeError, self.db.issue.stringFind, 
status='1')

        ids = []
        ids.append(self.db.issue.create(title="SPAM"))
        self.db.issue.create(title="not spam")
        ids.append(self.db.issue.create(title="spam"))
        ids.sort()
        got = self.db.issue.stringFind(title='spam')
        got.sort()
        self.assertEqual(got, ids)
        self.assertEqual(self.db.issue.stringFind(title='fubar'), [])

    def testReindexingChangeCase(self):
        search = self.db.indexer.search
        issue = self.db.issue
        i1 = issue.create(title="FLEBBLE plop")
        i2 = issue.create(title="FLEBBLE frooz")
        self.db.commit()
        self.assertEquals(search(['plop'], issue), {i1: {}})
        self.assertEquals(search(['flebble'], issue), {i1: {}, i2: {}})

        # change i1's title
        issue.set(i1, title="plop")
        self.db.commit()
        self.assertEquals(search(['plop'], issue), {i1: {}})
        self.assertEquals(search(['flebble'], issue), {i2: {}})

Then I ran these commands - all 4 databases passed.

$ python run_test.py . '^testStringFindCase$'
testStringFindCase (test.test_anydbm.anydbmDBTest) ... ok
testStringFindCase (test.test_memorydb.memorydbDBTest) ... ok
testStringFindCase (test.test_postgresql.postgresqlDBTest) ... ok
testStringFindCase (test.test_sqlite.sqliteDBTest) ... ok
$ python run_test.py . '^testReindexingChangeCase$'
testReindexingChangeCase (test.test_anydbm.anydbmDBTest) ... ok
testReindexingChangeCase (test.test_memorydb.memorydbDBTest) ... ok
testReindexingChangeCase (test.test_postgresql.postgresqlDBTest) ... ok
testReindexingChangeCase (test.test_sqlite.sqliteDBTest) ... ok

I expected Postgres to fail on at least one of these.

Obviously I don't know enough about the issue search on title to 
formulate a test for the postgres failure case.

Can someone point me to the code that implements the issue title search?
msg4844 Author: [hidden] (ber) Date: 2013-04-15 08:31
Tom,
thanks for trying to write the test.
I would need to look into it more closely to see what the issue is
and I don't know when I get around doing this. At least there seems
to be a need for clarification. 

Best,
Bernhard
msg4861 Author: [hidden] (tekberg) Date: 2013-04-25 19:25
I turned logging on in PostgreSQL so I could see what query it generates 
when doing a search on title:
  log_statement = 'all'
in postgresql.conf. The query is:

select _issue.id,_priority3._order,(_priority3._order is not 
NULL),_issue._activity,(_issue._activity is not NULL) from _issue LEFT 
OUTER JOIN _priority as _priority3 on _issue._priority=_priority3.id  
where (_issue._title LIKE '%%cpoe%%') and _issue.__retired__=0 order by 
(_priority3._order is not NULL),_priority3._order,(_issue._activity is 
not NULL),_issue._activity,_issue.id

Note the LIKE '%%cpoe%%'. This one returned no rows, while searching for 
CPOE in the title returned 172 rows.

I'm going to see if I can determine where this query is generated. If 
you have a pointer that would be nice, then I can write a test case and 
a patch.
msg4862 Author: [hidden] (tekberg) Date: 2013-04-25 19:44
OK. I found the code. It's in roundup/backends/rdbms_common.py in the 
_filter_sql method. For my version its at line 2437:

  where.append('('
      +' and '.join(["_%s._%s LIKE '%s'"%(pln, k, s) for s in v])
      +')')

This exactly matches the LIKE clause in the query. I don't know why I had 
so much trouble finding this earlier.
msg4865 Author: [hidden] (ber) Date: 2013-04-26 06:41
Tom,
good that you kept going on the issue.
My knowledge of the code is not that good either, so I could not have
gotting you a useful hint.

Bernhard
msg4866 Author: [hidden] (tekberg) Date: 2013-04-26 19:46
I have created a test case and a patch to fix the problem. Both are in 
the attached TAR file. The test case fails with an unpatched roundup, 
and succeeds with a patched roundup.

Please take a look to see if this looks OK to you. There are many ways 
to fix this problem - I picked one.

I applied the patch to version 1.4.21. If you want it applied to another 
version let me know.

Here is an overview of the fix:

test/db_test_base.py.diff
  Added the test case testFilteringStringCase.
roundup/backends/back_postgresql.py.diff
  Added an override for the method insensitive_like in the Class
  class that uses ILIKE.
roundup/backends/rdbms_common.py.diff
  Defined the method insensitive_like in the CLASS class that uses LIKE.
  Changed the string case in _filter_sql to call the method
  insensitive_like instead hardcoding 'LIKE'.
msg4867 Author: [hidden] (tekberg) Date: 2013-04-26 19:47
I have created a test case and a patch to fix the problem. Both are in 
the attached TAR file. The test case fails with an unpatched roundup, 
and succeeds with a patched roundup.

Please take a look to see if this looks OK to you. There are many ways 
to fix this problem - I picked one.

I applied the patch to version 1.4.21. If you want it applied to another 
version let me know.

Here is an overview of the fix:

test/db_test_base.py.diff
  Added the test case testFilteringStringCase.
roundup/backends/back_postgresql.py.diff
  Added an override for the method insensitive_like in the Class
  class that uses ILIKE.
roundup/backends/rdbms_common.py.diff
  Defined the method insensitive_like in the CLASS class that uses LIKE.
  Changed the string case in _filter_sql to call the method
  insensitive_like instead hardcoding 'LIKE'.
msg4868 Author: [hidden] (ber) Date: 2013-04-29 09:55
Tom, thanks for preparing the patch. I will look at it as soon as 
possible. 

A few hints for your next contribution: 
 * Patches are adopted slightly faster, if they apply to the current hg
tip.
 * If you can find a different tester, uptake usually is even smoother.
:)
msg4869 Author: [hidden] (tekberg) Date: 2013-04-29 14:41
Bernhard: thanks for the advice. Two of the 3 files I changed have not 
changed in the hg tip. The file roundup/backends/rdbms_common.py did 
change. The only change between the old/new diff files were the line 
numbers. I attached the updated diff file.
msg4870 Author: [hidden] (ber) Date: 2013-04-29 14:48
Tom,
all good! 
Usually "patch" just applies the change anyway so it is not an issue.
msg4871 Author: [hidden] (ber) Date: 2013-05-04 21:36
Tom,
took a first look with postgresql 9.2 on OpenSuse 12.3.
I get the expected failure for postgresql when adding your test function.
However adding your code changes still let me see the test failure
and I get an additional one:


FAIL: testFilteringRangeGeekInterval (test.test_postgresql.postgresqlDBTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python2.7/unittest/case.py", line 327, in run
    testMethod()
  File "/home/bernhard/hacking/roundup/roundup-hg/./test/db_test_base.py", line 
1351, in testFilteringRangeGeekInterval
    ae(filt(None, {'deadline': '-2d;'}), ['5', '6'])
  File "/usr/lib64/python2.7/unittest/case.py", line 511, in assertEqual
    assertion_func(first, second, msg=msg)
[...]
AssertionError: Lists differ: ['6'] != ['5', '6']

First differing element 0:
6
5

Second list contains 1 additional elements.
First extra element 1:
6

- ['6']
+ ['5', '6']

Do you also see that second test failure?
I have not analysed further yet.


One hint about creating patches: Usually a unified context diff is easier to read, "-
u" option for "diff". And you can put several changes in one patch files, which 
makes it easier to apply.
msg4872 Author: [hidden] (ber) Date: 2013-05-04 22:06
Hmm, the failure in testFilteringRangeGeekInterval was half an hour to mightnight 
and now, five minutes after midnight it does not happen anymore. Maybe it was 
unrelated.
msg4873 Author: [hidden] (rouilj) Date: 2013-05-04 22:47
In message <1367705161.54.0.137067178029.issue2550805@psf.upfronthosting.co.za> 
<1367705161.54.0.137067178029.issue2550805@psf.upfronthosting.co.za>,
Bernhard Reiter writes:
>Bernhard Reiter added the comment:
>
>Hmm, the failure in testFilteringRangeGeekInterval was half an hour to might
>night and now, five minutes after midnight it does not happen anymore.
>Maybe it was unrelated.

You just reminded me of something I saw and couldn't reproduce when I
was adding my tx_Source patch.

I had what I think was the same failure where the lists hd different
numbers of elements for sqlite on one test run, but when I did another
test everything passed ok. I just chalked it up so something broke on
my test system.

I did a run of all tests (db and sqlite back ends) about 20 minutes
ago on the current head roundup release and I don't see any failures.
msg4875 Author: [hidden] (ber) Date: 2013-05-06 07:57
Okay, let assume the testFilteringRangeGeekInterval failure is
unrelated to Tom's patch -> Issue2550810

As for the failure I have on the test, postgresql's docs say that ILIKE
behaviour depends on the locale. Here the commands that I've used to
create the database, maybe my choice of the "C" locale matters.

Created a new postgresql DB:
        rm -r tmpdb/
        mkdir tmpdb
        initdb -D tmpdb/ -U rounduptest --locale=c --encoding=UTF8
        pg_ctl -D tmpdb/ -l logfile -o '-k ./' start
msg4876 Author: [hidden] (ber) Date: 2013-05-07 21:59
Further analysis:
It is not an issue with the postgresl locale.
When running with  hg current tip 4786:6018617c55ab
   LOGGING_LEVEL=DEBUG python run_tests.py .  testFilteringStringCase

I can see that there is always "LIKE" returned from the call.
I've added a call to show me what gets called see
--- a/roundup/backends/rdbms_common.py  Sun May 05 00:21:27 2013 +0200
+++ b/roundup/backends/rdbms_common.py  Tue May 07 23:57:58 2013 +0200
@@ -2439,8 +2439,10 @@
                     v = ['%%'+self.db.sql_stringquote(s)+'%%' for s in v]
 
                     # now add to the where clause
+                    like = self.insensitive_like()
+                    print repr(self),repr(self.insensitive_like),self.insensitive_like()

And the result is always:
testFilteringStringCase (test.test_postgresql.postgresqlDBTest) ... 
<hyperdb.Class "issue"> <bound method IssueClass.insensitive_like of 
<hyperdb.Class "issue">> LIKE
msg4877 Author: [hidden] (ber) Date: 2013-05-07 23:03
In back_postgresql.py your patch in like.tar.gz places the def
insensitive_like(self) inside class Class():. I've tried adding it to
class PostgresqlClass: and now the test succeeds for me.

Tom,
can you recheck why this works for you?
And see if my patch  back_postgresql.py-20130508.diff also works for you?
msg4878 Author: [hidden] (tekberg) Date: 2013-05-08 16:09
I'm shocked! I retested my fix and it failed the test!!! I applied your 
patch (actually I cut/pasted my change to the class PostgresqlClass and 
added 'pass' to class Class) and it passed the test. I don't understand 
why my patch worked before. ????

I'll replace my .diff file with yours so I apply the correct patch. 
Thanks for taking the time to look at it and fix the problem.

Is there anything more you expect from me for this issue?
msg4879 Author: [hidden] (ber) Date: 2013-05-09 14:14
Hi Tom,

good to know that your behaviour is similiar to mine. I was worried a bit about this.

I haven't committed the change, because I'm not confident enough that this is the 
current patch is a good solution.

One thing that worries me is the python method resolution order (mro), because 
class IssueClass(PostgresqlClass, rdbms_common.IssueClass) inherits the 
method  insensitive_like() from both parents. On what basis does the 
PostgresqlClass' method win? Does this work on all circumstances?

Also is this way of using  a method the most elegant way?
Why not using a class variable like it is done with "order_by_null_values"?
And than just check if we have this variable and if so, use the contents?
The dynamics of methods may not be necessary and the default value
would be closer to place in code where it is used. 

If the variable is only defined in case of PostgresqlClass, then there would be no 
need to think about the MRO, thus resolving my first issue.

If you get to it earlier than me, feel free to create a more elegant patch
trying the variable approach.
msg4880 Author: [hidden] (tekberg) Date: 2013-05-09 15:55
Bernhard, The variable approach looks a bit simpler. Attached is a patch 
file for this. I had to break one line up because it was too long. I'm 
not real happy with the result but there is a rule about that 
(http://roundup.sourceforge.net/docs/developers.html#project-rules).

Please let me know what you think.
msg4881 Author: [hidden] (ber) Date: 2013-05-10 08:57
Yes, I like the variable approach better thanks for the patch.

Could be do something like
   if self has attribute (case_insensitive_like) use it, otherwise use
"LIKE"?

Would solve my mro worries and would have the default value closer to the
code where it is actually used.
msg4882 Author: [hidden] (tekberg) Date: 2013-05-10 13:39
How about removing the initialization to 'LIKE' and changing the 
reference to self.case_insensitive_like to this:

getattr(self,'case_insensitive_like','LIKE')

I'll code up a patch if you think this is OK.
msg4883 Author: [hidden] (ber) Date: 2013-05-10 14:26
Sound good, please do and ran the tests. :)
msg4884 Author: [hidden] (tekberg) Date: 2013-05-10 15:17
I made the change, rebuilt roundup and ran

  sudo /usr/local/bin/python run_tests.py . testFilteringString

There were 12 tests, all OK. Attached is the updated patch file.
msg4885 Author: [hidden] (ber) Date: 2013-05-10 21:25
Committed with rev4787:4a017661e414 .
Tom, thanks again for the contribution!
History
Date User Action Args
2013-05-10 21:25:31bersetstatus: new -> closed
resolution: fixed
messages: + msg4885
2013-05-10 15:17:47tekbergsetfiles: + like-class-variable-patch2.diff
messages: + msg4884
2013-05-10 14:26:09bersetmessages: + msg4883
2013-05-10 13:39:51tekbergsetmessages: + msg4882
2013-05-10 08:57:47bersetmessages: + msg4881
2013-05-09 15:55:09tekbergsetfiles: + like-class-variable-patch.diff
messages: + msg4880
2013-05-09 14:14:36bersetmessages: + msg4879
title: Postgres searches title attribute case insensitive while sqlite does not -> Postgres should search title attribute case insensitive like sqlite
2013-05-08 16:09:31tekbergsetmessages: + msg4878
2013-05-07 23:03:43bersetfiles: + back_postgresql.py-20130508.diff
messages: + msg4877
2013-05-07 21:59:52bersetmessages: + msg4876
2013-05-06 07:57:04bersetmessages: + msg4875
2013-05-04 22:47:52rouiljsetmessages: + msg4873
2013-05-04 22:06:01bersetmessages: + msg4872
2013-05-04 21:36:01bersetmessages: + msg4871
2013-04-29 14:48:55bersetmessages: + msg4870
2013-04-29 14:41:24tekbergsetfiles: + rdbms_common.py.diff
keywords: + patch
messages: + msg4869
2013-04-29 09:55:28bersetmessages: + msg4868
2013-04-26 19:47:36tekbergsetmessages: + msg4867
2013-04-26 19:46:25tekbergsetfiles: + like.tar.gz
messages: + msg4866
2013-04-26 06:41:04bersetmessages: + msg4865
2013-04-25 19:44:11tekbergsetmessages: + msg4862
2013-04-25 19:25:17tekbergsetmessages: + msg4861
2013-04-15 08:31:53bersetnosy: + rouilj, - admin
messages: + msg4844
2013-04-12 15:14:44tekbergsetmessages: + msg4843
2013-04-12 07:43:24bersetmessages: + msg4842
2013-04-11 21:23:34tekbergcreate