Roundup Tracker - Issues

Issue 1195739

classification
search in russian does not work
Type: Severity: normal
Components: Web interface Versions:
process
Status: fixed fixed
:
: : a1s, ber, richard, rouilj
Priority: low : patch

Created on 2005-05-05 08:24 by anonymous, last changed 2019-10-30 22:05 by rouilj.

Files
File name Uploaded Description Edit Remove
roundup.diff anonymous, 2005-05-05 08:24
Messages
msg1946 Author: [hidden] (anonymous) Date: 2005-05-05 08:24
Search functionality does not work in russian. It is 
possible to correct this by using unicode features of 
python. See enclosed difference.
msg1947 Author: [hidden] (a1s) Date: 2005-05-06 06:43
Logged In: YES 
user_id=8719

confirmed: current HEAD checkout does not find russian words

the patch fixes this problem for rdbms backend only, dbm and
xapian indexers are not affected.  perhaps i could port this
change to dbm backend.

more serious problem is that this patch contains hard-coded
charset, i.e. all database contents are assumed to be in
utf-8.  if we apply this patch, we lose the last chance to
ever support databases with different text encodings. (the
chances were not high anyway.)

Richard, do you think this is acceptable?
msg1948 Author: [hidden] (richard) Date: 2005-05-18 05:43
Logged In: YES 
user_id=6405

This patch is as good as we're going to get for now. 
msg1949 Author: [hidden] (a1s) Date: 2005-05-22 18:32
Logged In: YES 
user_id=8719

the patch is applied to HEAD and maint-0-8.

dbm indexer needs a bit more work: if i change text_splitter
to use unicode in a way similar to indexer_rdbms, dbm
indexer fails in .save_index() because there are no slots
for russian letters.

Richard, could you update docs, please?  i think we should
mention that from now on the database and all templates
*must* be in utf-8; other character sets are not and will
never be supported.
msg1950 Author: [hidden] (richard) Date: 2005-06-24 07:10
Logged In: YES 
user_id=6405

Where would you recommend that I put that information? 
 
Have you tried to index utf-8 with the Xapian indexer? If that 
works, then I'd be happy to say that if people wish to index utf-8 
then they can't use the built-in dbm indexer, but need to use 
the Xapian indexer. 
 
Having said that, I actually don't have a problem with just telling 
people they need to use one of the RDBMS backends. The dbm 
backend is really just for playing around, not serious work... 
msg1951 Author: [hidden] (a1s) Date: 2005-07-05 06:58
Logged In: YES 
user_id=8719

no, i didn't look at Xapian at all.

as for documentation update, it would be best to add a note
to three documents:
 * upgrade
 * html template editing
 * database setup
msg4614 Author: [hidden] (ber) Date: 2012-08-21 08:55
Reading up on the history. I believe this issues would need a retest.
msg6742 Author: [hidden] (rouilj) Date: 2019-10-13 22:50
See also: issue 1344046 - Search for "All text" can't find some Unicode words

Looks like a similar fix.
msg6784 Author: [hidden] (rouilj) Date: 2019-10-30 22:05
Final fix in rev5964:5bf7b5debb09.

Works for all indexers in python2 and python 3.

Closing.
History
Date User Action Args
2019-10-30 22:05:37rouiljsetstatus: open -> fixed
resolution: fixed
messages: + msg6784
2019-10-13 22:50:57rouiljsetnosy: + rouilj
messages: + msg6742
2016-06-27 03:17:17rouiljsetkeywords: + patch
2012-08-21 08:55:36bersetassignee: richard ->
messages: + msg4614
nosy: + ber
2005-05-05 08:24:26anonymouscreate