Issue 2551119
Created on 2021-03-18 12:13 by schlatterbeck, last changed 2021-06-14 01:55 by rouilj.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | Remove |
empty_multilink_expr.diff | ber, 2021-03-18 17:12 | |||
roundup-keyword-expr-20210414-1.png | ber, 2021-04-14 13:35 |
Messages | |||
---|---|---|---|
msg7124 | Author: [hidden] (schlatterbeck) | Date: 2021-03-18 12:13 | |
There is a feature in the query generator that can combine ID search with boolean expressions AND, OR, NOT, see commit f1fe6fd0aa61. This change came without a regression test and breaks for certain cases. Without a test of the *intended* feature I dare not touch that code. As far as I understand this uses *negative* IDs to implement a prefix or suffix expression parser. My use-case that breaks: We have two tables, time_project and time_wp with the following data structures in roundup (excerpt): time_project wps = Multilink('time_wp') status = Link('project_status') name = String() time_wp bookers = Multilink(user) Now a user enters a query with a work-package name that doesn't exist in the web-interface. The resulting call to filter has the following arguments: sort = [('+', 'name')] filterspec = {'status': ['2', '1'], 'wps.bookers': ['2428'], 'wps': []} This produces a traceback: <class 'psycopg2.errors.UndefinedTable'>: missing FROM-clause entry for table "_time_wp3" LINE 1: ...t from _time_wp where _time_wp.__retired__=0) and _time_wp3.... ^ ... return render(ob, econtext.vars) File "/usr/local/lib/python2.7/dist-packages/roundup/cgi/PageTemplates/Expressions.py", line 99, in render ob = ob() File "/usr/local/lib/python2.7/dist-packages/roundup/cgi/templating.py", line 3257, in batch l = [id for id in klass.filter(matches, filterspec, sort, group) File "/usr/local/lib/python2.7/dist-packages/roundup/backends/rdbms_common.py", line 2858, in filter sq = self._filter_sql (search_matches, filterspec, sort, group, File "/usr/local/lib/python2.7/dist-packages/roundup/backends/rdbms_common.py", line 233, in sql cursor.execute(sql, args) UndefinedTable: missing FROM-clause entry for table "_time_wp3" LINE 1: ...t from _time_wp where _time_wp.__retired__=0) and _time_wp3.... ^ It looks like the mentioned buggy code is triggered by the empty wps multilink in the search. Note how it tries to materialize the whole time_wp table (which contains 43665 entries in our case!) in a subquery. And it doesn't seem to correctly plug into the query generator: the from clause is missing an entry which is later referenced. I'd appreciate if someone using this extended query feature can turn on SQL statement logging (either in roundup or in the backend database) and provide me with pairs of filter calls and the generated SQL. Then I'd write an regression test for the intended behaviour. @Bernhard Reiter: Do you still use roundup with that feature? Examples of usage would be welcome! |
|||
msg7125 | Author: [hidden] (ber) | Date: 2021-03-18 14:59 | |
Hi Ralf, you seem to use the postgresl backen. AFAIK we still use this feature (though I don't know how many searches are done with it). We haven't looked at the code for a long while, the idea is to be able to compose complex searches. What is the fastest way to get a reproducable problem setup? |
|||
msg7126 | Author: [hidden] (ber) | Date: 2021-03-18 17:12 | |
Ralf, here is an into-the-blue-patch by one of our devs, more like a guess, but maybe it helps you anyway. (Sorry for not being more solid here, this certainly could have been documented better.) Best, Bernhard |
|||
msg7127 | Author: [hidden] (schlatterbeck) | Date: 2021-03-18 19:30 | |
Hi Bernhard, thanks for the quick reaction. On Thu, Mar 18, 2021 at 02:59:46PM +0000, Bernhard Reiter wrote: > > Hi Ralf, > you seem to use the postgresl backen. Yes. But I guess the problem is also reproduceable with mysql, the problem is in the query generator not backend specific. > AFAIK we still use this feature (though I don't know how many searches > are done with it). We haven't looked at the code for a long while, the > idea is to be able to compose complex searches. Cool. > What is the fastest way to get a reproducable problem setup? It would be nice if you could come up with examples of all the usual queries involving AND, OR, NOT and send me - The parameters to the filter function for a specific query - Optionally the generated sql (I can infer this myself). So this probable involves specifying some common queries via the web interface and looking what the filter method in backends/rdbms_common.py is called with. Unfortunately there is no logging for this, so you have to either instrument filter or set a breakpoint there in the python debugger. Accompanying the examples with the expression specified in the web interface would be nice, so I won't have to reverse-engineer that :-) The whole idea of all this is to come up with a test that covers functionality people (your customers among them :-) rely on before proceeding. On Thu, Mar 18, 2021 at 05:12:36PM +0000, Bernhard Reiter wrote: > > Ralf, > here is an into-the-blue-patch by one of our devs, > more like a guess, but maybe it helps you anyway. > (Sorry for not being more solid here, this certainly > could have been documented better.) Hmm, such a patch without a proper test is just what I want to avoid at this time :-) Once the behaviour is nailed down it should be easy to come up with a proper patch. Note that I have other systems where this misbehaves. I ideally would find out what goes wrong there and come up with additional tests (which would fail at this time) and proceed from there. I already have some analyses on what I *think* is wrong. But first I have to find out what it does when it's right :-) Thanks Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7128 | Author: [hidden] (ber) | Date: 2021-03-19 08:35 | |
Hi Ralf, just tried to be helpful as fast as I could yesterday. :) The original idea was to generally be able to have queries, where e.g. one status or two statuses, or priorities are excluded while others are allowed. One of the standard templates has the javascript code to form the queries. So you should be able to find test cases as well. The implementation idea had to find a few to express this via the url, and thus the notation was invented. If I have more time, I'll look it up here. Best, Bernhard |
|||
msg7129 | Author: [hidden] (rouilj) | Date: 2021-03-22 00:56 | |
Including link to issue2550648 to see all tickets related to this. There are 3 or 4 opened tickets. |
|||
msg7195 | Author: [hidden] (ber) | Date: 2021-04-14 13:35 | |
== Use Case (at Intevation) We want to search for a combination of keywords, for example all issues that have both the keywords "roundup" and "admin" === How it is used The query build from our template allows this, here is an url shortened to the significant parts, which will only list issues that have both keyword 1 and keyword 4. URL/issue?&keyword=1,4,-3&@action=search For this the expression editor is opened, the expression is constructed and "apply"ed to it is used as value for the keyword to search. If "OR" was selected, the URL would be URL/issue?&keyword=1,4,-4&@action=search So we have a postfix notation. roundup-keyword-expr-20210414-1.png shows an editor with "(a and b) or c" after pressing apply it will be entered in the current search page. === Template It is in the default classic roundup. testing with changeset: 6377:a7e7314fb7d9 tag: tip user: John Rouillard <rouilj@ieee.org> date: Sat Apr 10 22:10:07 2021 -0400 python demo.py nuke * login as demo, demo * create keywords a, b, c * create issues with a combination of the keywords. * use issue search page and expression editor to form expressions. A template set used by some of our still running elder trackers has the similar code https://hg.intevation.de/roundup/fast-decomposed/file/tip/html or hg clone https://hg.intevation.de/roundup/fast-decomposed fast-decomposed-hg Ralf, is that helping you? Thanks for your interest in this feature. :) (Sorry for having taken a while to respond here.) |
|||
msg7196 | Author: [hidden] (schlatterbeck) | Date: 2021-04-14 14:20 | |
On Wed, Apr 14, 2021 at 01:35:01PM +0000, Bernhard Reiter wrote: [...] > Ralf, is that helping you? Thanks for your interest in this feature. :) > (Sorry for having taken a while to respond here.) Yes! Thanks a lot, this is helping me to write tests for the indended semantics. I'm currently busy with another project (also roundup) so it will take me one or two weeks to come back to this. Thanks again for taking the time! Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7212 | Author: [hidden] (rouilj) | Date: 2021-05-06 19:10 | |
Hi Ralf, I am starting to plan the next roundup release. I would like to get a beta out end of may (no alpha this time). That way I can get the 20 year anniversary edition on in July. I'll probably release it on the 13th again a few days later then the first release: 0.1.0 2001-07-11. This is one of the few active bugs, do you think you can get it sorted (with or without docs) by end May or July? -- rouilj |
|||
msg7217 | Author: [hidden] (schlatterbeck) | Date: 2021-05-07 12:24 | |
On Thu, May 06, 2021 at 07:10:58PM +0000, John Rouillard wrote: > > I am starting to plan the next roundup release. I would like to get a > beta out end of may (no alpha this time). That way I can get the 20 > year anniversary edition on in July. I'll probably release it on the > 13th again a few days later then the first release: 0.1.0 2001-07-11. Cool! > This is one of the few active bugs, do you think you can get it > sorted (with or without docs) by end May or July? I'll try my best. Unfortunately the multilink expression feature is in worse shape than I anticipated: The very first test (the one suggested by Bernhard) fails for mysql. When a '-1' id (meaning "empty multilink") is involved, at least the tests I came up with fail for *all* backends. I'll commit what I have later today (when I'm reasonably sure I didn't break other tests, I've introduced a new class "keyword" in the tests) Once I have this in place the idea is - to get mysql working - to get the feature working with '-1' in a query - to allow '-1' mixed with other ids as an implicit "or" (my memory says that this used to work before the multilink-expression feature was introduced but I may be wrong and it may well be that this never worked for multilinks) - To ensure that if only '-1' (and no other boolean operator with lower negative id) is part of the query we do not materialize the whole multilink table in a subquery (that would be fine for 'keyword' but I have some tables where this would materialize *large* datasets) - Finally try to reproduce the bug for which the issue was originally created by writing a test for it - and fix it :-) Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7218 | Author: [hidden] (schlatterbeck) | Date: 2021-05-07 13:31 | |
The generated sql for mysql reads: 'select distinct(_issue.id) from issue_keywords,_issue where (1=0) and _issue.__retired__=0 order by _issue.id' not very likely this will ever return the correct set of items :-) |
|||
msg7219 | Author: [hidden] (rouilj) | Date: 2021-05-07 15:53 | |
Hi Ralf, it looks like anydbm and sqlite are working now and you got mysql running. Really just a single line change? I am surprised this bug didn't cause other issues. If I entered -1 for unset in a search form would that line be hit? If so would it work both before and after your change? I am pushing to CI now and will see how many times that line is covered. I wonder if we have a test for -1 (unset) somewhere in the test suite. Nice find. |
|||
msg7220 | Author: [hidden] (schlatterbeck) | Date: 2021-05-07 19:18 | |
On Fri, May 07, 2021 at 03:53:57PM +0000, John Rouillard wrote: > Hi Ralf, it looks like anydbm and sqlite are working now > and you got mysql running. Really just a single line change? > > I am surprised this bug didn't cause other > issues. If I entered -1 for unset in a search form would > that line be hit? If so would it work both before and after > your change? No this is the code with the expression parser for multilinks, e.g. the list '1', '2', '-3' would specify '1' AND '2' ('-3' is the "opcode" for AND). I think this has never worked for mysql. The constants in the expression representation are integers and IDs in roundup always were strings. So this never matched. A single '-1' would have worked. And now (latest commit some minutes ago) a list like '-1', '1', '2' would also work, this is implicit OR so it would mean a multilink containing '1' or '2' or being empty. I think this case has *never* worked. It *did* work for Link properties but not for multilinks. > I am pushing to CI now and will see how many times that line > is covered. I wonder if we have a test for -1 (unset) somewhere > in the test suite. Yes, be sure to include my latest commit. Note that I'm still planning further changes. The original bug (the reason I created this issue in the first place) is still not found. See my todo-list in msg7217, I've finished point 3 in that list :-) And I think point 4 is impossible, we always need a subquery for multilinks. And I'd like to make the expression parser work with: - '-1' being part of an expression, e.g., '1', '2', '-3', '-1', '-4' meaning "1 and 2 or empty" - reverse multilinks, I'm not sure this would work with anydbm, I'm fairly sure it should work with the sql backends. But this needs a test. Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7222 | Author: [hidden] (schlatterbeck) | Date: 2021-05-10 14:36 | |
OK, I've finally found + fixed the problem. During the search I discovered more cases where the query generator did not produce correct results. And the search method for anydbm (Proptree in hyperdb) was originally built on the assumption that an empty list [] would not produce any results when searching for links and multilinks. This has been changed without considering the semantics and also needed fixing (originally searching for empty Link/Multilink was done with the value '-1'). Due to similar reasons the expression parser interacted with the search method of Proptree when in a subtree results need to be combined. These two cases involve fairly rarely occurring use-cases like searching for two different attributes (say name and order) of the same transitive property, so this is probably the reason it was not found earlier. Another area littered with bugs was the interaction with newer features like reverse multilinks: this also was never tested with the multilink expression feature. And finally I'm allowing '-1' in expressions. E.g. we can now express "not empty" with ['-1', '-2'] ('-1' standing for empty and '-2' being the postfix operator for negation) searching e.g. for all issues that have any keyword (no matter which keyword). Note that the multilink expression feature probably has never worked all those years with mysql. It was comparing strings (IDs are strings in the database) with integers and did never find anything. I'm quite happy that the technical dept in that part of the code was finally cleaned up. All existing tests pass and I've created a bunch of new tests that also pass for me. Lets see if the CI test also passes those tests. |
|||
msg7223 | Author: [hidden] (rouilj) | Date: 2021-05-10 17:48 | |
In message <1620657365.75.0.788253625663.issue2551119@roundup.psfhosted.org>, Ralf Schlatterbeck writes: >Ralf Schlatterbeck added the comment: > >OK, I've finally found + fixed the problem. > [...] >I'm quite happy that the technical dept in that part of the code was finally cleaned up. > >All existing tests pass and I've created a bunch of new tests that >also pass for me. Lets see if the CI test also passes those tests. https://www.travis-ci.com/github/roundup-tracker/roundup/builds/225467127 looks good from here. All is green. Python 3.9 runs reports: 1581 passed, 2 skipped, 1 warning skipped: TrackerConfig::testInvalidIndexerLanguage_w_empty_no_xapian HTMLClassTestCase::test_string_stext warning: gpgme is still using imp Can you do a first draft on writing up the expression parser? I am not sure where in the docs it should go. I think the user guide is a good place, but there are a few issues with entering the expression (javascript editor dissapears after use so there is no way to edit the expression, I think there was also an issues with saving a search with an expression). So maybe putting it there is not ideal. Except for documentation is there anything else that has to be done? Thanks for your efforts. I have been lost in the templating code. Have a great day. |
|||
msg7226 | Author: [hidden] (schlatterbeck) | Date: 2021-05-11 08:57 | |
On Mon, May 10, 2021 at 05:48:17PM +0000, John Rouillard wrote: > Can you do a first draft on writing up the expression parser? I am not > sure where in the docs it should go. I can provide some documentation on the filter method -- should probably go into the relevant chapter of doc/customizing.txt. I have no experience with the javascript code for editing expressions. I've never used it myself and I'm not using it in any of my trackers. So I can't contribute anything meaningful to the user-guide part. Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7227 | Author: [hidden] (schlatterbeck) | Date: 2021-05-11 09:35 | |
I've pushed some documentation in both customizing.txt and in hyperdb.filter (almost the same wording). I notice that we have almost no documentation on the hyperdb-level API (There is some documentation of the HTML wrapper but not on the underlying hyperdb API) |
|||
msg7285 | Author: [hidden] (rouilj) | Date: 2021-06-14 01:55 | |
Closing this as fixed. Doc done and Ralf I assume you agree it is complete. Reopen if you disagree. Another blocker for 2.1.0 gone. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2021-06-14 01:55:52 | rouilj | set | keywords:
- Blocker priority: normal messages: + msg7285 status: open -> fixed resolution: fixed |
2021-06-10 01:57:05 | rouilj | set | status: pending -> open |
2021-06-10 01:54:19 | rouilj | set | status: open -> pending |
2021-05-11 09:35:22 | schlatterbeck | set | messages: + msg7227 |
2021-05-11 08:57:52 | schlatterbeck | set | messages: + msg7226 |
2021-05-10 17:48:17 | rouilj | set | messages: + msg7223 |
2021-05-10 14:36:05 | schlatterbeck | set | messages: + msg7222 |
2021-05-07 19:18:19 | schlatterbeck | set | messages: + msg7220 |
2021-05-07 15:53:57 | rouilj | set | messages: + msg7219 |
2021-05-07 13:31:18 | schlatterbeck | set | messages: + msg7218 |
2021-05-07 12:24:59 | schlatterbeck | set | messages: + msg7217 |
2021-05-06 19:16:27 | rouilj | set | keywords: + Blocker |
2021-05-06 19:10:58 | rouilj | set | status: new -> open type: behavior messages: + msg7212 |
2021-04-14 14:20:50 | schlatterbeck | set | messages: + msg7196 |
2021-04-14 13:35:01 | ber | set | files:
+ roundup-keyword-expr-20210414-1.png messages: + msg7195 |
2021-03-22 00:56:19 | rouilj | set | nosy:
+ rouilj messages: + msg7129 |
2021-03-19 08:35:49 | ber | set | messages: + msg7128 |
2021-03-18 19:30:33 | schlatterbeck | set | messages: + msg7127 |
2021-03-18 17:12:36 | ber | set | files:
+ empty_multilink_expr.diff keywords: + patch messages: + msg7126 |
2021-03-18 14:59:46 | ber | set | messages: + msg7125 |
2021-03-18 12:13:36 | schlatterbeck | create |