Roundup Tracker - Issues

Issue 2551374

classification
Add error handling for filter expressions
Type: behavior Severity: normal
Components: Database, Web interface Versions:
process
Status: fixed fixed
:
: rouilj : rouilj, schlatterbeck
Priority: :

Created on 2024-12-02 18:37 by rouilj, last changed 2025-01-01 07:17 by rouilj.

Messages
msg8205 Author: [hidden] (rouilj) Date: 2024-12-02 18:37
Roundup has a search expression language based on an RPN grammar. E.G. -1,-2 finds all
links/multilinks that have a value (-1 -> empty, -2 -> NOT, -3 -> AND, -4 -> OR).

Errors in the RPN filter are not reported to the user. This allows invalid expressions
like: -2,... (-2 needs an argument to NOT) to be sent without an error.

Done for this ticket looks like:

  1) exception is raised for malformed expressions
  2) The exception is reported to the user in some form like:
       Your expression is incorrect. While processing the keyword filter, the
         -2 (not) operator at position 1, of "-2,1,8,-3..." no operand was found.

      that:
         1 identifies the operator,
         2 the location in the filter
         3 the property being filtered
         4 the error
         5 the whole filter

      Another example: status=3,2,-2 leaves the 3 on the stack. Currently this returns
        not(status2), but should report an error like:

         Not all values were used in the filter for status. The remaining values were
          '3' after processing the rest of the expression.

In the referenced messages I suggest creating a new ExpressionError and propagating that
back to the user.

I will add the method: NOtestFilteringLinkInvalid to db_test_base.py as an example of
running the two cases above. Code to trap/verify a raised exception TBD.
Since it starts with NO it won't be run and it currently fails if enabled (by renaming
without the NO).

Initial ideas (see msg8201, msg8202):

  Trap IndexError if a pop fails in mlink_expr.py:compile_expression
     Raise a new exception let's call it ExpressionError with the parse info (1,2,4,5 above).

  Verify that there is only one item on the stack when returning. If not raise
     ExpressionError.

  Propagate any other error up the stack.

  Have the caller trap ExpressionError  and modify the ExpressionError message adding
      the property name (3 above) and reraise.

  Make sure the error is displayed to the user.

  Any other exception operates as it does currently.

Not sure that these changes will work as I haven't run through the code but...

Excerpts from msg8201 msg8202:

====
Ralf:
> >On Mon, Dec 02, 2024 at 03:30:54AM +0000, John Rouillard wrote:
> >[...]
> >> Another example is: ``creator=3,-2,1,-2,-3``. This is the same as the
> >> expression: ``(not user3) and (not user1)``. Using the rules of logic,
> >> this is the same as: ``not (user3 or user1)`` which is expressed in
> >> RPN as ``creator=3,1,-4,-2``. Compare this to ``creator=3,1,-2`` which
> >> returns issues created by user3 or any user other than user1.
> >
> >Have actually tried the last example? I'm not sure the implicit 'OR'
> >applies when other expressions have been matched. So it may well be that
> >creator=3,1,-2 is a syntax error (and would return *all* issues) and you
> >really need the final 'OR' like in creator=3,1,-2,-4
>
> I did test it. Both link (using status) and multilink (using keyword)
> cases. I just retested manually in case I screwed up the URL
> editing. It returns what I said, all entries without user1. The 3 is
> effectively a no-op since '1,-2' includes 3 anyway.
>
> Looking at mlink_expr.py:compile_expression, I'll bet '3' is just left
> on the stack and "1,-2" is popped/returned rather than a syntax
> error. compile_expression probably should check stack depth before
> returning and if it's not 1, raise an exception.

Yes that was my hunch here, too, that when in 'expression mode' it would
not do the implicit OR.
Checking the stack depth is a good idea.

> Also since "keyword=-2,1,8,-3,-2,-3" doesn't throw an error, I suspect
> your intuition is right. (However it acts like the expression was
> "keyword=-1,-2" not sure if that's intentional.)

That would be an operator on an empty stack... we probably should check
that, too. (Ah it is caught by the bare except statement)

[...]
> I wonder if wrapping the opcodes loop in
> mlink_expr.py::compile_expression in a try/except IndexError and
> raising a new ExpressionError exception that propigates through
> rdbms_common.py:_filter_multilink_expression's bare except would work?
...
> This fix would need to be in rdbms and anydbm back ends.
> Also it needs a new issue.

Yes, I also think fixing the expression parsing deserves its own
issue... Or we just document that an expression error is silently
ignored and returns the OR of all valid issue ids.
====

See also: msg8201 msg8202
msg8239 Author: [hidden] (rouilj) Date: 2024-12-31 01:23
Two cases are handled:

1. pop on the stack when there is nothing to pop. Error (wrapped for reading) looks like:

     There was an error searching issue by status using: [-2, 3, 4, -4].
     The operator -2 (not) at position 1 has too few arguments.

   a second case with two errors (-4 has too few arguments too) reports only
   the first error:

     There was an error searching issue by status using: [-2, 4, -4]. The operator
     -2 (not) at position 1 has too few arguments.

  where the terms "issue", "status", "[-2, 4, -4]" '-2 (not)' and position number (1)
  are all supplied from a context attribute on the new ExpressionError exception.

2. The stack has more than one item on it when returning. Error looks like:

     Error when searching issue by status using: [-1, 6, 1, 2, -4]. There are too many
     arguments for the existing operators. The values on the stack are: [ISEMPTY(-1),
     Value 6, (Value 1 OR Value 2)]

   where "issue", "status", "[-1, 6, 1, 2, -4]", and "[ISEMPTY(-1), Value 6,
   (Value 1 OR Value 2)]" are supplied by the context property on the
   ExpressionError exception.

The ExpressionError exception is called like:

        raise ExpressionError(
            "There was an error searching %(class)s by %(attr)s using: %(opcodes)s. "
            "The operator %(opcode)s (%(opcodename)s) at position "
            "%(position)d has too few arguments.",
             context={
                 "opcode": opcode,
                 "opcodename": opcode_names[opcode],
                 "position": position + 1,
                 "opcodes": opcodes,
             })

it's __repr__ and __str__ methods use args[0] as a % print template with context as the
dict. If using % fails, and alternate output format is used. The context argument is 
added to by callers of the Expression class since the class and attribute being searched
is only available to the callers.

Replaced NOtestFilteringLinkInvalid with testFilteringBrokenMultilinkExpression and
testFilteringBrokenLinkExpression. Also a test was added to test_liveserver.

Note that the error displayed on the web isn't very nice. It falls back to a basic error
page with no frame or other components. It would be better if we could display an empty
index page with an error message at the top. Handling a nice display is still to be done.

Even with this poor UI, it does help identify issues when I make a mistake with a filter
expression.

changeset:   8241:741ea8a86012
msg8244 Author: [hidden] (rouilj) Date: 2025-01-01 07:17
in changeset:   8253:cae1bbf2536b, replaced:

  self.renderError(e.args[0])

with:

   self.template = "search"
   self.write_html(self.renderContext())

self.template is "index" which tries to execute the search and fails over and over
again. By changing it to "search", I get the search page with the error and the
bad entry is displayed in the text input.

If the search was not generated from an alternate search page, the user is redirected
to the base search page. This might not allow the user to fix the problem (e.g. if
the search field is not present).

I think this is done for now. If we have somebody run into the alternate search page 
problem we can handle it then. A couple of ways might work:

  1) submit the page using an '@template=oktemplate|errortemplate' and change the
     core code to store that template info so I can use it when I handle the
     ExpressionError. This is probably the cleaner way to do it. Modifying
     the template element of the Client class to add oktemplate/errortemplate
     and parsing/setting them in determine_context().

  2) the search template can look for fields that an alternate search template
     handles and redirect the user by adding @template=altsearch to the url.
     This can be implemented in a tracker while 1 needs core code changes.
History
Date User Action Args
2025-01-01 07:17:34rouiljsetstatus: open -> fixed
assignee: rouilj
resolution: fixed
messages: + msg8244
2024-12-31 01:24:03rouiljsetnosy: + schlatterbeck
2024-12-31 01:23:34rouiljsetstatus: new -> open
messages: + msg8239
2024-12-02 18:56:34rouiljlinkissue2550698 superseder
2024-12-02 18:37:42rouiljcreate