Message3774
First, what we agree about:
- Semicolon and 'to' should continue to work as range operator
- the optional 'from' could be supported as well
- for several ranges to be combined, parentheses are needed
- all existing search syntax expression syntax
- 'or' should be supported to combine range expressions
- the comma currently works as an 'or' operator, which I admit
I wasn't aware of (I didn't use it, since whitespace works as well)
and should continue to work.
And, of course, in real world usage I will yield SQL code (e.g. a
"while" clause without "while", but with dict comprehension placeholders
for the variable name). My first goal was expression evaluation, and it
won't be difficult to create an expression for SQL instead of Python.
The Python expression is a good first shot, and it can be tested very
easily.
I agree that some of my examples are somewhat outdated; as I wrote in
msg3772, the syntax is too liberal, and I will restrict it (see below).
As mentioned above, I was not aware of the current nature of the comma.
BTW, as you could have noticed, I uploaded a unit test for float ranges
as well (file412); and of course I know that enumerations don't make
much sense for them.
Then something where you are IMO wrong:
- The whitespace syntax is *not* new; it is supported e.g. when seeking
issues by id. Thus, it *must* continue to work.
- *Requiring* the comma (or 'or') as an enumeration operator thus
would *break* compatibility (yes, it is supported; but it is
currently not *required*).
I disagree about the decimal comma cause. You and me, we are IT experts
who are used to decimal points; but many people who are supposed to
simply *use* Roundup are not. One major strength of Roundup is its huge
customizability which makes it suitable for non-geek usage (for people
who won't use Bugzilla because it is -- or looks -- too complicated).
Roundup should treat them well.
Of course it would be possible to recognise floats in both flavours,
with decimal points or decimal comma, regardless of the locale setting.
It just occurred to me: we *have* a difference already between search
syntax for date and integers, which is derived from the value syntax;
and we know it doesn't make very much sense to support enumerations for
floats, because of the nature of the data. Thus we could simply drop
enumeration support for floats (only open intervals and 'or'), and we
could support decimal commas right away. Probably we should require at
least one digit behind the comma, and when enumeration syntax is used,
the expression generator could yield a useful message.
I strictly disagree about whether curly brackets would make the syntax
harder to understand; and I disagree about the "problem" of having both
ranges and enumerations in the same expression.
I consider it a quite common need to specify both a range and a
enumeration. It is easy, it is implemented, and it should be possible.
What is difficult about "[3; 7] or {2 3 5}"? The most difficult bit is
the difference between open and closed interval edges. The curly
brackets are a possibility to explicitly state you want an enumeration.
This is very "pythonic": explicit is better than implicit ;-)
There is a problem about the range "(3; 5)" and an enumeration "(3, 5)":
The range doesn't contain the edges (thus, for integers it would yield
4), while the enumeration *does* contain the edges. This is
inconsistent. The better solution would be to enforce the usage of
curly brackets (instead of others; see below) for enums, which make
clear that an enumeration is a completely different beast.
There is another problem. Considering "(a;)" and "(;a)", logically
"(;)" must be a match-all expression, because a range spec contains
restrictions which are not present here. An empty set "(,)" in contrast
can never ever match *all* but must match *nothing*, because the set
values are positive specifications. It is A Good Thing to make this
difference visible, and to write the set "{}".
My updated proposal is:
- semicolons are for ranges, commas for enumerations/sets (your point)
- whitespace is for enumerations/sets (support required and convenient)
- "a; b" -- a range
- "a b", "a, b" -- equivalent sets
- "[a; b]" -- the above range, but combinable
- "{a, b}", "{a b}" -- the above set, but combinable
- 'or' can be used anywhere instead of commas
- 'to' can be used anywhere instead of semicolons
- commas (...) inside range bracketes are errors
- semicolons (...) inside set brackets are errors
- expressions like "a; b; c" (a range can have at most 2 edges)
or "a; b, c" are errors (change of expression type requires brackets)
- a single value in the expression is a "set of itself"
(and of course doesn't require a comma)
- when an error occurs, a nice message (including the position)
explains it.
Now the compatibility topic.
We need a new syntax anyway; for numeric fields, the new syntax will be
the *only* possibility to specify ranges. To stay compabible, it *must*
support whitespace enumerations for numbers.
For date values, whitespace can't be used this way, because Roundup's
date specs can contain whitespace. There is nothing we can do about
this difference: Whitespace was supported for integers, and so must
continue to be; whitespace could not be supported for date values, and
won't ever be. For both cases (and for floats as well, of course),
there are two possibilities they have in common:
- the 'or' operator
- inside curly brackets, the semicolon divides values, but changes
nothing about the nature of the enumeration-implying curly bracket.
I can't see a problem with my proposal of a configuration switch
(msg3765). The current functionality for the old date search syntax is
there, and so are hopefully the unittests. There will be unittests for
the new syntax as well. What else do you need?!
If it is already true that date specs like '-1d' refer to yesterday,
0:00, this is fine; nobody wants to change this. The 2nd edge seems to
be seldom used anyway; however, we should be as consistent between data
types as possible, and thus a specification of "yesterday;today" should
find *all* of yesterday and *all* of today (some date values can lie in
the future).
It is *possible* that there are scripts around that rely on the
[date1;date2) logic of the old syntax. Given enough time, it won't be a
problem to update the scripts; this wouldn't be complicated. Given a
configuration switch, every admin can decide himself if and when to
switch. Stored queries could be updated easily be a script as well. It
would be worth the afford.
Let's take the opportunity to do it right. I'll implement it -- maybe
with a little help with the dates --, including the unit tests, and
everyone will like it; I promise ;-) |
|
Date |
User |
Action |
Args |
2009-07-13 17:57:46 | tobias-herp | set | messageid: <1247507866.37.0.282981276955.issue1182919@psf.upfronthosting.co.za> |
2009-07-13 17:57:46 | tobias-herp | set | recipients:
+ tobias-herp, richard, schlatterbeck, ber, ajaksu2 |
2009-07-13 17:57:46 | tobias-herp | link | issue1182919 messages |
2009-07-13 17:57:45 | tobias-herp | create | |
|