> Think about rates, scientific parameters or small prices
Ok, you're right;  I was thinking along the lines of identifiers or at
least enumerable types (which floating point numbers are not).
With whitespace following, a comma could some day be accepted as an
operator (once we have a solution which accepts formatted input, e.g.
with decimal commas);  AFAIK, punctuation marks (as commas in
enumerations) are separated by whitespace from following text in all
natural languages using latin letters anyway.
> supporting "and" for intervals will confuse the typical user
Agreed.  Maybe we can some day come up with a suitable keyword or symbol
for 'intersection'.
> Of course the same syntax should be generalized for date values, too.
Agreed.
This arises the problem of compatibility to existing syntax which must
be handled somehow.  However, I don't insist in total compatibility,
since the existing syntax doesn't support enumerations, and IMO we need
this.
Thus, here is my new proposal:
expression                resulting set  remarks     
==========                =============  =======
1; 2; 3                   1, 2, 3        no parentheses nor 'to'
                                         `-> enumeration (';' is 'or')
1 to 5                    1, 2, 3, 4, 5  'to' -> range
1; 5                      1, 5           <> existing date syntax!
[1; 5]                    1, 2, 3, 4, 5  parentheses -> range
                                         (';' is 'to')
1 to 3; 6                 1, 2, 3, 6
[1; 5)                    1, 2, 3, 4     like current "1; 5" for dates
(1; 5)                    2, 3, 4        for consistency
[1; 3] or (6; 9)          1, 2, 3, 7, 8
[3;]                      3, 4, 5 ...
(3;)                      4, 5, 6 ...
(3;) or 1                 1, 4, 5, 6 ...
(3;);1                    1, 4, 5, 6 ...
With comma support (whitespace after semicolon is optional; whitespace
after comma is significant):
1, 2, 3                   1, 2, 3        natural language
1,2,3                     ERROR          possible ambiguity
1,5                       ERROR          possible ambiguity
(1,5)                     ERROR          possible ambiguity
(1, 5)                    2, 3, 4
[1,]                      1, 2, 3, ...
(all examples with ";" should work, with ";\s*" replaced by ", ";
commas raise errors unless followed by whitespace or parentheses)
Thus, the 'to' keyword would imply a closed range, *including* the
boundary values: "a to b" is the same as "[a; b]" (or "[a to b]").
Logic:
- parentheses are considered first:
  * every parenthesis is unambiguously opening ('(', '[', perhaps '{')
    or closing
  * assumptions about the possible content are implied (ranges, open
    or closed; perhaps '{' enforcing non-range)
- 'to' implies a range; unless the nature of the range is appointed
  by parentheses, it implies a *closed* range
- 'or' implies alternatives
- every ';' can be replaced by 'to' (range meaning) or 'or'
More examples:
expression                resulting set  remarks     
==========                =============  =======
1 to 3; 5                 1, 2, 3, 5
1 to 3 to 5               ERROR
[1; 3] or 5               1, 2, 3, 5
[1; 3]; 5                 1, 2, 3, 5
(1; 3) or 5               2, 5
1 or 3 or 5               1, 3, 5        {without parentheses,
1; 3 or 5                 1, 3, 5         ";" means 'or'}
To be discussed:
(1 to 3; 5)               1, 2, 3, 5     internally turned to
                                         ((1 to 3) or 5)
1 to 3 or 5               1, 2, 3, 5     like "(1 to 3) or 5"
[1 or 3]                  1, 3           because of "[1; 3]", which
                                         includes 1 and 3 as well
(1 or 3)                  1, 3           should be allowed
{1; 3}                    1, 3           non-range implying parentheses
{1 to 3}                  ERROR          contradiction
(3;)1                     ERROR          missing operator
(1; 3;)                   1, 3; or ERROR ';' before closing parenthesis
                                         could imply "no limit"
(;)                       all            as long as no 'and' is imple-
                                         mented, this will cause the
                                         entire expression to match
                                         every value
1 5                       1, 5
1;3 5                     1, 3, 5
(1;3 5)                   ERROR          ambiguity
Some thoughs for dates:
- the syntax is incompatible to the existing;  the current
  "a;b" date expression would need to be expressed as "[a; b)"
- "a to b" (or "[a; b]") would *contain* the records of day b
  (it doesn't currently, does it?), while "[a; b)" would not;
  as long as no hours or minutes (...) are specified, only the date
  parts of the values should be considered
- we should have a configuration switch for date expression
  compatibility:
  * first, the switch should be set by default to evaluate *date*
    expressions the traditional way (since they might be used by
    scripts, which is not true for non-date range exceptions)
  * after a reasonable time, the next major Roundup version switch
    (Roundup v1.5?) should switch this default to the new syntax
  * again after a reasonable time, the support for old date expressions
    could be removed
A function which takes an expression and returns a function,
could/should return a normalized version of the expression as well, e.g.
replacing all semicolons with 'or' or 'to', like this:
>>> evaluate("[3;8)")
("[3 to 8)", lambda x: 3 <= x < 8)
If this normalized version would be used in a "refine search" input
field of the result page -- or displayed/inserted by some AJAX
functionality -- it could make the whole feature quite self-explanatory.
The only difficult bit would be about "open" ("()") or "closed" ("[]")
ranges.
Longing for comments! |