> Think about rates, scientific parameters or small prices
Ok, you're right; I was thinking along the lines of identifiers or at
least enumerable types (which floating point numbers are not).
With whitespace following, a comma could some day be accepted as an
operator (once we have a solution which accepts formatted input, e.g.
with decimal commas); AFAIK, punctuation marks (as commas in
enumerations) are separated by whitespace from following text in all
natural languages using latin letters anyway.
> supporting "and" for intervals will confuse the typical user
Agreed. Maybe we can some day come up with a suitable keyword or symbol
for 'intersection'.
> Of course the same syntax should be generalized for date values, too.
Agreed.
This arises the problem of compatibility to existing syntax which must
be handled somehow. However, I don't insist in total compatibility,
since the existing syntax doesn't support enumerations, and IMO we need
this.
Thus, here is my new proposal:
expression resulting set remarks
========== ============= =======
1; 2; 3 1, 2, 3 no parentheses nor 'to'
`-> enumeration (';' is 'or')
1 to 5 1, 2, 3, 4, 5 'to' -> range
1; 5 1, 5 <> existing date syntax!
[1; 5] 1, 2, 3, 4, 5 parentheses -> range
(';' is 'to')
1 to 3; 6 1, 2, 3, 6
[1; 5) 1, 2, 3, 4 like current "1; 5" for dates
(1; 5) 2, 3, 4 for consistency
[1; 3] or (6; 9) 1, 2, 3, 7, 8
[3;] 3, 4, 5 ...
(3;) 4, 5, 6 ...
(3;) or 1 1, 4, 5, 6 ...
(3;);1 1, 4, 5, 6 ...
With comma support (whitespace after semicolon is optional; whitespace
after comma is significant):
1, 2, 3 1, 2, 3 natural language
1,2,3 ERROR possible ambiguity
1,5 ERROR possible ambiguity
(1,5) ERROR possible ambiguity
(1, 5) 2, 3, 4
[1,] 1, 2, 3, ...
(all examples with ";" should work, with ";\s*" replaced by ", ";
commas raise errors unless followed by whitespace or parentheses)
Thus, the 'to' keyword would imply a closed range, *including* the
boundary values: "a to b" is the same as "[a; b]" (or "[a to b]").
Logic:
- parentheses are considered first:
* every parenthesis is unambiguously opening ('(', '[', perhaps '{')
or closing
* assumptions about the possible content are implied (ranges, open
or closed; perhaps '{' enforcing non-range)
- 'to' implies a range; unless the nature of the range is appointed
by parentheses, it implies a *closed* range
- 'or' implies alternatives
- every ';' can be replaced by 'to' (range meaning) or 'or'
More examples:
expression resulting set remarks
========== ============= =======
1 to 3; 5 1, 2, 3, 5
1 to 3 to 5 ERROR
[1; 3] or 5 1, 2, 3, 5
[1; 3]; 5 1, 2, 3, 5
(1; 3) or 5 2, 5
1 or 3 or 5 1, 3, 5 {without parentheses,
1; 3 or 5 1, 3, 5 ";" means 'or'}
To be discussed:
(1 to 3; 5) 1, 2, 3, 5 internally turned to
((1 to 3) or 5)
1 to 3 or 5 1, 2, 3, 5 like "(1 to 3) or 5"
[1 or 3] 1, 3 because of "[1; 3]", which
includes 1 and 3 as well
(1 or 3) 1, 3 should be allowed
{1; 3} 1, 3 non-range implying parentheses
{1 to 3} ERROR contradiction
(3;)1 ERROR missing operator
(1; 3;) 1, 3; or ERROR ';' before closing parenthesis
could imply "no limit"
(;) all as long as no 'and' is imple-
mented, this will cause the
entire expression to match
every value
1 5 1, 5
1;3 5 1, 3, 5
(1;3 5) ERROR ambiguity
Some thoughs for dates:
- the syntax is incompatible to the existing; the current
"a;b" date expression would need to be expressed as "[a; b)"
- "a to b" (or "[a; b]") would *contain* the records of day b
(it doesn't currently, does it?), while "[a; b)" would not;
as long as no hours or minutes (...) are specified, only the date
parts of the values should be considered
- we should have a configuration switch for date expression
compatibility:
* first, the switch should be set by default to evaluate *date*
expressions the traditional way (since they might be used by
scripts, which is not true for non-date range exceptions)
* after a reasonable time, the next major Roundup version switch
(Roundup v1.5?) should switch this default to the new syntax
* again after a reasonable time, the support for old date expressions
could be removed
A function which takes an expression and returns a function,
could/should return a normalized version of the expression as well, e.g.
replacing all semicolons with 'or' or 'to', like this:
>>> evaluate("[3;8)")
("[3 to 8)", lambda x: 3 <= x < 8)
If this normalized version would be used in a "refine search" input
field of the result page -- or displayed/inserted by some AJAX
functionality -- it could make the whole feature quite self-explanatory.
The only difficult bit would be about "open" ("()") or "closed" ("[]")
ranges.
Longing for comments! |