Roundup Tracker - Issues

Message3782

Author tobias-herp
Recipients ajaksu2, ber, richard, schlatterbeck, tobias-herp
Date 2009-07-14.12:09:18
Message-id <1247573360.02.0.452146124266.issue1182919@psf.upfronthosting.co.za>
In-reply-to
Some recapitulation for the start:

- we agree to support ranges with "open interval" edges; otherwise
  a search for "[date1;date2)" (which is the current understanding of
  "date1;date2") won't be possible.
- thus, we consequently need to support
  "(a; b)"           <-->    a <  x <  b
  "[a; b]"           <-->    a <= x <= b
  "[a; b)"           <-->    a <= x <  b
  "(a; b]"           <-->    a <  x <= b
  and
  "(a;)"             <-->    a <  x
  "[a;)"             <-->    a <= x
  "(;b)"             <-->         x <  b
  "(;b]"             <-->         x <= b
  (maybe we can support things like "[;b]" as well, but I'm not sure
  whether we should)
- my proposal is to support commonly recognised set syntax for sets
  ("{1, 2, 5}"); *whereever possible*, separation of values can be
  done with whitespace.

Now for the compatibility and current date search issue.

I think we agree that, throughout *the new* search syntax, the default
nature of ranges (which are specified without braces and thus cannot be
combined with other ranges and sets) should be as consistent as possible.

But your proposal to make "[a; b)" the default behaviour would imply
that users searching for "1; 5" would get {1, 2, 3, 4}, but not 5.

For computer scientists, this might look logical, since they are used to
this kind of indexing e.g. for Python slices.  But for the usual guy
(and for this use case -- think of ids -- the usual computer scientist
as well) who issues a search request, it is not; we would expext to get
"5" as well, and rather assume that "5" doesn't exist.

Think of someone who is about to search for "2 5"; then he changes his
mind and decides, just to be sure, to check for the values inbetween as
well: he changes his search expression to "2; 5".  This is the kind of
things which should work as expected.

The current date syntax is inappropriate for numbers; the open-interval
edge default was fixed with past-only dates in mind.  Most date fields
won't ever be set to a future value, and so this was accepted.  But it
is not a good idea to be continued in the future and carried over to
numbers and ids.

Question: is it currently possible to search for "-1d; ." and get all
values in the 48-hour range of yesterday and today?  IMO, this search
should be possible.

We can leave it to the admin whether and when to switch date search
syntax to the new logic (we should), and we can provide a simple script
to convert all old search expressions in stored searches to new ones (we
should, as well).

Just to clarify, answering to what you have written:
> One syntax. Which passes the old unittests. Otherwise users
> will have to change *all* their stored queries.

- it is not necessary for the *new* syntax to pass the *old*
  unittests; the *old* syntax must pass the *new* unittest.
  Otherwise we couldn't introduce a new syntax at all
  (which is what we are about to do).
- the old syntax will continue to be valid, and perhaps --
  "." could keep its meaning of current date and time --
  would yield the same results.

> Roundup is deployed for large installations.
> - Think about the python.org tracker
> - Think about a customer of mine with >200 users
>   and *lots* of stored queries.

Yes; both cases are covered nicely by my proposal about a switch and a
conversion script.

> But I think we can achieve that when using semicolon for ranges.

Of course; that's what we do: we use semicolon for ranges.

We *shouldn't* tie us to a default understanding of ranges which was --
given past-only values! -- suitable for dates, and bugger up the numeric
and id search.  Days are ranges by nature (24-hour-ranges), and numbers
and ids are not.

> If you're referring to example date lets use a concrete example:
>  2009-06-01;2009-06-02
> includes all times after 2009-06-01 midnight (including midnight)
> up to and *not* including 2009-06-02 midnight.

If I understand this right, "midnight" is 0:00, and thus the specified
range spans 24 hours.

- What if someone seeks "2009-06-01"; wouldn't this
  find the same range? (IMO, it should!)
- Is this really an issue for stored queries?
  (while we can convert those:)
  I'd reckon most of these search for relatively specified
  ranges like "-3d;" or "-1d;." which would make the whole problem
  an non-issue.

We have the chance to introduce a more convenient semantics. "." will
continue to mean "now", including the time.

For ranges, we should consider days atomic (unless time specified).
Carrying over our range considerations from numbers to atomic days, I get:
- "[day1; ..."   means   *including* day1, thus "day1, 0:00" <= x
- "... ;day2]"   means   *including* day2, thus x <= "day2, 24:00"
                      (or, with databases in mind, x <= date(day2))
- "(day1; ..."   means   *excluding* day1, thus date(day1)   <  x
- "... ;day2)"   means   *excluding* day2, thus x  < "day2, 0:00"
                      (or, with databases in mind, x < date(day2))

Currenty, if I want to seek for records from day1 to day2, I *must*
specify day1 and day2+1; for scripts, this requires date calculations.

To seek for the records of 2009-06-01 and 2009-06-02, I must currently
specify
  "2009-06-01;2009-06-03"
This is frankly a PITA.  Given the choice, I'm sure almost everyone
(including the computer scientists) would prefer to use inclusive ranges
for dates.

For convenience, we could have a symbol 'now' which is equivalent to "."
(there is an SQL function "NOW()").  And we could have a symbol "*",
meaning (all of) today.

The we could seek for...
- ;.         (like current ";." -- different in theory,
             but not in practice)
- ;now       (the same; more explicitly saying "this includes time")
- ;*         (all up to today, 24:00)
- (;*)       (all before today, 0:00; remember atomic days)
- ;-1d       (the same, expressed as "all up to yesterday")

For sets of dates, there are several possibilities. I already called the
comma an 'or' operator; it can be used without any problem e.g. to make

  {1, 2, 5}, [9;)

the same as

  {1, 2, 5} or [9;)

Since a set can consist of a single element, date sets can be noted as
unions of sets:

  {date1} or {date2}
  {date1}, {date2}
or simply (dates can't contain commas, can they?):
  {date1, date2}

There is no problem with

  date1, date2

(no semicolon there, thus it can't be a range), and we can even support

  date1 or date2

... which some might prefer, being an even more human-readable variant.

(We don't necessarily need to support the 'or' inside the braces, but it
is logical: "{date1, date2}" -- or "{date1 or date2}" -- is the same as
"{date1} or {date2}")

>> Let's take the opportunity to do it right. (...)
>
> Fine. But lets start with a specification not an implementation.

That's why we are discussing the specification ;-)
(Of course, "it can't be done" would be a killer argument. But it
clearly *can* be done).
History
Date User Action Args
2009-07-14 12:09:20tobias-herpsetmessageid: <1247573360.02.0.452146124266.issue1182919@psf.upfronthosting.co.za>
2009-07-14 12:09:20tobias-herpsetrecipients: + tobias-herp, richard, schlatterbeck, ber, ajaksu2
2009-07-14 12:09:20tobias-herplinkissue1182919 messages
2009-07-14 12:09:18tobias-herpcreate