functions are undefined and the array_of_scalars
should reasonably contain a NaN (“not a number”,
IEEE 754 floating-point standard used for missing
values) at the corresponding position (or an
equivalent value, if NaN is not supported). One
exception is the counting function “n”, which returns
a zero in this case. Another special case is the
corrected sample standard deviation, “SD”, which is
undefined also for a single measurement. In this
case, one could either consistently return a NaN or
revert to the uncorrected sample standard deviation,
which is zero in this case.
2.5 Conditions
Each individual condition can be either true or false
with respect to a single scalar measurement. It
consists of a keyword from the
list_of_condition_keywords supplemented with a
certain number of parameters. We use the following
syntax for such parameterized keywords:
condition_keyword(parameter1, parameter2, …,
parameterN)
Irrespective of this, depending on the particular
implementation, one might as well resort to other
syntactical formulations.
2.5.1 Temporal Constraints
Conditions related to the time of the measurements
will mostly have one or two parameters, specifying a
point in time or a time range, e.g.
• “time_of_day(10:00, 12:00)” – to select data
measured between 10 and 12. More precisely, all
data measured at times t fulfilling the condition
10:00 ≤ t < 12:00. Note that t=12:00 is excluded
here to prevent the same data appearing twice in
selections like “time_of_day(10:00, 12:00)” and
“time_of_day(12:00, 14:00)”.
• “time_of_day(10:00)” - to select data from a
single point in time, here 10:00.
Similarly, “day_of_week(Mon, Fri)” would restrict
data to be chosen between Monday and Friday and
“day_of_week(Mon)” would restrict data to be
chosen from Monday only. In all cases except
“time_of_day”, the second argument is meant to be
inclusive, e.g. “year(2012, 2014)” selects data from
years 2012, 2013, and 2014. Accordingly, the filters
“day_of_month”, “week_of_year”, and “month_of_
year” can take one or two integers as parameters,
whereas the filter “last_n_days” has only one
(positive) integer as parameter.
As an alternative, in accordance with ISO 8601,
one might consider defining also all times uniformly
in terms of single numbers: time in the format
HH:MM could be defined as HHMM (or
HH:MM:SS as HHMMSS). The days of the week
could be enumerated from 1 (Monday) to 7
(Sunday).
The following conditions implement special
functions:
• “continuous_binning(start_time, time_ interval,
end_time)” – this specifies a whole list of
consecutive time intervals of length
time_interval (in seconds), starting at a given
point in time, start_time, and ending at end_time.
These timestamps can be specified, e.g.,
according to RFC 3339 or ISO 8601, as already
suggested in section 2.2. Recommended is the
use of “continuous_binning” as sole element in
either list_of_conditions1 or list_of_conditions2.
It is useless in list_of_conditions0 (where it
should be evaluated consistently as an
unfulfillable condition). Aside from statistical
assess-ments, continuous binning can be used
with “mean” or “median” to equidistantly (sub-
)sample the measurement data and thus to obtain
a regularized representation of the data.
• “all” – always fulfilled, to impose no restrictions
(has no parameters).
2.5.2 Spatial Constraints
Conditions related to the location of the
measurements can be defined, e.g., as follows
• “within_distance_of(location, 0, 20)” – to select
data measured at a distance between 0 and 20
meters of a certain location. A location can be
specified as appropriate, for example as
Cartesian coordinates or in terms of latitude,
longitude, and elevation. Alternatively, the server
could also provide a list of names (text strings)
defining particular locations of interest in the
“locations” (key, value)-pair, each name being
accepted as a valid location in
“within_distance_of”.
• “within_area_of(area)” – to select data measured
within a defined area. As stated in section 2.2, an
area can be specified, for example, as a polygon
on a two-dimen-sional surface or just by a name
(text string) provided in the “areas” (key, value)-
pair, implicitly defining a particular area of
interest.
KMIS 2015 - 7th International Conference on Knowledge Management and Information Sharing