Knowledge Discovery from Time Series

Outline of the Approach

In the DFG (Deutsche Forschungsgemeinschaft) project Kl 648/1 we were concerned about the question of how to learn rules about qualitative / semi-quantitative dependencies in multivariate time series (keywords: knowledge discovery from time series, rule discovery in time series, temporal pattern discovery). The approach can be sketched as follows: At the beginning the time series are segmented and thereby transformed into sequences of labeled intervals. The labels denote qualitative aspects of the signal in the respective intervals. Then, from the sequence of labeled intervals, we discover temporal patterns that occur more often than a certain threshold. The temporal patterns are sets of intervals where Allen's interval logic is used to capture their temporal relationships. From these patterns rules can be derived with temporal patterns in the premise and conclusion. Rules may be specialized with respect to numerical attributes like the length of the intervals or the slope of the signal within the interval. Finally, we obtain rules like when signal A decreases while signal B increases with slope greater than 2 then signal C will decrease. Since humans use a similar syntax when discussing such aspects, the proposed methodology may support a human in learning from temporal data. New insights into the examined time series may then be the motivation for extracting different features from the time series and the process is restarted.

The approach can also be applied to sequential data other than time series (for instance biological sequences, medical profiles, etc.). For sequential learning the time series abstraction step has to be replaced by an appropriate procedure that yields a sequence of intervals. Very often no such method is necessary because the data is already in this format (e.g. deseases of a patient, insurance contracts, period in which a certain DNA sequence occurs, etc.).

The figure above depicts the approach graphically. The arrows indicate the processing steps necessary to reach the next representation. Terms with question marks indicate important points that need special attention in the respective step.

The approach utilizes techniques from artificial intelligence, machine learning, data mining and signal processing. This project was funded by the DFG (Deutsche Forschungsgemeinschaft) under grant Kl-648.

To probe further...

You can find my publications related to temporal patterns below, some of them are available on-line (g'zipped postscript (.ps.gz) and portable document format (.pdf)).

  • Overview: For a brief overview of the approach, see
    • F. Höppner: Learning Dependencies in Multivariate Time Series. Proc. of the ECAI'02 Workshop on Knowledge Discovery in (Spatio-) Temporal Data, Lyon, France, pp. 25-31, July 2002. .ps.gz ] [ .pdf ]
    • F. Höppner: Lernen lokaler Zusammenhänge in multivariaten Zeitreihen. Tagungsband zum 5. Göttinger Symposium Soft Computing, Göttingen, pp. 113-125, Juni 2002. .ps.gz ] [ .pdf ]
  • Pattern Space: To capture the interval relationships we use Allen's interval logic. A pattern thus consists of a set of intervals, their labels, and their interval relationships (like before, meets, overlaps, etc.) Sometimes, the true patterns in the data cannot be represented by a single pattern of our pattern space, for instance ``B starts some time after A'' may manifest in pattern ``A before B'', ''A meets B'' or even ``A overlaps B''. This paper describes an approach to overcome such difficulties:
  • Feature Selection: Which labels do we want to consider? Since humans are used in hierarchically refining contexts, we start with increasing/decreasing or concave/convex labels, which are then (qualitatively or quantitatively) refined during the process. For quantitative constraints see
    • F. Höppner, F. Klawonn: Finding Informative Rules in Interval Sequences. Advances in Intelligent Data Analysis, Proc. of the 4th International Symposium, Lecture Notes in Computer Sciences 2189, Springer. Lissabon, Portugal, pp. 123-132, Sept. 2001. .ps.gz ] [ .pdf ] © Springer
  • Noise Handling: How do we want to distinguish between (possibly non-Gaussian) noise and features of the observed system during time series abstraction? We use scale-space filtering and scale-space lifetime to extract robust and perceptually important features.
  • Feature Ambiguity: It is often not a priori clear which aspects of a time series (at which scale) are of interest for the patterns we want to discover. Therefore, we use a multiscale description to reflect the ambiguity in the labels (a decreasing segment may become an increasing segment if we zoom out).
  • Efficiency: Techniques from association rule mining are adopted to find all patterns that occur more often than a certain threshold. A number of pruning techniques is used to make the process as efficient as possible. See the following references (the last one is the most detailed one):
    • F. Höppner: Learning Temporal Rules from State Sequences. IJCAI Workshop on Learning from Temporal and Spatial Data, Seattle, USA, pp. 25-31, 2001. .ps.gz ] [ .pdf ]
    • F. Höppner: Discovery of Temporal Patterns - Learning Rules about the Qualitative Behaviour of Time Series. Proc. of the 5th European Conference on Principles and Practice of Knowledge Discovery in Databases, Lecture Notes in Artificial Intelligence 2168, Springer. Freiburg, Germany, pp. 192-203, Sept. 2001. .ps.gz ] [ .pdf ] © Springer
    • F. Höppner, F. Klawonn: Learning Rules about the Development of Variables over Time. In: C.T. Leondes (editor): Intelligent Systems - Techniques and Applications, vol IV, CRC Press, 201-228, 2002.
  • Generalization: Some kinds of patterns cannot be expressed by single elements of our pattern space. In this case, we can try to find a set of elements that approximate the true relationship in the data approximately. A possible approach is described in
Home Disclaimer © F. Höppner last update: Mo 1. Mär 07:43:38 CET 2010