1.3. Namespace Dimensionality

class Dimensionality

Khiva Dimensionality class containing several dimensionality reduction methods.

Public Static Functions

static KhivaArray Khiva.Dimensionality.Paa(KhivaArray arr, int bins)

Piecewise Aggregate Approximation (PAA) approximates a time series \(X\) of length \(n\) into vector \(\bar{X}=(\bar{x}_{1},…,\bar{x}_{M})\) of any arbitrary length \(M \leq n\) where each of \(\bar{x_{i}}\) is calculated as follows:

\[ \bar{x}_{i} = \frac{M}{n} \sum_{j=n/M(i-1)+1}^{(n/M)i} x_{j}. \]
Which simply means that in order to reduce the dimensionality from \(n\) to \(M\), we first divide the original time series into \(M\) equally sized frames and secondly compute the mean values for each frame. The sequence assembled from the mean values is the PAA approximation (i.e., transform) of the original time series.

Return
An array of points with the reduced dimensionality.
Parameters
  • arr: Set of points.
  • bins: Sets the total number of divisions.

static KhivaArray Khiva.Dimensionality.Pip(KhivaArray arr, int numberIps)

Calculates the number of Perceptually Important Points (PIP) in the time series.

[1] Fu TC, Chung FL, Luk R, and Ng CM. Representing financial time series based on data point importance. Engineering Applications of Artificial Intelligence, 21(2):277-300, 2008.

Return
KhivaArray with the most Perceptually Important number_ips.
Parameters
  • arr: Expects an input array whose dimension zero is the length of the time series.
  • numberIps: The number of points to be returned.

static KhivaArray Khiva.Dimensionality.PlaBottomUp(KhivaArray arr, float maxError)

Applies the Piecewise Linear Approximation (PLA BottomUP) to the time series.

[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.

Return
The reduced number of points.
Parameters
  • arr: Expects a khiva_array containing the set of points to be reduced. The first component of the points in the first column and the second component of the points in the second column.
  • maxError: The maximum approximation error allowed.

static KhivaArray Khiva.Dimensionality.PlaSlidingWindow(KhivaArray arr, float maxError)

Applies the Piecewise Linear Approximation (PLA Sliding Window) to the time series.

[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.

Return
The reduced number of points.
Parameters
  • arr: Expects a khiva_array containing the set of points to be reduced. The first component of the points in the first column and the second component of the points in the second column.
  • maxError: The maximum approximation error allowed.

static KhivaArray Khiva.Dimensionality.RamerDouglasPeucker(KhivaArray points, double epsilon)

The Ramer–Douglas–Peucker algorithm (RDP) is an algorithm for reducing the number of points in a curve that is approximated by a series of points. It reduces a set of points depending on the perpendicular distance of the points and epsilon, the greater epsilon, more points are deleted.

[1] Urs Ramer, “An iterative procedure for the polygonal approximation of plane curves”, Computer Graphics and Image Processing, 1(3), 244–256 (1972) doi:10.1016/S0146-664X(72)80017-0.

[2] David Douglas &; Thomas Peucker, “Algorithms for the reduction of the number of points required to represent a

digitized line or its caricature”, The Canadian Cartographer 10(2), 112–122 (1973). doi:10.3138/FM57-6770-U75U-7727

Return
KhivaArray with the x-coordinates and y-coordinates of the selected points (x in column 0 and y in column 1).
Parameters
  • points: KhivaArray with the x-coordinates and y-coordinates of the input points (x in column 0 and y in column 1).
  • epsilon: It acts as the threshold value to decide which points should be considered meaningful or not.

static KhivaArray Khiva.Dimensionality.SAX(KhivaArray arr, int alphabetSize)

Symbolic Aggregate approXimation (SAX). It transforms a numeric time series into a time series of symbols with the same size. The algorithm was proposed by Lin et al.) and extends the PAA-based approach inheriting the original algorithm simplicity and low computational complexity while providing satisfactory sensitivity and selectivity in range query processing. Moreover, the use of a symbolic representation opened a door to the existing wealth of data-structures and string-manipulation algorithms in computer science such as hashing, regular expression, pattern matching, suffix trees, and grammatical inference.

[1] Lin, J., Keogh, E., Lonardi, S. &; Chiu, B. (2003) A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. In proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. San Diego, CA. June 13.

Return
An array of symbols.
Parameters
  • arr: KhivaArray with the input time series.
  • alphabetSize: Number of element within the alphabet.

static KhivaArray Khiva.Dimensionality.Visvalingam(KhivaArray points, int numPoints)

Reduces a set of points by applying the Visvalingam method (minimum triangle area) until the number of points is reduced to numPoints.

[1] M. Visvalingam and J. D. Whyatt, Line generalisation by repeated elimination of points, The Cartographic Journal, 1993.

Return
KhivaArray with the x-coordinates and y-coordinates of the selected points (x in column 0 and y in column 1).
Parameters
  • points: KhivaArray with the x-coordinates and y-coordinates of the input points (x in column 0 and y in column 1).
  • numPoints: Sets the number of points returned after the execution of the method.