Estimators
The Estimators module contains a variety of statistical estimators that can be applied to multivariate datasets.
- conaction.estimators.angular_disimilarity(X: ndarray) float64
Computes the multilinear angular disimilarity. When given an m x 2 data matrix, it is equivalent to the angular distance.
This function computes
\[\text{angular disimilarity} \triangleq \frac{\theta}{\pi}\]where \(\theta\) is the result of computing the arccosine on the reflective correlation coefficient.
- Parameters:
X (array-like) – m x n data matrix
- Returns:
angular disimilarity
- Return type:
np.float64
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> angular_disimilarity(data) 0.00981604173368436
- conaction.estimators.circular_correlation(X: ndarray) float64
This function calculates the n-ary circular correlation coefficient. When given an m x 2 data matrix, it is equivalent to the circular correlation coefficient.
This function estimates
\[R_c \left[ X_1, \cdots, X_n \right] = \frac{\mathbb{E} \left[ \prod_{j=1}^{n} \sin \left( X_j - \mathbb{E}[X_j] \right) \right]}{\prod_{j=1}^{n} \sqrt[n]{\mathbb{E}\left[ |\sin \left( X_j - \mathbb{E}[X_j] \right)|^n \right]}}\]- Parameters:
(array-like) (X) –
- Returns:
r – Circular correlation coefficient.
- Return type:
np.float64
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> circular_correlation(data) 0.9999999999999999
- conaction.estimators.correlation_ratio(X: ndarray, y: ndarray) float64
Warning
Not implemented yet.
This function calculates the multilinear correlation ratio of a collection of response variables given their classes. The classic Fisher’s correlation ratio is a special case.
- Parameters:
X (array-like) – m x n data matrix
y (array-like) – m-dimensional vector of class labels
- Returns:
Correlation ratio score.
- Return type:
np.float64
- Raises:
NotImplementedError –
References
- conaction.estimators.grade_entropy(X, normalize=True)
Computes a grade entropy for a strict product order on the row space points.
This function computes
\[H_g = \frac{-\sum_{i=1}^{k} p (g_i) \ln (p (g_i))}{\ln{m}}\]where \(p\) is a probability distribution over the grades \(g\) of the point \(x_i\) among the indexed set of points \(i \in \{1, \cdots, m\}\) according to a strict product order relation.
- Parameters:
X (array-like) – An m x n data matrix.
- Returns:
entropy – Grade entropy of product order.
- Return type:
np.float64
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> grade_entropy(data) 1.0
- conaction.estimators.kendall_tau(X: ndarray, method='A', n_jobs=1) float64
Multivariate Kendall’s tau.
- Parameters:
X (array-like) – An m x n data matrix.
method ({'A', 'B', 'C'}, optional) –
- The method used to account for tied points.
The following methods are available (default is ‘a’):
’A’: Original Kendall’s Tau.
’B’: \(\tau_B = \frac{m_c - m_d}{\sqrt[n]{\prod_{j=1}^{n} (m_0 - m_j)}}\)
’C’: :math:` au_C = frac{2 (m_c - m_d) }{m^2 frac{(max(m,n) - 1)}{max(m,n)}}`
- Returns:
Multivariate Kendall’s tau score
- Return type:
np.float64
- Raises:
NotImplementedError – Method B is not implemented yet.
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> kendall_tau(data) 1.0
- conaction.estimators.median_correlation(X: ~numpy.ndarray, transform=<function <lambda>>) float64
Median (multilinear) correlation.
The function estimates
\[R_{\mathcal{M}} \left[ X_1, \cdots, X_n \right] = \frac{\mathcal{M} \left[ \prod_{j=1}^{n} \left( X_j - \mathcal{M}[X_j] \right) \right]}{\prod_{j=1}^{n} \sqrt[n]{\mathcal{M}\left[ |X_n - \mathcal{M}[X_j]|^n \right]}}\]- Parameters:
X (array-like[np.float64]) – An m x n data matrix.
transform (function) – A data transform before computing coefficient.
- Returns:
r – The calculated median correlation coefficient.
- Return type:
np.float64
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> median_correlation(data) 0.9999999999999982
- conaction.estimators.misiak_correlation(x: ndarray, y: ndarray, X: ndarray) float64
Misiak’s n-inner correlation coefficient based on the n-inner product space presented in Misiak and Ryz 2000.
- Parameters:
x (array-like) – 1-D data vector
y (array-like) – 1-D data vector
X (array-like) – m x n data matrix
- Returns:
Misiak correlation score.
- Return type:
np.float64
References
Examples
>>> import numpy as np >>> np.random.seed(0) >>> x = np.random.normal(size=10) >>> y = np.random.normal(size=10) >>> X = np.random.normal(size=100).reshape(10,10) >>> misiak_correlation(x,y,X) -0.11209570083901074
- conaction.estimators.nightingale_correlation(X: ndarray, p=1, alphas=None) float64
Calculates the Nightingale correlation which is a normalized Nightingale covariance onto the interval of [0,1].
- Parameters:
X (array-like) – m x n data matrix
- Returns:
Nightingale correlation
- Return type:
np.float64
See also
nightingale_deviationNightingale’s deviation of order p.
nightingale_covarianceNightingale’s covariance of order p.
References
- conaction.estimators.nightingale_covariance(X: ndarray, p=1) float64
This function calculates the Nightingale covariance which is the multisemimetric between a collection of random variables from their expectations. The multisemimetric is induced by a multiseminorm, which is a generalization of the notion of a seminorm.
- Parameters:
X (array-like) – m x n data matrix
- Returns:
Nightingale covariance.
- Return type:
np.float64
See also
numpy.stdStandard deviation.
nightingale_deviationNightingale’s deviation of order p.
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> nightingale_covariance(data) 7381024072265624.0
- conaction.estimators.nightingale_deviation(x: ndarray, p=2) float64
Calculates the Nightingale deviation of order p. When the order = 2, it is the same as the standard deviation.
This function estimates
\[\text{Dev}_p \left[ X \right] \triangleq \sqrt[p]{\mathbb{E}\left[ |X - \mathbb{E}[X]|^p \right]}\]- Parameters:
x (array-like.) – Instances of a variable.
- Returns:
result
- Return type:
np.float64
See also
numpy.stdStandard deviation.
References
Examples
>>> import numpy as np >>> data = np.arange(10) >>> minkowski_deviation(data) 2.8722813232690143
- conaction.estimators.partial_agnesian(X: ndarray, t=None, k=0)
Computes the partial Agnesian of order k on a data matrix. If a vector of parameters is not provided, then a parameter step size of unity is assumed.
- Parameters:
X (array-like (2D)) – m x n data matrix
t (array-like (1D)) – Vector parameters corresponding to the rows of X.
k (Non-negative int) – Non-negative order of the partial Agnesian operator.
- Returns:
Sequence of partial Agnesian scores.
- Return type:
array-like[float]
Examples
>>> import numpy as np >>> data = np.arange(5*3).reshape(5,3) >>> partial_agnesian(X,k=1) array([27, 27, 27, 27]) >>> partial_agnesian(X, k=2) array([0, 0, 0]) >>> t = np.linspace(0, 10, 5) >>> partial_agnesian(X, k=1, t=t) array([1.728, 1.728, 1.728, 1.728])
- conaction.estimators.pearson_correlation(X: ndarray) float64
This function calculates the n-ary Pearson’s r correlation coefficient. When given an m x 2 data matrix, it is equivalent to the Pearson’s r correlation coefficient.
This function estimates
\[R_p \left[ X_1, \cdots, X_n \right] = \frac{\mathbb{E} \left[ \prod_{j=1}^{n} \left( X_j - \mathbb{E}[X_j] \right) \right]}{\prod_{j=1}^{n} \sqrt[n]{\mathbb{E}\left[ |X_n- \mathbb{E}[X_j]|^n \right]}}\]- Parameters:
X (array-like) – An m x n data matrix.
- Returns:
r – The calculated Pearson r correlation coefficient.
- Return type:
np.float64
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> pearson_correlation(data) 0.9999999999999978
- conaction.estimators.pnorm(x: ndarray, p=2) float64
Computes the p-norm of a given vector.
- Parameters:
x (1D array-like) – An m-dimensional vector.
p (float) – Order of the norm.
- Returns:
np.float64
P-norm of input vector.
References
- conaction.estimators.product_percentiles(X: array) array
Compute joint percentiles under a product order.
- Parameters:
X (np.array[float]) – Data matrix.
- Returns:
float
Joint percentiles.
- conaction.estimators.product_rank(X, monotone=False)
Assign product order rank to each point.
- Parameters:
X (np.array) – Data matrix.
montone (bool) – Whether to rank monotonically or antimonotonically.
- conaction.estimators.pseudograde_entropy(X: ndarray, n_jobs=1) float64
Computes a pseudograde entropy for a strict product order on the row space points.
This function computes
\[H_g = \frac{-\sum_{i=1}^{k} p (g_i) \ln (p (g_i))}{\ln{m}}\]where \(p\) is a probability distribution over the pseudogrades \(g\) of the point \(x_i\) among the indexed set of points \(i \in \{1, \cdots, m\}\) according to a strict product order relation.
- Parameters:
X (array-like) – An m x n data matrix.
- Returns:
entropy – Pseudograde entropy of product order.
- Return type:
np.float64
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> pseudograde_entropy(data) 1.0
- conaction.estimators.reach_percentiles(g: DiGraph) Dict[int, float]
Compute reach percentiles from a digraph.
- Parameters:
g (nx.DiGraph) – NetworkX DiGraph.
- Return type:
dict[int,float]
- conaction.estimators.reach_rank(g: DiGraph) Dict[int, int]
Reachable rank of nodes in a digraph.
- Parameters:
g (nx.DiGraph) – NetworkX DiGraph.
- Returns:
dict[int,int]
Reach rank for each node.
- conaction.estimators.reflective_correlation(X: ndarray) float64
Calculates the multilinear reflective correlation coefficient. When given an m x 2 data matrix, it is equivalent to the reflective correlation coefficient.
This function estimates
\[R_r \left[ X_1, \cdots, X_n \right] = \frac{\mathbb{E} \left[ \prod_{j=1}^{n} X_j \right]}{\prod_{j=1}^{n} \sqrt[n]{\mathbb{E}\left[ |X_n|^n \right]}}\]- Parameters:
X (array-like) – The m x n data matrix.
- Returns:
r – Reflective correlation coefficient score.
- Return type:
np.float64
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> reflective_correlation(data) 0.9995245464170066
- conaction.estimators.signum_correlation(X: ndarray) float64
Signum correlation coefficient.
This function estimates
\[R_{\text{sign}} \left[ X_1, \cdots, X_n \right] = \frac{\mathbb{E} \left[ \prod_{j=1}^{n} \text{sign} \left( X_j - \mathbb{E}[X_j] \right) \right]}{\prod_{j=1}^{n} \sqrt[n]{\mathbb{E}\left[ |\text{sign} \left( X_j - \mathbb{E}[X_j] \right)|^n \right]}}\]- Parameters:
X (array-like) – m x n data matrix
- Returns:
Signum correlation score.
- Return type:
float
See also
scipy.stats.kendalltauKendall’s \(\tau\)
Notes
On the face of it this coefficient seems the same as Kendall’s \(\tau\) due to taking products of signs, however they are distinct. Kendall’s \(\tau\) computes an average of the discordant pairs subtracted from the concordant pairs of points.
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> signum_correlation(data) 0.9999999999999998
- conaction.estimators.spearman_correlation(X: ndarray, method='average') float64
This function calculates the n-ary Spearman correlation coefficient. When given an m x 2 data matrix, it is equivalent to the Spearman’s Rho correlation coefficient.
This function estimates
\[R_c \left[ X_1, \cdots, X_n \right] = \frac{\mathbb{E} \left[ \prod_{j=1}^{n} \text{rank} \left( X_j \right) - \mathbb{E}[\text{rank} \left( X_j \right)] \right]}{\prod_{j=1}^{n} \sqrt[n]{\mathbb{E}\left[ |\text{rank} \left( X_j \right) - \mathbb{E}[\text{rank} \left( X_j \right)]|^n \right]}}\]- Parameters:
X (array-like) – m x n data matrix.
method ({'average', 'min', 'max', 'dense', 'ordinal'}, optional) –
- The method used to assign ranks to tied elements.
The following methods are available (default is ‘average’):
’average’: The average of the ranks that would have been assigned to all the tied values is assigned to each value.
’min’: The minimum of the ranks that would have been assigned to all the tied values is assigned to each value. (This is also referred to as “competition” ranking.)
’max’: The maximum of the ranks that would have been assigned to all the tied values is assigned to each value.
’dense’: Like ‘min’, but the rank of the next highest element is assigned the rank immediately after those assigned to the tied elements.
’ordinal’: All values are given a distinct rank, corresponding to the order that the values occur in a.
- Returns:
Spearman’s correlation coefficient.
- Return type:
np.float64
See also
scipy.stats.rankdataNotes
The available data ranking options are directly from scipy.stats.rankdata.
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> spearman_correlation(data) 0.9999999999999991
- conaction.estimators.taylor_correlation(X: ndarray) float64
Taylor’s multi-way correlation coefficient.
Taylor 2020 defines this function to be
\[\frac{1}{\sqrt{d}} \sqrt{\frac{1}{d-1} \sum_{i}^{d} ( \lambda_i - \bar{\lambda})^2 }\]where \(d\) is the number of variables, \(\lambda_1, \cdots, \lambda_d\) are the eigenvalues of the correlation matrix for a given set of variables, and \(\bar{\lambda}\) is the mean of those eigenvalues.
- Parameters:
X (array-like) – m x n data matrix
- Returns:
Taylor correlation score
- Return type:
np.float64
Notes
Taylor’s multi-way correlation coefficient is a rescaling of the Bessel-corrected standard deviation of the eigenvalues of the correlation matrix of the set of variables.
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> taylor_correlation(data) 0.9486832980505138
- conaction.estimators.trencevski_malceski_correlation(X: ndarray, Y: ndarray) float64
Generalized n-inner product correlation coefficient.
Computes a correlation coefficient based on Trencevski and Melceski 2006.
- Parameters:
X (array-like) – m x n data matrix
Y (array-like) – m x n data matrix
- Returns:
Correlation score.
- Return type:
np.float64
References
Examples
>>> import numpy as np >>> np.random.seed(0) >>> Y = np.random.normal(size=1000).reshape(100,10) >>> X = np.random.normal(size=1000).reshape(100,10) >>> trencevski_malceski_correlation(X,Y) 3.1886981411745035e-08
- conaction.estimators.wang_zheng_correlation(X: ndarray) float64
Correlation coefficient due to Wang & Zheng 2014.
This correlation coefficient is equivalent to
\[R_{wz} \triangleq 1 - \det (R_{n \times n})\]where \(R_{n \times n}\) is the correlation matrix computed on a collection of n variables. In other words, this correlation coefficient is the complement of the determinant of the correlation matrix.
- Parameters:
X (array-like) – m x n data matrix
- Returns:
result – Unsigned correlation coefficient.
- Return type:
np.float64
Notes
The complement of this statistic is the unsigned incorrelation coefficient.
References
Examples
>>> import numpy as np >>> data = np.arange(100).reshape(10,10) >>> wang_zheng_correlation(data) 1.0
- conaction.estimators.weak_inner_correlation()
- Raises:
NotImplementedError –
References