easyclimate.core.eof

The analysis of the EOF and MCA

Functions

get_EOF_model(→ xeofs.single.eof.EOF)

Build the model of the Empirical Orthogonal Functions (EOF) analysis, more commonly known as Principal Component Analysis (PCA).

save_EOF_model(model, path[, overwrite, save_data, engine])

Save the model.

load_EOF_model(→ xeofs.single.eof.EOF)

Load a saved EOF model.

calc_EOF_analysis(→ xarray.Dataset)

Calculate the results of the EOF model.

get_EOF_projection(model, data[, normalized])

Project data onto the components.

get_REOF_model(→ xeofs.single.EOFRotator)

Build the model of the Rotate Empirical Orthogonal Functions (REOF) analysis.

save_REOF_model(model, path[, overwrite, save_data, ...])

Save the model.

load_REOF_model(→ xeofs.single.EOFRotator)

Load a saved REOF model.

calc_REOF_analysis(→ xarray.Dataset)

Calculate the results of the REOF model.

get_REOF_projection(model, data[, normalized])

Project data onto the components.

get_MCA_model(→ xeofs.cross.MCA)

Build the model of the Maximum Covariance Analyis (MCA). MCA is a statistical method that finds patterns of maximum covariance between two datasets.

save_MCA_model(model, path[, overwrite, save_data, engine])

Save the model.

load_MCA_model(→ xeofs.cross.MCA)

Load a saved MCA model.

calc_MCA_analysis(→ easyclimate.core.datanode.DataNode)

Calculate the results of the EOF model.

get_MCA_projection(→ easyclimate.core.datanode.DataNode)

Get the expansion coefficients of "unseen" data. The expansion coefficients are obtained by projecting data onto the singular vectors.

calc_eof_projection_coefficient(data_field, eof_mode)

Compute the EOF projection coefficients for projecting a data field onto an EOF mode over common

Module Contents

easyclimate.core.eof.get_EOF_model(data_input: xarray.DataArray | list, lat_dim: str, lon_dim: str, time_dim: str = 'time', n_modes: int = 10, remove_seasonal_cycle_mean=False, center: bool = False, standardize: bool = False, use_coslat: bool = True, random_state: int | None = None, solver: Literal['auto', 'full', 'randomized'] = 'auto', solver_kwargs: dict = {}) xeofs.single.eof.EOF

Build the model of the Empirical Orthogonal Functions (EOF) analysis, more commonly known as Principal Component Analysis (PCA).

Parameters

data_input: xarray.DataArray or list

The spatio-temporal data to be calculated.

lat_dim: str.

Latitude coordinate dimension name.

lon_dim: str.

Longitude coordinate dimension name.

time_dim: str, default: time.

The time coordinate dimension name.

n_modes: int, default 10.

Number of modes to calculate.

remove_seasonal_cycle_mean: bool, default False.

Whether to remove seasonal cycle mean of the input data. If it is True, the function will use easyclimate.remove_seasonal_cycle_mean to remove seasonal cycle mean of the input data.

center: bool, default False.

Whether to center the input data.

standardize: bool, default False.

Whether to standardize the input data.

use_coslat: bool, default True.

Whether to use cosine of latitude for scaling.

random_state: int, default None.

Seed for the random number generator.

solver: {“auto”, “full”, “randomized”}, default: “auto”.

Solver to use for the SVD computation.

solver_kwargs: dict, default {}.

Additional keyword arguments to be passed to the SVD solver.

Returns

xeofs.single.EOF

easyclimate.core.eof.save_EOF_model(model: xeofs.single.eof.EOF, path: str, overwrite: bool = False, save_data: bool = False, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs)

Save the model.

Parameters

model: xeofs.single.EOF

The model of xeofs.single.EOF is the results from easyclimate.eof.get_EOF_model or xeofs.single.eof.EOF.fit.

path: str

Path to save the model.

overwrite: bool, default False

Whether or not to overwrite the existing path if it already exists. Ignored unless engine = “zarr”.

save_data: bool, default False

Whether or not to save the full input data along with the fitted components.

engine: {“zarr”, “netcdf4”, “h5netcdf”}, default “zarr”

Xarray backend engine to use for writing the saved model.

**kwargs: dict.

Additional keyword arguments to pass to xarray.DataTree.to_netcdf() or xarray.DataTree.to_zarr().

easyclimate.core.eof.load_EOF_model(path: str, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs) xeofs.single.eof.EOF

Load a saved EOF model.

Parameters

path: str

Path to the saved model.

engine: {“zarr”, “netcdf4”, “h5netcdf”}, default “zarr”

Xarray backend engine to use for reading the saved model.

**kwargs: dict.

Additional keyword arguments to pass to open_datatree().

Returns

The model of xeofs.single.EOF is the results from easyclimate.eof.get_EOF_model or xeofs.single.eof.EOF.fit.

easyclimate.core.eof.calc_EOF_analysis(model: xeofs.single.eof.EOF, PC_normalized: bool = True) xarray.Dataset

Calculate the results of the EOF model.

Parameters

model: xeofs.single.EOF

The model of xeofs.single.EOF is the results from easyclimate.eof.get_EOF_model or xeofs.single.eof.EOF.fit.

PC_normalized: bool, default True.

Whether to normalize the scores by the L2 norm (singular values).

Returns

The results of the EOF model xarray.Dataset.

  • EOF: The (EOF) components: The components in EOF anaylsis are the eigenvectors of the covariance/correlation matrix. Other names include the principal components or EOFs.

  • PC: The (PC) scores: The scores in EOF anaylsis are the projection of the data matrix onto the eigenvectors of the covariance matrix (or correlation) matrix. Other names include the principal component (PC) scores or just PCs.

  • explained_variance: The explained variance. The explained variance \(\lambda_i\) is the variance explained by each mode. It is defined as

\[\lambda_i = \frac{\sigma_i^2}{N-1}\]

where \(\sigma_i\) is the singular value of the \(i\)-th mode and \(N\) is the number of samples. Equivalently, \(\lambda_i\) is the \(i\)-th eigenvalue of the covariance matrix.

  • explained_variance_ratio: The explained variance ratio. The explained variance ratio \(\gamma_i\) is the variance explained by each mode normalized by the total variance. It is defined as

\[\gamma_i = \frac{\lambda_i}{\sum_{j=1}^M \lambda_j}\]

where \(\lambda_i\) is the explained variance of the \(i\)-th mode and \(M\) is the total number of modes.

  • singular_values: The singular values of the Singular Value Decomposition (SVD).

easyclimate.core.eof.get_EOF_projection(model: xeofs.single.eof.EOF, data: xarray.DataArray, normalized: bool = True)

Project data onto the components.

Parameters

model: xeofs.single.EOF

The model of xeofs.single.EOF is the results from easyclimate.eof.get_EOF_model or xeofs.single.eof.EOF.fit.

data: xarray.DataArray

Data to be transformed.

normalized: bool, default True.

Whether to normalize the scores by the L2 norm.

Returns

projections: xarray.DataArray

Projections of the data onto the components.

easyclimate.core.eof.get_REOF_model(data_input: xarray.DataArray, lat_dim: str, lon_dim: str, time_dim: str = 'time', n_modes: int = 2, power: int = 1, max_iter: int = None, rtol: float = 1e-08, remove_seasonal_cycle_mean=False, standardize: bool = False, use_coslat: bool = True, random_state=None, solver: Literal['auto', 'full', 'randomized'] = 'auto', solver_kwargs={}) xeofs.single.EOFRotator

Build the model of the Rotate Empirical Orthogonal Functions (REOF) analysis.

Parameters

data_input: xarray.DataArray

The spatio-temporal data to be calculated.

lat_dim: str.

Latitude coordinate dimension name.

lon_dim: str.

Longitude coordinate dimension name.

time_dim: str, default: time.

The time coordinate dimension name.

n_modes: int, default 10.

Number of modes to calculate.

remove_seasonal_cycle_mean: bool, default False.

Whether to remove seasonal cycle mean of the input data. If it is True, the function will use easyclimate.remove_seasonal_cycle_mean to remove seasonal cycle mean of the input data.

standardize: bool, default False.

Whether to standardize the input data.

use_coslat: bool, default True.

Whether to use cosine of latitude for scaling.

random_state: int, default None.

Seed for the random number generator.

solver: {“auto”, “full”, “randomized”}, default: “auto”.

Solver to use for the SVD computation.

solver_kwargs: dict, default {}.

Additional keyword arguments to be passed to the SVD solver.

Returns

xeofs.single.EOFRotator

Reference

Richman, M.B. (1986), Rotation of principal components. J. Climatol., 6: 293-335. https://doi.org/10.1002/joc.3370060305

easyclimate.core.eof.save_REOF_model(model: xeofs.single.EOFRotator, path: str, overwrite: bool = False, save_data: bool = False, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs)

Save the model.

Parameters

model: xeofs.single.EOFRotator

The model of xeofs.single.EOFRotator is the results from easyclimate.eof.get_REOF_model or xeofs.single.EOFRotator.fit.

path: str

Path to save the model.

overwrite: bool, default False

Whether or not to overwrite the existing path if it already exists. Ignored unless engine = “zarr”.

save_data: bool, default False

Whether or not to save the full input data along with the fitted components.

engine: {“zarr”, “netcdf4”, “h5netcdf”}, default “zarr”

Xarray backend engine to use for writing the saved model.

**kwargs: dict.

Additional keyword arguments to pass to xarray.DataTree.to_netcdf() or xarray.DataTree.to_zarr().

easyclimate.core.eof.load_REOF_model(path: str, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs) xeofs.single.EOFRotator

Load a saved REOF model.

Parameters

path: str

Path to the saved model.

engine: {“zarr”, “netcdf4”, “h5netcdf”}, default “zarr”

Xarray backend engine to use for reading the saved model.

**kwargs: dict.

Additional keyword arguments to pass to open_datatree().

Returns

The model of xeofs.single.EOFRotator is the results from easyclimate.eof.get_REOF_model or xeofs.single.EOFRotator.fit.

easyclimate.core.eof.calc_REOF_analysis(model: xeofs.single.EOFRotator, PC_normalized: bool = True) xarray.Dataset

Calculate the results of the REOF model.

Parameters

model: xeofs.single.EOFRotator

The model of xeofs.single.EOFRotator is the results from easyclimate.eof.get_REOF_model or xeofs.single.EOFRotator.fit.

PC_normalized: bool, default True.

Whether to normalize the scores by the L2 norm (singular values).

Returns

The results of the EOF model xarray.Dataset.

  • EOF: The (EOF) components: The components in EOF anaylsis are the eigenvectors of the covariance/correlation matrix. Other names include the principal components or EOFs.

  • PC: The (PC) scores: The scores in EOF anaylsis are the projection of the data matrix onto the eigenvectors of the covariance matrix (or correlation) matrix. Other names include the principal component (PC) scores or just PCs.

  • explained_variance: The explained variance. The explained variance \(\lambda_i\) is the variance explained by each mode. It is defined as

\[\lambda_i = \frac{\sigma_i^2}{N-1}\]

where \(\sigma_i\) is the singular value of the \(i\)-th mode and \(N\) is the number of samples. Equivalently, \(\lambda_i\) is the \(i\)-th eigenvalue of the covariance matrix.

  • explained_variance_ratio: The explained variance ratio. The explained variance ratio \(\gamma_i\) is the variance explained by each mode normalized by the total variance. It is defined as

\[\gamma_i = \frac{\lambda_i}{\sum_{j=1}^M \lambda_j}\]

where \(\lambda_i\) is the explained variance of the \(i\)-th mode and \(M\) is the total number of modes.

  • singular_values: The singular values of the Singular Value Decomposition (SVD).

easyclimate.core.eof.get_REOF_projection(model: xeofs.single.EOFRotator, data: xarray.DataArray, normalized: bool = True)

Project data onto the components.

Parameters

model: xeofs.single.EOFRotator

The model of xeofs.single.EOFRotator is the results from easyclimate.eof.get_REOF_model or xeofs.single.EOFRotator.fit.

data: xarray.DataArray

Data to be transformed.

normalized: bool, default True.

Whether to normalize the scores by the L2 norm.

Returns

projections: xarray.DataArray

Projections of the data onto the components.

easyclimate.core.eof.get_MCA_model(data_left: xarray.DataArray, data_right: xarray.DataArray, lat_dim: str, lon_dim: str, time_dim: str = 'time', n_modes=10, standardize: bool = False, use_coslat: bool = False, n_pca_modes: int = 'auto', weights_left: xarray.DataArray = None, weights_right: xarray.DataArray = None, random_state: int = None, solver: Literal['auto', 'full', 'randomized'] = 'auto', solver_kwargs: dict = {}) xeofs.cross.MCA

Build the model of the Maximum Covariance Analyis (MCA). MCA is a statistical method that finds patterns of maximum covariance between two datasets.

Note

MCA is similar to Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA), but while PCA finds modes of maximum variance and CCA finds modes of maximum correlation, MCA finds modes of maximum covariance.

Parameters

data_left: xarray.DataArray

Left input data.

data_right: xarray.DataArray

Right input data.

lat_dim: str.

Latitude coordinate dimension name.

lon_dim: str.

Longitude coordinate dimension name.

time_dim: str, default: time.

The time coordinate dimension name.

n_modes: int, default 10.

Number of modes to calculate.

standardize: bool, default False.

Whether to standardize the input data.

use_coslat: bool, default True.

Whether to use cosine of latitude for scaling.

n_pca_modes: int, default same as n_modes, i.e, ‘auto’.

The number of principal components to retain during the PCA preprocessing step applied to both data sets prior to executing MCA. If set to None, PCA preprocessing will be bypassed, and the MCA will be performed on the original datasets. Specifying an integer value greater than 0 for n_pca_modes will trigger the PCA preprocessing, retaining only the specified number of principal components. This reduction in dimensionality can be especially beneficial when dealing with high-dimensional data, where computing the cross-covariance matrix can become computationally intensive or in scenarios where multicollinearity is a concern.

weights_left: xarray.DataArray

Weights to be applied to the left input data.

weights_right: xarray.DataArray

Weights to be applied to the right input data.

random_state: int, default None.

Seed for the random number generator.

solver: {“auto”, “full”, “randomized”}, default: “auto”.

Solver to use for the SVD computation.

solver_kwargs: dict, default {}.

Additional keyword arguments to be passed to the SVD solver.

Returns

xeofs.cross.MCA

Reference

  • Bretherton, C. S., Smith, C., & Wallace, J. M. (1992). An Intercomparison of Methods for Finding Coupled Patterns in Climate Data. Journal of Climate, 5(6), 541-560. https://doi.org/10.1175/1520-0442(1992)005<0541:AIOMFF>2.0.CO;2

  • Cherry, S. (1996). Singular Value Decomposition Analysis and Canonical Correlation Analysis. Journal of Climate, 9(9), 2003-2009. https://doi.org/10.1175/1520-0442(1996)009<2003:SVDAAC>2.0.CO;2

easyclimate.core.eof.save_MCA_model(model: xeofs.cross.MCA, path: str, overwrite: bool = False, save_data: bool = False, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs)

Save the model.

Parameters

model: xeofs.cross.MCA

The model of xeofs.cross.MCA is the results from easyclimate.eof.get_MCA_model or xeofs.cross.mca.MCA.fit.

path: str

Path to save the model.

overwrite: bool, default False

Whether or not to overwrite the existing path if it already exists. Ignored unless engine = “zarr”.

save_data: bool, default False

Whether or not to save the full input data along with the fitted components.

engine: {“zarr”, “netcdf4”, “h5netcdf”}, default “zarr”

Xarray backend engine to use for writing the saved model.

**kwargs: dict.

Additional keyword arguments to pass to xarray.DataTree.to_netcdf() or xarray.DataTree.to_zarr().

easyclimate.core.eof.load_MCA_model(path: str, engine: Literal['zarr', 'netcdf4', 'h5netcdf'] = 'zarr', **kwargs) xeofs.cross.MCA

Load a saved MCA model.

Parameters

path: str

Path to the saved model.

engine: {“zarr”, “netcdf4”, “h5netcdf”}, default “zarr”

Xarray backend engine to use for reading the saved model.

**kwargs: dict.

Additional keyword arguments to pass to open_datatree().

Returns

The model of xeofs.cross.MCA is the results from easyclimate.eof.get_MCA_model or xeofs.cross.mca.MCA.fit.

easyclimate.core.eof.calc_MCA_analysis(model: xeofs.cross.MCA, correction=None, alpha=0.05, PC_normalized: bool = True) easyclimate.core.datanode.DataNode

Calculate the results of the EOF model.

Parameters

model: xeofs.cross.MCA

The model of xeofs.cross.MCA is the results from easyclimate.eof.get_MCA_model or xeofs.cross.mca.MCA.fit.

correction: str, default None

Method to apply a multiple testing correction. If None, no correction is applied. Available methods are:

  • bonferroni : one-step correction

  • sidak : one-step correction

  • holm-sidak : step down method using Sidak adjustments

  • holm : step-down method using Bonferroni adjustments

  • simes-hochberg : step-up method (independent)

  • hommel : closed method based on Simes tests (non-negative)

  • fdr_bh : Benjamini/Hochberg (non-negative) (default)

  • fdr_by : Benjamini/Yekutieli (negative)

  • fdr_tsbh : two stage fdr correction (non-negative)

  • fdr_tsbky : two stage fdr correction (non-negative)

alpha: float, default 0.05

The desired family-wise error rate. Not used if correction is None.

PC_normalized: bool, default True.

Whether to normalize the scores by the L2 norm (singular values).

Returns

The results of the MCA model (easyclimate.DataNode).

  • EOF: The singular vectors of the left and right field.

  • PC: The scores of the left and right field. The scores in MCA are the projection of the left and right field onto the left and right singular vector of the cross-covariance matrix.

  • correlation_coefficients_X: Get the correlation coefficients for the scores of \(X\).

The correlation coefficients of the scores of \(X\) are given by:

\[c_{x, ij} = \text{corr} \left(\mathbf{r}_{x, i}, \mathbf{r}_{x, j} \right)\]

where \(\mathbf{r}_{x, i}\) and \(\mathbf{r}_{x, j}\) are the \(i\) th and \(j\) th scores of \(X\).

  • correlation_coefficients_Y: Get the correlation coefficients for the scores of \(Y\).

The correlation coefficients of the scores of \(Y\) are given by:

\[c_{y, ij} = \text{corr} \left(\mathbf{r}_{y, i}, \mathbf{r}_{y, j} \right)\]

where \(\mathbf{r}_{y, i}\) and \(\mathbf{r}_{y, j}\) are the \(i\) th and \(j\) th scores of \(Y\). - covariance_fraction_CD95: Get the covariance fraction (CF).

Cheng and Dunkerton (1995) define the CF as follows:

\[CF_i = \frac{\sigma_i}{\sum_{i=1}^{m} \sigma_i}\]

where \(m\) is the total number of modes and \(\sigma_i\) is the \(i\)-th singular value of the covariance matrix.

This implementation estimates the sum of singular values from the first n modes, therefore one should aim to retain as many modes as possible to get a good estimate of the covariance fraction.

Note

In MCA, the focus is on maximizing the squared covariance (SC). As a result, this quantity is preserved during decomposition - meaning the SC of both datasets remains unchanged before and after decomposition. Each mode explains a fraction of the total SC, and together, all modes can reconstruct the total SC of the cross-covariance matrix. However, the (non-squared) covariance is not invariant in MCA; it is not preserved by the individual modes and cannot be reconstructed from them. Consequently, the squared covariance fraction (SCF) is invariant in MCA and is typically used to assess the relative importance of each mode. In contrast, the convariance fraction (CF) is not invariant. Cheng and Dunkerton (1995) introduced the CF to compare the relative importance of modes before and after Varimax rotation in MCA. Notably, when the data fields in MCA are identical, the CF corresponds to the explained variance ratio in Principal Component Analysis (PCA).

  • cross_correlation_coefficients: Get the cross-correlation coefficients.

The cross-correlation coefficients between the scores of \(X\) and \(Y\) are computed as:

\[c_{xy, i} = \text{corr} \left(\mathbf{r}_{x, i}, \mathbf{r}_{y, i} \right)\]

where \(\mathbf{r}_{x, i}\) and \(\mathbf{r}_{y, i}\) are the \(i\) th scores of \(X\) and \(Y\).

Note

When \(\alpha=0\), the cross-correlation coefficients are equivalent to the canonical correlation coefficients.

  • fraction_variance_X_explained_by_X: Get the fraction of variance explained (FVE X).

The FVE X is the fraction of variance in \(X\) explained by the scores of \(X\).

It is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :

\[FVE_{X|X,i} = 1 - \frac{\|\mathbf{d}_{X,i}\|_F^2}{\|X\|_F^2}\]

where \(\mathbf{d}_{X,i}\) are the residuals of the input data \(X\) after reconstruction by the \(i\) th scores of \(X\).

  • fraction_variance_Y_explained_by_X: Get the fraction of variance explained (FVE YX).

The FVE YX is the fraction of variance in \(Y\) explained by the scores of \(X\). It is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :

\[FVE_{Y|X,i} = 1 - \frac{\|(X^TX)^{-1/2} \mathbf{d}_{X,i}^T \mathbf{d}_{Y,i}\|_F^2}{\|(X^TX)^{-1/2} X^TY\|_F^2}\]

where \(\mathbf{d}_{X,i}\) and \(\mathbf{d}_{Y,i}\) are the residuals of the input data \(X\) and \(Y\) after reconstruction by the \(i\) th scores of \(X\) and \(Y\), respectively.

  • fraction_variance_Y_explained_by_Y: Get the fraction of variance explained (FVE Y).

The FVE Y is the fraction of variance in \(Y\) explained by the scores of \(Y\). It is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :

\[FVE_{Y|Y,i} = 1 - \frac{\|\mathbf{d}_{Y,i}\|_F^2}{\|Y\|_F^2}\]

where \(\mathbf{d}_{Y,i}\) are the residuals of the input data \(Y\) after reconstruction by the \(i\) th scores of \(Y\).

  • squared_covariance_fraction: Get the squared covariance fraction (SCF).

The SCF is computed as a weighted mean-square error (see equation (15) in Swenson (2015)) :

\[SCF_{i} = 1 - \frac{\|\mathbf{d}_{X,i}^T \mathbf{d}_{Y,i}\|_F^2}{\|X^TY\|_F^2}\]

where \(\mathbf{d}_{X,i}\) and \(\mathbf{d}_{Y,i}\) are the residuals of the input data \(X\) and \(Y\) after reconstruction by the \(i\) th scores of \(X\) and \(Y\), respectively.

  • heterogeneous_patterns: The heterogeneous patterns of the left and right field.

The heterogeneous patterns are the correlation coefficients between the input data and the scores of the other field.

More precisely, the heterogeneous patterns \(r_{\mathrm{het}}\) are defined as

\[r_{\mathrm{het}, x} = corr \left(X, A_y \right), \ r_{\mathrm{het}, y} = corr \left(Y, A_x \right)\]

where \(X\) and \(Y\) are the input data, \(A_x\) and \(A_y\) are the scores of the left and right field, respectively.

  • homogeneous_patterns: The homogeneous patterns of the left and right field.

The homogeneous patterns are the correlation coefficients between the input data and the scores.

More precisely, the homogeneous patterns \(r_{\mathrm{hom}}\) are defined as

\[r_{\mathrm{hom}, x} = corr \left(X, A_x \right), \ r_{\mathrm{hom}, y} = corr \left(Y, A_y \right)\]

where \(X\) and \(Y\) are the input data, \(A_x\) and \(A_y\) are the scores of the left and right field, respectively.

Reference

  • Cheng, X., & Dunkerton, T. J. (1995). Orthogonal Rotation of Spatial Patterns Derived from Singular Value Decomposition Analysis. Journal of Climate, 8(11), 2631-2643. https://doi.org/10.1175/1520-0442(1995)008<2631:OROSPD>2.0.CO;2

  • Swenson, E. (2015). Continuum Power CCA: A Unified Approach for Isolating Coupled Modes. Journal of Climate, 28(3), 1016-1030. https://doi.org/10.1175/JCLI-D-14-00451.1

easyclimate.core.eof.get_MCA_projection(model: xeofs.cross.mca.MCA, data_left: xarray.DataArray | xarray.Dataset, data_right: xarray.DataArray | xarray.Dataset, normalized: bool = True) easyclimate.core.datanode.DataNode

Get the expansion coefficients of “unseen” data. The expansion coefficients are obtained by projecting data onto the singular vectors.

Parameters

model: xeofs.cross.MCA

The model of xeofs.cross.MCA is the results from easyclimate.eof.get_MCA_model or xeofs.cross.mca.MCA.fit.

data_left: xarray.DataArray or xarray.Dataset

Left input data. Must be provided if data_right is not provided.

data_right: xarray.DataArray or xarray.Dataset

Right input data. Must be provided if data_left is not provided.

normalized: bool, default False.

Whether to return L2 normalized scores.

Returns

scores: easyclimate.DataNode

  • scores1: Left scores.

  • scores2: Right scores.

easyclimate.core.eof.calc_eof_projection_coefficient(data_field: xarray.DataArray, eof_mode: xarray.DataArray, time_dim: str = 'time')

Compute the EOF projection coefficients for projecting a data field onto an EOF mode over common spatial dimensions. This is useful in EOF (Empirical Orthogonal Function) analysis for climate or geophysical data, where the field is projected onto spatial modes to obtain time-varying coefficients.

The mathematical foundation is based on the decomposition (\(\mathbf{X} = \mathbf{V} \mathbf{T}\)), solving for the coefficients (\(\mathbf{T} = \mathbf{V}^{-1} \mathbf{X}\)). For a single normalized EOF mode (\(\mathbf{V}\)), this simplifies to the projection:

\[t = \frac{\sum (x \cdot v)}{\sum v^2}\]

where the summation is over the stacked spatial (pattern) dimensions, and (\(\mathbf{X}\)) is the data field (potentially with a ‘time’ dimension), (\(\mathbf{V}\)) is the EOF mode (spatial pattern).

The spatial pattern dimensions are automatically detected as the intersection of the input dimensions, excluding ‘time’ (if present). Both inputs are stacked along these pattern dimensions into a temporary ‘pattern’ dimension, and the projection is computed along it. NaN values are filled with 0 before computation.

  • If data_field lacks ‘time’, the result is a scalar.

  • If data_field has ‘time’ and eof_mode does not, the result preserves the ‘time’ dimension.

  • Broadcasting occurs automatically for compatible shapes.

Parameters

data_fieldxarray.DataArray

The input data field to project (e.g., time series of spatial fields \(\mathbf{X}\)).

eof_modexarray.DataArray

The EOF spatial mode \(\mathbf{V}\) (must have compatible spatial dimensions).

time_dim: str, default: time.

The time coordinate dimension name.

Returns

coefficientsxarray.DataArray or scalar

The EOF projection coefficients (\(\mathbf{T}\)). Dimensions match the non-spatial dimensions of data_field (e.g., ‘time’ if present).

Note

  • Assumes inputs have compatible shapes and the only differing dimension is ‘time’ in data_field.

  • NaNs are filled with 0 to avoid propagation; adjust if needed.

  • For zero-norm cases in the denominator, the result is set to 0.

Examples

Scalar projection for a single spatial field:

>>> import xarray as xr
>>> import numpy as np
>>> import easyclimate as ecl
>>> # Create a random number generator with a fixed seed.
>>> rng = np.random.default_rng(42)
>>> field = xr.DataArray(rng.random((2, 3)), dims=['lat', 'lon'])
>>> eof_v = xr.DataArray(rng.random((2, 3)), dims=['lat', 'lon'])
>>> coeff = ecl.eof.calc_eof_projection_coefficient(field, eof_v)
>>> print(coeff)
<xarray.DataArray 'eof_projection' ()> Size: 8B
array(0.95208032)
Attributes:
    long_name:  EOF Projection Coefficient
    units:

Time series projection:

>>> # Create a random number generator with a fixed seed.
>>> rng = np.random.default_rng(42)
>>> time = xr.DataArray(np.arange(4), dims=['time'])
>>> timed_field = xr.DataArray(rng.random((4, 2, 3)), dims=['time', 'lat', 'lon'])
>>> coeff_time = calc_eof_projection_coefficient(timed_field, eof_v)
>>> print(coeff_time)
<xarray.DataArray 'eof_projection' (time: 4)> Size: 32B
array([0.95208032, 1.        , 0.64684219, 1.06549741])
Dimensions without coordinates: time
Attributes:
    long_name:  EOF Projection Coefficient
    units: