cr.cube package¶
Submodules¶
cr.cube.dimension module¶
Contains implementation of the Dimension class, for Crunch Cubes.
-
class
cr.cube.dimension.Dimension(dim, selections=None)[source]¶ Bases:
objectImplementation of the Dimension class for Crunch Cubes.
This class contains all the utility functions for working with Crunch Cube dimensions. It also hides some of the internal implementation detail from the user, especially for Multiple response variables.
-
elements¶ Get elements of the crunch Dimension.
For categorical variables, the elements are represented by categories internally. For other variable types, actual ‘elements’ of the Crunch Cube JSON response are returned.
-
type¶ Get type of the Crunch Dimension.
-
valid_indices(include_missing)[source]¶ Gets valid indices of Crunch Cube Dimension’s elements.
This function needs to be used by CrunchCube class, in order to correctly calculate the indices of the result that needs to be returned to the user. In most cases, the non-valid indices are those of the missing values.
-
cr.cube.crunch_cube module¶
Home of the CrunchCube class.
This module contains the definition of the CrunchCube class. It represents the open-source library used for manipulating the crunch cubes (JSON responses from the Crunch.io platform).
-
class
cr.cube.crunch_cube.CrunchCube(response)[source]¶ Bases:
objectImplementation of the CrunchCube API class.
Class is used for the implementation of the main API functions that are needed for seamless integration with the crunch cube responses (from Crunch.io platform).
- Main API functions are:
- as_array
- margin
- proportions
- percentages
These functions are used to retrieve statistical information of interest, from the JSON like crunch cubes. Complete usage of each API function is described within the appropriate docstring.
Crunch Cubes contain richer metadata than standart Python objects, and they also conceal certain complexity in the data structures from the user. In particular, Multiple Response variables are generally represented as single dimensions in result tables, but in the actual data, they may comprise of two dimensions. These methods (API) understand the subtleties in the Crunch data types, and correctly compute margins and percentages off of them.
-
as_array(include_missing=False, weighted=True)[source]¶ Get crunch cube as ndarray.
Returns the tabular representation of the crunch cube. The returning value has as many dimensions, as there are dimensions in the crunch cube itself. E.g. for a cross-tab representation of a categorical and numerical variable, the resulting cube will have two dimensions.
- Args
- include_missing (bool): Include rows/cols for missing values
- Returns
- (ndarray): Tabular representation of the crunch cube
- Example 1 (Categorical x Categorical):
>>> cube = CrunchCube(response) >>> cube.as_array() np.array([ [5, 2], [5, 3], ])
- Example 2 (Categorical x Categorical, include missing values):
>>> cube = CrunchCube(response) >>> cube.as_array(include_missing=True) np.array([ [5, 3, 2, 0], [5, 2, 3, 0], [0, 0, 0, 0], ])
-
dimensions¶ Dimensions of the crunch cube.
-
labels(include_missing=False)[source]¶ Gets labels for each cube’s dimension.
- Args
- include_missing (bool): Include labels for missing values
- Returns
- labels (list of lists): Labels for each dimension
-
margin(axis=None, weighted=True)[source]¶ Get margin for the selected axis.
the selected axis. For MR variables, this is the sum of the selected and non-selected slices.
- Args
- axis (int): Axis across the margin is calculated. If no axis is
- provided the margin is calculated across all axis. For Categoricals, Num, Datetime, and Text, this translates to sumation of all elements.
- Returns
- Calculated margin for the selected axis
- Example 1:
>>> cube = CrunchCube(fixt_cat_x_cat) np.array([ [5, 2], [5, 3], ])
>>> cube.margin(axis=0) np.array([10, 5])
- Example 2:
>>> cube = CrunchCube(fixt_cat_x_num_x_datetime) np.array([ [[1, 1], [0, 0], [0, 0], [0, 0]], [[2, 1], [1, 1], [0, 0], [0, 0]], [[0, 0], [2, 3], [0, 0], [0, 0]], [[0, 0], [0, 0], [3, 2], [0, 0]], [[0, 0], [0, 0], [1, 1], [0, 1]] ])
>>> cube.margin(axis=0) np.array([ [3, 2], [3, 4], [4, 3], [0, 1], ])
-
percentages(axis=None)[source]¶ Get the percentages for crunch cube values.
This function calculates the percentages for crunch cube values. The percentages are based on the values of the ‘proportions’.
- Args
- axis (int): Base axis of percentages calculation. If no axis is
- provided, calculations are done accros entire table.
- Returns
- (nparray): Calculated array of crunch cube percentages.
- Example 1:
>>> cube = CrunchCube(fixt_cat_x_cat) np.array([ [5, 2], [5, 3], ])
>>> cube.percentages() np.array([ [33.33333, 13.33333], [33.33333, 20.00000], ])
- Example 2:
>>> cube = CrunchCube(fixt_cat_x_cat) np.array([ [5, 2], [5, 3], ])
>>> cube.percentages(axis=0) np.array([ [50., 40.], [50., 60.], ])
-
proportions(axis=None)[source]¶ Get proportions of a crunch cube.
This function calculates the proportions across the selected axis of a crunch cube. For most variable types, it means the value divided by the margin value. For Multiple Response types, the value is divied by the sum of selected and non-selected slices.
- Args
- axis (int): Base axis of proportions calculation. If no axis is
- provided, calculations are done accros entire table.
- Returns
- (nparray): Calculated array of crunch cube proportions.
- Example 1:
>>> cube = CrunchCube(fixt_cat_x_cat) np.array([ [5, 2], [5, 3], ])
>>> cube.proportions() np.array([ [0.3333333, 0.1333333], [0.3333333, 0.2000000], ])
- Example 2:
>>> cube = CrunchCube(fixt_cat_x_cat) np.array([ [5, 2], [5, 3], ])
>>> cube.proportions(axis=0) np.array([ [0.5, 0.4], [0.5, 0.6], ])