Sparse#

class pybear.new_numpy.random.Sparse(minimum, maximum, shape, sparsity=0, engine='default', dtype=<class 'float'>)#

Bases: object

Return random values from a “discrete uniform” (integer) or “uniform” (float) distribution of the specified dtype in the “half-open” interval [minimum, maximum) (includes low, but excludes the maximum), with desired sparsity.

Samples are uniformly distributed over the interval. In other words, any value within the given interval is equally likely to be drawn.

The sparse array is constructed at instantiation of the Sparse class, and is accessible via the sparse_array_ attribute of the instance. This means that Sparse cannot generate arrays dynamically; one instance, one array. If you want to change any of the parameters of the instance to create a new array, you need to create a new instance.

Engine

Sparse has different engines for populating the output array with zeros to the desired sparsity. Some engines offer higher speed with lower accuracy, while others have higher accuracy at the expense of speed. “default” behavior is a hybrid of “filter” and “iterative”.

“choice”

Build a full-size mask with sparse locations determined by numpy.random.choice on [0,1], with ‘p’ achieving amount of sparsity. Build a full-sized 100% dense numpy.ndarray filled as dictated by parameters then apply the mask to populate it with zeros.

“filter”

Generate an array filled randomly from [1,100000] and convert the array to a mask that fixes the sparse locations by applying a number filter derived from the target sparsity. Generate a 100% dense array of integers or floats then apply the mask to it to achieve sparsity.

“serialized”

Generate a serialized list of unique indices and random values (or zeros) then map the values (or zeros) into a fully sparse (or dense) array.

  1. Determine the number of dense (or sparse) positions in

    the target array.

  2. Generate that number of random dense (or sparse) indices

    serially using pybear.new_numpy.random.choice without replacement. This guarantees no duplicate indices.

  3. Generate an equally-sized vector of dense values (or zeros).

  4. Map the vector of values (or zeros) to the index positions

    in a 100% sparse (or dense) full-sized array.

“iterative”

Generate a serialized list of not-necessarily-unique indices and random values (or zeros), then map the values (or zeros) into a fully sparse (or dense) array. Repeat iteratively until the desired sparsity is achieved. Same as ‘serialized’ except these indices are not necessarily unique and the process is iterative.

1. Determine the number of dense (or sparse) positions in the target array.

2. Generate that number of random dense (or sparse) indices serially with replacement; this does not guarantee non-duplicate indices.

  1. Generate an equally-sized vector of values (or zeros).

4. Map the vector of values (or zeros) to the index positions in a 100% sparse (or dense) full-sized array.

5. Because there may have been duplicate indices, repeat steps 2-4 until desired sparsity is achieved.

“default”

A hybrid method of “filter” and “iterative” that maximizes speed and accuracy. When the size of the target object is less than 1,000,000, the fastest methods “filter” and “choice” have difficulty achieving the target sparsity. In this case, the more accurate, but slower, “iterative” method is used. For target sizes over 1,000,000, the law of averages prevails and the “filter” method is able to achieve sufficiently close sparsities at speeds much faster than “iterative”.

Parameters:
minimumnumbers.Real

Lowest (signed) value to be drawn from the distribution.

maximumnumbers.Real

Upper boundary of the output interval. All values generated will be less than this number.

shapeint | Sequence[int]

Dimensions of the returned array.

sparsitynumbers.Real, default = 0

Desired percentage of zeros in the returned array.

engineEngineType, default = “default”

Selects the desired engine for generating the returned array. See the ‘Engine’ section of the docs for a detailed explanation.

dtypeobject, default = float

Desired dtype of the result.

Attributes:
sparse_array_numpy.ndarray[numbers.Real]

ndarray of shape ‘shape’ with desired dtype and sparsity.

See also

numpy.random.randint
numpy.random.uniform

Notes

Type Aliases

EngineType:

Literal[“choice”, “filter”, “serialized”, “iterative”, “default”]

Examples

>>> from pybear.new_numpy import random as pb_random
>>> instance = pb_random.Sparse(0, 10, (3,3), 50, dtype=np.int8)
>>> sparse_array = instance.sparse_array_
>>> print(sparse_array)
[[0 6 0]
 [8 8 0]
 [0 0 1]]