Datasets generators manual

We provide the generation of different customizable datasets to use as inputs for Gudhi complexes and data structures.

Points generators

The module points enables the generation of random points on a sphere, random points on a torus and as a grid.

Points on sphere

The function sphere enables the generation of random i.i.d. points uniformly on a (d-1)-sphere in \(R^d\). The user should provide the number of points to be generated on the sphere n_samples and the ambient dimension ambient_dim. The radius of sphere is optional and is equal to 1 by default. Only random points generation is currently available.

The generated points are given as an array of shape \((n\_samples, ambient\_dim)\).

Example

from gudhi.datasets.generators import points
from gudhi import AlphaComplex

# Generate 50 points on a sphere in R^2
gen_points = points.sphere(n_samples = 50, ambient_dim = 2, radius = 1, sample = "random")

# Create an alpha complex from the generated points
alpha_complex = AlphaComplex(points = gen_points)
gudhi.datasets.generators.points.sphere(n_samples: int, ambient_dim: int, radius: float = 1.0, sample: str = 'random') numpy.ndarray[numpy.float64]

Generate random i.i.d. points uniformly on a (d-1)-sphere in R^d

Parameters
  • n_samples (integer) – The number of points to be generated.

  • ambient_dim (integer) – The ambient dimension d.

  • radius (float) – The radius. Default value is 1..

  • sample (string) – The sample type. Default and only available value is “random”.

Returns

the generated points on a sphere.

Points on a flat torus

You can also generate points on a torus.

Two functions are available and give the same output: the first one depends on CGAL and the second does not and consists of full python code.

On another hand, two sample types are provided: you can either generate i.i.d. points on a d-torus in \(R^{2d}\) randomly or on a grid.

First function: ctorus

The user should provide the number of points to be generated on the torus n_samples, and the dimension dim of the torus on which points would be generated in \(R^{2dim}\). The sample argument is optional and is set to ‘random’ by default. In this case, the returned generated points would be an array of shape \((n\_samples, 2*dim)\). Otherwise, if set to ‘grid’, the points are generated on a grid and would be given as an array of shape:

\[( ⌊n\_samples^{1 \over {dim}}⌋^{dim}, 2*dim )\]

Note 1: The output array first shape is rounded down to the closest perfect \(dim^{th}\) power.

Note 2: This version is recommended when the user wishes to use ‘grid’ as sample type, or ‘random’ with a relatively small number of samples (~ less than 150).

Example

from gudhi.datasets.generators import points

# Generate 50 points randomly on a torus in R^6
gen_points = points.ctorus(n_samples = 50, dim = 3)

# Generate 27 points on a torus as a grid in R^6
gen_points = points.ctorus(n_samples = 50, dim = 3, sample = 'grid')
gudhi.datasets.generators.points.ctorus(n_samples: int, dim: int, sample: str = 'random') numpy.ndarray[numpy.float64]

Generate random i.i.d. points on a d-torus in R^2d or as a grid

Parameters
  • n_samples (integer) – The number of points to be generated.

  • dim (integer) – The dimension of the torus on which points would be generated in R^2*dim.

  • sample (string) – The sample type. Available values are: “random” and “grid”. Default value is “random”.

Returns

the generated points on a torus.

The shape of returned numpy array is:

If sample is ‘random’: (n_samples, 2*dim).

If sample is ‘grid’: (⌊n_samples**(1./dim)⌋**dim, 2*dim), where shape[0] is rounded down to the closest perfect ‘dim’th power.

Second function: torus

The user should provide the number of points to be generated on the torus n_samples and the dimension dim of the torus on which points would be generated in \(R^{2dim}\). The sample argument is optional and is set to ‘random’ by default. The other allowed value of sample type is ‘grid’.

Note: This version is recommended when the user wishes to use ‘random’ as sample type with a great number of samples and a low dimension.

Example

from gudhi.datasets.generators import points

# Generate 50 points randomly on a torus in R^6
gen_points = points.torus(n_samples = 50, dim = 3)

# Generate 27 points on a torus as a grid in R^6
gen_points = points.torus(n_samples = 50, dim = 3, sample = 'grid')
gudhi.datasets.generators.points.torus(n_samples, dim, sample='random')[source]

Generate points on a flat dim-torus in R^2dim either randomly or on a grid

Parameters
  • n_samples – The number of points to be generated.

  • dim – The dimension of the torus on which points would be generated in R^2*dim.

  • sample – The sample type of the generated points. Can be ‘random’ or ‘grid’.

Returns

numpy array containing the generated points on a torus.

The shape of returned numpy array is:

If sample is ‘random’: (n_samples, 2*dim).

If sample is ‘grid’: (⌊n_samples**(1./dim)⌋**dim, 2*dim), where shape[0] is rounded down to the closest perfect ‘dim’th power.