Mapper/GIC/Nerve complexes reference manual#
MapperComplex
reference manual#
- class gudhi.cover_complex.MapperComplex[source]#
This is a class for computing Mapper simplicial complexes on point clouds or distance matrices.
- __init__(*, input_type='point cloud', colors=None, min_points_per_node=0, filter_bnds=None, resolutions=None, gains=None, clustering=DBSCAN(), N=100, beta=0.0, C=10.0, verbose=False)[source]#
Constructor for the MapperComplex class.
- Parameters:
input_type¶ (string) – type of input data. Either “point cloud” or “distance matrix”.
min_points_per_node¶ (int) – threshold on the size of the cover complex nodes (default 0). Any node associated to a subpopulation with less than min_points_per_node points will be removed.
filter_bnds¶ (list of lists or numpy array of shape (num_filters) x 2)) – limits of each filter, of the form [[f_1^min, f_1^max], …, [f_n^min, f_n^max]]. If one of the values is numpy.nan, it can be computed from the dataset with the fit() method.
resolutions¶ (list or numpy array of shape num_filters containing integers) – resolution of each filter function, ie number of intervals required to cover each filter image. If None, it is estimated from data.
gains¶ (list or numpy array of shape num_filters containing doubles in [0,1]) – gain of each filter function, ie overlap percentage of the intervals covering each filter image. If None, it is set as 1/3 for all filters, since in the automatic parameter selection method in http://www.jmlr.org/papers/volume19/17-291/17-291.pdf, any arbitrary value between 1/3 and 1/2 works, so we go with the minimal one (ensuring that the complex is a graph if only given one filter).
clustering¶ (class) – clustering class (default sklearn.cluster.DBSCAN()). Common clustering classes can be found in the scikit-learn library (such as AgglomerativeClustering for instance). If None, it is set to hierarchical clustering, with scale estimated from data.
N¶ (int) – subsampling iterations (default 100) for estimating scale and resolutions. Used only if clustering or resolutions = None. See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
beta¶ (float) – exponent parameter (default 0.) for estimating scale and resolutions. Used only if clustering or resolutions = None. See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
C¶ (float) – constant parameter (default 10.) for estimating scale and resolutions. Used only if clustering or resolutions = None. See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
verbose¶ (bool) – whether to display info while computing.
- estimate_scale(X, N=100, beta=0.0, C=10.0)[source]#
Compute estimated scale of a point cloud or a distance matrix.
- Parameters:
X¶ (numpy array of shape (num_points) x (num_coordinates) if point cloud and (num_points) x (num_points) if distance matrix) – input point cloud or distance matrix.
N¶ (int) – subsampling iterations (default 100). See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
beta¶ (float) – exponent parameter (default 0.). See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
C¶ (float) – constant parameter (default 10.). See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
- Returns:
delta – estimated scale that can be used with, e.g., agglomerative clustering.
- Return type:
float
- fit(X, y=None, filters=None, colors=None)[source]#
Fit the MapperComplex class on a point cloud or a distance matrix: compute the Mapper complex and store it in a simplex tree called simplex_tree_.
- Parameters:
X¶ (numpy array of shape (num_points) x (num_coordinates) if point cloud and (num_points) x (num_points) if distance matrix) – input point cloud or distance matrix.
y¶ (n x 1 array) – point labels (unused).
filters¶ (list of lists or numpy array of shape (num_points) x (num_filters)) – filter functions (sometimes called lenses) used to compute the cover. Each column of the numpy array defines a scalar function defined on the input points.
colors¶ (list of lists or numpy array of shape (num_points) x (num_colors)) – functions used to color the nodes of the cover complex. More specifically, coloring is done by computing the means of these functions on the subpopulations corresponding to each node. If None, first coordinate is used if input is point cloud, and eccentricity is used if input is distance matrix.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequest
encapsulating routing information.- Return type:
MetadataRequest
- get_networkx(set_attributes_from_colors=False)#
Turn the 1-skeleton of the cover complex computed after calling fit() method into a networkx graph. This function requires networkx (https://networkx.org/documentation/stable/install.html).
- Parameters:
set_attributes_from_colors¶ (bool) – if True, the color functions will be used as attributes for the networkx graph.
- Returns:
G – graph representing the 1-skeleton of the cover complex.
- Return type:
networkx graph
- get_optimal_parameters_for_agglomerative_clustering(X, beta=0.0, C=10.0, N=100)[source]#
Compute optimal scale and resolutions for a point cloud or a distance matrix.
- Parameters:
X¶ (numpy array of shape (num_points) x (num_coordinates) if point cloud and (num_points) x (num_points) if distance matrix) – input point cloud or distance matrix.
beta¶ (float) – exponent parameter (default 0.). See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
C¶ (float) – constant parameter (default 10.). See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
N¶ (int) – subsampling iterations (default 100). See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
- Returns:
delta (float) – optimal scale that can be used with agglomerative clustering.
resolutions (numpy array of shape (num_filters)) – optimal resolutions associated to each filter.
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep¶ (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- save_to_dot(file_name='cover_complex', color_name='color', eps_color=0.1, eps_size=0.1)#
Write the 0-skeleton of the cover complex in a DOT file called “{file_name}.dot”, that can be processed with, e.g., neato. The vertices of the cover complex are colored with the first color function, ie, the first column of self.colors. This function also produces an extra pdf file “colorbar_{color_name}.pdf” containing a colorbar corresponding to the node colors in the DOT file.
- Parameters:
file_name¶ (string) – name for the output .dot file, default “cover_complex”
color_name¶ (string) – name for the output .pdf showing the colorbar of the color used for the Mapper nodes, default “color”
eps_color¶ (float) – scale the node colors between [eps_color, 1-eps_color]. Should be between 0 and 1/2. When close to 0., the color varies a lot across the nodes, if close to 1/2, the color tends to be more uniform.
eps_size¶ (float) – scale the node sizes between [eps_size, 1-eps_size]. Should be between 0 and 1/2. When close to 0., the size varies a lot across the nodes, if close to 1/2, the nodes tend to have the same size.
- save_to_html(file_name='cover_complex', data_name='data', cover_name='cover', color_name='color')#
Write the cover complex to an HTML file called “{file_name}.html”, that can be visualized in a browser. This function is based on a fork of MLWave/kepler-mapper
- Parameters:
file_name¶ (string) – name for the output .html file, default “cover_complex”
data_name¶ (string) – name to use for the data on which the cover complex was computed, default “data”.
cover_name¶ (string) – name to use for the cover used to compute the cover complex, default “cover”.
color_name¶ (string) – name to use for the color used to color the cover complex nodes, default “color”.
- save_to_txt(file_name='cover_complex', data_name='data', cover_name='cover', color_name='color')#
Write the cover complex to a TXT file called “{file_name}.txt”, that can be processed with the KeplerMapper Python script “KeplerMapperVisuFromTxtFile.py” available under “src/Nerve_GIC/utilities/”.
- Parameters:
file_name¶ (string) – name for the output .txt file, default “cover_complex”
data_name¶ (string) – name to use for the data on which the cover complex was computed, default “data”. It will be used when generating an html visualization with KeplerMapperVisuFromTxtFile.py
cover_name¶ (string) – name to use for the cover used to compute the cover complex, default “cover”. It will be used when generating an html visualization with KeplerMapperVisuFromTxtFile.py
color_name¶ (string) – name to use for the color used to color the cover complex nodes, default “color”. It will be used when generating an html visualization with KeplerMapperVisuFromTxtFile.py
- set_fit_request(*, colors: bool | None | str = '$UNCHANGED$', filters: bool | None | str = '$UNCHANGED$') MapperComplex #
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- Returns:
self – The updated object.
- Return type:
object
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params¶ (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
GraphInducedComplex
reference manual#
- class gudhi.cover_complex.GraphInducedComplex[source]#
This is a class for computing graph induced simplicial complexes on point clouds or distance matrices.
- __init__(*, input_type='point cloud', cover='functional', min_points_per_node=0, voronoi_samples=100, assignments=None, filter_bnds=None, resolution=None, gain=None, N=100, beta=0.0, C=10.0, graph='rips', rips_threshold=None, verbose=False)[source]#
Constructor for the GraphInducedComplex class.
- Parameters:
input_type¶ (string) – type of input data. Either “point cloud” or “distance matrix”.
cover¶ (string) – specifies the cover. Either “functional” (preimages of filter function), “voronoi” or “precomputed”.
min_points_per_node¶ (int) – threshold on the size of the cover complex nodes (default 0). Any node associated to a subpopulation with less than min_points_per_node points will be removed.
voronoi_samples¶ (int) – number of Voronoi germs used for partitioning the input dataset. Used only if cover = “voronoi”.
assignments¶ (list of length (num_points) of lists of integers) – cover assignment for each point. Used only if cover = “precomputed”.
filter_bnds¶ (list or numpy array of shape 2) – limits of the filter function, of the form [f^min, f^max]. If one of the values is numpy.nan, it can be computed from the dataset with the fit() method. Used only if cover = “functional”.
resolution¶ (int) – resolution of the filter function, ie number of intervals required to cover each filter image. Used only if cover = “functional”. If None, it is estimated from data.
gain¶ (double in [0,1]) – gain of the filter function, ie overlap percentage of the intervals covering each filter image. Used only if cover = “functional”.
N¶ (int) – subsampling iterations (default 100) for estimating scale and resolutions. Used only if cover = “functional”. See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
beta¶ (double) – exponent parameter (default 0.) for estimating scale and resolutions. Used only if cover = “functional”. See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
C¶ (double) – constant parameter (default 10.) for estimating scale and resolutions. Used only if cover = “functional”. See http://www.jmlr.org/papers/volume19/17-291/17-291.pdf for details.
graph¶ (string) – type of graph to use for GIC. Currently accepts “rips” only.
rips_threshold¶ (float) – Rips parameter. Used only if graph = “rips”.
verbose¶ (bool) – whether to display info while computing.
- fit(X, y=None, filter=None, color=None)[source]#
Fit the GraphInducedComplex class on a point cloud or a distance matrix: compute the graph induced complex and store it in a simplex tree called simplex_tree_.
- Parameters:
X¶ (numpy array of shape (num_points) x (num_coordinates) if point cloud and (num_points) x (num_points) if distance matrix) – input point cloud or distance matrix.
y¶ (n x 1 array) – point labels (unused).
filter¶ (list or numpy array of shape (num_points)) – filter function (sometimes called lens) used to compute the cover. Used only if cover = “functional”.
color¶ (list or numpy array of shape (num_points)) – function used to color the nodes of the cover complex. More specifically, coloring is done by computing the means of this function on the subpopulations corresponding to each node. If None, first coordinate is used if input is point cloud, and eccentricity is used if input is distance matrix.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequest
encapsulating routing information.- Return type:
MetadataRequest
- get_networkx(set_attributes_from_colors=False)#
Turn the 1-skeleton of the cover complex computed after calling fit() method into a networkx graph. This function requires networkx (https://networkx.org/documentation/stable/install.html).
- Parameters:
set_attributes_from_colors¶ (bool) – if True, the color functions will be used as attributes for the networkx graph.
- Returns:
G – graph representing the 1-skeleton of the cover complex.
- Return type:
networkx graph
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep¶ (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- save_to_dot(file_name='cover_complex', color_name='color', eps_color=0.1, eps_size=0.1)#
Write the 0-skeleton of the cover complex in a DOT file called “{file_name}.dot”, that can be processed with, e.g., neato. The vertices of the cover complex are colored with the first color function, ie, the first column of self.colors. This function also produces an extra pdf file “colorbar_{color_name}.pdf” containing a colorbar corresponding to the node colors in the DOT file.
- Parameters:
file_name¶ (string) – name for the output .dot file, default “cover_complex”
color_name¶ (string) – name for the output .pdf showing the colorbar of the color used for the Mapper nodes, default “color”
eps_color¶ (float) – scale the node colors between [eps_color, 1-eps_color]. Should be between 0 and 1/2. When close to 0., the color varies a lot across the nodes, if close to 1/2, the color tends to be more uniform.
eps_size¶ (float) – scale the node sizes between [eps_size, 1-eps_size]. Should be between 0 and 1/2. When close to 0., the size varies a lot across the nodes, if close to 1/2, the nodes tend to have the same size.
- save_to_html(file_name='cover_complex', data_name='data', cover_name='cover', color_name='color')#
Write the cover complex to an HTML file called “{file_name}.html”, that can be visualized in a browser. This function is based on a fork of MLWave/kepler-mapper
- Parameters:
file_name¶ (string) – name for the output .html file, default “cover_complex”
data_name¶ (string) – name to use for the data on which the cover complex was computed, default “data”.
cover_name¶ (string) – name to use for the cover used to compute the cover complex, default “cover”.
color_name¶ (string) – name to use for the color used to color the cover complex nodes, default “color”.
- save_to_txt(file_name='cover_complex', data_name='data', cover_name='cover', color_name='color')#
Write the cover complex to a TXT file called “{file_name}.txt”, that can be processed with the KeplerMapper Python script “KeplerMapperVisuFromTxtFile.py” available under “src/Nerve_GIC/utilities/”.
- Parameters:
file_name¶ (string) – name for the output .txt file, default “cover_complex”
data_name¶ (string) – name to use for the data on which the cover complex was computed, default “data”. It will be used when generating an html visualization with KeplerMapperVisuFromTxtFile.py
cover_name¶ (string) – name to use for the cover used to compute the cover complex, default “cover”. It will be used when generating an html visualization with KeplerMapperVisuFromTxtFile.py
color_name¶ (string) – name to use for the color used to color the cover complex nodes, default “color”. It will be used when generating an html visualization with KeplerMapperVisuFromTxtFile.py
- set_fit_request(*, color: bool | None | str = '$UNCHANGED$', filter: bool | None | str = '$UNCHANGED$') GraphInducedComplex #
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- Returns:
self – The updated object.
- Return type:
object
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params¶ (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance
NerveComplex
reference manual#
- class gudhi.cover_complex.NerveComplex[source]#
This is a class for computing nerve simplicial complexes on point clouds or distance matrices.
- __init__(*, input_type='point cloud', min_points_per_node=0, verbose=False)[source]#
Constructor for the NerveComplex class.
- Parameters:
input_type¶ (string) – type of input data. Either “point cloud” or “distance matrix”.
min_points_per_node¶ (int) – threshold on the size of the cover complex nodes (default 0). Any node associated to a subpopulation with less than min_points_per_node points will be removed.
verbose¶ (bool) – whether to display info while computing.
- fit(X, y=None, assignments=None, color=None)[source]#
Fit the NerveComplex class on a point cloud or a distance matrix: compute the nerve complex and store it in a simplex tree called simplex_tree_.
- Parameters:
X¶ (numpy array of shape (num_points) x (num_coordinates) if point cloud and (num_points) x (num_points) if distance matrix) – input point cloud or distance matrix.
y¶ (n x 1 array) – point labels (unused).
assignments¶ (list of length (num_points) of lists of integers) – cover assignment for each point.
color¶ (numpy array of shape (num_points) x (num_colors)) – functions used to color the nodes of the cover complex. More specifically, coloring is done by computing the means of these functions on the subpopulations corresponding to each node. If None, first coordinate is used if input is point cloud, and eccentricity is used if input is distance matrix.
- get_metadata_routing()#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
routing – A
MetadataRequest
encapsulating routing information.- Return type:
MetadataRequest
- get_networkx(set_attributes_from_colors=False)#
Turn the 1-skeleton of the cover complex computed after calling fit() method into a networkx graph. This function requires networkx (https://networkx.org/documentation/stable/install.html).
- Parameters:
set_attributes_from_colors¶ (bool) – if True, the color functions will be used as attributes for the networkx graph.
- Returns:
G – graph representing the 1-skeleton of the cover complex.
- Return type:
networkx graph
- get_params(deep=True)#
Get parameters for this estimator.
- Parameters:
deep¶ (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
params – Parameter names mapped to their values.
- Return type:
dict
- save_to_dot(file_name='cover_complex', color_name='color', eps_color=0.1, eps_size=0.1)#
Write the 0-skeleton of the cover complex in a DOT file called “{file_name}.dot”, that can be processed with, e.g., neato. The vertices of the cover complex are colored with the first color function, ie, the first column of self.colors. This function also produces an extra pdf file “colorbar_{color_name}.pdf” containing a colorbar corresponding to the node colors in the DOT file.
- Parameters:
file_name¶ (string) – name for the output .dot file, default “cover_complex”
color_name¶ (string) – name for the output .pdf showing the colorbar of the color used for the Mapper nodes, default “color”
eps_color¶ (float) – scale the node colors between [eps_color, 1-eps_color]. Should be between 0 and 1/2. When close to 0., the color varies a lot across the nodes, if close to 1/2, the color tends to be more uniform.
eps_size¶ (float) – scale the node sizes between [eps_size, 1-eps_size]. Should be between 0 and 1/2. When close to 0., the size varies a lot across the nodes, if close to 1/2, the nodes tend to have the same size.
- save_to_html(file_name='cover_complex', data_name='data', cover_name='cover', color_name='color')#
Write the cover complex to an HTML file called “{file_name}.html”, that can be visualized in a browser. This function is based on a fork of MLWave/kepler-mapper
- Parameters:
file_name¶ (string) – name for the output .html file, default “cover_complex”
data_name¶ (string) – name to use for the data on which the cover complex was computed, default “data”.
cover_name¶ (string) – name to use for the cover used to compute the cover complex, default “cover”.
color_name¶ (string) – name to use for the color used to color the cover complex nodes, default “color”.
- save_to_txt(file_name='cover_complex', data_name='data', cover_name='cover', color_name='color')#
Write the cover complex to a TXT file called “{file_name}.txt”, that can be processed with the KeplerMapper Python script “KeplerMapperVisuFromTxtFile.py” available under “src/Nerve_GIC/utilities/”.
- Parameters:
file_name¶ (string) – name for the output .txt file, default “cover_complex”
data_name¶ (string) – name to use for the data on which the cover complex was computed, default “data”. It will be used when generating an html visualization with KeplerMapperVisuFromTxtFile.py
cover_name¶ (string) – name to use for the cover used to compute the cover complex, default “cover”. It will be used when generating an html visualization with KeplerMapperVisuFromTxtFile.py
color_name¶ (string) – name to use for the color used to color the cover complex nodes, default “color”. It will be used when generating an html visualization with KeplerMapperVisuFromTxtFile.py
- set_fit_request(*, assignments: bool | None | str = '$UNCHANGED$', color: bool | None | str = '$UNCHANGED$') NerveComplex #
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- Returns:
self – The updated object.
- Return type:
object
- set_params(**params)#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
**params¶ (dict) – Estimator parameters.
- Returns:
self – Estimator instance.
- Return type:
estimator instance