Subsampling

Enumerations

enum  : std::size_t { Gudhi::subsampling::random_starting_point = std::size_t(-1) }
 

Functions

template<typename Distance , typename Point_range , typename PointOutputIterator , typename DistanceOutputIterator = Null_output_iterator>
void Gudhi::subsampling::choose_n_farthest_points (Distance dist, Point_range const &input_pts, std::size_t final_size, std::size_t starting_point, PointOutputIterator output_it, DistanceOutputIterator dist_it={})
 Subsample by a greedy strategy of iteratively adding the farthest point from the current chosen point set to the subsampling. More...
 
template<typename Point_container , typename OutputIterator >
void Gudhi::subsampling::pick_n_random_points (Point_container const &points, std::size_t final_size, OutputIterator output_it)
 Subsample a point set by picking random vertices. More...
 
template<typename Kernel , typename Point_range , typename OutputIterator >
void Gudhi::subsampling::sparsify_point_set (const Kernel &k, Point_range const &input_pts, typename Kernel::FT min_squared_dist, OutputIterator output_it)
 Outputs a subset of the input points so that the squared distance between any two points is greater than min_squared_dist. More...
 

Detailed Description

Author
Clément Jamin, Siargey Kachanovich

Introduction

This Gudhi component offers methods to subsample a set of points.

Example: sparsify_point_set

This example outputs a subset of the input points so that the squared distance between any two points is greater than or equal to 0.4.

#include <gudhi/sparsify_point_set.h>
#include <CGAL/Epick_d.h>
#include <CGAL/Random.h>
#include <iostream>
#include <vector>
#include <iterator>
int main(void) {
typedef CGAL::Epick_d<CGAL::Dimension_tag<4> > K;
typedef typename K::Point_d Point_d;
CGAL::Random rd;
std::vector<Point_d> points;
for (int i = 0; i < 500; ++i)
points.push_back(Point_d(rd.get_double(-1., 1), rd.get_double(-1., 1),
rd.get_double(-1., 1), rd.get_double(-1., 1)));
K k;
std::vector<Point_d> results;
Gudhi::subsampling::sparsify_point_set(k, points, 0.4, std::back_inserter(results));
std::clog << "Before sparsification: " << points.size() << " points.\n";
std::clog << "After sparsification: " << results.size() << " points.\n";
return 0;
}

Example: choose_n_farthest_points

This example outputs a subset of 100 points obtained by González algorithm, starting with a random point.

#include <gudhi/choose_n_farthest_points.h>
#include <CGAL/Epick_d.h>
#include <CGAL/Random.h>
#include <iostream>
#include <vector>
#include <iterator>
int main(void) {
typedef CGAL::Epick_d<CGAL::Dimension_tag<4> > K;
typedef typename K::Point_d Point_d;
CGAL::Random rd;
std::vector<Point_d> points;
for (int i = 0; i < 500; ++i)
points.push_back(Point_d(rd.get_double(-1., 1), rd.get_double(-1., 1),
rd.get_double(-1., 1), rd.get_double(-1., 1)));
K k;
std::vector<Point_d> results;
Gudhi::subsampling::choose_n_farthest_points(k.squared_distance_d_object(), points, 100,
std::back_inserter(results));
std::clog << "Before sparsification: " << points.size() << " points.\n";
std::clog << "After sparsification: " << results.size() << " points.\n";
return 0;
}

Example: pick_n_random_points

This example outputs a subset of 100 points picked randomly.

#include <gudhi/pick_n_random_points.h>
#include <CGAL/Epick_d.h>
#include <CGAL/Random.h>
#include <iostream>
#include <vector>
#include <iterator>
int main(void) {
typedef CGAL::Epick_d<CGAL::Dimension_tag<4> > K;
typedef typename K::Point_d Point_d;
CGAL::Random rd;
std::vector<Point_d> points;
for (int i = 0; i < 500; ++i)
points.push_back(Point_d(rd.get_double(-1., 1), rd.get_double(-1., 1),
rd.get_double(-1., 1), rd.get_double(-1., 1)));
K k;
std::vector<Point_d> results;
Gudhi::subsampling::pick_n_random_points(points, 100, std::back_inserter(results));
std::clog << "Before sparsification: " << points.size() << " points.\n";
std::clog << "After sparsification: " << results.size() << " points.\n";
return 0;
}

Enumeration Type Documentation

◆ anonymous enum

anonymous enum : std::size_t
Enumerator
random_starting_point 

Argument for choose_n_farthest_points to indicate that the starting point should be picked randomly.

Function Documentation

◆ choose_n_farthest_points()

template<typename Distance , typename Point_range , typename PointOutputIterator , typename DistanceOutputIterator = Null_output_iterator>
void Gudhi::subsampling::choose_n_farthest_points ( Distance  dist,
Point_range const &  input_pts,
std::size_t  final_size,
std::size_t  starting_point,
PointOutputIterator  output_it,
DistanceOutputIterator  dist_it = {} 
)

Subsample by a greedy strategy of iteratively adding the farthest point from the current chosen point set to the subsampling.

The iteration starts with the landmark starting point or, if starting point==random_starting_point, with a random landmark. It chooses final_size points from a random access range input_pts (or the number of distinct points if final_size is larger) and outputs them in the output iterator output_it. It also outputs the distance from each of those points to the set of previous points in dist_it.

Template Parameters
Distancemust provide an operator() that takes 2 points (value type of the range) and returns their distance (or some more general proximity measure) as a double.
Point_rangeRandom access range of points.
PointOutputIteratorOutput iterator whose value type is the point type.
DistanceOutputIteratorOutput iterator for distances.
Parameters
[in]distA distance function.
[in]input_ptsThe input points.
[in]final_sizeThe size of the subsample to compute.
[in]starting_pointThe seed in the farthest point algorithm.
[out]output_itThe output iterator for points.
[out]dist_itThe optional output iterator for distances.
Warning
Older versions of this function took a CGAL kernel as argument. Users need to replace k with k.squared_distance_d_object() in the first argument of every call to choose_n_farthest_points.
Examples:
Subsampling/example_choose_n_farthest_points.cpp, Subsampling/example_custom_distance.cpp, Witness_complex/example_strong_witness_complex_off.cpp, Witness_complex/example_witness_complex_off.cpp, Witness_complex/example_witness_complex_sphere.cpp, Witness_complex/strong_witness_persistence.cpp, and Witness_complex/weak_witness_persistence.cpp.

◆ pick_n_random_points()

template<typename Point_container , typename OutputIterator >
void Gudhi::subsampling::pick_n_random_points ( Point_container const &  points,
std::size_t  final_size,
OutputIterator  output_it 
)

Subsample a point set by picking random vertices.

It chooses final_size distinct points from a random access range points and outputs them to the output iterator output_it. Point_container::iterator should be ValueSwappable and RandomAccessIterator.

Examples:
Subsampling/example_pick_n_random_points.cpp.

◆ sparsify_point_set()

template<typename Kernel , typename Point_range , typename OutputIterator >
void Gudhi::subsampling::sparsify_point_set ( const Kernel &  k,
Point_range const &  input_pts,
typename Kernel::FT  min_squared_dist,
OutputIterator  output_it 
)

Outputs a subset of the input points so that the squared distance between any two points is greater than min_squared_dist.

Template Parameters
Kernelmust be a model of the SearchTraits concept, such as the CGAL::Epick_d class, which can be static if you know the ambiant dimension at compile-time, or dynamic if you don't.
Point_rangeRange whose value type is Kernel::Point_d. It must provide random-access via operator[] and the points should be stored contiguously in memory.
OutputIteratorOutput iterator whose value type is Kernel::Point_d.
Parameters
[in]kA kernel object.
[in]input_ptsConst reference to the input points.
[in]min_squared_distMinimum squared distance separating the output points.
[out]output_itThe output iterator.
Examples:
Subsampling/example_sparsify_point_set.cpp, and Tangential_complex/example_with_perturb.cpp.
GUDHI  Version 3.4.1  - C++ library for Topological Data Analysis (TDA) and Higher Dimensional Geometry Understanding.  - Copyright : MIT Generated on Fri Jan 22 2021 09:41:16 for GUDHI by Doxygen 1.8.13