2.2.1. Datasets

The load_datasets segment of the Machine Learning module is a set of datasets that are persistence diagrams computed from actual datasets used in machine learning applications. Three datasets are available to load, and an accompanying notebook with a machine learning pipeline is available for each. An additional dataset is available for download.

An overview of these datasets is available in “A Comparative Study of Machine Learning Methods for Persistence Diagrams” as well as details on the type of persistence used to compute persistence diagrams from each dataset.


Load the persistence diagrams from the training portion of the MNIST dataset from http://yann.lecun.com/exdb/mnist/


Load the persistence diagrams from the MPEG7 dataset


Load the persistence diagrams from the shrec14 dataset