2.5.1. Datasets

The load_datasets segment of the Machine Learning module is a set of datasets that are persistence diagrams computed from actual datasets used in machine learning applications. Three datasets are available to load, and an accompanying notebook with a machine learning pipeline is available for each. An additional dataset is available for download.

An overview of these datasets is available in “A Comparative Study of Machine Learning Methods for Persistence Diagrams” as well as details on the type of persistence used to compute persistence diagrams from each dataset.

teaspoon.ML.load_datasets.mnist()[source]

Load the persistence diagrams from the training portion of the MNIST dataset from http://yann.lecun.com/exdb/mnist/

Columns available: zero_dim_rtl: 0-dimensional persistence diagrams computed using right to left euler transform zero_dim_ltr: 0-dimensional persistence diagrams computed using left to right euler transform zero_dim_btt: 0-dimensional persistence diagrams computed using bottom to top euler transform zero_dim_ttb: 0-dimensional persistence diagrams computed using top to bottom euler transform one_dim_rtl: 1-dimensional persistence diagrams computed using right to left euler transform one_dim_ltr: 1-dimensional persistence diagrams computed using left to right euler transform one_dim_btt: 1-dimensional persistence diagrams computed using bottom to top euler transform one_dim_ttb: 1-dimensional persistence diagrams computed using top to bottom euler transform

teaspoon.ML.load_datasets.mpeg7()[source]

Load the persistence diagrams from the MPEG7 dataset

teaspoon.ML.load_datasets.shrec14()[source]

Load the persistence diagrams from the shrec14 dataset