View source
Download
.ipynb
Compare dimensionality reduction methods¶
This notebook demonstrates how to perform dimensionality reduction on a column in a tlc.Table using two different dimensionality reduction algorithms, pacmap and umap.

The Table we will be using in this notebook contains a column of points in 3 dimensions. We reduce these columns to points in 2 dimensions. While a dimensionality reduction from 3 to 2 is not the most typical use case for dimensionality reduction, it is a good way to visualize and compare the effects of different dimensionality reduction algorithms.
To run this notebook, you must also have run:
Install dependencies¶
[ ]:
%pip install "3lc[umap,pacmap]"
[ ]:
import tlc
# Load the table from the previous example. It contains a single column containing the 3D points.
table = tlc.Table.from_names(table_name="mammoth-10k", dataset_name="Mammoth", project_name="3LC Tutorials - Mammoth")
table.columns
[ ]:
umap_params_1 = {
"n_components": 2, # Project the data to 2 dimensions
"n_neighbors": 15, # Local connectivity, fewer neighbors create more local clusters
"min_dist": 0.1, # Minimum distance between points in the embedding space, preserves more local structure
"metric": "euclidean", # Use Euclidean distance to measure similarity
"retain_source_embedding_column": True,
"source_embedding_column": "points",
}
reduced_umap_1 = tlc.reduce_embeddings(table, method="umap", **umap_params_1)
umap_params_2 = {
"n_components": 2, # Project the data to 2 dimensions
"n_neighbors": 50, # Local connectivity, more neighbors create more global structure
"min_dist": 0.5, # Minimum distance between points in the embedding space, allows more spread out embedding
"metric": "manhattan", # Use Manhattan distance to measure similarity
"retain_source_embedding_column": True,
"source_embedding_column": "points",
}
reduced_umap_2 = tlc.reduce_embeddings(table, method="umap", **umap_params_2)
[ ]:
pacmap_param_1 = {
"n_components": 2, # Project the data to 2 dimensions
"n_neighbors": 10, # Number of neighbors to consider, fewer neighbors emphasize local structure
"MN_ratio": 0.5, # Ratio of mid-near pairs, balancing between local and global structure
"FP_ratio": 2.0, # Ratio of far pairs, emphasizing the global structure more
"retain_source_embedding_column": True,
"source_embedding_column": "points",
}
reduced_pacmap_1 = tlc.reduce_embeddings(reduced_umap_2, method="pacmap", **pacmap_param_1)
pacmap_param_2 = {
"n_components": 2, # Project the data to 2 dimensions
"n_neighbors": 30, # Number of neighbors to consider, more neighbors emphasize global structure
"MN_ratio": 1.0, # Ratio of mid-near pairs, equal balance between local and global structure
"FP_ratio": 1.0, # Ratio of far pairs, standard emphasis on global structure
"retain_source_embedding_column": True,
"source_embedding_column": "points",
}
reduced_pacmap_2 = tlc.reduce_embeddings(reduced_pacmap_1, method="pacmap", **pacmap_param_2)