Mnist_a1b1 <- tumap(mnist_pca, ret_nn = TRUE) # remember to get the nearest neighbor data back too Mnist_pca <- irlba::prcomp_irlba(mnist, n = 50, retx = TRUE, center = center, Here’s an example workflow: # PCA to 50 dimensions first This is a substantial speed up and makes repeating runs with different values of a and b a lot more tolerable. I also recommend doing any PCA dimensionality reduction outside of uwot and using ret_nn = TRUE for the first plot, so you can re-use the nearest neighbors data in subsequent runs of umap. To find good values for a and b, you can start with them at a = 1 and b = 1, which gives a t-SNE-like output function, and you can use the tumap function to generate the initial plot much faster. I personally prefer to use a and b directly. Of min_dist and spread, modifying min_dist between 0 and 1, as suggested by the UMAP docs seems to be most fruitful of the parameters to meddle with. Hopefully this is enough to convince you that the embedding parameters can be profitably twiddled with in more than a random way to give visualizations that improve over the default settings.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |