Skip to content

Conversation

@BernalFA
Copy link

Using dedenser as a python package, I noticed the needed for some modifications.

  1. The umap implementation on make_cloud does not have a fixed random_state, causing different downsampling outcomes when repeatedly calling the function and Dedenser.downsample() on the same dataset. I added a random seed for umap.
  2. To improve ease of use as a python package, I added the output option on make_cloud, which allows returning the point cloud array for direct use in Dedenser.downsample(). The array is still saved to file.

minimal example:

from dedenser import make_cloud, Dedenser

# define random seed
SEED = 21

# create chemical point cloud
point_clouds = make_cloud(
    path=path_to_file, 
    path_out=path_out,
    position=5,
    heady=True,
    random_state=SEED, # new
    output=True # new
)

# run downsampling
dd = Dedenser(
    data=point_clouds,
    target=0.35,
    random_seed=SEED
)
out_cloud = dd.downsample()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant