MinHash clustering
The MinHash clustering service subscribes to the Redis database and listens for jobs on the minhash queue.
Tasks
- minhash_service.tasks.add_signature(sample_id: str, signature) str
Find signatures similar to reference signature.
- minhash_service.tasks.remove_signature(sample_id: str) Dict[str, str | bool]
Remove a signature from the database and index.
- minhash_service.tasks.similar(sample_id: str, min_similarity: float = 0.5, limit: int | None = None) List[SimilarSignature]
Find signatures similar to reference signature.
- Parameters:
str (sample_id) – The id of reference sample
float (min_similarity) – Minimum similarity score
None (limit int |) – Limit the result to x samples, default to None
- Returns:
list of the similar signatures
- Return type:
SimilarSignatures
- minhash_service.tasks.cluster(sample_ids: List[str], cluster_method: str = 'single') str
Cluster multiple sample on their sourmash signatures.
- Parameters:
List[str] (sample_ids) – The sample ids to cluster
int (cluster_method) – The linkage or clustering method to use, default to single
- Raises:
ValueError – raises an exception if the method is not a valid MSTree clustering method.
- Returns:
clustering result in newick format
- Return type:
- minhash_service.tasks.find_similar_and_cluster(sample_id: str, min_similarity: float = 0.5, limit: int | None = None, cluster_method: str = 'single') str
Find similar samples and cluster them on their minhash profile.
- Parameters:
str (sample_id) – The id of reference sample
float (min_similarity) – Minimum similarity score
None (limit int |) – Limit the result to x samples, default to None
int (cluster_method) – The linkage or clustering method to use, default to single
- Raises:
ValueError – raises an exception if the method is not a valid MSTree clustering method.
- Returns:
clustering result in newick format
- Return type: