deduplication | SigPort

Approximate Hashing for Bioinformatics

Read more about Approximate Hashing for Bioinformatics
Log in to post comments

A particular form of lossless data compression is known as deduplication, which is often applied in a scenario in which a large data repository is given and we wish to store a new, updated, version of it, in which the changes account only for a tiny fraction of the accumulated information. The idea is then to find duplicated parts and store only one copy P of them; the second and subsequent occurrences of these parts can then be replaced by pointers to P.

DCC-Approx.pptx

DCC-Approx.pptx (316)

Categories:: Other

40 Views