Searches a directory and its immediate subdirectories for duplicate WAV files
based on MD5 hash of file contents. Returns one representative file path for
each set of duplicates found.
Usage
find_duplicate_wavs(dir, parallel = TRUE, n_cores = NULL)
Arguments
- dir
Character. Path to the directory to search. Must exist.
- parallel
Logical. If TRUE, use parallel processing for hashing. Default is TRUE.
- n_cores
Integer. Number of cores to use. Default is NULL (auto-detect).
Value
A named character vector where:
Returns empty character vector if no duplicates found.
Returns NULL invisibly if no WAV files found.
Details
The function searches the specified directory and its immediate subdirectories
(one level deep) for WAV files. File matching is case-insensitive. Duplicates
are identified by computing MD5 hashes of the binary file contents.
Requires the digest package to be installed.
Examples
if (FALSE) { # \dontrun{
# Find duplicate WAV files
dups <- find_duplicate_wavs("/path/to/audio/library")
# Print duplicate sets
if (length(dups) > 0) {
print(dups)
}
} # }