Archive Support
checkle can hash files within archives without extracting them.
Supported Formats
TAR Archives
.tar- Uncompressed.tar.gz/.tgz- Gzip compressed.tar.bz2- Bzip2 compressed.tar.xz- XZ compressed
ZIP Archives
.zip- Various compression methods
Basic Usage
Hash Specific File in Archive
checkle hash archive.tar:path/to/file.txt
Hash All Files in Archive
checkle hash archive.tar.gz:*
Hash Files Matching Pattern
checkle hash data.zip:*.csv
checkle hash backup.tar:logs/*.log
Archive Path Syntax
Use colon (:) to separate archive from internal path:
archive_path:internal_path
Examples:
# Specific file
data.tar.gz:results/output.txt
# All files
data.tar.gz:*
# Pattern matching
data.zip:*.fastq
data.tar:experiments/*/results.csv
Pattern Matching
Wildcards
*- Match any characters (except/)**- Match any characters (including/)?- Match single character
Examples
# All CSV files in root
checkle hash archive.zip:*.csv
# All files in subdirectory
checkle hash archive.tar:data/*
# Recursive pattern
checkle hash archive.tar.gz:**/*.txt
Performance
Streaming Processing
- Files processed without full extraction
- Memory usage bounded
- Decompression on-the-fly
Limitations
- Sequential access within archives
- Cannot parallelize individual archive entries
- Compressed archives require decompression
Examples
Genomics Data
# Hash FASTQ files in compressed archive
checkle hash sequencing_run.tar.gz:*.fastq
# Verify specific sample
checkle verify reads.tar.gz:sample_001.fastq --hash abc123
Backup Verification
# Hash all files in backup
checkle hash backup.tar.gz:* -o backup_checksums.txt
# Verify backup integrity later
checkle verify-many --checksum-file backup_checksums.txt
Data Transfer
# Before transfer - hash archive contents
checkle hash data.tar.gz:* > checksums_before.txt
# After transfer - verify integrity
checkle hash data.tar.gz:* > checksums_after.txt
diff checksums_before.txt checksums_after.txt
Archive vs Regular File
Without colon - hash the archive itself
checkle hash archive.tar.gz
# Output: abc123def456 archive.tar.gz
With colon - hash contents
checkle hash archive.tar.gz:file.txt
# Output: 789xyz012 archive.tar.gz:file.txt
Compressed Archives
Compression is handled transparently:
# All work the same way
checkle hash data.tar:file.txt # Uncompressed
checkle hash data.tar.gz:file.txt # Gzip
checkle hash data.tar.bz2:file.txt # Bzip2
checkle hash data.tar.xz:file.txt # XZ
Verification
Single File in Archive
checkle verify archive.tar:important.dat --hash e3b0c44298fc1c14
Multiple Files
Create checksum file:
checkle hash archive.tar:* -o archive_checksums.txt
Verify later:
checkle verify-many --checksum-file archive_checksums.txt
Tips
- Use patterns to hash multiple files - More efficient than individual commands
- Save checksums for archives - Verify integrity without re-reading
- Compressed archives are slower - Decompression adds overhead
- Large archives work fine - Streaming prevents memory issues
- Archive path must exist - Archive file itself must be accessible