Const
Detect quality encoding from single quality string using sequencing technology patterns
Implements intelligent quality encoding detection that accounts for the complex history of sequencing technology evolution. The algorithm uses ASCII range analysis combined with pattern recognition to distinguish between Solexa, Phred+64, and Phred+33 encodings. This is essential for processing legacy sequencing data and ensuring compatibility across the 15+ year history of high-throughput sequencing.
Detection Algorithm Strategy: The algorithm uses a multi-stage approach to handle overlapping ASCII ranges:
Technological Context:
Detection Challenges: The overlapping ASCII ranges create fundamental ambiguity:
Biological Impact of Quality Scores:
Platform-Specific Characteristics:
Quality string from FASTQ record
Detected quality encoding with confidence assessment
// Typical Illumina 1.5 quality string (Phred+64)
const illumina15 = "@@CDEFGHIJKLMNOPQRSTUVWXYZ[\\]";
const encoding = detectEncodingImmediate(illumina15);
console.log(encoding); // "phred64" - legacy Illumina format
// Modern Illumina data with high quality scores
const modernHQ = "IIIIIIIIIIIIIIIIIIIII"; // Q40 across entire read
const encoding = detectEncodingImmediate(modernHQ);
console.log(encoding); // "phred33" - modern standard
// Original Solexa scoring (odds-based, rare in modern data)
const solexa = ";;;;;;;;;;"; // Poor quality, Solexa-specific range
const encoding = detectEncodingImmediate(solexa);
console.log(encoding); // "solexa" - historical format
🔥 NATIVE CANDIDATE: ASCII min/max finding with SIMD acceleration
Primary quality encoding detection (re-export for intuitive API) Uses immediate detection as the primary use case for most workflows