Class SeqOps<T>

Main SeqOps class providing fluent interface for sequence operations

Enables Unix pipeline-style method chaining for processing genomic sequences. All operations are lazy-evaluated and maintain streaming behavior for memory efficiency with large datasets.

Example

// Basic pipeline
await seqops(sequences)
  .filter({ minLength: 100 })
  .transform({ reverseComplement: true })
  .subseq({ region: "100:500" })
  .writeFasta('output.fasta');

// Complex filtering and analysis
const stats = await seqops(sequences)
  .quality({ minScore: 20, trim: true })
  .filter({ minLength: 50 })
  .stats({ detailed: true });

Type Parameters

T extends AbstractSequence

Constructors

constructor

new SeqOps<T extends AbstractSequence>(source: AsyncIterable<T>): SeqOps<T>
Create a new SeqOps pipeline
Type Parameters
- T extends AbstractSequence
Parameters
- source: AsyncIterable<T>
 Input sequences (async iterable)
Returns SeqOps<T>
- Defined in src/operations/index.ts:114

Methods

`Static`fromDSV

fromDSV(
 path: string,
 options?: {
 delimiter?: string;
 hasHeader?: boolean;
 columns?: string[];
 format?: "fasta" | "fastq";
 qualityEncoding?: "phred33" | "phred64" | "solexa";
 },
): SeqOps<AbstractSequence>
Create SeqOps pipeline from delimiter-separated file

Supports auto-detection of delimiter and format. Files can be compressed (.gz, .zst) and will be automatically decompressed during streaming.
Parameters
- path: string
 Path to DSV file (TSV, CSV, or custom delimiter)
- Optionaloptions: {
 delimiter?: string;
 hasHeader?: boolean;
 columns?: string[];
 format?: "fasta" | "fastq";
 qualityEncoding?: "phred33" | "phred64" | "solexa";
 }
 Parsing options (delimiter auto-detected if not specified)
Returns SeqOps<AbstractSequence>
New SeqOps pipeline for sequence processing
Example
```
// Auto-detect delimiter
const sequences = await SeqOps.fromDSV('data.txt').collect();

// Explicit delimiter with custom columns
const genes = await SeqOps.fromDSV('genes.psv', {
 delimiter: '|',
 format: 'fastq'
}).filter({ minLength: 100 });
```
Since
v0.1.0
- Defined in src/operations/index.ts:144

`Static`fromTSV

fromTSV(
 path: string,
 options?: Omit<
 {
 delimiter?: string;
 hasHeader?: boolean;
 columns?: string[];
 format?: "fasta"
 | "fastq";
 qualityEncoding?: "phred33" | "phred64" | "solexa";
 },
 "delimiter",
 >,
): SeqOps<AbstractSequence>
Create SeqOps pipeline from TSV (tab-separated) file

Convenience method for TSV files with tab delimiter pre-configured.
Parameters
- path: string
 Path to TSV file
- Optionaloptions: Omit<
 {
 delimiter?: string;
 hasHeader?: boolean;
 columns?: string[];
 format?: "fasta"
 | "fastq";
 qualityEncoding?: "phred33" | "phred64" | "solexa";
 },
 "delimiter",
 >
 Parsing options (delimiter forced to tab)
Returns SeqOps<AbstractSequence>
New SeqOps pipeline
Example
```
await SeqOps.fromTSV('sequences.tsv')
 .filter({ minLength: 50 })
 .writeFasta('filtered.fa');
```
Since
v0.1.0
- Defined in src/operations/index.ts:166

`Static`fromCSV

fromCSV(
 path: string,
 options?: Omit<
 {
 delimiter?: string;
 hasHeader?: boolean;
 columns?: string[];
 format?: "fasta"
 | "fastq";
 qualityEncoding?: "phred33" | "phred64" | "solexa";
 },
 "delimiter",
 >,
): SeqOps<AbstractSequence>
Create SeqOps pipeline from CSV (comma-separated) file

Convenience method for CSV files with comma delimiter pre-configured. Handles Excel-exported CSV files with proper quote escaping.
Parameters
- path: string
 Path to CSV file
- Optionaloptions: Omit<
 {
 delimiter?: string;
 hasHeader?: boolean;
 columns?: string[];
 format?: "fasta"
 | "fastq";
 qualityEncoding?: "phred33" | "phred64" | "solexa";
 },
 "delimiter",
 >
 Parsing options (delimiter forced to comma)
Returns SeqOps<AbstractSequence>
New SeqOps pipeline
Example
```
await SeqOps.fromCSV('excel_export.csv')
 .clean()
 .stats()
 .writeFastq('processed.fq');
```
Since
v0.1.0
- Defined in src/operations/index.ts:193

`Static`fromJSON

fromJSON(path: string, options?: JSONParseOptions): SeqOps<AbstractSequence>
Create SeqOps pipeline from JSON file

Parses JSON files containing sequence arrays. Supports both simple array format and wrapped format with metadata. Suitable for datasets under 100K sequences (loads entire file into memory).
Parameters
- path: string
 Path to JSON file
- Optionaloptions: JSONParseOptions
 Parsing options (format, quality encoding)
Returns SeqOps<AbstractSequence>
New SeqOps pipeline
Example
```
// Parse JSON array of sequences
await SeqOps.fromJSON('sequences.json')
 .filter({ minLength: 100 })
 .writeFasta('filtered.fa');

// Parse FASTQ sequences with quality encoding
await SeqOps.fromJSON('reads.json', {
 format: 'fastq',
 qualityEncoding: 'phred33'
}).quality({ minScore: 20 });
```
Performance
O(n) memory - loads entire file. Use fromJSONL() for large datasets.

Since
v0.1.0
- Defined in src/operations/index.ts:228

`Static`fromJSONL

fromJSONL(path: string, options?: JSONParseOptions): SeqOps<AbstractSequence>
Create SeqOps pipeline from JSONL (JSON Lines) file

Parses JSONL files where each line is a separate JSON object. Provides streaming with O(1) memory usage, suitable for datasets with millions of sequences.
Parameters
- path: string
 Path to JSONL file
- Optionaloptions: JSONParseOptions
 Parsing options (format, quality encoding)
Returns SeqOps<AbstractSequence>
New SeqOps pipeline
Example
```
// Stream large JSONL dataset
await SeqOps.fromJSONL('huge-dataset.jsonl')
 .filter({ minLength: 100 })
 .sample(1000)
 .writeFasta('sampled.fa');

// Process FASTQ from JSONL
await SeqOps.fromJSONL('reads.jsonl', { format: 'fastq' })
 .quality({ minScore: 30 })
 .clean()
 .writeFastq('clean.fq');
```
Performance
O(1) memory - streams line-by-line. Ideal for large files.

Since
v0.1.0
- Defined in src/operations/index.ts:262

`Static`from

from<T extends AbstractSequence>(sequences: T[]): SeqOps<T>
Create SeqOps pipeline from array of sequences

Convenient method to convert arrays to SeqOps pipelines. Most common use case for examples and small datasets.
Type Parameters
- T extends AbstractSequence
Parameters
- sequences: T[]
 Array of sequences
Returns SeqOps<T>
New SeqOps instance
Example
```
const sequences = [
 { id: 'seq1', sequence: 'ATCG', length: 4 },
 { id: 'seq2', sequence: 'GCTA', length: 4 }
];

const result = await SeqOps.from(sequences)
 .translate()
 .writeFasta('proteins.fasta');
```
Since
v0.1.0
- Defined in src/operations/index.ts:290

filter

filter(
 this: SeqOps<T & { index: number }>,
 options:
 | FilterOptions
 | ((seq: T, index: number) => boolean | Promise<boolean>),
): SeqOps<T>

Filter sequences based on criteria

Remove sequences that don't meet specified criteria. All criteria within a single filter call are combined with AND logic.

After calling .enumerate(), the index parameter becomes available in predicate functions, enabling position-based filtering.

Parameters

this: SeqOps<T & { index: number }>
options: FilterOptions | ((seq: T, index: number) => boolean | Promise<boolean>)
Filter criteria or custom predicate (with index after enumerate)

Returns SeqOps<T>

New SeqOps instance for chaining

Example

// Filter by length and GC content
seqops(sequences)
  .filter({ minLength: 100, maxGC: 60 })
  .filter({ hasAmbiguous: false });

// Custom filter function
seqops(sequences)
  .filter((seq) => seq.id.startsWith('chr'));

// With index (after enumerate) - keep even positions
seqops(sequences)
  .enumerate()
  .filter((seq, idx) => idx % 2 === 0);

// Async predicate with index
seqops(sequences)
  .enumerate()
  .filter(async (seq, idx) => {
    const valid = await validateSequence(seq);
    return valid && idx < 1000;
  });

// Type preservation with FastqSequence
seqops<FastqSequence>(reads)
  .filter((seq) => seq.quality !== undefined);
// Type: SeqOps<FastqSequence> ✅

filter(
options: FilterOptions | ((seq: T) => boolean | Promise<boolean>),
): SeqOps<T>

Filter sequences based on criteria

Remove sequences that don't meet specified criteria. All criteria within a single filter call are combined with AND logic.

After calling .enumerate(), the index parameter becomes available in predicate functions, enabling position-based filtering.

Parameters

options: FilterOptions | ((seq: T) => boolean | Promise<boolean>)
Filter criteria or custom predicate (with index after enumerate)

Returns SeqOps<T>

New SeqOps instance for chaining

Example

// Filter by length and GC content
seqops(sequences)
  .filter({ minLength: 100, maxGC: 60 })
  .filter({ hasAmbiguous: false });

// Custom filter function
seqops(sequences)
  .filter((seq) => seq.id.startsWith('chr'));

// With index (after enumerate) - keep even positions
seqops(sequences)
  .enumerate()
  .filter((seq, idx) => idx % 2 === 0);

// Async predicate with index
seqops(sequences)
  .enumerate()
  .filter(async (seq, idx) => {
    const valid = await validateSequence(seq);
    return valid && idx < 1000;
  });

// Type preservation with FastqSequence
seqops<FastqSequence>(reads)
  .filter((seq) => seq.quality !== undefined);
// Type: SeqOps<FastqSequence> ✅

filterBySet

filterBySet(
set: SequenceSet,
options?: { exclude?: boolean; by?: "sequence" | "id" },
): SeqOps<T>

Filter sequences based on membership in a SequenceSet

Efficiently filters the stream based on whether sequences are present in the provided set. Useful for contamination removal, whitelist/blacklist filtering, and set-based operations.

Type Parameters

U extends AbstractSequence

Parameters

set: SequenceSet
SequenceSet to filter against
Optionaloptions: { exclude?: boolean; by?: "sequence" | "id" }
Filtering options

Returns SeqOps<T>

Filtered SeqOps instance

Example

// Remove contamination sequences
const contaminants = await seqops("contaminants.fasta").collectSet();
await seqops("reads.fastq")
  .filterBySet(contaminants, { exclude: true })
  .writeFastq("clean_reads.fastq");

Example

// Keep only sequences in whitelist
const whitelist = await seqops("approved.fasta").collectSet();
await seqops("candidates.fasta")
  .filterBySet(whitelist, { exclude: false })
  .writeFasta("approved_candidates.fasta");

Example

// Filter by sequence ID instead of sequence content
const idSet = await seqops("targets.fasta").collectSet();
seqops("all_sequences.fasta")
  .filterBySet(idSet, { by: "id" });

transform

transform(options: TransformOptions): SeqOps<T>
Transform sequence content

Apply transformations that modify the sequence string itself.
Parameters
- options: TransformOptions
 Transform options
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences)
 .transform({ reverseComplement: true })
 .transform({ upperCase: true })
 .transform({ toRNA: true })
```
- Defined in src/operations/index.ts:436

amplicon

amplicon(forwardPrimer: string): SeqOps<T>

Extract amplicons via primer sequences

Finds primer pairs within sequences and extracts the amplified regions. Supports mismatch tolerance, degenerate bases (IUPAC codes), windowed search for long-read performance, canonical matching for BED-extracted primers, and flexible region extraction. Provides complete seqkit amplicon parity with enhanced biological validation and type safety.

Parameters

forwardPrimer: string

Returns SeqOps<T>

Example

// Simple amplicon extraction (90% use case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT')
  .writeFasta('amplicons.fasta');

// With mismatch tolerance (common case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT', 2)
  .filter({ minLength: 50 });

// Single primer (auto-canonical matching)
seqops(sequences)
  .amplicon('UNIVERSAL_PRIMER')
  .stats();

// Real-world COVID-19 diagnostics
seqops(samples)
  .quality({ minScore: 20 })
  .amplicon(
    primer`ACCAGGAACTAATCAGACAAG`,     // N gene forward
    primer`CAAAGACCAATCCTACCATGAG`,    // N gene reverse
    2                                  // Allow sequencing errors
  )
  .validate({ mode: 'strict' });

// Long reads with windowed search (massive performance boost)
seqops(nanoporeReads)
  .amplicon('FORWARD', 'REVERSE', {
    searchWindow: { forward: 200, reverse: 200 }  // 100x+ speedup
  });

// Advanced features (10% use case)
seqops(sequences)
  .amplicon({
    forwardPrimer: primer`ACCAGGAACTAATCAGACAAG`,
    reversePrimer: primer`CAAAGACCAATCCTACCATGAG`,
    maxMismatches: 3,                             // Long-read tolerance
    canonical: true,                              // BED-extracted primers
    flanking: true,                               // Include primer context
    region: '-100:100',                           // Biological context
    searchWindow: { forward: 200, reverse: 200 }, // Performance optimization
    outputMismatches: true                        // Debug information
  })
  .rmdup('sequence')
  .writeFasta('advanced_amplicons.fasta');

amplicon(forwardPrimer: string, reversePrimer: string): SeqOps<T>

Extract amplicons via primer sequences

Parameters

forwardPrimer: string
reversePrimer: string

Returns SeqOps<T>

Example

// Simple amplicon extraction (90% use case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT')
  .writeFasta('amplicons.fasta');

// With mismatch tolerance (common case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT', 2)
  .filter({ minLength: 50 });

// Single primer (auto-canonical matching)
seqops(sequences)
  .amplicon('UNIVERSAL_PRIMER')
  .stats();

// Real-world COVID-19 diagnostics
seqops(samples)
  .quality({ minScore: 20 })
  .amplicon(
    primer`ACCAGGAACTAATCAGACAAG`,     // N gene forward
    primer`CAAAGACCAATCCTACCATGAG`,    // N gene reverse
    2                                  // Allow sequencing errors
  )
  .validate({ mode: 'strict' });

// Long reads with windowed search (massive performance boost)
seqops(nanoporeReads)
  .amplicon('FORWARD', 'REVERSE', {
    searchWindow: { forward: 200, reverse: 200 }  // 100x+ speedup
  });

// Advanced features (10% use case)
seqops(sequences)
  .amplicon({
    forwardPrimer: primer`ACCAGGAACTAATCAGACAAG`,
    reversePrimer: primer`CAAAGACCAATCCTACCATGAG`,
    maxMismatches: 3,                             // Long-read tolerance
    canonical: true,                              // BED-extracted primers
    flanking: true,                               // Include primer context
    region: '-100:100',                           // Biological context
    searchWindow: { forward: 200, reverse: 200 }, // Performance optimization
    outputMismatches: true                        // Debug information
  })
  .rmdup('sequence')
  .writeFasta('advanced_amplicons.fasta');

amplicon(
 forwardPrimer: string,
 reversePrimer: string,
 maxMismatches: number,
): SeqOps<T>

Extract amplicons via primer sequences

Parameters

forwardPrimer: string
reversePrimer: string
maxMismatches: number

Returns SeqOps<T>

Example

// Simple amplicon extraction (90% use case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT')
  .writeFasta('amplicons.fasta');

// With mismatch tolerance (common case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT', 2)
  .filter({ minLength: 50 });

// Single primer (auto-canonical matching)
seqops(sequences)
  .amplicon('UNIVERSAL_PRIMER')
  .stats();

// Real-world COVID-19 diagnostics
seqops(samples)
  .quality({ minScore: 20 })
  .amplicon(
    primer`ACCAGGAACTAATCAGACAAG`,     // N gene forward
    primer`CAAAGACCAATCCTACCATGAG`,    // N gene reverse
    2                                  // Allow sequencing errors
  )
  .validate({ mode: 'strict' });

// Long reads with windowed search (massive performance boost)
seqops(nanoporeReads)
  .amplicon('FORWARD', 'REVERSE', {
    searchWindow: { forward: 200, reverse: 200 }  // 100x+ speedup
  });

// Advanced features (10% use case)
seqops(sequences)
  .amplicon({
    forwardPrimer: primer`ACCAGGAACTAATCAGACAAG`,
    reversePrimer: primer`CAAAGACCAATCCTACCATGAG`,
    maxMismatches: 3,                             // Long-read tolerance
    canonical: true,                              // BED-extracted primers
    flanking: true,                               // Include primer context
    region: '-100:100',                           // Biological context
    searchWindow: { forward: 200, reverse: 200 }, // Performance optimization
    outputMismatches: true                        // Debug information
  })
  .rmdup('sequence')
  .writeFasta('advanced_amplicons.fasta');

amplicon(
 forwardPrimer: string,
 reversePrimer: string,
 options: Partial<AmpliconOptions>,
): SeqOps<T>

Extract amplicons via primer sequences

Parameters

forwardPrimer: string
reversePrimer: string
options: Partial<AmpliconOptions>

Returns SeqOps<T>

Example

// Simple amplicon extraction (90% use case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT')
  .writeFasta('amplicons.fasta');

// With mismatch tolerance (common case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT', 2)
  .filter({ minLength: 50 });

// Single primer (auto-canonical matching)
seqops(sequences)
  .amplicon('UNIVERSAL_PRIMER')
  .stats();

// Real-world COVID-19 diagnostics
seqops(samples)
  .quality({ minScore: 20 })
  .amplicon(
    primer`ACCAGGAACTAATCAGACAAG`,     // N gene forward
    primer`CAAAGACCAATCCTACCATGAG`,    // N gene reverse
    2                                  // Allow sequencing errors
  )
  .validate({ mode: 'strict' });

// Long reads with windowed search (massive performance boost)
seqops(nanoporeReads)
  .amplicon('FORWARD', 'REVERSE', {
    searchWindow: { forward: 200, reverse: 200 }  // 100x+ speedup
  });

// Advanced features (10% use case)
seqops(sequences)
  .amplicon({
    forwardPrimer: primer`ACCAGGAACTAATCAGACAAG`,
    reversePrimer: primer`CAAAGACCAATCCTACCATGAG`,
    maxMismatches: 3,                             // Long-read tolerance
    canonical: true,                              // BED-extracted primers
    flanking: true,                               // Include primer context
    region: '-100:100',                           // Biological context
    searchWindow: { forward: 200, reverse: 200 }, // Performance optimization
    outputMismatches: true                        // Debug information
  })
  .rmdup('sequence')
  .writeFasta('advanced_amplicons.fasta');

amplicon(options: AmpliconOptions): SeqOps<T>

Extract amplicons via primer sequences

Parameters

options: AmpliconOptions

Returns SeqOps<T>

Example

// Simple amplicon extraction (90% use case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT')
  .writeFasta('amplicons.fasta');

// With mismatch tolerance (common case)
seqops(sequences)
  .amplicon('ATCGATCG', 'CGATCGAT', 2)
  .filter({ minLength: 50 });

// Single primer (auto-canonical matching)
seqops(sequences)
  .amplicon('UNIVERSAL_PRIMER')
  .stats();

// Real-world COVID-19 diagnostics
seqops(samples)
  .quality({ minScore: 20 })
  .amplicon(
    primer`ACCAGGAACTAATCAGACAAG`,     // N gene forward
    primer`CAAAGACCAATCCTACCATGAG`,    // N gene reverse
    2                                  // Allow sequencing errors
  )
  .validate({ mode: 'strict' });

// Long reads with windowed search (massive performance boost)
seqops(nanoporeReads)
  .amplicon('FORWARD', 'REVERSE', {
    searchWindow: { forward: 200, reverse: 200 }  // 100x+ speedup
  });

// Advanced features (10% use case)
seqops(sequences)
  .amplicon({
    forwardPrimer: primer`ACCAGGAACTAATCAGACAAG`,
    reversePrimer: primer`CAAAGACCAATCCTACCATGAG`,
    maxMismatches: 3,                             // Long-read tolerance
    canonical: true,                              // BED-extracted primers
    flanking: true,                               // Include primer context
    region: '-100:100',                           // Biological context
    searchWindow: { forward: 200, reverse: 200 }, // Performance optimization
    outputMismatches: true                        // Debug information
  })
  .rmdup('sequence')
  .writeFasta('advanced_amplicons.fasta');

clean

clean(options: CleanOptions): SeqOps<T>
Clean and sanitize sequences

Fix common issues in sequence data such as gaps, ambiguous bases, and whitespace.
Parameters
- options: CleanOptions
 Clean options
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences)
 .clean({ removeGaps: true })
 .clean({ replaceAmbiguous: true, replaceChar: 'N' })
 .clean({ trimWhitespace: true, removeEmpty: true })
```
- Defined in src/operations/index.ts:559

quality

quality(
this: SeqOps,
options: QualityOptions,
): SeqOps
FASTQ quality operations

Filter, trim, and bin sequences based on quality scores. Supports filtering, trimming, and binning operations - all operations are optional and can be combined. Only affects FASTQ sequences; FASTA sequences pass through unchanged.
Type Parameters
- U extends AbstractSequence & FastqSequence
Parameters
- this: SeqOps
- options: QualityOptions
 Quality filtering, trimming, and binning options
Returns SeqOps
New SeqOps instance for chaining
Example: Basic filtering
```
seqops(sequences)
 .quality({ minScore: 20 })
```
Example: Quality trimming
```
seqops(sequences)
 .quality({ trim: true, trimThreshold: 20, trimWindow: 4 })
```
Example: Quality binning for compression
```
seqops(sequences)
 .quality({ bins: 3, preset: 'illumina' })
```
Example: Combined operations
```
seqops(sequences)
 .quality({
 minScore: 20, // 1. Filter low quality
 trim: true, // 2. Trim ends
 bins: 3, // 3. Bin for compression
 preset: 'illumina'
 })
```
Example: Custom binning boundaries
```
seqops(sequences)
 .quality({ bins: 2, boundaries: [25] })
```
- Defined in src/operations/index.ts:610

convert

convert(
this: SeqOps,
options: ConvertOptions,
): SeqOps

Convert FASTQ quality score encodings

Convert quality scores between different encoding schemes (Phred+33, Phred+64, Solexa). Essential for legacy data processing and tool compatibility. Only affects FASTQ sequences; FASTA sequences pass through unchanged.

Type Parameters

U extends AbstractSequence & FastqSequence

Parameters

this: SeqOps
options: ConvertOptions
Conversion options

Returns SeqOps

New SeqOps instance for chaining

Example

// Primary workflow: Auto-detect source encoding (matches seqkit)
seqops(legacyData)
  .convert({ targetEncoding: 'phred33' })
  .writeFastq('modernized.fastq');

// Legacy Illumina 1.3-1.7 to modern standard
seqops(illumina15Data)
  .convert({
    sourceEncoding: 'phred64',  // Skip detection for known encoding
    targetEncoding: 'phred33'   // Modern standard
  })

// Real-world pipeline: QC → standardize encoding → analysis
const results = await seqops(mixedEncodingFiles)
  .quality({ minScore: 20 })           // Filter first
  .convert({ targetEncoding: 'phred33' })  // Standardize
  .stats({ detailed: true });

toFastqSequence

toFastqSequence(
this: SeqOps,
options?: Fa2FqOptions,
): SeqOps<FastqSequence>
Convert FASTA sequences to FASTQ format

Converts FASTA sequences to FASTQ by adding uniform quality scores. This method is only available when working with FASTA sequences and will cause a compile-time error if called on FASTQ sequences.
Type Parameters
- U extends AbstractSequence & FastaSequence
Parameters
- this: SeqOps
- Optionaloptions: Fa2FqOptions
 Conversion options with compile-time validation for literal values
Returns SeqOps<FastqSequence>
New SeqOps instance with FASTQ sequences
Example
```
// Convert with default quality (Phred+33 score 40)
await seqops(fastaSeqs)
 .toFastqSequence()
 .writeFastq('output.fastq');

// Convert with custom quality character
await seqops(fastaSeqs)
 .toFastqSequence({ quality: 'I' }) // Valid
 .writeFastq('output.fastq');

// These will cause compile-time errors:
// seqops(fastaSeqs).toFastqSequence({ quality: '€' }); // Invalid character
// seqops(fastqSeqs).toFastqSequence(); // Cannot convert FASTQ to FASTQ
```
- Defined in src/operations/index.ts:678

toFastaSequence

toFastaSequence(
this: SeqOps,
options?: Record<string, never>,
): SeqOps<FastaSequence>
Convert FASTQ sequences to FASTA format

Converts FASTQ sequences to FASTA by removing quality scores. This method is only available when working with FASTQ sequences and will cause a compile-time error if called on FASTA sequences.
Type Parameters
- U extends AbstractSequence & FastqSequence
Parameters
- this: SeqOps
- Optionaloptions: Record<string, never>
 Conversion options
Returns SeqOps<FastaSequence>
New SeqOps instance with FASTA sequences
Example
```
// Convert FASTQ to FASTA for BLAST database
await seqops(fastqSeqs)
 .toFastaSequence()
 .writeFasta('blast_db.fasta');

// Preserve quality metrics for QC tracking
await seqops(fastqSeqs)
 .toFastaSequence({ includeQualityStats: true })
 .writeFasta('assembly_input.fasta');

// This will cause a compile-time error:
// seqops(fastaSeqs).toFastaSequence(); // Cannot convert FASTA to FASTA
```
- Defined in src/operations/index.ts:714

validate

validate(options: ValidateOptions): SeqOps<T>
Validate sequences

Check sequences for validity and optionally fix or reject invalid ones.
Parameters
- options: ValidateOptions
 Validation options
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences)
 .validate({ mode: 'strict', action: 'reject' })
 .validate({ allowAmbiguous: true, action: 'fix', fixChar: 'N' })
```
- Defined in src/operations/index.ts:738

grep

grep(pattern: string): SeqOps<T>

Search sequences by pattern

Pattern matching and filtering similar to Unix grep. Supports both simple string patterns and complex options for advanced use cases.

Parameters

pattern: string

Returns SeqOps<T>

Example

// Simple sequence search (most common case)
seqops(sequences)
  .grep('ATCG')                    // Search sequences for 'ATCG'
  .grep(/^chr\d+/, 'id')           // Search IDs with regex

// Advanced options for complex scenarios
seqops(sequences)
  .grep({
    pattern: 'ATCGATCG',
    target: 'sequence',
    allowMismatches: 2,
    searchBothStrands: true
  })

grep(pattern: RegExp): SeqOps<T>

Search sequences by pattern

Pattern matching and filtering similar to Unix grep. Supports both simple string patterns and complex options for advanced use cases.

Parameters

pattern: RegExp

Returns SeqOps<T>

Example

// Simple sequence search (most common case)
seqops(sequences)
  .grep('ATCG')                    // Search sequences for 'ATCG'
  .grep(/^chr\d+/, 'id')           // Search IDs with regex

// Advanced options for complex scenarios
seqops(sequences)
  .grep({
    pattern: 'ATCGATCG',
    target: 'sequence',
    allowMismatches: 2,
    searchBothStrands: true
  })

grep(pattern: string, target: "sequence" | "description" | "id"): SeqOps<T>

Search sequences by pattern

Pattern matching and filtering similar to Unix grep. Supports both simple string patterns and complex options for advanced use cases.

Parameters

pattern: string
target: "sequence" | "description" | "id"

Returns SeqOps<T>

Example

// Simple sequence search (most common case)
seqops(sequences)
  .grep('ATCG')                    // Search sequences for 'ATCG'
  .grep(/^chr\d+/, 'id')           // Search IDs with regex

// Advanced options for complex scenarios
seqops(sequences)
  .grep({
    pattern: 'ATCGATCG',
    target: 'sequence',
    allowMismatches: 2,
    searchBothStrands: true
  })

grep(pattern: RegExp, target: "sequence" | "description" | "id"): SeqOps<T>

Search sequences by pattern

Pattern matching and filtering similar to Unix grep. Supports both simple string patterns and complex options for advanced use cases.

Parameters

pattern: RegExp
target: "sequence" | "description" | "id"

Returns SeqOps<T>

Example

// Simple sequence search (most common case)
seqops(sequences)
  .grep('ATCG')                    // Search sequences for 'ATCG'
  .grep(/^chr\d+/, 'id')           // Search IDs with regex

// Advanced options for complex scenarios
seqops(sequences)
  .grep({
    pattern: 'ATCGATCG',
    target: 'sequence',
    allowMismatches: 2,
    searchBothStrands: true
  })

grep(options: GrepOptions): SeqOps<T>

Search sequences by pattern

Pattern matching and filtering similar to Unix grep. Supports both simple string patterns and complex options for advanced use cases.

Parameters

options: GrepOptions

Returns SeqOps<T>

Example

// Simple sequence search (most common case)
seqops(sequences)
  .grep('ATCG')                    // Search sequences for 'ATCG'
  .grep(/^chr\d+/, 'id')           // Search IDs with regex

// Advanced options for complex scenarios
seqops(sequences)
  .grep({
    pattern: 'ATCGATCG',
    target: 'sequence',
    allowMismatches: 2,
    searchBothStrands: true
  })

`Static`concat

concat(
filePaths: string[],
handleDuplicateIds?: "ignore" | "suffix",
): SeqOps<FastaSequence>
Concatenate multiple sequence files into a single pipeline

Static factory function that creates a SeqOps pipeline from multiple files. Elegant API for combining sequence sources with simple duplicate handling.
Parameters
- filePaths: string[]
 Array of file paths to concatenate
- handleDuplicateIds: "ignore" | "suffix" = "ignore"
 How to handle duplicate IDs: 'suffix' | 'ignore' (default: 'ignore')
Returns SeqOps<FastaSequence>
New SeqOps instance for chaining
Example
```
// Simple concatenation
const combined = SeqOps.concat(['file1.fasta', 'file2.fasta']);

// With duplicate ID suffixing
const merged = SeqOps.concat(['db1.fa', 'db2.fa'], 'suffix')
 .filter({ minLength: 100 })
 .writeFasta('combined.fa');
```
Since
v0.1.0
- Defined in src/operations/index.ts:813

concat

concat(
sources: (string | AsyncIterable<AbstractSequence, any, any>)[],
options?: Omit<ConcatOptions, "sources">,
): SeqOps<T>
Concatenate sequences from multiple sources

Combines sequences from multiple file paths and/or AsyncIterables with sophisticated ID conflict resolution. Maintains streaming behavior for memory efficiency with large datasets.
Parameters
- sources: (string | AsyncIterable<AbstractSequence, any, any>)[]
 Array of file paths and/or AsyncIterables to concatenate
- Optionaloptions: Omit<ConcatOptions, "sources">
 Concatenation options (optional)
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
// Simple concatenation from files
seqops(sequences)
 .concat(['file1.fasta', 'file2.fasta'])
 .concat([anotherAsyncIterable])

// Advanced options for complex scenarios
seqops(sequences)
 .concat(['file1.fasta', 'file2.fasta'], {
 idConflictResolution: 'suffix',
 validateFormats: true,
 sourceLabels: ['batch1', 'batch2'],
 onProgress: (processed, total, source) =>
 console.log(`Processed ${processed} from ${source}`)
 })
```
- Defined in src/operations/index.ts:874

subseq

subseq(options: SubseqOptions): SeqOps<T>
Extract subsequences

Mirrors seqkit subseq functionality for region extraction.
Parameters
- options: SubseqOptions
 Extraction options
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences)
 .subseq({
 region: "100:500",
 upstream: 50,
 downstream: 50
 })
```
- Defined in src/operations/index.ts:922

windows

windows<K extends number>(size: K): SeqOps<KmerSequence<K>>
Generate sliding windows (k-mers) from sequences

Extracts overlapping or non-overlapping windows from sequences with compile-time k-mer size tracking. Essential for k-mer analysis, motif discovery, and sequence decomposition.
Type Parameters
- K extends number
Parameters
- size: K
 Window size (k-mer size)
Returns SeqOps<KmerSequence<K>>
New SeqOps instance with KmerSequence type
Example
```
// Simple usage - just specify size
const kmers = await seqops(sequences).windows(21).toArray();

// With options - step, circular, greedy modes
seqops(sequences).windows(21, { step: 3, circular: true })

// Non-overlapping tiles
seqops(sequences).windows(100, { step: 100 })

// Greedy mode - include short final window
seqops(sequences).windows(50, { greedy: true })
```
- Defined in src/operations/index.ts:952
windows<K extends number>(
size: K,
options: Omit<WindowOptions<K>, "size">,
): SeqOps<KmerSequence<K>>
Generate sliding windows (k-mers) from sequences with options
Type Parameters
- K extends number
Parameters
- size: K
 Window size (k-mer size)
- options: Omit<WindowOptions<K>, "size">
 Additional window options (step, circular, greedy, etc.)
Returns SeqOps<KmerSequence<K>>
New SeqOps instance with KmerSequence type
- Defined in src/operations/index.ts:961
windows<K extends number>(options: WindowOptions<K>): SeqOps<KmerSequence<K>>
Generate sliding windows (k-mers) from sequences (legacy object form)
Type Parameters
- K extends number
Parameters
- options: WindowOptions<K>
 Window generation options with k-mer size
Returns SeqOps<KmerSequence<K>>
New SeqOps instance with KmerSequence type
- Defined in src/operations/index.ts:972

sliding

sliding<K extends number>(size: K): SeqOps<KmerSequence<K>>
Alias for .windows() - emphasizes sliding window concept
Type Parameters
- K extends number
Parameters
- size: K
 Window size
Returns SeqOps<KmerSequence<K>>
SeqOps yielding KmerSequence objects
- Defined in src/operations/index.ts:997
sliding<K extends number>(
size: K,
options: Omit<WindowOptions<K>, "size">,
): SeqOps<KmerSequence<K>>
Alias for .windows() - emphasizes sliding window concept
Type Parameters
- K extends number
Parameters
- size: K
 Window size
- options: Omit<WindowOptions<K>, "size">
Returns SeqOps<KmerSequence<K>>
SeqOps yielding KmerSequence objects
- Defined in src/operations/index.ts:998
sliding<K extends number>(options: WindowOptions<K>): SeqOps<KmerSequence<K>>
Alias for .windows() - emphasizes sliding window concept
Type Parameters
- K extends number
Parameters
- options: WindowOptions<K>
Returns SeqOps<KmerSequence<K>>
SeqOps yielding KmerSequence objects
- Defined in src/operations/index.ts:1002

kmers

kmers<K extends number>(size: K): SeqOps<KmerSequence<K>>
Alias for .windows() - emphasizes k-mer generation
Type Parameters
- K extends number
Parameters
- size: K
 K-mer size
Returns SeqOps<KmerSequence<K>>
SeqOps yielding KmerSequence objects
- Defined in src/operations/index.ts:1016
kmers<K extends number>(
size: K,
options: Omit<WindowOptions<K>, "size">,
): SeqOps<KmerSequence<K>>
Alias for .windows() - emphasizes k-mer generation
Type Parameters
- K extends number
Parameters
- size: K
 K-mer size
- options: Omit<WindowOptions<K>, "size">
Returns SeqOps<KmerSequence<K>>
SeqOps yielding KmerSequence objects
- Defined in src/operations/index.ts:1017
kmers<K extends number>(options: WindowOptions<K>): SeqOps<KmerSequence<K>>
Alias for .windows() - emphasizes k-mer generation
Type Parameters
- K extends number
Parameters
- options: WindowOptions<K>
Returns SeqOps<KmerSequence<K>>
SeqOps yielding KmerSequence objects
- Defined in src/operations/index.ts:1021

head

head(n: number): SeqOps<T>
Take first n sequences

Mirrors seqkit head functionality.
Parameters
- n: number
 Number of sequences to take
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences).head(1000)
```
- Defined in src/operations/index.ts:1042

take

take(n: number): SeqOps<T>
Take first N sequences (alias for head)

Returns the first N sequences from the stream. This is an alias for head() provided for developers familiar with this naming convention.

Mirrors seqkit head functionality.
Parameters
- n: number
 Number of sequences to take
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences).take(1000)
```
- Defined in src/operations/index.ts:1070

sample

sample(count: number): SeqOps<T>
Sample sequences from the stream

Supports two modes: exact count sampling with strategy selection, or fraction-based streaming sampling for large datasets.
Parameters
- count: number
 Number of sequences to sample
Returns SeqOps<T>
Example: Quick sampling (default reservoir strategy)
```
seqops('input.fastq').sample(1000) // Exactly 1000 sequences
```
Example: Fraction-based streaming
```
seqops('huge.fastq').sample({ fraction: 0.1 }) // ~10% of sequences
```
Example: Reproducible paired-end sampling
```
const seed = 42;
seqops('R1.fastq').sample({ fraction: 0.05, seed })
seqops('R2.fastq').sample({ fraction: 0.05, seed })
```
- Defined in src/operations/index.ts:1103
sample(
count: number,
strategy: "random" | "systematic" | "reservoir",
): SeqOps<T>
Sample sequences from the stream

Supports two modes: exact count sampling with strategy selection, or fraction-based streaming sampling for large datasets.
Parameters
- count: number
 Number of sequences to sample
- strategy: "random" | "systematic" | "reservoir"
 Sampling strategy ('reservoir', 'systematic', or 'random')
Returns SeqOps<T>
Example: Quick sampling (default reservoir strategy)
```
seqops('input.fastq').sample(1000) // Exactly 1000 sequences
```
Example: Fraction-based streaming
```
seqops('huge.fastq').sample({ fraction: 0.1 }) // ~10% of sequences
```
Example: Reproducible paired-end sampling
```
const seed = 42;
seqops('R1.fastq').sample({ fraction: 0.05, seed })
seqops('R2.fastq').sample({ fraction: 0.05, seed })
```
- Defined in src/operations/index.ts:1104
sample(options: SampleOptions): SeqOps<T>
Sample sequences from the stream

Supports two modes: exact count sampling with strategy selection, or fraction-based streaming sampling for large datasets.
Parameters
- options: SampleOptions
 Detailed sampling options
Returns SeqOps<T>
Example: Quick sampling (default reservoir strategy)
```
seqops('input.fastq').sample(1000) // Exactly 1000 sequences
```
Example: Fraction-based streaming
```
seqops('huge.fastq').sample({ fraction: 0.1 }) // ~10% of sequences
```
Example: Reproducible paired-end sampling
```
const seed = 42;
seqops('R1.fastq').sample({ fraction: 0.05, seed })
seqops('R2.fastq').sample({ fraction: 0.05, seed })
```
- Defined in src/operations/index.ts:1105

sort

sort(options: SortOptions): SeqOps<T>
Sort sequences by specified criteria

High-performance sorting optimized for genomic data compression. Automatically switches between in-memory and external sorting based on dataset size. Proper sequence ordering dramatically improves compression ratios for genomic datasets.
Parameters
- options: SortOptions
 Sort criteria and options
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
// Sort by length for compression optimization
seqops(sequences)
 .sort({ by: 'length', order: 'desc' })

// Sort by GC content for clustering similar sequences
seqops(sequences)
 .sort({ by: 'gc', order: 'asc' })

// Custom sorting for specialized genomic criteria
seqops(sequences)
 .sort({
 custom: (a, b) => a.sequence.localeCompare(b.sequence)
 })
```
- Defined in src/operations/index.ts:1151

sortByLength

sortByLength(order?: "asc" | "desc"): SeqOps<T>
Sort sequences by length (convenience method)
Parameters
- order: "asc" | "desc" = "asc"
 Sort order: 'asc' or 'desc' (default: 'asc')
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences)
 .sortByLength('desc') // Longest first for compression
 .sortByLength() // Shortest first (default)
```
- Defined in src/operations/index.ts:1169

sortById

sortById(order?: "asc" | "desc"): SeqOps<T>
Sort sequences by ID (convenience method)
Parameters
- order: "asc" | "desc" = "asc"
 Sort order: 'asc' or 'desc' (default: 'asc')
Returns SeqOps<T>
New SeqOps instance for chaining
- Defined in src/operations/index.ts:1179

sortByGC

sortByGC(order?: "asc" | "desc"): SeqOps<T>
Sort sequences by GC content (convenience method)
Parameters
- order: "asc" | "desc" = "asc"
 Sort order: 'asc' or 'desc' (default: 'asc')
Returns SeqOps<T>
New SeqOps instance for chaining
- Defined in src/operations/index.ts:1189

rmdup

rmdup(by: "both" | "sequence" | "id"): SeqOps<T>

Remove duplicate sequences

High-performance deduplication using probabilistic Bloom filters or exact Set-based approaches. Supports both simple deduplication and advanced configuration for large datasets.

Parameters

by: "both" | "sequence" | "id"

Returns SeqOps<T>

Example

// Simple deduplication (most common cases)
seqops(sequences)
  .rmdup('sequence')               // Remove sequence duplicates
  .rmdup('id', true)               // Remove ID duplicates (exact)

// Advanced options for large datasets
seqops(sequences)
  .rmdup({
    by: 'both',
    expectedUnique: 5_000_000,
    falsePositiveRate: 0.0001
  })

rmdup(by: "both" | "sequence" | "id", exact: boolean): SeqOps<T>

Remove duplicate sequences

High-performance deduplication using probabilistic Bloom filters or exact Set-based approaches. Supports both simple deduplication and advanced configuration for large datasets.

Parameters

by: "both" | "sequence" | "id"
exact: boolean

Returns SeqOps<T>

Example

// Simple deduplication (most common cases)
seqops(sequences)
  .rmdup('sequence')               // Remove sequence duplicates
  .rmdup('id', true)               // Remove ID duplicates (exact)

// Advanced options for large datasets
seqops(sequences)
  .rmdup({
    by: 'both',
    expectedUnique: 5_000_000,
    falsePositiveRate: 0.0001
  })

rmdup(options: RmdupOptions): SeqOps<T>

Remove duplicate sequences

High-performance deduplication using probabilistic Bloom filters or exact Set-based approaches. Supports both simple deduplication and advanced configuration for large datasets.

Parameters

options: RmdupOptions

Returns SeqOps<T>

Example

// Simple deduplication (most common cases)
seqops(sequences)
  .rmdup('sequence')               // Remove sequence duplicates
  .rmdup('id', true)               // Remove ID duplicates (exact)

// Advanced options for large datasets
seqops(sequences)
  .rmdup({
    by: 'both',
    expectedUnique: 5_000_000,
    falsePositiveRate: 0.0001
  })

rename

rename(options?: RenameOptions): SeqOps<T>
Rename duplicated sequence IDs

Appends numeric suffixes to duplicate IDs to ensure uniqueness. Useful after merging datasets or processing PCR replicates.
Parameters
- Optionaloptions: RenameOptions
 Rename options
Returns SeqOps<T>
New SeqOps with unique IDs
Example
```
// Basic usage - duplicates get "_2", "_3" suffixes
seqops(sequences).rename();

// Rename all occurrences including first
seqops(sequences).rename({ renameFirst: true, startNum: 1 });
// Result: "id_1", "id_2", "id_3"
```
- Defined in src/operations/index.ts:1254

unique

unique(options?: UniqueOptions): SeqOps<T>

Remove duplicate sequences with configurable deduplication strategies

Streaming deduplication with multiple key extraction methods and conflict resolution strategies. Memory-efficient for large datasets when using the default "first" strategy.

Parameters

Optionaloptions: UniqueOptions
Deduplication options

Returns SeqOps<T>

New SeqOps with deduplicated sequences

Example

// Remove duplicate sequences (most common)
seqops(sequences).unique();

// Remove sequences with duplicate IDs
seqops(sequences).unique({ by: "id" });

// Case-insensitive sequence deduplication
seqops(sequences).unique({ by: "sequence", caseSensitive: false });

Example

// Keep longest when duplicates found
seqops(sequences).unique({
  by: "sequence",
  conflictResolution: "longest"
});

// Keep highest quality reads (FASTQ only)
seqops(reads).unique({
  by: "sequence",
  conflictResolution: "highest-quality"
});

Example

// Custom deduplication key
seqops(sequences).unique({
  by: (seq) => seq.id.split("_")[0]  // Group by ID prefix
});

replace

replace(options: ReplaceOptions): SeqOps<T>

Replace sequence names/content by regular expression

Performs pattern-based substitution on sequence IDs (default) or sequence content (FASTA only). Supports capture variables, special placeholders ({nr}, {kv}, {fn}), and grep-style filtering.

Parameters

options: ReplaceOptions
Replace options with pattern and replacement string

Returns SeqOps<T>

New SeqOps instance for chaining

Example

// Remove descriptions from sequence IDs
seqops(sequences).replace({ pattern: '\\s.+', replacement: '' })

// Add prefix to all sequence IDs
seqops(sequences).replace({ pattern: '^', replacement: 'PREFIX_' })

// Use capture variables to restructure IDs
seqops(sequences).replace({
  pattern: '^(\\w+)_(\\w+)',
  replacement: '$2_$1'
})

// Key-value lookup from file
seqops(sequences).replace({
  pattern: '^(\\w+)',
  replacement: '$1_{kv}',
  kvFile: 'aliases.txt'
})

translate

translate(geneticCode?: number | TranslateOptions): SeqOps<T>
Translate DNA/RNA sequences to proteins

High-performance protein translation supporting all 31 NCBI genetic codes with progressive disclosure for optimal developer experience.
Parameters
- OptionalgeneticCode: number | TranslateOptions
 Genetic code number (1-33) or full options object
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
// Simple cases (90% of usage)
seqops(sequences)
 .translate() // Standard genetic code, frame +1
 .translate(2) // Vertebrate mitochondrial code

// Advanced options (10% of usage)
seqops(sequences)
 .translate({
 geneticCode: 1,
 orfsOnly: true,
 minOrfLength: 30
 })
```
- Defined in src/operations/index.ts:1369

translateMito

translateMito(): SeqOps<T>
Translate using mitochondrial genetic code (convenience method)

Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences)
 .translateMito() // Genetic code 2 - Vertebrate Mitochondrial
```
- Defined in src/operations/index.ts:1392

translateAllFrames

translateAllFrames(geneticCode?: number): SeqOps<T>
Translate all 6 reading frames (convenience method)
Parameters
- geneticCode: number = 1
 Genetic code to use (default: 1 = Standard)
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences)
 .translateAllFrames() // All frames with standard code
 .translateAllFrames(2) // All frames with mito code
```
- Defined in src/operations/index.ts:1409

translateOrf

translateOrf(minLength?: number, geneticCode?: number): SeqOps<T>
Find and translate open reading frames (convenience method)
Parameters
- minLength: number = 30
 Minimum ORF length in amino acids (default: 30)
- geneticCode: number = 1
 Genetic code to use (default: 1 = Standard)
Returns SeqOps<T>
New SeqOps instance for chaining
Example
```
seqops(sequences)
 .translateOrf() // Default: 30 aa minimum
 .translateOrf(100) // 100 aa minimum
 .translateOrf(50, 2) // 50 aa minimum, mito code
```
- Defined in src/operations/index.ts:1431

split

split(options: SplitOptions): Promise<SplitSummary>

Split sequences into multiple files

Terminal operation that writes pipeline sequences to separate files with comprehensive seqkit split/split2 compatibility. Integrates seamlessly with all SeqOps pipeline operations for sophisticated genomic workflows.

Parameters

options: SplitOptions
Split configuration options

Returns Promise<SplitSummary>

Promise resolving to split results summary

Example

// Basic usage - split after processing
const result = await seqops(sequences)
  .filter({ minLength: 100 })
  .clean({ removeGaps: true })
  .split({ mode: 'by-size', sequencesPerFile: 1000 });

// Real-world genomics: Quality control → split for parallel processing
const qcResults = await seqops(rawReads)
  .quality({ minScore: 20, trim: true })      // Quality filter
  .filter({ minLength: 50, maxLength: 150 })  // Length filter
  .clean({ removeAmbiguous: true })           // Clean sequences
  .split({ mode: 'by-length', basesPerFile: 1000000 }); // 1MB chunks

// Genome assembly: Split chromosomes for parallel analysis
const chrResults = await seqops(genome)
  .grep({ pattern: /^chr[1-9]/, target: 'id' })  // Autosomal only
  .transform({ upperCase: true })                // Normalize case
  .split({ mode: 'by-id', idRegex: 'chr(\\d+)' }); // Group by chromosome

// Amplicon sequencing: Process primers → split by target
const amplicons = await seqops(sequences)
  .grep({ pattern: forwardPrimer, target: 'sequence' })  // Has forward primer
  .grep({ pattern: reversePrimer, target: 'sequence' })  // Has reverse primer
  .subseq({ region: '20:-20' })                         // Trim primers
  .split({ mode: 'by-parts', numParts: 8 });            // Parallel processing

console.log(`Created ${result.filesCreated.length} files`);

splitToStream

splitToStream(options: SplitOptions): AsyncIterable<SplitResult>

Split sequences with streaming results for advanced processing

Returns AsyncIterable of split results following the locate() pattern. Enables sophisticated post-processing workflows where each split result needs individual handling during the splitting process.

Parameters

options: SplitOptions
Split configuration options

Returns AsyncIterable<SplitResult>

AsyncIterable of split results for processing

Example

// Basic streaming - process each split file as it's created
for await (const result of seqops(sequences).splitToStream(options)) {
  await compressFile(result.outputFile);
  console.log(`Split ${result.sequenceCount} sequences to ${result.outputFile}`);
}

// Large genome processing: Split → compress → upload pipeline
for await (const chunk of seqops(largeGenome).splitToStream({
  mode: 'by-length',
  basesPerFile: 50_000_000 // 50MB chunks
})) {
  // Process each chunk immediately to manage memory
  await compressWithBgzip(chunk.outputFile);
  await uploadToCloud(chunk.outputFile + '.gz');
  await deleteLocalFile(chunk.outputFile); // Clean up
  console.log(`Processed chunk ${chunk.partId}: ${chunk.sequenceCount} sequences`);
}

// Quality control: Split → validate → report pipeline
const qualityReports = [];
for await (const batch of seqops(sequencingRun).splitToStream({
  mode: 'by-size',
  sequencesPerFile: 10000
})) {
  const qc = await runQualityControl(batch.outputFile);
  qualityReports.push({
    file: batch.outputFile,
    sequences: batch.sequenceCount,
    qcScore: qc.overallScore
  });
}

splitBySize

splitBySize(sequencesPerFile: number, outputDir?: string): Promise<SplitSummary>
Split by sequence count (convenience method)

Most common splitting mode - divide sequences into files with N sequences each. Ideal for creating manageable chunks for parallel processing.
Parameters
- sequencesPerFile: number
 Number of sequences per output file
- outputDir: string = "./split"
 Output directory (default: './split')
Returns Promise<SplitSummary>
Promise resolving to split results
Example
```
// Simple case - just split
await seqops(sequences).splitBySize(1000);

// Common workflow: Filter → process → split for downstream analysis
await seqops(rawSequences)
 .filter({ minLength: 100 })
 .clean({ removeGaps: true })
 .splitBySize(5000, './chunks');

// RNA-seq: Quality filter → deduplicate → split for differential expression
await seqops(rnaseqReads)
 .quality({ minScore: 20 })
 .rmdup({ by: 'sequence' })
 .splitBySize(100000, './de-analysis');
```
- Defined in src/operations/index.ts:1570

splitByParts

splitByParts(numParts: number, outputDir?: string): Promise<SplitSummary>
Split into equal parts (convenience method)
Parameters
- numParts: number
 Number of output files to create
- outputDir: string = "./split"
 Output directory (default: './split')
Returns Promise<SplitSummary>
Promise resolving to split results
- Defined in src/operations/index.ts:1581

splitByLength

splitByLength(basesPerFile: number, outputDir?: string): Promise<SplitSummary>

Split by base count (convenience method)

Implements seqkit split2's key functionality for splitting by total sequence bases rather than sequence count. Essential for genome processing where you need consistent data sizes regardless of sequence count.

Parameters

basesPerFile: number
Number of bases per output file
outputDir: string = "./split"
Output directory (default: './split')

Returns Promise<SplitSummary>

Promise resolving to split results

Example

// Genome assembly: Split into 10MB chunks for parallel processing
await seqops(scaffolds).splitByLength(10_000_000);

// Metagenomics: Process → bin → split by data size
await seqops(contigs)
  .filter({ minLength: 1000 })
  .sort({ by: 'length', order: 'desc' })  // Longest first
  .splitByLength(5_000_000, './metagenome-bins');

// Long-read sequencing: Quality control → split for analysis
await seqops(nanoporeReads)
  .quality({ minScore: 7 })  // Nanopore quality threshold
  .filter({ minLength: 5000, maxLength: 100000 })
  .splitByLength(50_000_000, './nanopore-chunks');

splitById

splitById(pattern: string | RegExp, outputDir?: string): Promise<SplitSummary>

Split by sequence ID pattern (convenience method)

Groups sequences by ID patterns for organized analysis. String patterns are automatically converted to RegExp for better developer experience.

Parameters

pattern: string | RegExp
String pattern or RegExp to group sequences by ID
outputDir: string = "./split"
Output directory (default: './split')

Returns Promise<SplitSummary>

Promise resolving to split results

Example

// Genome assembly: Split by chromosome
await seqops(scaffolds).splitById('chr(\\d+)'); // chr1, chr2, chr3...

// Multi-species analysis: Group by organism
await seqops(sequences)
  .splitById('(\\w+)_gene'); // Groups: human_gene, mouse_gene, etc.

// Transcriptome: Split by gene families
await seqops(transcripts)
  .filter({ minLength: 200 })
  .transform({ upperCase: true })
  .splitById('(HOX\\w+)_transcript', './gene-families');

// Advanced: Use RegExp for complex patterns
await seqops(sequences)
  .splitById(/^(chr[XY]|chrM)_/, './sex-chromosomes');

splitByRegion

splitByRegion<T extends string>(
region: T extends ValidGenomicRegion<T> ? T<T> : never,
outputDir?: string,
): Promise<SplitSummary>

Split by genomic region with compile-time validation (convenience method)

Uses advanced TypeScript template literal types to parse and validate genomic regions at compile time, preventing coordinate errors.

Type Parameters

T extends string

Parameters

region: T extends ValidGenomicRegion<T> ? T<T> : never
Genomic region string with compile-time validation
outputDir: string = "./split"
Output directory (default: './split')

Returns Promise<SplitSummary>

Promise resolving to split results

Example

// ✅ Type-safe region parsing - validated at compile time
await seqops(sequences).splitByRegion('chr1:1000-2000');
await seqops(sequences).splitByRegion('scaffold_1:500-1500');
await seqops(sequences).splitByRegion('chrX:0-1000'); // 0-based OK

// ❌ These cause TypeScript compilation errors:
// await seqops(sequences).splitByRegion('chr1:2000-1000'); // end < start
// await seqops(sequences).splitByRegion('chr1:1000-1000'); // end = start
// await seqops(sequences).splitByRegion('invalid-format'); // bad format

// 🔥 Compile-time coordinate extraction available:
type Coords = ExtractCoordinates<'chr1:1000-2000'>;
// → { chr: 'chr1'; start: 1000; end: 2000; length: 1000 }

stats

stats(options?: StatsOptions): Promise<SequenceStats>
Calculate sequence statistics

Terminal operation that processes all sequences to compute statistics. Mirrors seqkit stats functionality.
Parameters
- options: StatsOptions = {}
 Statistics options
Returns Promise<SequenceStats>
Promise resolving to statistics
Example
```
const stats = await seqops(sequences)
 .seq({ minLength: 100 })
 .stats({ detailed: true });
console.log(`N50: ${stats.n50}`);
```
- Defined in src/operations/index.ts:1708

writeFasta

writeFasta(path: string, options?: { wrapWidth?: number }): Promise<void>
Write sequences to FASTA file

Terminal operation that writes all sequences in FASTA format.
Parameters
- path: string
 Output file path
- options: { wrapWidth?: number } = {}
 Writer options
Returns Promise<void>
Promise resolving when write is complete
Example
```
await seqops(sequences)
 .seq({ reverseComplement: true })
 .writeFasta('output.fasta');
```
- Defined in src/operations/index.ts:1729

writeFastq

writeFastq(path: string, defaultQuality?: string): Promise<void>
Write sequences to FASTQ file

Terminal operation that writes all sequences in FASTQ format. If input sequences don't have quality scores, uses default quality.
Parameters
- path: string
 Output file path
- defaultQuality: string = "I"
 Default quality string for FASTA sequences
Returns Promise<void>
Promise resolving when write is complete
Example
```
await seqops(sequences)
 .seq({ minQuality: 20 })
 .writeFastq('output.fastq', 'IIIIIIIIII');
```
- Defined in src/operations/index.ts:1771

writeJSON

writeJSON(
path: string,
options?: Fx2TabOptions<readonly string[]> & JSONWriteOptions,
): Promise<void>
Write sequences to JSON file

Convenience method that converts sequences to tabular format and writes as JSON. Supports both simple array format and wrapped format with metadata. Loads entire dataset into memory before writing.
Parameters
- path: string
 Output file path
- Optionaloptions: Fx2TabOptions<readonly string[]> & JSONWriteOptions
 Combined column selection and JSON formatting options
Returns Promise<void>
Promise resolving when write is complete
Example
```
// Simple JSON array
await SeqOps.fromFasta('input.fa')
 .writeJSON('output.json');

// With selected columns
await SeqOps.fromFasta('input.fa')
 .writeJSON('output.json', {
 columns: ['id', 'sequence', 'length', 'gc']
 });

// Pretty-printed with metadata
await SeqOps.fromFasta('input.fa')
 .writeJSON('output.json', {
 columns: ['id', 'sequence', 'length'],
 pretty: true,
 includeMetadata: true
 });
```
Performance
O(n) memory - loads all sequences. Use writeJSONL() for large datasets.

Since
v0.1.0
- Defined in src/operations/index.ts:1837

writeJSONL

writeJSONL(
path: string,
options?: Fx2TabOptions<readonly string[]>,
): Promise<void>
Write sequences to JSONL (JSON Lines) file

Convenience method that converts sequences to tabular format and writes as JSONL (one JSON object per line). Provides streaming with O(1) memory usage, ideal for large datasets.

Note: JSONL format does not support metadata or pretty-printing. Each line is a separate, compact JSON object.
Parameters
- path: string
 Output file path
- Optionaloptions: Fx2TabOptions<readonly string[]>
 Column selection options (JSON formatting options not applicable)
Returns Promise<void>
Promise resolving when write is complete
Example
```
// Basic JSONL output
await SeqOps.fromFasta('input.fa')
 .writeJSONL('output.jsonl');

// With selected columns
await SeqOps.fromFasta('input.fa')
 .writeJSONL('output.jsonl', {
 columns: ['id', 'sequence', 'length', 'gc']
 });

// Large dataset streaming
await SeqOps.fromFasta('huge-dataset.fa')
 .filter({ minLength: 100 })
 .writeJSONL('filtered.jsonl'); // O(1) memory
```
Performance
O(1) memory - streams line-by-line. Use for large datasets.

Since
v0.1.0
- Defined in src/operations/index.ts:1890

toTabular

toTabular<Columns extends readonly string[] = readonly ["id", "seq", "length"]>(
options?: Fx2TabOptions<Columns>,
): TabularOps<Columns>
Convert sequences to tabular format

Transform sequences into a tabular representation with configurable columns. This is the primary method for tabular conversion, providing a more intuitive name than the seqkit-inspired fx2tab.
Type Parameters
- Columns extends readonly string[] = readonly ["id", "seq", "length"]
Parameters
- Optionaloptions: Fx2TabOptions<Columns>
 Column selection and formatting options
Returns TabularOps<Columns>
TabularOps instance for further processing or writing
Example
```
// Basic conversion to tabular format
await seqops(sequences)
 .toTabular({ columns: ['id', 'seq', 'length', 'gc'] })
 .writeTSV('output.tsv');

// With custom columns
await seqops(sequences)
 .toTabular({
 columns: ['id', 'seq', 'gc'],
 customColumns: {
 high_gc: (seq) => seq.gc > 60 ? 'HIGH' : 'NORMAL'
 }
 })
 .writeCSV('analysis.csv');
```
- Defined in src/operations/index.ts:1924

fx2tab

fx2tab<Columns extends readonly string[] = readonly ["id", "seq", "length"]>(
options?: Fx2TabOptions<Columns>,
): TabularOps<Columns>
Convert sequences to tabular format (SeqKit compatibility)

Alias for .toTabular() maintained for SeqKit parity and backward compatibility. New code should prefer .toTabular() for better clarity.
Type Parameters
- Columns extends readonly string[] = readonly ["id", "seq", "length"]
Parameters
- Optionaloptions: Fx2TabOptions<Columns>
 Column selection and formatting options
Returns TabularOps<Columns>
TabularOps instance for further processing or writing
See
toTabular - Primary method for tabular conversion
Example
```
// Legacy name for SeqKit users
await seqops(sequences)
 .fx2tab({ columns: ['id', 'seq', 'gc'] })
 .writeTSV('output.tsv');
```
- Defined in src/operations/index.ts:1948

asRows

asRows<Columns extends readonly string[] = readonly ["id", "seq", "length"]>(
options?: Fx2TabOptions<Columns>,
): TabularOps<Columns>
Convert sequences to row-based format

Clearer alias for .toTabular() that emphasizes the row-based structure used for output to various formats (TSV, CSV, JSON, JSONL).

This method converts sequences into a structured row format that can be written to tabular formats (TSV/CSV) or object formats (JSON/JSONL). Use this when the term "tabular" feels semantically incorrect for your output format (e.g., JSON).
Type Parameters
- Columns extends readonly string[] = readonly ["id", "seq", "length"]
Parameters
- Optionaloptions: Fx2TabOptions<Columns>
 Column selection and formatting options
Returns TabularOps<Columns>
TabularOps instance for further processing or writing
See
toTabular - Original method name
Example
```
// Writing to JSON - "rows" is clearer than "tabular"
await seqops(sequences)
 .asRows({ columns: ['id', 'sequence', 'length'] })
 .writeJSON('output.json');

// Writing to JSONL
await seqops(sequences)
 .asRows({ columns: ['id', 'seq', 'gc'] })
 .writeJSONL('output.jsonl');

// Also works for tabular formats
await seqops(sequences)
 .asRows({ columns: ['id', 'seq', 'length'] })
 .writeTSV('output.tsv');
```
Since
v0.1.0
- Defined in src/operations/index.ts:1989

writeTSV

writeTSV(
path: string,
options?: Omit<Fx2TabOptions, "delimiter">,
): Promise<void>
Write sequences as TSV (tab-separated values)

Terminal operation that writes sequences as tab-separated values.
Parameters
- path: string
 Output file path
- options: Omit<Fx2TabOptions, "delimiter"> = {}
 Conversion options (delimiter will be set to tab)
Returns Promise<void>
Example
```
// Simple TSV output
await seqops(sequences).writeTSV('output.tsv');

// With column selection
await seqops(sequences).writeTSV('output.tsv', {
 columns: ['id', 'seq', 'length', 'gc']
});
```
- Defined in src/operations/index.ts:2014

writeCSV

writeCSV(
path: string,
options?: Omit<Fx2TabOptions, "delimiter">,
): Promise<void>
Write sequences as CSV (comma-separated values)

Terminal operation that writes sequences as comma-separated values. Excel protection is recommended for CSV files.
Parameters
- path: string
 Output file path
- options: Omit<Fx2TabOptions, "delimiter"> = {}
 Conversion options (delimiter will be set to comma)
Returns Promise<void>
Example
```
// CSV with Excel protection
await seqops(sequences).writeCSV('output.csv', {
 excelSafe: true
});
```
- Defined in src/operations/index.ts:2039

writeDSV

writeDSV(
 path: string,
 delimiter: string,
 options?: Omit<Fx2TabOptions, "delimiter">,
): Promise<void>
Write sequences as DSV with custom delimiter

Terminal operation for any delimiter-separated format.
Parameters
- path: string
 Output file path
- delimiter: string
 Custom delimiter character(s)
- options: Omit<Fx2TabOptions, "delimiter"> = {}
 Conversion options
Returns Promise<void>
Example
```
// Pipe-delimited output
await seqops(sequences).writeDSV('output.psv', '|', {
 columns: ['id', 'seq', 'length']
});

// Semicolon for European Excel
await seqops(sequences).writeDSV('output.csv', ';', {
 excelSafe: true
});
```
- Defined in src/operations/index.ts:2069

collect

collect(): Promise<T[]>
Collect all sequences into an array

Terminal operation that materializes all sequences in memory. Use with caution on large datasets.

Returns Promise<T[]>
Promise resolving to array of sequences
Example
```
const sequences = await seqops(input)
 .seq({ minLength: 100 })
 .collect();
console.log(`Collected ${sequences.length} sequences`);
```
- Defined in src/operations/index.ts:2097

collectSet

collectSet<K extends number>(this: SeqOps<KmerSequence<K>>): Promise<KmerSet<K>>
Collect k-mer sequences into KmerSet with K preservation

When the stream contains KmerSequence objects, returns KmerSet which enforces compile-time k-mer size matching for set operations.
Type Parameters
- K extends number
Parameters
- this: SeqOps<KmerSequence<K>>
Returns Promise<KmerSet<K>>
Promise<KmerSet> for k-mer sequences
- Defined in src/operations/index.ts:2113
collectSet(this: SeqOps<T>): Promise<SequenceSet<T>>
Collect generic sequences into SequenceSet

For non-k-mer sequences, returns generic SequenceSet which allows flexible set operations across sequence types.
Parameters
- this: SeqOps<T>
Returns Promise<SequenceSet<T>>
Promise<SequenceSet> for generic sequences
- Defined in src/operations/index.ts:2123

count

count(): Promise<number>
Count sequences

Terminal operation that counts sequences without loading them in memory.

Returns Promise<number>
Promise resolving to sequence count
Example
```
const count = await seqops(sequences)
 .filter(seq => seq.length > 100)
 .count();
```
- Defined in src/operations/index.ts:2179

map

map(
this: SeqOps<T & { index: number }>,
fn: (seq: T, index: number) => U | Promise,
): SeqOps
Transform sequences with a mapping function

Transforms each sequence in the stream using the provided function. Type parameter U is inferred from the return type of the mapping function, allowing type transformations while preserving specific sequence types when the mapping function returns the same type.

After calling .enumerate(), the index parameter becomes available in the mapping function signature.
Type Parameters
- U extends AbstractSequence = T
 Output sequence type (defaults to T for type preservation)
Parameters
- this: SeqOps<T & { index: number }>
- fn: (seq: T, index: number) => U | Promise
 Mapping function (with index after enumerate)
Returns SeqOps
New SeqOps with transformed sequences
Example
```
// Transform without index
seqops<FastqSequence>(reads)
 .map((seq) => ({ ...seq, id: `sample1_${seq.id}` }));
// Type preserved: SeqOps<FastqSequence>

// Transform with index (after enumerate)
seqops(sequences)
 .enumerate()
 .map((seq, idx) => ({
 ...seq,
 description: `position=${idx} ${seq.description || ""}`,
 }));

// Async transformation
seqops(sequences)
 .map(async (seq) => {
 const annotation = await fetchAnnotation(seq.id);
 return { ...seq, description: annotation };
 });
```
- Defined in src/operations/index.ts:2225
map(fn: (seq: T) => U | Promise): SeqOps
Transform sequences with a mapping function

Transforms each sequence in the stream using the provided function. Type parameter U is inferred from the return type of the mapping function, allowing type transformations while preserving specific sequence types when the mapping function returns the same type.

After calling .enumerate(), the index parameter becomes available in the mapping function signature.
Type Parameters
- U extends AbstractSequence = T
 Output sequence type (defaults to T for type preservation)
Parameters
- fn: (seq: T) => U | Promise
 Mapping function (with index after enumerate)
Returns SeqOps
New SeqOps with transformed sequences
Example
```
// Transform without index
seqops<FastqSequence>(reads)
 .map((seq) => ({ ...seq, id: `sample1_${seq.id}` }));
// Type preserved: SeqOps<FastqSequence>

// Transform with index (after enumerate)
seqops(sequences)
 .enumerate()
 .map((seq, idx) => ({
 ...seq,
 description: `position=${idx} ${seq.description || ""}`,
 }));

// Async transformation
seqops(sequences)
 .map(async (seq) => {
 const annotation = await fetchAnnotation(seq.id);
 return { ...seq, description: annotation };
 });
```
- Defined in src/operations/index.ts:2229

enumerate

enumerate(): SeqOps<T & { index: number }>

Attach index to each sequence

Adds a zero-based index property to each sequence in the stream. After calling this method, downstream operations like .map() and .filter() can access the index parameter in their callback functions.

The index represents the position of the sequence in the stream (0-based).

Returns SeqOps<T & { index: number }>

New SeqOps with sequences that have an index property

Example

// Enable index parameter in downstream operations
const results = await seqops<FastqSequence>(reads)
  .enumerate()
  .filter((seq, idx) => idx < 10000) // Index available
  .map((seq, idx) => ({
    ...seq,
    description: `${seq.description} pos=${idx}`,
  }))
  .collect();

// Type: Array<FastqSequence & { index: number }> ✅
results[0].quality; // ✅ Exists (FastqSequence preserved)
results[0].index;   // ✅ Exists (from enumerate)

Example

// Position-based filtering
seqops(sequences)
  .enumerate()
  .filter((seq, idx) => idx % 2 === 0) // Keep even positions
  .writeFasta('even_positions.fasta');

Example

// Progress tracking
seqops(sequences)
  .enumerate()
  .tap((seq, idx) => {
    if (idx % 1000 === 0) console.log(`Processed ${idx}`);
  })
  .filter({ minLength: 100 });

tap

tap(
this: SeqOps<T & { index: number }>,
fn: (seq: T, index: number) => void | Promise<void>,
): SeqOps<T>

Apply side effects without consuming the stream

Executes a function for each sequence but yields the original sequence unchanged. Useful for logging, progress tracking, or other side effects that shouldn't modify the sequence data.

After calling .enumerate(), the index parameter becomes available.

Parameters

this: SeqOps<T & { index: number }>
fn: (seq: T, index: number) => void | Promise<void>
Side effect function (with index after enumerate)

Returns SeqOps<T>

Same SeqOps for continued chaining

Example

// Progress logging without index
let count = 0;
seqops(sequences)
  .tap((seq) => {
    count++;
    if (count % 1000 === 0) console.log(`Processed ${count}`);
  })
  .filter({ minLength: 100 })
  .writeFasta('output.fasta');

Example

// Progress tracking with index
seqops(sequences)
  .enumerate()
  .tap((seq, idx) => {
    if (idx % 1000 === 0) console.log(`Processed ${idx}`);
  })
  .filter({ minLength: 100 });

Example

// Collect statistics without modifying stream
const stats = { totalLength: 0, count: 0 };
seqops(sequences)
  .tap((seq) => {
    stats.totalLength += seq.length;
    stats.count++;
  })
  .filter({ minLength: 100 })
  .writeFasta('filtered.fasta');
console.log(`Average length: ${stats.totalLength / stats.count}`);

Example

// Async side effects (e.g., logging to database)
seqops(sequences)
  .enumerate()
  .tap(async (seq, idx) => {
    await logToDatabase({ id: seq.id, position: idx });
  })
  .filter({ minLength: 100 });

tap(fn: (seq: T) => void | Promise<void>): SeqOps<T>

Apply side effects without consuming the stream

Executes a function for each sequence but yields the original sequence unchanged. Useful for logging, progress tracking, or other side effects that shouldn't modify the sequence data.

After calling .enumerate(), the index parameter becomes available.

Parameters

fn: (seq: T) => void | Promise<void>
Side effect function (with index after enumerate)

Returns SeqOps<T>

Same SeqOps for continued chaining

Example

// Progress logging without index
let count = 0;
seqops(sequences)
  .tap((seq) => {
    count++;
    if (count % 1000 === 0) console.log(`Processed ${count}`);
  })
  .filter({ minLength: 100 })
  .writeFasta('output.fasta');

Example

// Progress tracking with index
seqops(sequences)
  .enumerate()
  .tap((seq, idx) => {
    if (idx % 1000 === 0) console.log(`Processed ${idx}`);
  })
  .filter({ minLength: 100 });

Example

// Collect statistics without modifying stream
const stats = { totalLength: 0, count: 0 };
seqops(sequences)
  .tap((seq) => {
    stats.totalLength += seq.length;
    stats.count++;
  })
  .filter({ minLength: 100 })
  .writeFasta('filtered.fasta');
console.log(`Average length: ${stats.totalLength / stats.count}`);

Example

// Async side effects (e.g., logging to database)
seqops(sequences)
  .enumerate()
  .tap(async (seq, idx) => {
    await logToDatabase({ id: seq.id, position: idx });
  })
  .filter({ minLength: 100 });

flatMap

flatMap(
 this: SeqOps<T & { index: number }>,
 fn: (
 seq: T,
 index: number,
 ) => U[] | AsyncIterable<U, any, any> | Promise<U[]>,
): SeqOps

Map each sequence to multiple sequences and flatten the result

Transforms each sequence into zero or more sequences, then flattens all results into a single stream. The mapping function can return an array or an async iterable.

After calling .enumerate(), the index parameter becomes available.

Type Parameters

U extends AbstractSequence = T
Output sequence type (defaults to T for type preservation)

Parameters

this: SeqOps<T & { index: number }>
fn: (seq: T, index: number) => U[] | AsyncIterable<U, any, any> | Promise<U[]>
Mapping function that returns array or async iterable (with index after enumerate)

Returns SeqOps

New SeqOps with flattened results

Example

// Expand each sequence to multiple variants
seqops(sequences)
  .flatMap((seq) => [
    { ...seq, id: `${seq.id}_variant1`, sequence: variant1(seq.sequence) },
    { ...seq, id: `${seq.id}_variant2`, sequence: variant2(seq.sequence) },
  ])
  .writeFasta('variants.fasta');

Example

// Generate k-mers from each sequence
seqops(sequences)
  .flatMap((seq) => generateKmers(seq, 21))
  .unique({ by: 'sequence' })
  .writeFasta('unique_kmers.fasta');

Example

// With index - expand based on position
seqops(sequences)
  .enumerate()
  .flatMap((seq, idx) => {
    const count = idx < 10 ? 3 : 1; // More variants for first 10
    return Array.from({ length: count }, (_, i) => ({
      ...seq,
      id: `${seq.id}_copy${i}`,
    }));
  });

Example

// Async iterable result
seqops(sequences)
  .flatMap(async function* (seq) {
    for (const frame of [1, 2, 3, -1, -2, -3]) {
      yield translateFrame(seq, frame);
    }
  });

flatMap(
fn: (seq: T) => U[] | AsyncIterable<U, any, any> | Promise<U[]>,
): SeqOps

Map each sequence to multiple sequences and flatten the result

Transforms each sequence into zero or more sequences, then flattens all results into a single stream. The mapping function can return an array or an async iterable.

After calling .enumerate(), the index parameter becomes available.

Type Parameters

U extends AbstractSequence = T
Output sequence type (defaults to T for type preservation)

Parameters

fn: (seq: T) => U[] | AsyncIterable<U, any, any> | Promise<U[]>
Mapping function that returns array or async iterable (with index after enumerate)

Returns SeqOps

New SeqOps with flattened results

Example

// Expand each sequence to multiple variants
seqops(sequences)
  .flatMap((seq) => [
    { ...seq, id: `${seq.id}_variant1`, sequence: variant1(seq.sequence) },
    { ...seq, id: `${seq.id}_variant2`, sequence: variant2(seq.sequence) },
  ])
  .writeFasta('variants.fasta');

Example

// Generate k-mers from each sequence
seqops(sequences)
  .flatMap((seq) => generateKmers(seq, 21))
  .unique({ by: 'sequence' })
  .writeFasta('unique_kmers.fasta');

Example

// With index - expand based on position
seqops(sequences)
  .enumerate()
  .flatMap((seq, idx) => {
    const count = idx < 10 ? 3 : 1; // More variants for first 10
    return Array.from({ length: count }, (_, i) => ({
      ...seq,
      id: `${seq.id}_copy${i}`,
    }));
  });

Example

// Async iterable result
seqops(sequences)
  .flatMap(async function* (seq) {
    for (const frame of [1, 2, 3, -1, -2, -3]) {
      yield translateFrame(seq, frame);
    }
  });

forEach

forEach(
this: SeqOps<T & { index: number }>,
fn: (seq: T, index: number) => void | Promise<void>,
): Promise<void>
Process each sequence with a callback (terminal operation)

Applies a function to each sequence in the stream. This is a terminal operation that consumes the stream and returns when all sequences have been processed.

After calling .enumerate(), the index parameter becomes available in the callback.
Parameters
- this: SeqOps<T & { index: number }>
- fn: (seq: T, index: number) => void | Promise<void>
 Callback function to execute for each sequence
Returns Promise<void>
Promise that resolves when all sequences have been processed
Example
```
// Type-safe with FastqSequence
await seqops<FastqSequence>(reads)
 .forEach((seq) => {
 console.log(seq.quality); // ✅ TypeScript knows quality exists
 });
```
Example
```
// With progress tracking after enumerate
await seqops(sequences)
 .enumerate()
 .forEach((seq, idx) => {
 if (idx % 1000 === 0) console.log(`Progress: ${idx}`);
 });
```
Example
```
// Async callback support
await seqops(sequences)
 .forEach(async (seq) => {
 await writeToDatabase(seq);
 });
```
- Defined in src/operations/index.ts:2521
forEach(fn: (seq: T) => void | Promise<void>): Promise<void>
Process each sequence with a callback (terminal operation)

Applies a function to each sequence in the stream. This is a terminal operation that consumes the stream and returns when all sequences have been processed.

After calling .enumerate(), the index parameter becomes available in the callback.
Parameters
- fn: (seq: T) => void | Promise<void>
 Callback function to execute for each sequence
Returns Promise<void>
Promise that resolves when all sequences have been processed
Example
```
// Type-safe with FastqSequence
await seqops<FastqSequence>(reads)
 .forEach((seq) => {
 console.log(seq.quality); // ✅ TypeScript knows quality exists
 });
```
Example
```
// With progress tracking after enumerate
await seqops(sequences)
 .enumerate()
 .forEach((seq, idx) => {
 if (idx % 1000 === 0) console.log(`Progress: ${idx}`);
 });
```
Example
```
// Async callback support
await seqops(sequences)
 .forEach(async (seq) => {
 await writeToDatabase(seq);
 });
```
- Defined in src/operations/index.ts:2525

reduce

reduce(
this: SeqOps<T & { index: number }>,
fn: (accumulator: T, seq: T, index: number) => T | Promise<T>,
): Promise<T | undefined>
Reduce sequences to a single value using first element as accumulator

Terminal operation that reduces the stream to a single value by applying a function that combines the accumulator with each sequence. The first sequence in the stream becomes the initial accumulator value.

Returns undefined if the stream is empty.

After calling .enumerate(), the index parameter becomes available.
Parameters
- this: SeqOps<T & { index: number }>
- fn: (accumulator: T, seq: T, index: number) => T | Promise<T>
 Reducer function that combines accumulator with each sequence
Returns Promise<T | undefined>
Promise resolving to the final accumulated value, or undefined if empty
Example
```
// Find longest sequence
const longest = await seqops<FastqSequence>(reads)
 .reduce((acc, seq) => seq.length > acc.length ? seq : acc);
// Type: FastqSequence | undefined ✅
```
Example
```
// With index tracking
const result = await seqops(sequences)
 .enumerate()
 .reduce((acc, seq, idx) => {
 console.log(`Comparing at index ${idx}`);
 return acc.length > seq.length ? acc : seq;
 });
```
Example
```
// Find sequence with highest GC content
const highestGC = await seqops(sequences)
 .reduce((acc, seq) => {
 const accGC = calculateGC(acc.sequence);
 const seqGC = calculateGC(seq.sequence);
 return seqGC > accGC ? seq : acc;
 });
```
- Defined in src/operations/index.ts:2582
reduce(fn: (accumulator: T, seq: T) => T | Promise<T>): Promise<T | undefined>
Reduce sequences to a single value using first element as accumulator

Terminal operation that reduces the stream to a single value by applying a function that combines the accumulator with each sequence. The first sequence in the stream becomes the initial accumulator value.

Returns undefined if the stream is empty.

After calling .enumerate(), the index parameter becomes available.
Parameters
- fn: (accumulator: T, seq: T) => T | Promise<T>
 Reducer function that combines accumulator with each sequence
Returns Promise<T | undefined>
Promise resolving to the final accumulated value, or undefined if empty
Example
```
// Find longest sequence
const longest = await seqops<FastqSequence>(reads)
 .reduce((acc, seq) => seq.length > acc.length ? seq : acc);
// Type: FastqSequence | undefined ✅
```
Example
```
// With index tracking
const result = await seqops(sequences)
 .enumerate()
 .reduce((acc, seq, idx) => {
 console.log(`Comparing at index ${idx}`);
 return acc.length > seq.length ? acc : seq;
 });
```
Example
```
// Find sequence with highest GC content
const highestGC = await seqops(sequences)
 .reduce((acc, seq) => {
 const accGC = calculateGC(acc.sequence);
 const seqGC = calculateGC(seq.sequence);
 return seqGC > accGC ? seq : acc;
 });
```
- Defined in src/operations/index.ts:2586

fold

fold(
 this: SeqOps<T & { index: number }>,
 fn: (accumulator: U, seq: T, index: number) => U | Promise,
 initialValue: U,
): Promise

Fold sequences to a single value with explicit initial value

Terminal operation that reduces the stream to a single value by applying a function that combines the accumulator with each sequence. Unlike reduce(), fold() requires an explicit initial value and can transform to any type.

Never returns undefined - always returns at least the initial value.

After calling .enumerate(), the index parameter becomes available.

Type Parameters

Parameters

this: SeqOps<T & { index: number }>
fn: (accumulator: U, seq: T, index: number) => U | Promise
Folder function that combines accumulator with each sequence
initialValue: U
The initial accumulator value

Returns Promise

Promise resolving to the final accumulated value

Example

// Calculate total length
const totalLength = await seqops(sequences)
  .fold((sum, seq) => sum + seq.length, 0);
// Type: number ✅

Example

// Build index mapping
const index = await seqops<FastqSequence>(reads)
  .fold(
    (map, seq) => map.set(seq.id, seq),
    new Map<string, FastqSequence>(),
  );
// Type: Map<string, FastqSequence> ✅

Example

// Collect statistics with position tracking
const stats = await seqops(sequences)
  .enumerate()
  .fold(
    (acc, seq, idx) => {
      const gc = calculateGC(seq.sequence);
      return {
        min: Math.min(acc.min, gc),
        max: Math.max(acc.max, gc),
        sum: acc.sum + gc,
        count: acc.count + 1,
        positions: [...acc.positions, { idx, gc }],
      };
    },
    { min: Infinity, max: -Infinity, sum: 0, count: 0, positions: [] },
  );

fold(
fn: (accumulator: U, seq: T) => U | Promise,
initialValue: U,
): Promise

Fold sequences to a single value with explicit initial value

Never returns undefined - always returns at least the initial value.

After calling .enumerate(), the index parameter becomes available.

Type Parameters

Parameters

fn: (accumulator: U, seq: T) => U | Promise
Folder function that combines accumulator with each sequence
initialValue: U
The initial accumulator value

Returns Promise

Promise resolving to the final accumulated value

Example

// Calculate total length
const totalLength = await seqops(sequences)
  .fold((sum, seq) => sum + seq.length, 0);
// Type: number ✅

Example

// Build index mapping
const index = await seqops<FastqSequence>(reads)
  .fold(
    (map, seq) => map.set(seq.id, seq),
    new Map<string, FastqSequence>(),
  );
// Type: Map<string, FastqSequence> ✅

Example

// Collect statistics with position tracking
const stats = await seqops(sequences)
  .enumerate()
  .fold(
    (acc, seq, idx) => {
      const gc = calculateGC(seq.sequence);
      return {
        min: Math.min(acc.min, gc),
        max: Math.max(acc.max, gc),
        sum: acc.sum + gc,
        count: acc.count + 1,
        positions: [...acc.positions, { idx, gc }],
      };
    },
    { min: Infinity, max: -Infinity, sum: 0, count: 0, positions: [] },
  );

zipWith

zipWith(
 this: SeqOps<T & { index: number }>,
 other: SeqOps,
 fn: (a: T, b: U, indexA: number, indexB: number) => V | Promise<V>,
): SeqOps<V>
Combine two streams element-by-element with a combining function

Zips two streams together, applying a function to each pair of elements. Index parameters appear in the signature only when the corresponding stream has been enumerated. Stops when either stream ends (shortest-wins behavior).
Type Parameters
- U extends AbstractSequence
- V extends AbstractSequence = T
Parameters
- this: SeqOps<T & { index: number }>
- other: SeqOps
 The second stream to zip with (SeqOps or AsyncIterable)
- fn: (a: T, b: U, indexA: number, indexB: number) => V | Promise<V>
 Combining function that merges elements from both streams
Returns SeqOps<V>
New SeqOps with combined elements
Example
```
// Neither enumerated
const forward = seqops<FastqSequence>("reads_R1.fastq");
const reverse = seqops<FastqSequence>("reads_R2.fastq");
forward.zipWith(reverse, (fwd, rev) => ({
 id: `${fwd.id}_merged`,
 sequence: fwd.sequence + "NNNN" + reverseComplement(rev.sequence),
}));
```
Example
```
// Left enumerated only
forward.enumerate().zipWith(reverse, (fwd, rev, idxFwd) => {
 if (idxFwd % 1000 === 0) console.log(`Processed ${idxFwd} pairs`);
 return mergePair(fwd, rev);
});
```
Example
```
// Both enumerated - verify alignment
forward.enumerate().zipWith(reverse.enumerate(), (fwd, rev, idxFwd, idxRev) => {
 if (idxFwd !== idxRev) throw new Error(`Alignment mismatch`);
 return mergePair(fwd, rev);
});
```
- Defined in src/operations/index.ts:2735
zipWith(
 this: SeqOps<T & { index: number }>,
 other: SeqOps,
 fn: (a: T, b: U, indexA: number) => V | Promise<V>,
): SeqOps<V>
Combine two streams element-by-element with a combining function

Zips two streams together, applying a function to each pair of elements. Index parameters appear in the signature only when the corresponding stream has been enumerated. Stops when either stream ends (shortest-wins behavior).
Type Parameters
- U extends AbstractSequence
- V extends AbstractSequence = T
Parameters
- this: SeqOps<T & { index: number }>
- other: SeqOps
 The second stream to zip with (SeqOps or AsyncIterable)
- fn: (a: T, b: U, indexA: number) => V | Promise<V>
 Combining function that merges elements from both streams
Returns SeqOps<V>
New SeqOps with combined elements
Example
```
// Neither enumerated
const forward = seqops<FastqSequence>("reads_R1.fastq");
const reverse = seqops<FastqSequence>("reads_R2.fastq");
forward.zipWith(reverse, (fwd, rev) => ({
 id: `${fwd.id}_merged`,
 sequence: fwd.sequence + "NNNN" + reverseComplement(rev.sequence),
}));
```
Example
```
// Left enumerated only
forward.enumerate().zipWith(reverse, (fwd, rev, idxFwd) => {
 if (idxFwd % 1000 === 0) console.log(`Processed ${idxFwd} pairs`);
 return mergePair(fwd, rev);
});
```
Example
```
// Both enumerated - verify alignment
forward.enumerate().zipWith(reverse.enumerate(), (fwd, rev, idxFwd, idxRev) => {
 if (idxFwd !== idxRev) throw new Error(`Alignment mismatch`);
 return mergePair(fwd, rev);
});
```
- Defined in src/operations/index.ts:2740
zipWith(
other: SeqOps,
fn: (a: T, b: U, indexB: number) => V | Promise<V>,
): SeqOps<V>
Combine two streams element-by-element with a combining function

Zips two streams together, applying a function to each pair of elements. Index parameters appear in the signature only when the corresponding stream has been enumerated. Stops when either stream ends (shortest-wins behavior).
Type Parameters
- U extends AbstractSequence
- V extends AbstractSequence = T
Parameters
- other: SeqOps
 The second stream to zip with (SeqOps or AsyncIterable)
- fn: (a: T, b: U, indexB: number) => V | Promise<V>
 Combining function that merges elements from both streams
Returns SeqOps<V>
New SeqOps with combined elements
Example
```
// Neither enumerated
const forward = seqops<FastqSequence>("reads_R1.fastq");
const reverse = seqops<FastqSequence>("reads_R2.fastq");
forward.zipWith(reverse, (fwd, rev) => ({
 id: `${fwd.id}_merged`,
 sequence: fwd.sequence + "NNNN" + reverseComplement(rev.sequence),
}));
```
Example
```
// Left enumerated only
forward.enumerate().zipWith(reverse, (fwd, rev, idxFwd) => {
 if (idxFwd % 1000 === 0) console.log(`Processed ${idxFwd} pairs`);
 return mergePair(fwd, rev);
});
```
Example
```
// Both enumerated - verify alignment
forward.enumerate().zipWith(reverse.enumerate(), (fwd, rev, idxFwd, idxRev) => {
 if (idxFwd !== idxRev) throw new Error(`Alignment mismatch`);
 return mergePair(fwd, rev);
});
```
- Defined in src/operations/index.ts:2745
zipWith(
other: SeqOps,
fn: (a: T, b: U) => V | Promise<V>,
): SeqOps<V>
Combine two streams element-by-element with a combining function

Zips two streams together, applying a function to each pair of elements. Index parameters appear in the signature only when the corresponding stream has been enumerated. Stops when either stream ends (shortest-wins behavior).
Type Parameters
- U extends AbstractSequence
- V extends AbstractSequence = T
Parameters
- other: SeqOps
 The second stream to zip with (SeqOps or AsyncIterable)
- fn: (a: T, b: U) => V | Promise<V>
 Combining function that merges elements from both streams
Returns SeqOps<V>
New SeqOps with combined elements
Example
```
// Neither enumerated
const forward = seqops<FastqSequence>("reads_R1.fastq");
const reverse = seqops<FastqSequence>("reads_R2.fastq");
forward.zipWith(reverse, (fwd, rev) => ({
 id: `${fwd.id}_merged`,
 sequence: fwd.sequence + "NNNN" + reverseComplement(rev.sequence),
}));
```
Example
```
// Left enumerated only
forward.enumerate().zipWith(reverse, (fwd, rev, idxFwd) => {
 if (idxFwd % 1000 === 0) console.log(`Processed ${idxFwd} pairs`);
 return mergePair(fwd, rev);
});
```
Example
```
// Both enumerated - verify alignment
forward.enumerate().zipWith(reverse.enumerate(), (fwd, rev, idxFwd, idxRev) => {
 if (idxFwd !== idxRev) throw new Error(`Alignment mismatch`);
 return mergePair(fwd, rev);
});
```
- Defined in src/operations/index.ts:2749
zipWith(
 this: SeqOps<T & { index: number }>,
 other: AsyncIterable,
 fn: (a: T, b: U, indexA: number) => V | Promise<V>,
): SeqOps<V>
Combine two streams element-by-element with a combining function

Zips two streams together, applying a function to each pair of elements. Index parameters appear in the signature only when the corresponding stream has been enumerated. Stops when either stream ends (shortest-wins behavior).
Type Parameters
- U extends AbstractSequence
- V extends AbstractSequence = T
Parameters
- this: SeqOps<T & { index: number }>
- other: AsyncIterable
 The second stream to zip with (SeqOps or AsyncIterable)
- fn: (a: T, b: U, indexA: number) => V | Promise<V>
 Combining function that merges elements from both streams
Returns SeqOps<V>
New SeqOps with combined elements
Example
```
// Neither enumerated
const forward = seqops<FastqSequence>("reads_R1.fastq");
const reverse = seqops<FastqSequence>("reads_R2.fastq");
forward.zipWith(reverse, (fwd, rev) => ({
 id: `${fwd.id}_merged`,
 sequence: fwd.sequence + "NNNN" + reverseComplement(rev.sequence),
}));
```
Example
```
// Left enumerated only
forward.enumerate().zipWith(reverse, (fwd, rev, idxFwd) => {
 if (idxFwd % 1000 === 0) console.log(`Processed ${idxFwd} pairs`);
 return mergePair(fwd, rev);
});
```
Example
```
// Both enumerated - verify alignment
forward.enumerate().zipWith(reverse.enumerate(), (fwd, rev, idxFwd, idxRev) => {
 if (idxFwd !== idxRev) throw new Error(`Alignment mismatch`);
 return mergePair(fwd, rev);
});
```
- Defined in src/operations/index.ts:2753
zipWith(
other: AsyncIterable,
fn: (a: T, b: U) => V | Promise<V>,
): SeqOps<V>
Combine two streams element-by-element with a combining function

Zips two streams together, applying a function to each pair of elements. Index parameters appear in the signature only when the corresponding stream has been enumerated. Stops when either stream ends (shortest-wins behavior).
Type Parameters
- U extends AbstractSequence
- V extends AbstractSequence = T
Parameters
- other: AsyncIterable
 The second stream to zip with (SeqOps or AsyncIterable)
- fn: (a: T, b: U) => V | Promise<V>
 Combining function that merges elements from both streams
Returns SeqOps<V>
New SeqOps with combined elements
Example
```
// Neither enumerated
const forward = seqops<FastqSequence>("reads_R1.fastq");
const reverse = seqops<FastqSequence>("reads_R2.fastq");
forward.zipWith(reverse, (fwd, rev) => ({
 id: `${fwd.id}_merged`,
 sequence: fwd.sequence + "NNNN" + reverseComplement(rev.sequence),
}));
```
Example
```
// Left enumerated only
forward.enumerate().zipWith(reverse, (fwd, rev, idxFwd) => {
 if (idxFwd % 1000 === 0) console.log(`Processed ${idxFwd} pairs`);
 return mergePair(fwd, rev);
});
```
Example
```
// Both enumerated - verify alignment
forward.enumerate().zipWith(reverse.enumerate(), (fwd, rev, idxFwd, idxRev) => {
 if (idxFwd !== idxRev) throw new Error(`Alignment mismatch`);
 return mergePair(fwd, rev);
});
```
- Defined in src/operations/index.ts:2758

interleave

interleave(
other: SeqOps<T> | AsyncIterable<T, any, any>,
options?: InterleaveOptions,
): SeqOps<T>

Interleave with another stream in alternating order

Combines two streams by alternating elements: left, right, left, right, etc. Both streams must contain sequences of the same type for type safety. Commonly used for Illumina paired-end reads.

Stops when either stream ends (shortest-wins behavior).

Parameters

other: SeqOps<T> | AsyncIterable<T, any, any>
Stream to interleave with (SeqOps or AsyncIterable)
Optionaloptions: InterleaveOptions
Interleaving options

Returns SeqOps<T>

Interleaved SeqOps stream

Example

// Basic interleaving
const forward = seqops<FastqSequence>('reads_R1.fastq');
const reverse = seqops<FastqSequence>('reads_R2.fastq');

forward
  .interleave(reverse)
  .writeFastq('interleaved.fastq');
// Output: F1, R1, F2, R2, F3, R3, ...

Example

// With ID validation for paired-end reads
forward
  .interleave(reverse, { validateIds: true })
  .writeFastq('interleaved.fastq');
// Throws error if IDs don't match

Example

// Custom ID comparison (ignore /1 /2 suffix)
forward
  .interleave(reverse, {
    validateIds: true,
    idComparator: (a, b) => {
      const stripSuffix = (id: string) => id.replace(//[12]$/, '');
      return stripSuffix(a) === stripSuffix(b);
    }
  })
  .writeFastq('interleaved.fastq');

Example

// Type safety - only same types can be interleaved
const fasta = seqops<FastaSequence>('seqs.fasta');
const fastq = seqops<FastqSequence>('seqs.fastq');

fasta.interleave(fasta);  // ✅ Both FastaSequence
fasta.interleave(fastq);  // ❌ Type error - FastaSequence vs FastqSequence

pair

pair(
other: SeqOps<T> | AsyncIterable<T, any, any>,
options?: PairOptions,
): SeqOps<T>
Repair paired-end read ordering through buffered ID matching

Matches paired-end reads (R1 and R2) from shuffled or out-of-order streams, then outputs them in correctly interleaved order. Supports two modes:
- Dual-stream: Match reads from two separate files (R1.fastq + R2.fastq)
- Single-stream: Repair pairing within one mixed stream
Uses hash-based buffering to handle out-of-order data, making it suitable for sequences that have been sorted, filtered, or otherwise reordered after initial sequencing.

Output Order: Always yields R1, R2, R1, R2, R1, R2... (interleaved)

Memory Management:
- Buffers reads until match found
- Default limit: 100,000 reads (configurable)
- Warns at 80% capacity
- Throws MemoryError if limit exceeded
Parameters
- other: SeqOps<T> | AsyncIterable<T, any, any>
 Second stream for dual-stream mode (R2 reads)
- Optionaloptions: PairOptions
 Pairing options (ID extraction, buffer limits, unpaired handling)
Returns SeqOps<T>
Paired SeqOps stream in interleaved order
Throws
When buffer size exceeds maxBufferSize

Throws
When onUnpaired='error' and unpaired reads found
Example
```
// Dual-stream mode: Match reads from separate R1 and R2 files
const r1 = seqops<FastqSequence>('sample_R1.fastq.gz');
const r2 = seqops<FastqSequence>('sample_R2.fastq.gz');

r1.pair(r2).writeFastq('paired.fastq');
// Output: R1_001, R2_001, R1_002, R2_002, ...
```
Example
```
// Single-stream mode: Repair pairing within mixed stream
seqops<FastqSequence>('shuffled.fastq')
 .pair()
 .writeFastq('repaired.fastq');
// Reads with /1 suffix → R1, /2 suffix → R2
```
Example
```
// Custom ID extraction for non-standard naming
r1.pair(r2, {
 extractPairId: (id) => id.split('_')[0] // Custom base ID
}).writeFastq('paired.fastq');
```
Example
```
// Strict mode: error on unpaired reads
r1.pair(r2, {
 onUnpaired: 'error', // Throw on unpaired (default: 'warn')
 maxBufferSize: 50000 // Smaller buffer limit
}).writeFastq('paired.fastq');
```
Example
```
// Skip unpaired reads silently
seqops<FastqSequence>('mixed.fastq')
 .pair({ onUnpaired: 'skip' })
 .writeFastq('paired_only.fastq');
```
Performance
- Best case (synchronized): O(1) memory - minimal buffering
- Average case (partially shuffled): O(k) where k = shuffle distance
- Worst case (fully shuffled): O(n) - all reads buffered
Since
v0.1.0
- Defined in src/operations/index.ts:2938
pair(options?: PairOptions): SeqOps<T>
Repair paired-end read ordering through buffered ID matching

Matches paired-end reads (R1 and R2) from shuffled or out-of-order streams, then outputs them in correctly interleaved order. Supports two modes:
- Dual-stream: Match reads from two separate files (R1.fastq + R2.fastq)
- Single-stream: Repair pairing within one mixed stream
Uses hash-based buffering to handle out-of-order data, making it suitable for sequences that have been sorted, filtered, or otherwise reordered after initial sequencing.

Output Order: Always yields R1, R2, R1, R2, R1, R2... (interleaved)

Memory Management:
- Buffers reads until match found
- Default limit: 100,000 reads (configurable)
- Warns at 80% capacity
- Throws MemoryError if limit exceeded
Parameters
- Optionaloptions: PairOptions
 Pairing options (ID extraction, buffer limits, unpaired handling)
Returns SeqOps<T>
Paired SeqOps stream in interleaved order
Throws
When buffer size exceeds maxBufferSize

Throws
When onUnpaired='error' and unpaired reads found
Example
```
// Dual-stream mode: Match reads from separate R1 and R2 files
const r1 = seqops<FastqSequence>('sample_R1.fastq.gz');
const r2 = seqops<FastqSequence>('sample_R2.fastq.gz');

r1.pair(r2).writeFastq('paired.fastq');
// Output: R1_001, R2_001, R1_002, R2_002, ...
```
Example
```
// Single-stream mode: Repair pairing within mixed stream
seqops<FastqSequence>('shuffled.fastq')
 .pair()
 .writeFastq('repaired.fastq');
// Reads with /1 suffix → R1, /2 suffix → R2
```
Example
```
// Custom ID extraction for non-standard naming
r1.pair(r2, {
 extractPairId: (id) => id.split('_')[0] // Custom base ID
}).writeFastq('paired.fastq');
```
Example
```
// Strict mode: error on unpaired reads
r1.pair(r2, {
 onUnpaired: 'error', // Throw on unpaired (default: 'warn')
 maxBufferSize: 50000 // Smaller buffer limit
}).writeFastq('paired.fastq');
```
Example
```
// Skip unpaired reads silently
seqops<FastqSequence>('mixed.fastq')
 .pair({ onUnpaired: 'skip' })
 .writeFastq('paired_only.fastq');
```
Performance
- Best case (synchronized): O(1) memory - minimal buffering
- Average case (partially shuffled): O(k) where k = shuffle distance
- Worst case (fully shuffled): O(n) - all reads buffered
Since
v0.1.0
- Defined in src/operations/index.ts:2939

locate

locate(pattern: string): AsyncIterable<MotifLocation>

Find pattern locations in sequences

Terminal operation that finds all occurrences of patterns within sequences with support for fuzzy matching, strand searching, and various output formats. Mirrors seqkit locate functionality.

Parameters

pattern: string

Returns AsyncIterable<MotifLocation>

Example

// Simple cases (most common)
const locations = seqops(sequences)
  .locate('ATCG')                    // Exact string match
  .locate(/ATG...TAA/)               // Regex pattern
  .locate('ATCG', 2);                // Allow 2 mismatches

// Advanced options for complex scenarios
const locations = seqops(sequences).locate({
  pattern: 'ATCG',
  allowMismatches: 1,
  searchBothStrands: true,
  outputFormat: 'bed'
});

for await (const location of locations) {
  console.log(`Found at ${location.start}-${location.end} on ${location.strand}`);
}

locate(pattern: RegExp): AsyncIterable<MotifLocation>

Find pattern locations in sequences

Terminal operation that finds all occurrences of patterns within sequences with support for fuzzy matching, strand searching, and various output formats. Mirrors seqkit locate functionality.

Parameters

pattern: RegExp

Returns AsyncIterable<MotifLocation>

Example

// Simple cases (most common)
const locations = seqops(sequences)
  .locate('ATCG')                    // Exact string match
  .locate(/ATG...TAA/)               // Regex pattern
  .locate('ATCG', 2);                // Allow 2 mismatches

// Advanced options for complex scenarios
const locations = seqops(sequences).locate({
  pattern: 'ATCG',
  allowMismatches: 1,
  searchBothStrands: true,
  outputFormat: 'bed'
});

for await (const location of locations) {
  console.log(`Found at ${location.start}-${location.end} on ${location.strand}`);
}

locate(pattern: string, mismatches: number): AsyncIterable<MotifLocation>

Find pattern locations in sequences

Terminal operation that finds all occurrences of patterns within sequences with support for fuzzy matching, strand searching, and various output formats. Mirrors seqkit locate functionality.

Parameters

pattern: string
mismatches: number

Returns AsyncIterable<MotifLocation>

Example

// Simple cases (most common)
const locations = seqops(sequences)
  .locate('ATCG')                    // Exact string match
  .locate(/ATG...TAA/)               // Regex pattern
  .locate('ATCG', 2);                // Allow 2 mismatches

// Advanced options for complex scenarios
const locations = seqops(sequences).locate({
  pattern: 'ATCG',
  allowMismatches: 1,
  searchBothStrands: true,
  outputFormat: 'bed'
});

for await (const location of locations) {
  console.log(`Found at ${location.start}-${location.end} on ${location.strand}`);
}

locate(pattern: RegExp, mismatches: number): AsyncIterable<MotifLocation>

Find pattern locations in sequences

Terminal operation that finds all occurrences of patterns within sequences with support for fuzzy matching, strand searching, and various output formats. Mirrors seqkit locate functionality.

Parameters

pattern: RegExp
mismatches: number

Returns AsyncIterable<MotifLocation>

Example

// Simple cases (most common)
const locations = seqops(sequences)
  .locate('ATCG')                    // Exact string match
  .locate(/ATG...TAA/)               // Regex pattern
  .locate('ATCG', 2);                // Allow 2 mismatches

// Advanced options for complex scenarios
const locations = seqops(sequences).locate({
  pattern: 'ATCG',
  allowMismatches: 1,
  searchBothStrands: true,
  outputFormat: 'bed'
});

for await (const location of locations) {
  console.log(`Found at ${location.start}-${location.end} on ${location.strand}`);
}

locate(options: LocateOptions): AsyncIterable<MotifLocation>

Find pattern locations in sequences

Terminal operation that finds all occurrences of patterns within sequences with support for fuzzy matching, strand searching, and various output formats. Mirrors seqkit locate functionality.

Parameters

options: LocateOptions

Returns AsyncIterable<MotifLocation>

Example

// Simple cases (most common)
const locations = seqops(sequences)
  .locate('ATCG')                    // Exact string match
  .locate(/ATG...TAA/)               // Regex pattern
  .locate('ATCG', 2);                // Allow 2 mismatches

// Advanced options for complex scenarios
const locations = seqops(sequences).locate({
  pattern: 'ATCG',
  allowMismatches: 1,
  searchBothStrands: true,
  outputFormat: 'bed'
});

for await (const location of locations) {
  console.log(`Found at ${location.start}-${location.end} on ${location.strand}`);
}

[asyncIterator]

"[asyncIterator]"(): AsyncIterator<AbstractSequence>
Enable direct iteration over the pipeline

Returns AsyncIterator<AbstractSequence>
Async iterator for sequences
Example
```
for await (const seq of seqops(sequences).seq({ minLength: 100 })) {
 console.log(seq.id);
}
```
- Defined in src/operations/index.ts:3031

Class SeqOps<T>

Example

Type Parameters

Index

Constructors

Methods

Constructors

constructor

Type Parameters

Parameters

Returns SeqOps<T>

Methods

StaticfromDSV

Parameters

Returns SeqOps<AbstractSequence>

Example

Since

StaticfromTSV

Parameters

Returns SeqOps<AbstractSequence>

Example

Since

StaticfromCSV

Parameters

Returns SeqOps<AbstractSequence>

Example

Since

StaticfromJSON

Parameters

Returns SeqOps<AbstractSequence>

Example

Performance

Since

StaticfromJSONL

Parameters

Returns SeqOps<AbstractSequence>

Example

Performance

Since

Staticfrom

Type Parameters

Parameters

Returns SeqOps<T>

Example

Since

filter

Parameters

Returns SeqOps<T>

Example

Parameters

Returns SeqOps<T>

Example

filterBySet

Type Parameters

Parameters

Returns SeqOps<T>

Example

Example

Example

transform

Parameters

Returns SeqOps<T>

Example

amplicon

Parameters

Returns SeqOps<T>

Example

Parameters

Returns SeqOps<T>

Example

Parameters

Returns SeqOps<T>

Example

Parameters

Returns SeqOps<T>

Example

Parameters

Returns SeqOps<T>

Example

clean

`Static`fromDSV

`Static`fromTSV

`Static`fromCSV

`Static`fromJSON

`Static`fromJSONL

`Static`from

`Static`concat