The number of characters (or bytes) in the content.
For ASCII content this is the same regardless of internal representation.
StaticfromCreates a GenotypeString backed by a JavaScript string.
If the argument is already a GenotypeString, it is returned as-is with no copying or wrapping. This makes the method idempotent and safe to call in contexts where the input may be either a plain string or an existing GenotypeString.
When a plain string is passed, it is stored as-is with no copying or validation. Conversion to bytes happens lazily if and when a byte-native operation is called.
StaticfromCreates a GenotypeString backed by a copy of the provided byte array.
A defensive copy is made so that subsequent mutations to the original array do not affect the GenotypeString instance.
StaticconcatCreates a new GenotypeString by concatenating multiple parts.
When all parts are bytes-backed GenotypeStrings, the concatenation is performed entirely in byte-land — one Uint8Array is allocated at the total length and each part is copied in. No string conversion occurs.
Accepts any mix of GenotypeString and plain string arguments.
Returns whether the content contains the given substring.
When the data is in byte form, this performs a byte scan without converting to a JS string.
Alias for includes. Returns whether the content contains the given substring.
This name follows the convention used by Rust's str::contains and
Python's in operator. Functionally identical to includes.
Returns the index of the first occurrence of the pattern, or -1 if not found. Follows the same semantics as String.prototype.indexOf, including fromIndex clamping.
When the data is in byte form, this performs a byte scan without converting to a JS string.
Returns a new GenotypeString containing a portion of the content.
Follows the same semantics as String.prototype.slice. The returned instance is independent — mutating one does not affect the other.
Optionalend: numberReturns a new GenotypeString containing a portion of the content.
Follows the same semantics as String.prototype.substring: negative or NaN arguments are clamped to 0, values beyond length are clamped to length, and if start is greater than end the two are swapped. For new code, prefer slice which has more predictable behavior with negative indices.
Optionalend: numberReturns a new GenotypeString with the content repeated the given number of times.
Follows the same semantics as String.prototype.repeat. When the data is in byte form, the repetition is performed by allocating a single Uint8Array and copying the source bytes into each segment.
The number of times to repeat (must be non-negative and finite)
Returns a new GenotypeString with all ASCII lowercase letters converted to uppercase.
When the data is in byte form, this uses bit manipulation (byte & 0xDF)
rather than JS string casing. Only ASCII letters a-z are affected; all
other byte values are preserved unchanged.
Returns a new GenotypeString with all ASCII uppercase letters converted to lowercase.
When the data is in byte form, this uses bit manipulation (byte | 0x20)
rather than JS string casing. Only ASCII letters A-Z are affected; all
other byte values are preserved unchanged.
Returns a new GenotypeString with leading and trailing ASCII whitespace removed.
When the data is in byte form, bytes 0x09 (tab), 0x0A (LF), 0x0B (VT), 0x0C (FF), 0x0D (CR), and 0x20 (space) are trimmed. This matches the characters that String.prototype.trim removes in the ASCII range.
Returns the character at the given index, or an empty string if the index is out of range.
Returns the ASCII/Unicode code point of the character at the given index, or NaN if the index is out of range.
Returns whether the character at the given index matches a single character.
This is the representation-agnostic way to test a character at a position without forcing string conversion or dealing with raw byte values. When the data is in byte form, the comparison is done on byte values directly.
Returns false for out-of-range indices.
The position to check
A single-character string to compare against
Returns whether the character at the given index is a member of the given character set.
Accepts either a CharSet (preferred for hot loops — O(1) lookup with no allocation) or a plain string of characters (convenient for one-off checks — converted to code point comparisons internally).
Returns false for out-of-range indices.
The position to check
A CharSet or a string of characters to test membership in
import { Bases } from "./genotype-string";
const gs = GenotypeString.fromString("ATCGRYN");
gs.isAnyOf(0, Bases.Purine); // true ('A' is a purine)
gs.isAnyOf(2, Bases.Purine); // false ('C' is not a purine)
gs.isAnyOf(0, "AC"); // true ('A' is in "AC")
gs.isAnyOf(4, Bases.Ambiguous); // true ('R' is ambiguous)
Returns whether the content starts with the given prefix.
When the data is in byte form, this compares bytes directly without converting to a JS string.
Returns whether the content ends with the given suffix.
When the data is in byte form, this compares bytes directly without converting to a JS string.
Compares this instance for equality against another GenotypeString, a plain string, or a Uint8Array.
When both sides are in byte form, comparison is done directly on bytes. Otherwise, both sides are converted to strings for comparison.
Compares this instance with another for sort ordering, following the same contract as String.prototype.localeCompare.
Returns a negative number if this instance sorts before the other, a positive number if it sorts after, or 0 if they are equal. Accepts a GenotypeString or a plain string as the comparison target.
Optionallocales: string | string[]Optionaloptions: CollatorOptionsReturns the content as a JavaScript string.
If the data is currently in byte form, this triggers a UTF-8 decode and the byte representation is dropped. Subsequent calls return the cached string without re-decoding.
Returns the string representation for JSON serialization.
Without this method, JSON.stringify() would serialize the object's
(empty) public shape rather than its string content. With it,
JSON.stringify({ sequence: gs }) produces {"sequence":"ATCG"}
as expected.
Enables transparent coercion in template literals, string concatenation, and other JS primitive contexts.
Returns the string representation for "string" and "default" hints, and NaN for the "number" hint.
Yields single-character strings, enabling for...of iteration, spread
syntax ([...gs]), Array.from(gs), and constructors like new Set(gs).
When the data is in byte form, characters are produced directly from bytes without converting the entire content to a JS string first.
Returns the content as a new Uint8Array.
The returned array is a copy — mutating it does not affect this instance. If the data is currently in string form, this triggers a UTF-8 encode and the string representation is dropped.
Matches the content against a regular expression.
Converts to a JS string if not already in string form, since the JS regex engine operates on strings.
Returns a new GenotypeString with occurrences of the pattern replaced.
Converts to a JS string for the replacement operation, then wraps the result in a new GenotypeString.
Returns the index of the first match of the regular expression, or -1 if no match is found.
Converts to a JS string if not already in string form.
Splits the content on the given separator and returns an array of plain strings.
Returns plain strings rather than GenotypeString instances because split results are typically short fragments used for parsing rather than sequences that would benefit from byte representation.
Optionallimit: numberStatic[kStatic[k
A lazy dual-representation string type for genomic sequence and quality data.
GenotypeString holds either a JavaScript string or a Uint8Array internally, converting between them on demand. It presents a string-like interface so that call sites don't need to know or care which representation is active.
When the data is in byte form, common operations like substring search, case conversion, and slicing are performed directly on bytes without converting to a JS string first. This avoids redundant encoding/decoding when chaining multiple operations that cross the Rust FFI boundary.
Instances are created through the static factory methods fromString and fromBytes. The constructor is private.
This type assumes ASCII content. Byte-native operations use ASCII semantics (e.g., case conversion via bit manipulation). For genomic data — nucleotide sequences, quality scores, IUPAC codes — this is always correct.
Most string contexts work transparently: template literals, string concatenation with
+,RegExp.test(),RegExp.exec(),String(),JSON.stringify(), and default array sorting all coerce automatically.A few JavaScript mechanisms do not work with wrapper types and cannot be overridden. These are inherent limitations of wrapping a non-primitive:
===) andswitchcompare by reference, not value. Use the equals method for content comparison.SetandMapuse identity semantics for object keys. To use sequence content as a key, call.toString()first.gs[0]) does not return a character. Use charAt instead.Example