This file contains release notes for up to the three most recent releases in reverse chronological order. For the complete release history, see the CHANGELOG or the docs/releases/ directory.
Transfer Reflow with Content-Aware Routing
This release delivers the complete transfer reflow pipeline, enabling content-aware data reorganization across cloud storage providers and local filesystems.
Copy objects while rewriting keys based on templates:
# Path-based reflow
gonimbus transfer reflow 's3://source/prefix/' \
--dest 's3://dest/base/' \
--rewrite-from '{program}/{site}/{date}/{file}' \
--rewrite-to '{date}/{program}/{site}/{file}'
# Content-aware reflow (with probe-derived variables)
gonimbus transfer reflow --stdin \
--dest 's3://dest/base/' \
--rewrite-from '{_}/{store}/{device}/{date}/{file}' \
--rewrite-to '{business_date}/{store}/{file}' < probe.jsonl
# Bucket to local filesystem
gonimbus transfer reflow --stdin \
--dest 'file:///tmp/output/' \
--rewrite-from '...' \
--rewrite-to '...' < probe.jsonl- Template variables from path segments or probe-derived fields
- Parallel copy with configurable workers (
--parallel) - Checkpoint/resume for large jobs (
--checkpoint,--resume) - Collision handling (
--on-collision log|fail|overwrite) - Dry-run mode (
--dry-run)
Extract derived fields from object content:
# Probe single object
gonimbus content probe 's3://bucket/file.xml' --config probe.yaml
# Bulk probe via stdin
gonimbus content probe --stdin --config probe.yaml < uris.txtextract:
- name: business_date
type: xml_xpath
xpath: //BusinessDate
- name: schema_version
type: json_path
path: $.metadata.version| Type | Use Case | Example |
|---|---|---|
xml_xpath |
XML element extraction | //BusinessDate |
regex |
Pattern matching | date=(\d{4}-\d{2}-\d{2}) |
json_path |
JSON field extraction | $.data.timestamp |
Transfer reflow now supports local filesystem destinations:
gonimbus transfer reflow --stdin \
--dest 'file:///tmp/reflow-out/' \
--src-profile my-aws-profile \
--rewrite-from '...' \
--rewrite-to '...' < probe.jsonl| Mode | Behavior |
|---|---|
--on-collision log (default) |
Log conflict, fail operation |
--on-collision fail |
Fail immediately on first conflict |
--on-collision overwrite |
Replace existing (requires --overwrite) |
- See docs/releases/v0.1.7.md for complete release notes
Content Inspection with Range Requests
This release introduces content inspection commands that read object headers without downloading entire files, using HTTP Range requests for efficiency.
New content subcommands provide JSONL-only inspection operations:
# Read the first 4KB of an object (default)
gonimbus content head s3://bucket/path/to/file.xml --profile my-profile
# Read the first 256 bytes (magic bytes, file headers)
gonimbus content head s3://bucket/path/to/file.xml --bytes 256 --profile my-profileOutput is a single gonimbus.content.head.v1 JSONL record:
{
"type": "gonimbus.content.head.v1",
"ts": "2026-01-25T12:00:00Z",
"job_id": "...",
"provider": "s3",
"data": {
"uri": "s3://bucket/path/to/file.xml",
"key": "path/to/file.xml",
"bytes_requested": 4096,
"bytes_returned": 4096,
"content_b64": "PD94bWwgdmVyc2lvbj0iMS4wIi4uLg==",
"etag": "60eda68512f8238bd2ba9abac0de63d7",
"size": 3729736,
"last_modified": "2025-12-15T20:53:44Z",
"content_type": "application/xml"
}
}| Command | Output | Use Case |
|---|---|---|
stream head |
JSONL (metadata only) | Routing decisions, size checks |
stream get |
Mixed framing (JSONL + raw bytes) | Full content download |
content head |
JSONL (base64 content) | Header inspection, magic bytes |
Key difference: content commands are JSONL-only with no mixed framing, making them easier to integrate with tools like jq.
The S3 provider now supports HTTP Range requests via the ObjectRanger interface:
- Efficient partial reads: Only downloads requested bytes
- Automatic fallback: Falls back to GetObject if provider doesn't support ranges
- Standard semantics: Uses HTTP Range header with inclusive byte offsets
Inspect magic bytes without downloading entire files:
# Read first 16 bytes for magic number detection
gonimbus content head s3://bucket/data/file --bytes 16 --profile prod | \
jq -r '.data.content_b64' | base64 -d | xxdExtract XML version and encoding from document headers:
# Read first 256 bytes for XML declaration
gonimbus content head s3://bucket/data/doc.xml --bytes 256 --profile prod | \
jq -r '.data.content_b64' | base64 -d | head -1Make routing decisions based on file headers without full download:
# Check if file starts with expected header
header=$(gonimbus content head s3://bucket/file --bytes 64 --profile prod | \
jq -r '.data.content_b64' | base64 -d)
if [[ "$header" == *"expected-pattern"* ]]; then
# Route to processor A
fi- See docs/releases/v0.1.6.md for complete release notes
Content Streaming + validate=size for Consumer Integration
This release introduces content streaming commands and validation, enabling Gonimbus to serve as a data plane for downstream consumers (Go, Python, Node) that need to process object content without managing provider SDKs directly.
New stream subcommands provide structured access to object metadata and content:
# Get object metadata (JSONL output)
gonimbus stream head s3://bucket/key --profile my-profile
# Stream object content (mixed JSONL + raw bytes)
gonimbus stream get s3://bucket/key --profile my-profileThe streaming output uses a mixed-framing format (ADR-0004):
| Record Type | Purpose |
|---|---|
gonimbus.stream.open.v1 |
Stream metadata (uri, size, etag, last_modified) |
gonimbus.stream.chunk.v1 |
Chunk header (seq, nbytes) + raw bytes |
gonimbus.stream.close.v1 |
Completion status (success/error, total chunks/bytes) |
Errors are emitted to stdout as gonimbus.error.v1 records (streaming mode contract), enabling consumers to rely on structured output without scraping stderr.
The pkg/stream package provides Go helpers for producing and consuming streams:
Writer: Produces mixed-framing outputDecoder: Parses streams with truncation detection (io.ErrUnexpectedEOF)- Byte-exact reconstruction verified via MD5/SHA256 round-trip
Both stream get and transfer operations now validate that enumerated size matches GetObject content-length:
- Catches stale index/list metadata before deep pipeline processing
- Size mismatch mapped to
NOT_FOUNDerror code (stale key semantics) - Fails early, avoiding wasted buffering and retries
- ADR-0004: Language-neutral content stream contract
- Streaming contract spec and helper guidance (
docs/development/streaming/) - See docs/releases/v0.1.5.md for complete release notes