Skip to content

Commit cbd6ae2

Browse files
authored
Merge pull request #524 from ipfs/feat/ipip-remove-ipld-translation
IPIP-0524: Remove cross-codec conversion from HTTP Gateways
2 parents 0084d7c + a0429c9 commit cbd6ae2

File tree

2 files changed

+230
-7
lines changed

2 files changed

+230
-7
lines changed

src/http-gateways/path-gateway.md

Lines changed: 31 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,12 @@ thanks:
3636
affiliation:
3737
name: Protocol Labs
3838
url: https://protocol.ai/
39+
- name: Alex Potsides
40+
github: achingbrain
41+
url: https://achingbrain.net
42+
affiliation:
43+
name: Shipyard
44+
url: https://ipshipyard.com
3945
xref:
4046
- url
4147
- trustless-gateway
@@ -158,12 +164,13 @@ For example:
158164

159165
- [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) – disables [IPLD codec deserialization](https://ipld.io/docs/codecs/), requests a verifiable raw [block](https://docs.ipfs.io/concepts/glossary/#block) to be returned
160166
- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables [IPLD codec deserialization](https://ipld.io/docs/codecs/), requests a verifiable [CAR](https://docs.ipfs.io/concepts/glossary/#car) stream to be returned with implicit or explicit [`dag-scope`](https://specs.ipfs.tech/http-gateways/trustless-gateway/#dag-scope-request-query-parameter) for blocks at the terminus of the specified path and the blocks required to traverse path segments from root CID to the terminus.
161-
- [application/x-tar](https://en.wikipedia.org/wiki/Tar_(computing)) – returns UnixFS tree (files and directories) as a [TAR](https://en.wikipedia.org/wiki/Tar_(computing)) stream. Returned tree starts at a DAG which name is the same as the terminus segment. Produces 400 Bad Request for content that is not UnixFS.
162-
- [application/vnd.ipld.dag-json](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-json)requests [IPLD Data Model](https://ipld.io/docs/data-model/) representation serialized into [DAG-JSON format](https://ipld.io/docs/codecs/known/dag-json/). If the requested CID already has `dag-json` (0x0129) codec, data is validated as DAG-JSON before being returned as-is. Invalid DAG-JSON produces HTTP Error 500.
163-
- [application/vnd.ipld.dag-cbor](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-cbor)requests [IPLD Data Model](https://ipld.io/docs/data-model/) representation serialized into [DAG-CBOR format](https://ipld.io/docs/codecs/known/dag-cbor/). If the requested CID already has `dag-cbor` (0x71) codec, data is validated as DAG-CBOR before being returned as-is. Invalid DAG-CBOR produces HTTP Error 500.
164-
- [application/json](https://www.iana.org/assignments/media-types/application/json)same as `application/vnd.ipld.dag-json`, unless the CID's codec already is `json` (0x0200). Then, the raw JSON block can be returned as-is without any conversion.
165-
- [application/cbor](https://www.iana.org/assignments/media-types/application/cbor)same as `application/vnd.ipld.dag-cbor`, unless the CID's codec already is `cbor` (0x51). Then, the raw CBOR block can be returned as-is without any conversion.
167+
- [application/x-tar](https://en.wikipedia.org/wiki/Tar_(computing)) – returns a UnixFS tree (files and directories) as a [TAR](https://en.wikipedia.org/wiki/Tar_(computing)) stream. Returned tree starts at a DAG which name is the same as the terminus segment. Produces 406 Not Acceptable for content that is not UnixFS.
168+
- [application/vnd.ipld.dag-json](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-json)Returns the block when CID codec is `dag-json`. Implementations MAY validate block data before returning. SHOULD produce 406 Not Acceptable when the CID codec does not match.
169+
- [application/vnd.ipld.dag-cbor](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-cbor)Returns the block when CID codec is `dag-cbor`. Implementations MAY validate block data before returning. SHOULD produce 406 Not Acceptable when the CID codec does not match.
170+
- [application/json](https://www.iana.org/assignments/media-types/application/json)For blocks with CID codec `json`, returns block data as `application/json`. Implementations MAY validate block data before returning. For deserialized UnixFS files that represent text files with valid JSON, implementations SHOULD allow serving the file content as `application/json` regardless of the CID codec being `dag-pb` or `raw`. SHOULD produce 406 Not Acceptable in all other cases.
171+
- [application/cbor](https://www.iana.org/assignments/media-types/application/cbor)Returns the block when CID codec is `cbor`. Implementations MAY validate block data before returning. SHOULD produce 406 Not Acceptable when the CID codec does not match.
166172
- [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record) – requests a verifiable :cite[ipns-record] to be returned. Produces 400 Bad Request if the content is not under the IPNS namespace, or contains a path.
173+
- [text/html](https://html.spec.whatwg.org/) – returns a human-readable representation of the requested data which may include a link to download the raw data.
167174

168175
:::note
169176

@@ -339,6 +346,23 @@ responses (such as CAR), once HTTP 200 OK status is sent, gateways cannot
339346
change it. If a child block is missing during streaming, the gateway SHOULD
340347
terminate the stream. Clients MUST verify response completeness.
341348

349+
### `406` Not Acceptable
350+
351+
Returned when the requested response format does not match the CID's codec
352+
and the gateway does not perform cross-codec conversion.
353+
354+
For example, requesting `?format=dag-json` on a `dag-cbor` block, or
355+
`?format=dag-cbor` on a `dag-pb` block, SHOULD return a 406 response.
356+
357+
Similarly, requesting `?format=tar` for content that is not UnixFS SHOULD
358+
return 406.
359+
360+
Implementations MAY include an actionable hint in the response body (e.g.,
361+
suggesting the client fetch the raw block with `?format=raw` and convert
362+
client-side).
363+
364+
See :cite[ipip-0524] for details.
365+
342366
### `410` Gone
343367

344368
Error to indicate that request was formally correct, but this specific Gateway
@@ -753,10 +777,10 @@ By default, implicit deserialized response type is based on `Accept` header and
753777
- Bytes representing a CBOR file, see [application/cbor](https://www.iana.org/assignments/media-types/application/cbor)
754778
- Works exactly the same as `raw`, but returned `Content-Type` is `application/cbor`
755779
- DAG-JSON (0x0129)
756-
- If the `Accept` header includes `text/html`, implementation should return a generated HTML with options to download DAG-JSON as-is, or converted to DAG-CBOR.
780+
- If the `Accept` header includes `text/html`, implementation should return a generated HTML with an option to download DAG-JSON as-is.
757781
- Otherwise, response works exactly the same as `raw` block, but returned `Content-Type` is [application/vnd.ipld.dag-json](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-json)
758782
- DAG-CBOR (0x71)
759-
- If the `Accept` header includes `text/html`: implementation should return a generated HTML with options to download DAG-CBOR as-is, or converted to DAG-JSON.
783+
- If the `Accept` header includes `text/html`: implementation should return a generated HTML with an option to download DAG-CBOR as-is.
760784
- Otherwise, response works exactly the same as `raw` block, but returned `Content-Type` is [application/vnd.ipld.dag-cbor](https://www.iana.org/assignments/media-types/application/vnd.ipld.dag-cbor)
761785

762786
The following response types require an explicit opt-in, can only be requested with [`format`](#format-request-query-parameter) query parameter or [`Accept`](#accept-request-header) header:

src/ipips/ipip-0524.md

Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
---
2+
title: "IPIP-0524: Remove cross-codec conversion from HTTP Gateways"
3+
date: 2026-03-05
4+
ipip: ratified
5+
editors:
6+
- name: Alex Potsides
7+
github: achingbrain
8+
url: https://achingbrain.net
9+
affiliation:
10+
name: Shipyard
11+
url: https://ipshipyard.com
12+
- name: Marcin Rataj
13+
github: lidel
14+
url: https://lidel.org
15+
affiliation:
16+
name: Shipyard
17+
url: https://ipshipyard.com
18+
relatedIssues:
19+
- https://github.com/ipfs/gateway-conformance/issues/200
20+
order: 524
21+
tags: ['ipips']
22+
---
23+
24+
## Summary
25+
26+
Make IPFS HTTP Gateway responses easier to reason about by not requiring IPLD
27+
Data Model translations
28+
29+
## Motivation
30+
31+
When sending an [Accept](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Accept)
32+
header or [format](https://specs.ipfs.tech/http-gateways/path-gateway/#format-request-query-parameter)
33+
query parameter to specify the response format of a request, the IPFS HTTP
34+
Gateway specs [allow translation](https://specs.ipfs.tech/http-gateways/path-gateway/#accept-request-header)
35+
of the requested content into the [IPLD Data Model](https://ipld.io/docs/data-model/).
36+
37+
This adds significant complexity to HTTP Gateway implementations, since they
38+
need to be able to translate between arbitrary data types and handle all the
39+
various failure states.
40+
41+
The conversions are also lossy due to differences in supported data types across
42+
different formats so lack general-purpose utility and are ultimately something
43+
that could be done on an interested client if required.
44+
45+
## Detailed design
46+
47+
When the block's CID codec matches the requested response format,
48+
implementations MAY return the block as-is without parsing or validating it.
49+
This is effectively equivalent to requesting `?format=raw` but with a
50+
codec-specific `Content-Type` header.
51+
52+
When the CID codec does not match the requested format, the gateway SHOULD
53+
return a [406 Not Acceptable](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/406)
54+
unless the server provides cross-codec conversion as an extra feature outside
55+
of this specification.
56+
57+
For example, requesting a DAG-JSON block with the `application/cbor` format
58+
would result in a 406 response.
59+
60+
Where a human-readable rendering of the data is desired, the `text/html` format
61+
can be requested. This would allow browsing DAG-PB data, for example.
62+
63+
A [400](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/400)
64+
may be returned if the request was invalid (for example an unsupported format
65+
was requested).
66+
67+
A [500](https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Status/500)
68+
may be returned in other circumstances.
69+
70+
## Design rationale
71+
72+
Simplifying the HTTP Gateway spec to remove these format translations and the
73+
additional logic required makes it more straightforward to create new
74+
implementations, and makes the returned data more transparent and so easier to
75+
understand since the data is not modified to fit the output format.
76+
77+
Clients that wish to translate between different data formats may request raw
78+
blocks and do the translation themselves.
79+
80+
### User benefit
81+
82+
For gateway operators and implementers, removing the requirement to perform
83+
codec conversions server-side significantly reduces implementation complexity.
84+
85+
For end users and application developers, the change makes gateway behavior
86+
easier to reason about: a request either returns data deserialized according
87+
to the rules of the CID's original codec, or fails with 406. This moves
88+
conversion to userland, encouraging users to fetch raw blocks with
89+
`?format=raw` and convert client-side, putting the application in full control
90+
and producing deterministic results regardless of which gateway is used.
91+
92+
This matters in practice because codec libraries do not behave identically.
93+
[Cross-library dag-cbor tests (2026)](https://hyphacoop.github.io/dasl-testing/?group=tests-by-file&tag=dag-cbor)
94+
show each implementation differs on edge cases like float handling, map key
95+
ordering, and encoding strictness. Relying on server-side conversion means
96+
the output depends on whichever library the gateway happens to use, which is
97+
not a foundation for robust software.
98+
99+
### Compatibility
100+
101+
Formally this is a breaking change: server-side IPLD Data Model translations
102+
between codecs are removed.
103+
104+
In practice, nobody could build reliable software
105+
on top of conversion logic that behaved non-deterministically across gateways
106+
written in different languages. Clients that needed data in a different
107+
format often chose to fetch `?format=raw` and convert client-side already.
108+
109+
This IPIP standardizes that robust real-world pattern and removes an
110+
unreliable niche feature that has seen limited use.
111+
112+
#### Real-world `?format=` usage on `ipfs.io` and `dweb.link`
113+
114+
A 24-hour sample of traffic on the `ipfs.io` and `dweb.link` public gateways
115+
(Feb 2026) shows that only 4.5% of all requests use the `?format=` query
116+
parameter, and the vast majority ask for `json`:
117+
118+
| `?format=` value | % of requests with `format=` |
119+
|------------------|------------------------------|
120+
| `json` | 99.11% |
121+
| `raw` | 0.86% |
122+
| `dag-json` | 0.02% |
123+
| `car` | 0.01% |
124+
| other | <0.01% |
125+
126+
Note: `ipfs.io` and `dweb.link` serve deserialized responses. Trustless
127+
verifiable requests (`?format=raw`, `?format=car`) are redirected to
128+
`trustless-gateway.link`, which is why those formats appear so rarely here.
129+
130+
Looking at what those `?format=json` requests actually point at tells the
131+
real story. The CID codec of the requested blocks breaks down as follows:
132+
133+
| CID codec of requested block | % of `?format=json` |
134+
|------------------------------|---------------------|
135+
| `dag-pb` (CIDv0 `Qm...`) | 60.0% |
136+
| `dag-pb` (CIDv1 `bafy...`) | 21.4% |
137+
| `raw` (`bafk...`) | 18.6% |
138+
139+
100% of `?format=json` requests are for blocks with `dag-pb` or `raw` codec.
140+
None target the `json` codec (0x0200). In other words, these clients are
141+
reading regular JSON files stored as UnixFS, not asking the gateway to convert
142+
between IPLD codecs. The gateway serves them as plain HTTP file responses,
143+
which is covered by the UnixFS interop exception described later in this IPIP.
144+
145+
The remaining formats (`dag-json` and `car`) together account for less than
146+
0.04% of `?format=` requests and do not depend on cross-codec conversion
147+
either, since they request data in the block's native codec.
148+
149+
#### `json` and `dag-json` independence
150+
151+
`application/json` and `application/vnd.ipld.dag-json` are now treated as
152+
independent formats, each matching only their respective CID codec (`json`
153+
0x0200 and `dag-json` 0x0129). The old behavior where `application/json` was
154+
an alias for `application/vnd.ipld.dag-json` (falling back to dag-json
155+
conversion) no longer applies.
156+
157+
#### UnixFS interop exception for `Accept: application/json`
158+
159+
Note: the codec match requirement and 406 behavior described above do not
160+
apply to deserialized UnixFS file responses. Users commonly store valid JSON
161+
as UnixFS files (with `dag-pb` or `raw` codec), and serving those files with
162+
`Accept: application/json` is regular HTTP content serving, not codec
163+
conversion. See the `application/json` entry in the
164+
[Accept request header](https://specs.ipfs.tech/http-gateways/path-gateway/#accept-request-header)
165+
section of the Path Gateway spec for normative requirements.
166+
167+
#### Opt-in backward compatibility
168+
169+
Implementations MAY offer an opt-in configuration flag to restore the old
170+
codec conversion behavior for backward compatibility.
171+
172+
#### Implementation-defined behavior
173+
174+
- The content of the 406 error response body (e.g. actionable hints).
175+
- Handling of `?format=json` / `Accept: application/json` on non-json-codec
176+
content (like `dag-pb` UnixFS files).
177+
- Whether to offer an opt-in flag for restoring codec conversion.
178+
- Validation of block data when the CID codec matches the requested format.
179+
180+
### Security
181+
182+
No security implications. This change restricts gateway behavior (returning
183+
406 instead of converting), which reduces attack surface.
184+
185+
## Test fixtures
186+
187+
Implementers can run the [gateway-conformance](https://github.com/ipfs/gateway-conformance/)
188+
test suite v0.10 or later. The following behaviors are verified by the test suite:
189+
190+
- Requesting a block in a format that differs from its CID codec (e.g.
191+
`dag-pb` block with `?format=dag-json`) returns HTTP 406.
192+
- Requesting a block in its native codec returns HTTP 200.
193+
- `?format=raw` works for any codec.
194+
- HTML rendering (`Accept: text/html`) of DAG-JSON/DAG-CBOR blocks is not
195+
codec conversion and remains part of the spec.
196+
197+
### Copyright
198+
199+
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).

0 commit comments

Comments
 (0)