You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[University of Reading](http://www.reading.ac.uk/).
11
11
12
-
Documentation is available on [docs.rs](https://docs.rs/reductionist/latest/reductionist/).
12
+
Documentation for the Reductionist application is hosted on [GitHub](https://stackhpc.github.io/reductionist-rs).
13
+
Documentation for the source code is available on [docs.rs](https://docs.rs/reductionist/latest/reductionist/).
13
14
14
15
This is a performant implementation of the active storage server.
15
16
The original Python functional prototype is available [here](https://github.com/stackhpc/s3-active-storage-prototype).
16
17
17
-
## Concepts
18
+
Note: The original S3 Active Storage project was renamed to Reductionist, to avoid confusion due to overuse of the term Active Storage.
18
19
19
-
The Reductionist server supports the application of reductions to S3 objects that contain numeric binary data. These reductions are specified by making a HTTP post request to the active storage proxy service.
20
+
## Features
20
21
21
-
The Reductionist server does not attempt to infer the datatype - it must be told the datatype to use based on knowledge that the client already has about the S3 object.
22
+
Reductionist provides the following features:
22
23
23
-
For example, if the original object has the following URL:
// - exactly one of the keys below should be specified
95
-
// - the values should match the data type (dtype)
96
-
"missing": {
97
-
"missing_value": 42,
98
-
"missing_values": [42, -42],
99
-
"valid_min": 42,
100
-
"valid_max": 42,
101
-
"valid_range": [-42, 42],
102
-
}
103
-
}
104
-
```
105
-
106
-
The currently supported reducers are `max`, `min`, `sum`, `select` and `count`. All reducers return the result using the same datatype as specified in the request except for `count` which always returns the result as `int64`.
107
-
108
-
The proxy returns the following headers to the HTTP response:
109
-
110
-
*`x-activestorage-dtype`: The data type of the data in the response payload. One of `int32`, `int64`, `uint32`, `uint64`, `float32` or `float64`.
111
-
*`x-activestorage-byte-order`: The byte order of the data in the response payload. Either `big` or `little`.
112
-
*`x-activestrorage-shape`: A JSON-encoded list of numbers describing the shape of the data in the response payload. May be an empty list for a scalar result.
113
-
*`x-activestorage-count`: The number of non-missing array elements operated on while performing the requested reduction. This header is useful, for example, to calculate the mean over multiple requests where the number of items operated on may differ between chunks.
114
-
115
-
[//]: <> (TODO: No OpenAPI support yet).
116
-
[//]: <>(For a running instance of the proxy server, the full OpenAPI specification is browsable as a web page at the `{proxy-address}/redoc/` endpoint or in raw JSON form at `{proxy-address}/openapi.json`.)
117
-
118
-
## Caveats
119
-
120
-
This is a very early-stage project, and as such supports limited functionality.
121
-
122
-
In particular, the following are known limitations which we intend to address:
123
-
124
-
* Error handling and reporting is minimal
125
-
* No support for missing data
126
-
* No support for encrypted objects
127
-
128
-
## Running
129
-
130
-
There are various ways to run the Reductionist server.
131
-
132
-
### Production deployment
133
-
134
-
Reductionist provides an Ansible playbook to easily deploy it and supporting
135
-
services to one or more hosts. See the [deployment
136
-
README](deployment/README.md) for details.
137
-
138
-
### Running in a container
139
-
140
-
The simplest method is to run it in a container using a pre-built image:
141
-
142
-
```sh
143
-
docker run -it --detach --rm --net=host --name reductionist ghcr.io/stackhpc/reductionist-rs:latest
144
-
```
145
-
146
-
Images are published to [GitHub Container Registry](https://github.com/stackhpc/reductionist-rs/pkgs/container/reductionist-rs) when the project is released.
147
-
The `latest` tag corresponds to the most recent release, or you can use a specific release e.g. `0.1.0`.
148
-
149
-
This method does not require access to the source code.
150
-
151
-
### Building a container image
152
-
153
-
If you need to use unreleased changes, but still want to run in a container, it is possible to build an image.
For simple testing purposes Minio is a convenient object storage server.
220
-
221
-
### Deploy Minio object storage
222
-
223
-
Start a local [Minio](https://min.io/) server which serves the test data:
224
-
225
-
```sh
226
-
./scripts/minio-start
227
-
```
228
-
229
-
The Minio server will run in a detached container and may be stopped:
230
-
231
-
```sh
232
-
./scripts/minio-stop
233
-
```
234
-
235
-
Note that object data is not preserved when the container is stopped.
236
-
237
-
### Upload some test data
238
-
239
-
A script is provided to upload some test data to minio.
240
-
In a separate terminal, set up the Python virtualenv then upload some sample data:
241
-
242
-
```sh
243
-
# Create a virtualenv
244
-
python3 -m venv ./venv
245
-
# Activate the virtualenv
246
-
source ./venv/bin/activate
247
-
# Install dependencies
248
-
pip install scripts/requirements.txt
249
-
# Upload some sample data to the running minio server
250
-
python ./scripts/upload_sample_data.py
251
-
```
252
-
253
-
### Compliance test suite
254
-
255
-
Proxy functionality can be tested using the [S3 active storage compliance suite](https://github.com/stackhpc/s3-active-storage-compliance-suite).
256
-
257
-
### Making requests to active storage endpoints
258
-
259
-
Request authentication is implemented using [Basic Auth](https://en.wikipedia.org/wiki/Basic_access_authentication) with the username and password consisting of your S3 Access Key ID and Secret Access Key, respectively. These credentials are then used internally to authenticate with the upstream S3 source using [standard AWS authentication methods](https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html)
260
-
261
-
A basic Python client is provided in `scripts/client.py`.
262
-
First install dependencies in a Python virtual environment:
The source code is documented using [rustdoc](https://doc.rust-lang.org/rustdoc/what-is-rustdoc.html).
283
-
Documentation is available on [docs.rs](https://docs.rs/reductionist/latest/reductionist/).
284
-
It is also possible to build the documentation locally:
285
-
286
-
```sh
287
-
cargo doc --no-deps
288
-
```
289
-
290
-
The resulting documentation is available under `target/doc`, and may be viewed in a web browser using file:///path/to/reductionist/target/doc/reductionist/index.html.
39
+
*[PyActiveStorage](https://github.com/valeriupredoi/PyActiveStorage) is a Python library which performs reductions on numerical data in data sources such as netCDF4. It has support for delegating computation to Reductionist when the data is stored in an S3-compatible object store.
291
40
292
41
## Contributing
293
42
294
-
See [CONTRIBUTING.md](CONTRIBUTING.md) for information about contributing to Reductionist.
43
+
See the [contributor guide](https://stackhpc.github.io/reductionist-rs/contributing.html) for information about contributing to Reductionist.
0 commit comments