-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Open
Description
Describe the issue
Running the quantised bge‑micro‑v2 encoder (model_quantized.onnx
) with the official
linux‑aarch64 ONNX Runtime 1.22.0 aborts on a real Raspberry Pi Zero 2 W:
/opt/rh/gcc-toolset-14/root/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = unsigned int; _Alloc = std::allocator<unsigned int>; reference = unsigned int&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
Aborted
I previously filed this via Semantic Kernel #12712 which points back to ORT #25290, but this repro removes SK completely.
System information
Value | |
---|---|
Hardware | Raspberry Pi Zero 2 W – quad‑core Cortex‑A53 @ 1 GHz, 512 MB RAM |
OS | Raspberry Pi OS Lite 64‑bit (Debian 12 / bookworm) |
ONNX Runtime | 1.22.0, onnxruntime‑linux‑aarch64‑1.22.0.tgz |
Build flags in sample | -std=c++17 -O3 -latomic -lonnxruntime -lpthread |
Model | TaylorAI/bge‑micro‑v2/onnx/model_quantized.onnx |
Works / fails | Works under docker run --platform linux/arm64/v8 satmandu/raspios:lite (1 vCPU, qemu‑user‑static). Fails on real Pi Zero 2 W |
Expected behavior
Program prints something like:
Embedding dim: 384
First element: -0.410086
Actual behavior on PiZero2W
/opt/rh/gcc-toolset-14/root/usr/include/c++/14/bits/stl_vector.h:1130: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = unsigned int; _Alloc = std::allocator<unsigned int>; reference = unsigned int&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
Aborted
Observations
- Compiling the same source under Debian bookworm in Docker does not trigger the abort
- It also works via Python, script here https://gist.github.com/poissoncorp/84268304c71394cdb49b59d513057a4c
To reproduce
repro.sh
cd /tmp
wget -q https://github.com/microsoft/onnxruntime/releases/download/v1.22.0/onnxruntime-linux-aarch64-1.22.0.tgz
tar -xzf onnxruntime-linux-aarch64-1.22.0.tgz
sudo cp -a /tmp/onnxruntime-linux-aarch64-1.22.0/include/* /usr/local/include/
sudo cp -a /tmp/onnxruntime-linux-aarch64-1.22.0/lib/* /usr/local/lib/
sudo ldconfig
cd ~/OnnxC
sudo wget -O ~/OnnxC/model_quantized.onnx https://huggingface.co/TaylorAI/bge-micro-v2/resolve/main/onnx/model_quantized.onnx
cat > main.cpp <<'CPP'
#include <onnxruntime_cxx_api.h>
#include <array>
#include <vector>
#include <iostream>
int main() {
Ort::Env env{ORT_LOGGING_LEVEL_WARNING, "bge"};
Ort::SessionOptions so;
so.SetIntraOpNumThreads(1);
Ort::Session session{env, "model_quantized.onnx", so};
std::vector<int64_t> ids = {101, 7592, 8776, 999, 102};
size_t L = ids.size();
std::vector<int64_t> mask(L, 1), seg(L, 0);
const std::array<int64_t,2> shape = {1, static_cast<int64_t>(L)};
auto mem = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
auto make = [&](std::vector<int64_t>& v){
return Ort::Value::CreateTensor<int64_t>(mem, v.data(), v.size(),
shape.data(), shape.size());
};
std::array<Ort::Value,3> inputs{make(ids), make(mask), make(seg)};
const char* in[] = {"input_ids","attention_mask","token_type_ids"};
const char* out[] = {"last_hidden_state"};
auto res = session.Run(Ort::RunOptions{nullptr},
in, inputs.data(), inputs.size(),
out, 1);
const float* v = res[0].GetTensorData<float>();
size_t dim = res[0].GetTensorTypeAndShapeInfo().GetShape().back();
std::cout << "Embedding dim: " << dim
<< "\nFirst element: " << v[0] << '\n';
}
CPP
g++ main.cpp -std=c++17 -I/usr/local/include -L/usr/local/lib \
-lonnxruntime -lpthread -latomic -O3 -o bge_cpp
./bge_cpp
Urgency
We can't run RavenDB 7.0+ on RPi
Platform
Linux
OS Version
Raspberry Pi OS Lite 64‑bit (Debian 12 / bookworm)
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.22.0
ONNX Runtime API
C++
Architecture
ARM64
Execution Provider
Default CPU
Execution Provider Library Version
No response
Metadata
Metadata
Assignees
Labels
No labels