Skip to content

Latest commit

 

History

History
392 lines (308 loc) · 16.8 KB

File metadata and controls

392 lines (308 loc) · 16.8 KB

gRPC Interface Architecture

Overview

The cuOpt remote execution system uses gRPC for client-server communication. The interface supports arbitrarily large optimization problems (multi-GB) through a chunked array transfer protocol that uses only unary (request-response) RPCs — no bidirectional streaming.

All client-server serialization uses protocol buffers generated by protoc and grpc_cpp_plugin. The internal server-to-worker pipe uses protobuf for metadata headers and raw byte transfer for bulk array data (see Security Notes).

Directory Layout

All gRPC-related C++ source lives under a single tree:

cpp/src/grpc/
├── cuopt_remote.proto              # Base protobuf messages (job status, settings, etc.)
├── cuopt_remote_service.proto      # Service definition + messages (SubmitJob, ChunkedUpload, Incumbent, etc.)
├── grpc_problem_mapper.{hpp,cpp}   # CPU problem ↔ proto (incl. chunked header)
├── grpc_solution_mapper.{hpp,cpp}  # LP/MIP solution ↔ proto (unary + chunked)
├── grpc_settings_mapper.{hpp,cpp}  # PDLP/MIP settings ↔ proto
├── grpc_service_mapper.{hpp,cpp}   # Request/response builders (status, cancel, stream logs, etc.)
├── client/
│   ├── grpc_client.{hpp,cpp}       # High-level client: connect, submit, poll, get result
│   └── solve_remote.cpp            # solve_lp_remote / solve_mip_remote (uses grpc_client)
└── server/
    ├── grpc_server_main.cpp        # main(), argument parsing, gRPC server setup
    ├── grpc_service_impl.cpp       # CuOptRemoteServiceImpl — all RPC handlers
    ├── grpc_server_types.hpp       # Shared types, globals, forward declarations
    ├── grpc_field_element_size.hpp # ArrayFieldId → element byte size (codegen target)
    ├── grpc_pipe_serialization.hpp # Pipe I/O: protobuf headers + raw byte arrays (request/result)
    ├── grpc_incumbent_proto.hpp    # Incumbent proto build/parse (codegen target)
    ├── grpc_worker.cpp             # worker_process(), incumbent callback, store_simple_result
    ├── grpc_worker_infra.cpp       # Pipes, spawn, wait_for_workers, mark_worker_jobs_failed
    ├── grpc_server_threads.cpp     # result_retrieval, incumbent_retrieval, session_reaper
    └── grpc_job_management.cpp     # Pipe I/O, submit_job_async, check_status, cancel, etc.
  • Protos: Live in cpp/src/grpc/. CMake generates C++ in the build dir (cuopt_remote.pb.h, cuopt_remote_service.pb.h, cuopt_remote_service.grpc.pb.h).
  • Mappers: Shared by client and server; convert between host C++ types and protobuf. Used for unary and chunked paths.
  • Client: Solver-level utility (not public API). Used by solve_lp_remote/solve_mip_remote and tests.
  • Server: Standalone executable cuopt_grpc_server. See GRPC_SERVER_ARCHITECTURE.md for process model and file roles.

Protocol Files

File Purpose
cpp/src/grpc/cuopt_remote.proto Message definitions (problems, settings, solutions, field IDs)
cpp/src/grpc/cuopt_remote_service.proto gRPC service definition (RPCs)

Generated code is placed in the CMake build directory (not checked into source).

Service Interface

service CuOptRemoteService {
  // Job submission (small problems, single message)
  rpc SubmitJob(SubmitJobRequest) returns (SubmitJobResponse);

  // Chunked upload (large problems, multiple unary RPCs)
  rpc StartChunkedUpload(StartChunkedUploadRequest) returns (StartChunkedUploadResponse);
  rpc SendArrayChunk(SendArrayChunkRequest) returns (SendArrayChunkResponse);
  rpc FinishChunkedUpload(FinishChunkedUploadRequest) returns (SubmitJobResponse);

  // Job management
  rpc CheckStatus(StatusRequest) returns (StatusResponse);
  rpc CancelJob(CancelRequest) returns (CancelResponse);
  rpc DeleteResult(DeleteRequest) returns (DeleteResponse);

  // Result retrieval (small results, single message)
  rpc GetResult(GetResultRequest) returns (ResultResponse);

  // Chunked download (large results, multiple unary RPCs)
  rpc StartChunkedDownload(StartChunkedDownloadRequest) returns (StartChunkedDownloadResponse);
  rpc GetResultChunk(GetResultChunkRequest) returns (GetResultChunkResponse);
  rpc FinishChunkedDownload(FinishChunkedDownloadRequest) returns (FinishChunkedDownloadResponse);

  // Blocking wait (returns status only, use GetResult afterward)
  rpc WaitForCompletion(WaitRequest) returns (WaitResponse);

  // Real-time streaming
  rpc StreamLogs(StreamLogsRequest) returns (stream LogMessage);
  rpc GetIncumbents(IncumbentRequest) returns (IncumbentResponse);
}

Chunked Array Transfer Protocol

Why Chunking?

gRPC has per-message size limits (configurable, default set to 256 MiB in cuOpt), and protobuf has a hard 2 GB serialization limit. Optimization problems and their solutions can exceed several gigabytes, so a chunked transfer mechanism is needed.

The protocol uses only unary RPCs (no bidirectional streaming), which simplifies error handling, load balancing, and proxy compatibility.

Upload Protocol (Large Problems)

When the estimated serialized problem size exceeds 75% of max_message_bytes, the client splits large arrays into chunks and sends them via multiple unary RPCs:

Client                                          Server
  |                                               |
  |-- StartChunkedUpload(header, settings) -----> |
  |<-- upload_id, max_message_bytes -------------- |
  |                                               |
  |-- SendArrayChunk(upload_id, field, data) ----> |
  |<-- ok ---------------------------------------- |
  |                                               |
  |-- SendArrayChunk(upload_id, field, data) ----> |
  |<-- ok ---------------------------------------- |
  |           ...                                 |
  |                                               |
  |-- FinishChunkedUpload(upload_id) ------------> |
  |<-- job_id ------------------------------------ |

Key features:

  • StartChunkedUpload sends a ChunkedProblemHeader with all scalar fields and array metadata (ArrayDescriptor for each large array: field ID, total elements, element size)
  • Each SendArrayChunk carries one chunk of one array, identified by ArrayFieldId and element_offset
  • The server reports max_message_bytes so the client can adapt chunk sizing
  • FinishChunkedUpload triggers server-side reassembly and job submission

Download Protocol (Large Results)

When the result exceeds the gRPC max message size, the client fetches it via chunked unary RPCs (mirrors the upload pattern):

Client                                           Server
  |                                                |
  |-- StartChunkedDownload(job_id) --------------> |
  |<-- download_id, ChunkedResultHeader ---------- |
  |                                                |
  |-- GetResultChunk(download_id, field, off) ----> |
  |<-- data bytes --------------------------------- |
  |                                                |
  |-- GetResultChunk(download_id, field, off) ----> |
  |<-- data bytes --------------------------------- |
  |           ...                                  |
  |                                                |
  |-- FinishChunkedDownload(download_id) ---------> |
  |<-- ok ----------------------------------------- |

Key features:

  • ChunkedResultHeader carries all scalar fields (termination status, objectives, residuals, solve time, warm start scalars) plus ResultArrayDescriptor entries for each array (solution vectors, warm start arrays)
  • Each GetResultChunk fetches a slice of one array, identified by ResultFieldId and element_offset
  • FinishChunkedDownload releases the server-side download session state
  • LP results include PDLP warm start data (9 arrays + 8 scalars) for subsequent warm-started solves

Automatic Routing

The client handles size-based routing transparently:

  1. Upload: Estimate serialized problem size
    • Below 75% of max_message_bytes → unary SubmitJob
    • Above threshold → StartChunkedUpload + SendArrayChunk + FinishChunkedUpload
  2. Download: Check result_size_bytes from CheckStatus
    • Below max_message_bytes → unary GetResult
    • Above limit (or RESOURCE_EXHAUSTED) → chunked download RPCs

Error Handling

gRPC Status Codes

Code Meaning Client Action
OK Success Process result
NOT_FOUND Job ID not found Check job ID
RESOURCE_EXHAUSTED Message too large Use chunked transfer
CANCELLED Job was cancelled Handle gracefully
DEADLINE_EXCEEDED Timeout Retry or increase timeout
UNAVAILABLE Server not reachable Retry with backoff
INTERNAL Server error Report to user
INVALID_ARGUMENT Bad request Fix request

Connection Handling

  • Client detects context->IsCancelled() for graceful disconnect
  • Server cleans up job state on client disconnect during upload
  • Automatic reconnection is NOT built-in (caller should retry)

Completion Strategy

The solve_lp and solve_mip methods poll CheckStatus every poll_interval_ms until the job reaches a terminal state (COMPLETED/FAILED/CANCELLED) or timeout_seconds is exceeded. During polling, MIP incumbent callbacks are invoked on the main thread.

The WaitForCompletion RPC is available as a public async API primitive for callers managing jobs directly, but it is not used by the convenience solve_* methods because polling provides timeout protection and enables incumbent callbacks.

Client API (grpc_client_t)

Configuration

struct grpc_client_config_t {
  std::string server_address = "localhost:8765";
  int poll_interval_ms       = 1000;
  int timeout_seconds        = 3600;  // Max wait for job completion (1 hour)
  bool stream_logs           = false; // Stream solver logs from server

  // Callbacks
  std::function<void(const std::string&)> log_callback;
  std::function<void(const std::string&)> debug_log_callback;  // Internal client debug messages
  std::function<bool(int64_t, double, const std::vector<double>&)> incumbent_callback;
  int incumbent_poll_interval_ms = 1000;

  // TLS configuration
  bool enable_tls = false;
  std::string tls_root_certs;   // CA certificate (PEM)
  std::string tls_client_cert;  // Client certificate (mTLS)
  std::string tls_client_key;   // Client private key (mTLS)

  // Transfer configuration
  int64_t max_message_bytes = 256 * 1024 * 1024;  // 256 MiB
  int64_t chunk_size_bytes  = 16 * 1024 * 1024;   // 16 MiB per chunk
  // Chunked upload threshold is computed as 75% of max_message_bytes.
  bool enable_transfer_hash = false;               // FNV-1a hash logging
};

Synchronous Operations

// Blocking solve — handles chunked transfer automatically
auto result = client.solve_lp(problem, settings);
auto result = client.solve_mip(problem, settings, enable_incumbents);

Asynchronous Operations

// Submit and get job ID
auto submit = client.submit_lp(problem, settings);
std::string job_id = submit.job_id;

// Poll for status
auto status = client.check_status(job_id);

// Get result when ready
auto result = client.get_lp_result<int, double>(job_id);

// Cancel or delete
client.cancel_job(job_id);
client.delete_job(job_id);

Real-Time Streaming

// Log streaming (callback-based)
client.stream_logs(job_id, 0, [](const std::string& line, bool done) {
  std::cout << line;
  return true;  // continue streaming
});

// Incumbent polling (during MIP solve)
config.incumbent_callback = [](int64_t idx, double obj, const auto& sol) {
  std::cout << "Incumbent " << idx << ": " << obj << "\n";
  return true;  // return false to cancel solve
};

Environment Variables

Variable Default Description
CUOPT_REMOTE_HOST localhost Server hostname for remote solves
CUOPT_REMOTE_PORT 8765 Server port for remote solves
CUOPT_CHUNK_SIZE 16 MiB Override chunk_size_bytes
CUOPT_MAX_MESSAGE_BYTES 256 MiB Override max_message_bytes
CUOPT_GRPC_DEBUG 0 Enable client debug/throughput logging (0 or 1)
CUOPT_TLS_ENABLED 0 Enable TLS for client connections (0 or 1)
CUOPT_TLS_ROOT_CERT (none) Path to PEM root CA file (server verification)
CUOPT_TLS_CLIENT_CERT (none) Path to PEM client certificate file (for mTLS)
CUOPT_TLS_CLIENT_KEY (none) Path to PEM client private key file (for mTLS)

TLS Configuration

Server-Side TLS

./cuopt_grpc_server --port 8765 \
  --tls \
  --tls-cert server.crt \
  --tls-key server.key

Mutual TLS (mTLS)

Server requires client certificate:

./cuopt_grpc_server --port 8765 \
  --tls \
  --tls-cert server.crt \
  --tls-key server.key \
  --tls-root ca.crt \
  --require-client-cert

Client provides certificate via environment variables (applies to Python, cuopt_cli, and C API):

export CUOPT_TLS_ENABLED=1
export CUOPT_TLS_ROOT_CERT=ca.crt
export CUOPT_TLS_CLIENT_CERT=client.crt
export CUOPT_TLS_CLIENT_KEY=client.key

Or programmatically via grpc_client_config_t:

config.enable_tls = true;
config.tls_root_certs = read_file("ca.crt");
config.tls_client_cert = read_file("client.crt");
config.tls_client_key = read_file("client.key");

Message Size Limits

Configuration Default Notes
Server --max-message-mb 256 MiB Per-message limit (also --max-message-bytes for exact byte values)
Server clamping [4 KiB, ~2 GiB] Enforced at startup to stay within protobuf's serialization limit
Client max_message_bytes 256 MiB Clamped to [4 MiB, ~2 GiB] at construction
Chunk size 16 MiB Payload per SendArrayChunk/GetResultChunk
Chunked threshold 75% of max_message_bytes Problems above this use chunked upload (e.g. 192 MiB when max is 256 MiB)

Chunked transfer allows unlimited total payload size; only individual chunks must fit within the per-message limit. Neither client nor server allows "unlimited" message size — both clamp to the protobuf 2 GiB ceiling.

Security Notes

  1. gRPC Layer: All client-server message parsing uses protobuf-generated code
  2. Internal Pipe: The server-to-worker pipe uses protobuf for metadata headers and length-prefixed raw read()/write() for bulk array data. This pipe is internal to the server process (main → forked worker) and not exposed to clients.
  3. Standard gRPC Security: HTTP/2 framing, flow control, standard status codes
  4. TLS Support: Optional encryption with mutual authentication
  5. Input Validation: Server validates all incoming gRPC messages before processing

Data Flow Summary

┌─────────┐                                    ┌─────────────┐
│ Client  │                                    │   Server    │
│         │  SubmitJob (small)                 │             │
│ problem ├───────────────────────────────────►│ deserialize │
│         │  -or- Chunked Upload (large)       │      ↓      │
│         │                                    │   worker    │
│         │                                    │   process   │
│         │  GetResult (small)                 │      ↓      │
│ solution│◄───────────────────────────────────┤  serialize  │
│         │  -or- Chunked Download (large)     │             │
└─────────┘                                    └─────────────┘

See GRPC_SERVER_ARCHITECTURE.md for details on internal server architecture.

Code Generation

The cpp/codegen directory (optional) generates conversion snippets from field_registry.yaml. Targets include:

  • Settings: PDLP/MIP settings ↔ proto (replacing hand-written blocks in the settings mapper).
  • Result header/scalars/arrays: ChunkedResultHeader and array field handling.
  • Field element size: grpc_field_element_size.hpp (ArrayFieldId → byte size).
  • Incumbent: grpc_incumbent_proto.hpp (build/parse Incumbent messages).

Adding or changing a proto field can be done via YAML and regenerate instead of editing mapper code by hand.

Build

  • libcuopt: Includes the mapper .cpp files, grpc_client.cpp, and solve_remote.cpp. Requires CUOPT_ENABLE_GRPC, gRPC, and protobuf. Proto generation is done by CMake custom commands that depend on the .proto files in cpp/src/grpc/.
  • cuopt_grpc_server: Executable built from cpp/src/grpc/server/*.cpp; links libcuopt, gRPC, protobuf.

Tests that use the client (e.g. grpc_client_test.cpp, grpc_integration_test.cpp) get cpp/src/grpc and cpp/src/grpc/client in their include path.