Build a Production-Ready Pipeline

Field	Value
Difficulty	Advanced
Estimated Read Time	20-25 minutes
Labels	`production`, `reliability`, `deployment`

This is the capstone chapter. Everything so far has been one concept at a time; here they come together into a single blueprint you can lift into real deployment code. The template's whole purpose is to make explicit the three things that defaults leave implicit: the model's input bounds (so contract violations fail at build time, not mid-stream), the stage naming (so diagnostics stay readable when several models share a process), and the queue policy (so behavior under load is observable rather than mysterious).

The shape is: configure run options, configure and load the model, build a runner, then drive it with a bounded async loop. By the end you will have a Runner executing an async pipeline with production defaults and a push/pull loop that counts successful outputs — the runtime skeleton you would standardize across multiple models in the same application.

Walkthrough

Configure the run options

These are the production runtime defaults. queue_depth = 8 gives a small bounded buffer; overflow_policy = Block makes the producer wait rather than silently drop frames (the safe choice when you care about loss); output_memory = Owned ensures returned tensors survive past the pull. Setting these explicitly — instead of relying on defaults — is what makes behavior under load predictable.

tutorials/017_build_production_pipeline/build_production_pipeline.cpp
simaai::neat::RunOptions run_opt;
run_opt.queue_depth = 8;
run_opt.overflow_policy = simaai::neat::OverflowPolicy::Block;
run_opt.output_memory = simaai::neat::OutputMemory::Owned;

Configure and load the model

Here we make the input contract explicit on the model. Setting preprocess.input_max_width/height/depth to the frame's dimensions means a mismatched input fails at build time with a clear contract error, rather than producing a confusing runtime failure later. name_suffix = "_prod" tags this model's stages so they're identifiable in diagnostics across a multi-model app. We then construct the Model from the archive path and these options.

Model::Options also spells out the preprocessing the model expects — InputKind::Image, RGB color convert, and ImageNet normalization with has_explicit_stats = true — because the C++ path declares preprocessing up front rather than relying on archive defaults.

tutorials/017_build_production_pipeline/build_production_pipeline.cpp
simaai::neat::Model::Options model_opt;
model_opt.preprocess.kind = simaai::neat::InputKind::Image;
model_opt.preprocess.enable = simaai::neat::AutoFlag::On;
model_opt.preprocess.color_convert.input_format = simaai::neat::PreprocessColorFormat::RGB;
model_opt.preprocess.input_max_width = rgb.cols;
model_opt.preprocess.input_max_height = rgb.rows;
model_opt.preprocess.input_max_depth = rgb.channels();
model_opt.preprocess.normalize.enable = simaai::neat::AutoFlag::On;
model_opt.preprocess.normalize.mean = {0.485f, 0.456f, 0.406f};
model_opt.preprocess.normalize.stddev = {0.229f, 0.224f, 0.225f};
model_opt.preprocess.normalize.has_explicit_stats = true;
model_opt.name_suffix = "_prod";

simaai::neat::Model model(model_path, model_opt);

Build the runner

ModelRouteOptions (C++ Model::RouteOptions) selects which boundaries the route includes — include_input and include_output both true here — and carries the same _prod suffix so the route's elements match the model's naming. We then call model.build(sample, route_options, run_options): the one-call path that takes a Model straight to a runnable Runner, forwarding both the route and run options into the underlying pipeline. The representative sample lets the build lock in negotiated shapes.

The sample is a TensorList built with Tensor::from_cv_mat(rgb, ..., TensorMemory::EV74), which places the input in device-appropriate memory.

tutorials/017_build_production_pipeline/build_production_pipeline.cpp
simaai::neat::Model::RouteOptions sess_opt;
sess_opt.include_input = true;
sess_opt.include_output = true;
sess_opt.name_suffix = "_prod";

auto runner = model.build(
    simaai::neat::TensorList{simaai::neat::Tensor::from_cv_mat(
        rgb, simaai::neat::ImageSpec::PixelFormat::RGB, simaai::neat::TensorMemory::EV74)},
    sess_opt, run_opt);

Drive the production loop

This is the loop a real service runs. For each iteration we push(...) an input — checking the boolean return so a rejected push (under Block, a transient condition) is handled rather than miscounted — then pull(...) with a finite timeout and count the successful outputs. After the loop, close() tears the runner down cleanly. This push-bool / pull-with-timeout / explicit-close pattern is the reliable async skeleton; swap in your real inputs and output handling and the structure stays the same.

tutorials/017_build_production_pipeline/build_production_pipeline.cpp
int ok = 0;
for (int i = 0; i < iters; ++i) {
  if (!runner.push(simaai::neat::TensorList{simaai::neat::Tensor::from_cv_mat(
          rgb, simaai::neat::ImageSpec::PixelFormat::RGB, simaai::neat::TensorMemory::EV74)}))
    continue;
  auto out = runner.pull(/*timeout_ms=*/2000);
  if (!out.empty())
    ++ok;
}
runner.close();
if (ok <= 0)
  throw std::runtime_error("runner produced no outputs");

Run

This chapter needs a model archive (resnet_50). Run the Python and C++ (prebuilt) commands from the Neat install root (the directory that contains share/ and lib/); run the build from source commands from the repo root.

C++ (prebuilt):

./lib/sima-neat/tutorials/tutorial_017_build_production_pipeline \
  --model /tmp/resnet_50.tar.gz --iters 4

C++ (build from source):

./build.sh --target tutorial_017_build_production_pipeline
./build/tutorials-standalone/tutorial_017_build_production_pipeline \
  --model /tmp/resnet_50.tar.gz --iters 4

Expected output:

outputs=4
[OK] 017_build_production_pipeline

(The Python build prints iters=4 ok=4.)

To integrate this chapter's C++ source into your own project with a custom CMakeLists.txt (no extras folder required), see How to Run Tutorials on the landing page.

Full source

Show the complete source programs

tutorials/017_build_production_pipeline/build_production_pipeline.cpp
// Production blueprint: wrap a Model in a Runner with production-grade RunOptions.
//
// Usage:
//   tutorial_017_build_production_pipeline --model /path/to/resnet_50.tar.gz [--iters 4]

#include "neat.h"

#include <opencv2/core.hpp>

#include <iostream>
#include <stdexcept>
#include <string>

namespace {

bool get_arg(int argc, char** argv, const std::string& key, std::string& out) {
  for (int i = 1; i + 1 < argc; ++i) {
    if (key == argv[i]) {
      out = argv[i + 1];
      return true;
    }
  }
  return false;
}

int parse_int_arg(int argc, char** argv, const std::string& key, int def) {
  std::string value;
  if (!get_arg(argc, argv, key, value))
    return def;
  return std::stoi(value);
}

} // namespace

int main(int argc, char** argv) {
  try {
    std::string model_path;
    if (!get_arg(argc, argv, "--model", model_path)) {
      std::cerr << "Usage: tutorial_017_build_production_pipeline --model <path> [--iters <n>]\n";
      return 1;
    }
    const int iters = parse_int_arg(argc, argv, "--iters", 4);

    cv::Mat rgb(224, 224, CV_8UC3, cv::Scalar(16, 96, 196));
    if (!rgb.isContinuous())
      rgb = rgb.clone();

    // CORE LOGIC
    // Production defaults: bounded queue, blocking overflow, owned output memory.
    // Model::build returns a Runner that owns the async pipeline; measure the
    // workload explicitly when you need performance data.
    simaai::neat::RunOptions run_opt;
    run_opt.queue_depth = 8;
    run_opt.overflow_policy = simaai::neat::OverflowPolicy::Block;
    run_opt.output_memory = simaai::neat::OutputMemory::Owned;

    simaai::neat::Model::Options model_opt;
    model_opt.preprocess.kind = simaai::neat::InputKind::Image;
    model_opt.preprocess.enable = simaai::neat::AutoFlag::On;
    model_opt.preprocess.color_convert.input_format = simaai::neat::PreprocessColorFormat::RGB;
    model_opt.preprocess.input_max_width = rgb.cols;
    model_opt.preprocess.input_max_height = rgb.rows;
    model_opt.preprocess.input_max_depth = rgb.channels();
    model_opt.preprocess.normalize.enable = simaai::neat::AutoFlag::On;
    model_opt.preprocess.normalize.mean = {0.485f, 0.456f, 0.406f};
    model_opt.preprocess.normalize.stddev = {0.229f, 0.224f, 0.225f};
    model_opt.preprocess.normalize.has_explicit_stats = true;
    model_opt.name_suffix = "_prod";

    simaai::neat::Model model(model_path, model_opt);

    simaai::neat::Model::RouteOptions sess_opt;
    sess_opt.include_input = true;
    sess_opt.include_output = true;
    sess_opt.name_suffix = "_prod";

    auto runner = model.build(
        simaai::neat::TensorList{simaai::neat::Tensor::from_cv_mat(
            rgb, simaai::neat::ImageSpec::PixelFormat::RGB, simaai::neat::TensorMemory::EV74)},
        sess_opt, run_opt);

    int ok = 0;
    for (int i = 0; i < iters; ++i) {
      if (!runner.push(simaai::neat::TensorList{simaai::neat::Tensor::from_cv_mat(
              rgb, simaai::neat::ImageSpec::PixelFormat::RGB, simaai::neat::TensorMemory::EV74)}))
        continue;
      auto out = runner.pull(/*timeout_ms=*/2000);
      if (!out.empty())
        ++ok;
    }
    runner.close();
    if (ok <= 0)
      throw std::runtime_error("runner produced no outputs");

    std::cout << "outputs=" << ok << "\n";
    std::cout << "[OK] 017_build_production_pipeline\n";
    return 0;
  } catch (const std::exception& e) {
    std::cerr << "[FAIL] " << e.what() << "\n";
    return 1;
  }
}

Walkthrough​

Configure the run options​

Configure and load the model​

Build the runner​

Drive the production loop​

Run​

Full source​

Source​