Build an Inference Pipeline
| Field | Value |
|---|---|
| Difficulty | Beginner |
| Estimated Read Time | 5 minutes |
| Labels | graph, build, run, pipeline |
Chapter 001 ran a model in three lines. That convenience hides a two-part lifecycle that every non-trivial Neat program uses directly: you first describe a pipeline as a Graph, then build that description into a runnable Run. This chapter makes that lifecycle visible by composing the smallest possible pipeline — one input node wired to one output node, no model in between — and pushing a single frame through it.
The payoff is conceptual: a Graph is a reusable definition you build once and execute many times, not a one-off call. By the end you will have created a graph, turned it into a runnable pipeline, and read back the rank of the output tensor — proving the frame made it through.
Walkthrough
Describe the input
Before wiring up nodes, declare what a frame looks like. InputOptions is that contract: pixel format, width/height, channel depth, and whether the runtime timestamps each buffer. The input node built from these options validates incoming frames against the shape the pipeline expects.
C++ additionally sets is_live = false to mark this as a non-live (file/tensor) source.
simaai::neat::InputOptions in;
in.format = "RGB";
in.width = width;
in.height = height;
in.depth = 3;
in.is_live = false;
in.do_timestamp = true;
Compose the graph
Now build the structure. A fresh Graph is an empty composition surface, and add() appends nodes in order. We add exactly two — an input node (configured above) and a bare output node. That is the entire topology: frames enter at the input and leave at the output, with nothing in between. This is the seam where, in later chapters, a model or preprocessing stage slots in.
Nodes come from simaai::neat::nodes::Input(...) and nodes::Output().
simaai::neat::Graph graph;
graph.add(simaai::neat::nodes::Input(in));
graph.add(simaai::neat::nodes::Output());
Build the pipeline
build() is the transition from description to executable. It resolves the added nodes into a concrete pipeline, validates the input/output contracts against a real sample, and creates a reusable Run handle. We pass a representative frame so build() can lock in the negotiated tensor shapes; the next step uses Run::run(...) for a deterministic one-at-a-time call.
The sample frame is a cv::Mat, and run_opt.output_memory = Owned asks the runtime to return owned output buffers.
auto run = graph.build(std::vector<cv::Mat>{input}, run_opt);
Run a frame and read the result
With a Run in hand, run() pushes one frame and pulls one result synchronously. Because there's no model, the output mirrors the input contract — so reading the tensor's rank is enough to confirm a frame completed the round trip. In real pipelines this same run()/push/pull surface is how you drive inference.
run() returns a TensorList; read sample.front().shape.size().
simaai::neat::TensorList sample = run.run(std::vector<cv::Mat>{input}, /*timeout_ms=*/1000);
Run
Run it and you should see the rank of the output tensor printed to stdout. Run the Python and C++ (prebuilt) commands from the Neat install root (the directory that contains share/ and lib/); run the build from source commands from the repo root. This chapter needs no model archive.
C++ (prebuilt):
./lib/sima-neat/tutorials/tutorial_004_build_inference_pipeline \
--width 320 --height 240
C++ (build from source):
./build.sh --target tutorial_004_build_inference_pipeline
./build/tutorials-standalone/tutorial_004_build_inference_pipeline \
--width 320 --height 240
Expected output:
tensor_rank=3
[OK] 004_build_inference_pipeline
(The Python build prints output_rank=....) To integrate this chapter's C++ source into your own project with a custom CMakeLists.txt (no extras folder required), see How to Run Tutorials on the landing page.
In Practice
How build/run, execution modes, the push/pull surface, and RunOptions fit together once you move past a single sync call.
Build vs run
Graph::build(...)constructs the pipeline and returns aRunhandle for push/pull control.Graph::run(...)is the synchronous convenience path: it builds (if needed), pushes one input, and pulls one output.
Sync vs async
- Use
Graph::run(...)for a simple one-shot call. - Use
Graph::build(...)when you want a reusable runner and explicitpush(...)/pull(...)control — see Run Inference Asynchronously.
Push/pull API
Run exposes:
push(...)/try_push(...)for inputs (cv::Mat,Tensor, orSample).pull(...),pull_tensor(...),pull_tensor_or_throw(...)for outputs.
If you need output metadata (timestamps, stream ids), use pull() to get a Sample. If you only need the tensor payload, use pull_tensor().
RunOptions (simple API)
Common knobs:
preset: latency/safety profile (Realtime,Balanced,Reliable).queue_depth: runtime queue depth.overflow_policy: queue overflow behavior (Block,KeepLatest,DropIncoming).output_memory: output ownership policy (Auto,ZeroCopy,Owned).on_input_drop: callback hook for dropped input events.
For queue-depth, overflow, and measurement under load, see Tune Throughput and Queue Depth.