Skip to main content

Diagnose and Profile a Pipeline

Diagnose and Profile a Pipeline — animated walkthrough overview

FieldValue
DifficultyIntermediate
Estimated Read Time<10 minutes
Labelsdiagnostics, debugging, observability

When a pipeline misbehaves, the temptation is to jump straight into element-level debugging. This chapter teaches the cheaper first move: a repeatable triage pass that answers three questions in order — Is the graph contract valid? Does one run succeed? What do the runtime diagnostics say? It catches most misconfiguration in seconds, before it becomes a multi-hour session, and it works on the same minimal Input → Output graph you already know from chapter 004.

By the end you will have validated a graph's contract, run a single measured frame, and printed the measurement report that tells you whether the pipeline is healthy.

Walkthrough

Validate the contract

validate() is a contract-level check that runs before build(). It exercises the node order, caps, and backend parse path without streaming any data, and returns a report carrying a canonical error_code. An empty/ok code means the graph is structurally sound; anything else buckets the failure (see the error taxonomy below) so you know where to look. Running this first means you never waste time debugging runtime behavior on a graph that was never going to build.

tutorials/012_diagnose_a_pipeline/diagnose_a_pipeline.cpp
// validate() checks the Graph before build() and prints any caps problems.
auto report = graph.validate();
std::cout << "validate.error_code=" << report.error_code << "\n";

Run one measured frame

Next, build and run a single deterministic frame inside a start_measurement() window. output_memory = Owned asks for owned output buffers so the result stays valid after the call. One frame is enough: if it succeeds, the pipeline is live; if it throws, the exception carries a structured report you can bucket the same way as validate().

tutorials/012_diagnose_a_pipeline/diagnose_a_pipeline.cpp
// Build a reusable runner and measure the caller-owned workload.
simaai::neat::RunOptions run_opt;
run_opt.output_memory = simaai::neat::OutputMemory::Owned;
auto run = graph.build(std::vector<cv::Mat>{rgb}, run_opt);
simaai::neat::MeasureOptions measure_opt;
measure_opt.title = "tutorial 011 diagnosis";
auto scope = run.start_measurement(measure_opt);
simaai::neat::TensorList out = run.run(std::vector<cv::Mat>{rgb}, /*timeout_ms=*/1000);
if (out.empty())
throw std::runtime_error("missing output tensor");
const simaai::neat::MeasureReport measured = scope.stop();

Read the runtime diagnostics

With one run on record, the MeasureReport summarizes the pipeline's health: counters (inputs_enqueued, outputs_pulled, drops), end-to-end latency, node metrics, plugin/kernel timing, edge timing, and optional power. MeasureReport::to_text() is the baseline you capture before escalating to probes and DOT graphs described in In Practice.

tutorials/012_diagnose_a_pipeline/diagnose_a_pipeline.cpp
// Post-run diagnostics come from the measurement report.
std::cout << "measure.inputs_enqueued=" << measured.counters.inputs_enqueued
<< " outputs_pulled=" << measured.counters.outputs_pulled << "\n";
std::cout << "measure.text_size=" << measured.to_text().size() << "\n";

Run

Run it and you should see the validate code and measurement report printed to stdout. Run the Python and C++ (prebuilt) commands from the Neat install root (the directory that contains share/ and lib/); run the build from source commands from the repo root. This chapter needs no model archive.

C++ (prebuilt):

./lib/sima-neat/tutorials/tutorial_012_diagnose_a_pipeline

C++ (build from source):

./build.sh --target tutorial_012_diagnose_a_pipeline
./build/tutorials-standalone/tutorial_012_diagnose_a_pipeline

Expected output (counter values and the summary string vary by run):

validate.error_code=
measure.inputs_enqueued=1 outputs_pulled=1
measure.text_size=...
[OK] 012_diagnose_a_pipeline

(The Python build prints validate_error_code=, inputs_enqueued=... outputs_pulled=..., and measure_text_size=....) To integrate this chapter's C++ source into your own project with a custom CMakeLists.txt (no extras folder required), see How to Run Tutorials on the landing page.

In Practice

Structured diagnostics, the error taxonomy, debug knobs, and the plugin-failure workflow you reach for when validate() / start_measurement() / MeasureReport point at a problem.

GraphReport

GraphReport captures structured diagnostics:

  • pipeline string (for reproduction)
  • canonical error_code (machine triage)
  • repro_note (human summary + hint)
  • node reports and owned element names
  • bus messages and error details
  • optional flow/timing counters

When an error occurs, NeatError carries a GraphReport you can log or serialize.

Error taxonomy

Framework errors use stable code families:

Error codeMeaningTypical fix
misconfig.pipeline_shapeNode order/shape contract violationEnsure Input() first for push pipelines and Output() last for pull pipelines
misconfig.capsCaps negotiation/override mismatchAlign caps_override, format, and downstream caps
misconfig.input_shapeInput tensor/frame/sample shape/layout mismatchValidate width/height/depth, layout, dtype, storage
build.parse_launchgst_parse_launch failedValidate fragment syntax and plugin availability
runtime.pullRuntime pull/timeout/closed-output failureCheck sink output production, queue pressure, and upstream errors
io.parseSaved-graph JSON parse/schema failureValidate JSON and required node fields
io.openGraph save/load file open/read/write failureCheck path existence, permissions, and storage health

PullError.code uses the same taxonomy (not only exception paths).

Programmatic handling

#include "pipeline/ErrorCodes.h"
#include "pipeline/NeatError.h"

try {
auto run = graph.build(input);
simaai::neat::Sample out;
simaai::neat::PullError perr;
const auto st = run.pull(500, out, &perr);
if (st == simaai::neat::PullStatus::Error &&
perr.code == simaai::neat::error_codes::kRuntimePull) {
// runtime pull triage path
}
} catch (const simaai::neat::NeatError& e) {
if (e.report().error_code == simaai::neat::error_codes::kParseLaunch) {
// build/parse-launch triage path
}
}

Debug knobs (environment)

Key environment variables (see Architecture for detail):

  • SIMA_GST_DOT_DIR: write DOT graphs for failures
  • SIMA_GST_BOUNDARY_PROBES: boundary flow counters
  • SIMA_GST_ELEMENT_TIMINGS: per-element timings
  • SIMA_GST_FLOW_DEBUG: per-element flow counters
  • SIMA_GST_ENFORCE_NAMES: enforce naming contract

Debug workflow

  1. Capture GraphReport.error_code and bucket the failure by taxonomy first.
  2. Capture GraphReport.repro_note for concrete context and built-in hint.
  3. Capture pipeline text: Graph::describe_backend() or last_pipeline().
  4. Capture structured diagnostics: MeasureReport::to_text() or NeatError::report().
  5. Inspect GraphReport.bus for first terminal ERROR source + detail.
  6. If runtime stalls/timeouts, enable boundary/element probes to localize flow stop.

Recommended support bundle:

  • error_code
  • repro_note
  • full pipeline_string
  • first 3-5 terminal bus errors (GraphReport.bus)
  • environment overrides used in run/validate

Common failures → fixes

SymptomLikely causeFix
missing ... pluginGStreamer plugin not foundCheck GST_PLUGIN_PATH, run gst-inspect-1.0 <plugin>
appsink 'mysink' not foundMissing terminal Output()Ensure Output is the last node in run/build pipelines
caps_override is set; renegotiation disabledcaps pinnedRemove caps_override or keep input caps fixed
tensor caps change not supportedTensor shape/dtype change at runtimeKeep tensor shape/dtype stable (no renegotiation)

Debugging plugin failures

When a plugin fails, NEAT raises a NeatError whose message contains the GStreamer error and a structured debug string. Use the fields to locate the root cause quickly.

  1. Read the structured fields. Look for the debug key/value fields in the error text:

    • node: the failing element name in the pipeline
    • config_path: JSON config file (if applicable)
    • model_path: model/pack path (if applicable)
    • hint: actionable fix guidance
    • detail: extra context such as missing keys or allocator state

    See the Error Format Reference for the full list.

  2. Confirm the pipeline context. Use the pipeline string from Graph::last_pipeline() or from the error report:

    • Verify the node name appears in the pipeline.
    • Confirm the config_path exists and is readable.
    • For caps errors, check upstream elements that negotiate into the failing node.
  3. Apply common fixes.

    • Config errors: verify JSON syntax, required keys, and any model paths.
    • Caps errors: add or fix parser elements (e.g., h264parse), ensure caps include required fields like parsed=true, stream-format=byte-stream, alignment=au.
    • Allocator errors: ensure upstream elements use the required allocator type (system vs. simaai memory/segment).
  4. Capture more diagnostics with the debug knobs above (SIMA_GST_DOT_DIR, SIMA_GST_FLOW_DEBUG, SIMA_GST_ELEMENT_TIMINGS).

Full source

Show the complete C++ and Python programs
tutorials/012_diagnose_a_pipeline/diagnose_a_pipeline.cpp
// Two diagnostic commands: Graph::validate and Run::start_measurement.
//
// Usage:
// tutorial_012_diagnose_a_pipeline

#include "neat.h"

#include <opencv2/core.hpp>

#include <iostream>
#include <stdexcept>

int main() {
try {
cv::Mat rgb(96, 128, CV_8UC3, cv::Scalar(22, 44, 66));
if (!rgb.isContinuous())
rgb = rgb.clone();

simaai::neat::Graph graph;
simaai::neat::InputOptions in;
in.format = "RGB";
in.width = rgb.cols;
in.height = rgb.rows;
in.depth = rgb.channels();
graph.add(simaai::neat::nodes::Input(in));
graph.add(simaai::neat::nodes::Output());

// CORE LOGIC
// validate() checks the Graph before build() and prints any caps problems.
auto report = graph.validate();
std::cout << "validate.error_code=" << report.error_code << "\n";

// Build a reusable runner and measure the caller-owned workload.
simaai::neat::RunOptions run_opt;
run_opt.output_memory = simaai::neat::OutputMemory::Owned;
auto run = graph.build(std::vector<cv::Mat>{rgb}, run_opt);
simaai::neat::MeasureOptions measure_opt;
measure_opt.title = "tutorial 011 diagnosis";
auto scope = run.start_measurement(measure_opt);
simaai::neat::TensorList out = run.run(std::vector<cv::Mat>{rgb}, /*timeout_ms=*/1000);
if (out.empty())
throw std::runtime_error("missing output tensor");
const simaai::neat::MeasureReport measured = scope.stop();

// Post-run diagnostics come from the measurement report.
std::cout << "measure.inputs_enqueued=" << measured.counters.inputs_enqueued
<< " outputs_pulled=" << measured.counters.outputs_pulled << "\n";
std::cout << "measure.text_size=" << measured.to_text().size() << "\n";

std::cout << "[OK] 012_diagnose_a_pipeline\n";
return 0;
} catch (const std::exception& e) {
std::cerr << "[FAIL] " << e.what() << "\n";
return 1;
}
}

Source