Skip to main content

Run a Graph

A Graph is the plan. A Run is the live execution handle.

Use this page after you have authored a graph. If you still need to decide which nodes or fragments belong in the graph, start with Graph. If the graph already looks right, this page is where you make it run, drain, measure, and survive real input.

Choose one-shot or reusable execution

Use the smallest runtime path that fits the job:

NeedUseWhy
Run one input and get one outputGraph.run(...)Shortest one-shot path.
Push many inputs over timeGraph.build(...) and RunReuses the runtime and exposes push/pull control.
Use named inputs or outputsGraph.build(...) and named run.push(...) / run.pull(...)Keeps multi-input and multi-output apps explicit.
Let source nodes drive the graphGraph.build() or Graph.run() with no app inputUse when the graph owns a camera, file, RTSP, or other source node.
Measure, export, drain, or stop deliberatelyRunGives you lifecycle and diagnostics control.

No magic. Build the graph, run it, inspect the result.

Choose how input enters the graph

Before you tune queues, decide who owns input.

Graph styleHow input entersHow you run it
App-pushed graphYour application calls Graph.run(input, ...), run.run(input, ...), run.push(...), or run.try_push(...)Build or run with input. Inspect endpoint names before pushing into multi-input graphs.
Source-owned graphThe graph contains a source node or fragment, such as file, camera, RTSP, or stream inputBuild or run without app input: graph.build() or graph.run(). Pull outputs, use output nodes, or use callbacks depending on the graph.

If the graph owns the source, do not push into it. Inspect what it emits instead.

Run a source-owned graph

If the graph contains its own source node, build or run it without app input. Do not push into a graph that already owns the source. Pull named outputs when the graph exposes them; let sink nodes handle output when the graph ends in a sink.

Use graph.run() for a source-to-sink job where the graph owns both input and output. Use graph.build() when your app needs to pull results, measure the run, or stop it deliberately.

auto run = graph.build();

while (running && run.can_pull()) {
auto sample = run.pull("detections", /*timeout_ms=*/1000);
if (!sample) {
continue;
}
handle(*sample);
}

run.close();

For long-running sources, make your application decide when to exit the loop and call close(). A timeout means no output arrived in that window; it does not always mean the source is done.

Run once

Use Graph.run(...) when you want one synchronous push/pull operation.

simaai::neat::Graph graph("classifier");
graph.add(simaai::neat::nodes::Input("image"));
graph.add(model);
graph.add(simaai::neat::nodes::Output("classes"));

simaai::neat::TensorList outputs = graph.run(std::vector<cv::Mat>{frame});

In Python, pass a list or tuple. graph.run([tensor]) means “one graph input,” not “add a batch dimension.”

Build a reusable Run

Use Graph.build(...) when your application owns the loop.

auto run = graph.build();

run.push("image", std::vector<cv::Mat>{frame});
simaai::neat::TensorList outputs = run.pull_tensors("classes", /*timeout_ms=*/2000);

run.close_input();
while (auto sample = run.pull(/*timeout_ms=*/100)) {
// Drain remaining output after end-of-input.
}
run.close();

Use close_input() when you are done pushing and want in-flight work to finish. Use close() when you want to tear the run down; C++ also exposes stop() as the immediate-stop spelling.

Use a reusable Run for request/response

Graph.run(...) is the shortest one-shot path. If you want the same request/response shape without rebuilding the graph each time, build a reusable Run once and call run.run(...).

Use this when:

  • the graph stays alive for many requests;
  • each request should still wait for its own output;
  • you do not need a separate producer thread and consumer thread yet.
auto run = graph.build();

for (const auto& frame : frames) {
simaai::neat::TensorList outputs = run.run(
std::vector<cv::Mat>{frame},
/*timeout_ms=*/2000);
handle(outputs);
}

run.close();

Move from run.run(...) to explicit push(...) / pull(...) when you need in-flight work, producer/consumer threads, non-blocking push, named output polling, or drain control.

Inspect runtime endpoints

Before you push into a multi-input graph, ask the Run what names it accepts.

auto run = graph.build();

for (const auto& name : run.input_names()) {
std::cout << "input: " << name << "\n";
}
for (const auto& name : run.output_names()) {
std::cout << "output: " << name << "\n";
}

If a graph has more than one public input or output, use named push(...) and pull(...). Neat should not have to guess which wire you meant.

Run multi-input and multi-output graphs

For multi-input graphs, push one named endpoint at a time, or push an unnamed list only when the graph has one unambiguous input route.

run.push("left", simaai::neat::TensorList{left_tensor});
run.push("right", simaai::neat::TensorList{right_tensor});

auto boxes = run.pull_tensors("detections", /*timeout_ms=*/2000);
auto preview = run.pull("preview", /*timeout_ms=*/2000);

When you combine streams, preserve the matching key that the graph expects. CombinePolicy::ByFrame needs frame_id; CombinePolicy::ByPts needs pts_ns. Missing keys should fail loudly. Silent joins are how bugs get promoted to architecture.

Choose run options

RunOptions controls runtime behavior. Start with defaults. Change options when the source, output lifetime, or throughput target needs a different policy.

WorkloadStart withWhy
First working appdefault RunOptionsProve correctness before tuning.
Live camera or RTSP inputRunPreset::Realtime; OutputOptions::Latest() where output freshness mattersFresh frames beat complete history. The realtime preset resolves to latest-frame overflow unless you override it.
File or batch processingRunPreset::Reliable; OutputOptions::EveryFrame(...)Preserve every input and expose backpressure. The reliable preset resolves to blocking overflow unless you override it.
Normal app servingRunPreset::BalancedGood default once the graph works.
Jittery source needs bounded bufferingqueue_depthIncrease only enough to absorb jitter. A deep queue can hide stale frames and delayed backpressure.
App stores outputs after pullOutputMemory::OwnedKeeps output lifetime independent of runtime buffers.
App consumes outputs immediatelyOutputMemory::AutoLet Neat choose the right ownership path first.
Default wait time should be explicitinput_timeout_msSets the default timeout for build/run input-mode paths. Per-call timeouts still win.
Seeded build should catch first-sample errors earlystartup_preflight = trueKeeps seeded build honest. Disable only when first-sample failures can surface later through pull(...) or last_error().
Source buffer lifetime is shortadvanced.copy_input = trueProtects input memory that may disappear after push(...).
Input size needs a guardrailadvanced.max_input_bytesRejects oversized input before it enters the graph.
You need drop telemetryon_input_dropCounts overload and size-guard drops by stream and reason.
You need build-time evidencerun_exportWrites a run snapshot when the run is built.
simaai::neat::RunOptions options;
options.preset = simaai::neat::RunPreset::Realtime;
options.on_input_drop = [](const simaai::neat::InputDropInfo& drop) {
std::cerr << "dropped input from stream " << drop.stream_id
<< ": " << drop.reason << "\n";
};

auto run = graph.build(options);

Do not set every knob because it exists. The fastest way to get lost is to tune before you have a baseline.

Runtime option recipes

Copy the shape of these recipes, not the numbers. Queue sizes and output limits depend on the model, the source rate, and how fast your app pulls results.

Low-latency live output

Use this when the next frame matters more than the complete frame history. Set the output queue policy when you add the output node; set the input/drop policy when you build the Run.

graph.add(simaai::neat::nodes::Output(
"detections",
simaai::neat::OutputOptions::Latest()));

simaai::neat::RunOptions options;
options.preset = simaai::neat::RunPreset::Realtime;

auto run = graph.build(options);

This recipe keeps the newest useful result instead of building a museum of stale frames. Pull continuously and count drops by stream.

Lossless batch output

Use this when every input should produce its corresponding output and backpressure is better than loss.

graph.add(simaai::neat::nodes::Output(
"result",
simaai::neat::OutputOptions::EveryFrame(/*max_buffers=*/64)));

simaai::neat::RunOptions options;
options.preset = simaai::neat::RunPreset::Reliable;

auto run = graph.build(options);

Close input when the producer is done, then drain the output. If input count and output count diverge, inspect the model contract before blaming the runtime.

Owned output lifetime

Use owned output when your app stores tensors after pull(...) returns or hands them to another thread. Keep Auto for first-run code and change this only when lifetime requires it.

simaai::neat::RunOptions options;
options.output_memory = simaai::neat::OutputMemory::Owned;

auto run = graph.build(options);

Seed build when shape or format must be proven early

Most reusable runs can build without input:

run = graph.build()

Use seeded build(input, ...) when the first real input should prove shape, format, caps, or byte-guard behavior before the app enters the streaming loop.

auto run = graph.build(std::vector<cv::Mat>{frame});

startup_preflight is on by default for seeded builds, so the seed catches payload-level failures while building. If build fails, the structured report can include build_adaptation: the seed shape, dynamic limits, byte guard, and adaptation actions Neat tried. Use it to debug evidence, not vibes.

Handle backpressure

Backpressure means the graph cannot accept or emit data as fast as the app wants.

Use these controls deliberately:

  • queue_depth controls how much work can wait in runtime queues.
  • overflow_policy = Block applies backpressure to the producer.
  • overflow_policy = KeepLatest drops older queued input so live streams stay fresh.
  • overflow_policy = DropIncoming rejects new input when the queue is full.
  • try_push(...) returns false instead of blocking.
  • on_input_drop reports dropped input with InputDropInfo fields such as stream_id, frame_id, port_name, and reason.

For threading, use one push thread and one pull thread for a Run. Do not push to the same Run concurrently from multiple threads unless your app serializes those calls.

Use a simple threading pattern

For live or high-throughput app-pushed graphs, start with two application threads:

  1. A producer thread stamps metadata and calls push(...) or try_push(...).
  2. A consumer thread pulls continuously and releases or copies outputs quickly.

Add more threads around your own queues, not around the same Run. The hot loop should be boring. Boring is fast.

auto run = graph.build(options);

std::thread producer([&] {
while (auto sample = next_sample()) {
sample->stream_id = current_stream_id();
sample->frame_id = next_frame_id();

if (!run.try_push("image", *sample)) {
count_local_drop(sample->stream_id);
}
}

run.close_input();
});

std::thread consumer([&] {
simaai::neat::Sample output;
simaai::neat::PullError error;

while (true) {
switch (run.pull("detections", /*timeout_ms=*/100, output, &error)) {
case simaai::neat::PullStatus::Ok:
handle_output(output);
break;
case simaai::neat::PullStatus::Timeout:
continue;
case simaai::neat::PullStatus::Closed:
return;
case simaai::neat::PullStatus::Error:
record_runtime_error(error);
return;
}
}
});

producer.join();
consumer.join();
run.close();

In C++, use the status-aware pull(...) overload when timeout, end-of-stream, and errors must be handled differently. In Python, pull(...) returns None when no sample is returned for that call, so pair it with your own producer/shutdown state.

Close, drain, or stop deliberately

Pick the shutdown path that matches your intent. Do not keep pushing into a run that is closing.

IntentUseWhat to do next
Finish queued work after the last inputclose_input()Keep pulling until the output is drained. In C++, status-aware pull returns PullStatus::Closed at end-of-stream.
Cancel nowstop()Stop producers and let waiting pulls unblock. Use this for shutdown or failure paths, not normal batch drain.
Release runtime resourcesclose()Call after drain or cancellation, or let the Run object leave scope.

For batch work, close input, drain output, then close the run. For live work, stop producers first, then stop or close the run. No zombie producers, no haunted queues.

Choose output ownership

OutputMemory controls how pulled tensors relate to runtime buffers:

  • Auto: let Neat choose. Use this first.
  • Owned: copy output into framework-owned memory. Use this when another thread or object stores tensors after pull.
  • ZeroCopy: share runtime storage. Use this only when the page or example explains the lifetime rules.

If throughput falls off a cliff, check whether the app is holding output samples too long. Zero-copy can be fast, but pinned buffers are still pinned buffers.

Preserve stream identity

Multistream graphs need identity before they need tuning. Preserve stream_id and frame_id so you can prove fairness, detect starvation, and count drops.

auto sample = simaai::neat::Sample::from_image(
frame,
simaai::neat::ImageSpec::PixelFormat::BGR,
simaai::neat::TensorMemory::CPU);
sample.stream_id = camera_id;
sample.frame_id = frame_number++;

if (!run.try_push("image", sample)) {
// Count local backpressure here. Runtime drops also flow through on_input_drop.
}

For source-owned graphs, pick source nodes that preserve or stamp stream metadata. For app-pushed graphs, your app owns that metadata.

Scale from one stream to many

Start with one stream. Then scale the topology and runtime policy on purpose.

PatternUse it whenWatch
One stream -> one model -> one outputBuilding the first correct pathOutput shape, dtype, and latency.
Many streams -> one model laneAggregate input rate fits one model pathPer-stream fairness and stale streams.
Many streams -> multiple model lanesOne lane cannot keep upStream partitioning, route naming, and output accounting.
One stream -> several modelsDifferent decisions need the same inputBranch-level latency and target-normalized FPS.
Many streams -> model + metadata/video outputsProduction app emits several artifactsCount target outputs separately from preview or telemetry outputs.

When connecting live graph fragments, GraphLinkOptions can select realtime latest-by-stream behavior. Use it when freshness matters more than preserving every frame across a live fan-in.

Run source-owned multistream graphs

For camera-heavy apps, the graph often owns the streams. In that shape, source groups feed the model path and your app pulls results. You still need the same throughput discipline:

  • give each source a stable stream_id;
  • use realtime latest-by-stream behavior on live fan-in links when freshness matters;
  • pull outputs continuously;
  • count outputs per stream, not only in aggregate;
  • export the run after the measured window if one stream starves or drops frames.
Source-owned choiceStart withWhy
One camera per graphOne source group, one model path, one outputEasiest way to prove the camera, model, and output contract.
Many cameras into one model laneSource fragments connected to one model fragment with GraphLinkOptions for live fan-inKeeps one model lane busy while preserving per-stream identity.
Many cameras across lanesPartition source fragments across several graph lanesUse when one model lane saturates. Measure each lane and each stream.
Video output handled by the graphSink groups such as VideoSender(...) or H.264/UDP output groupsUse when the app should not pull and transmit every frame itself.

If the graph owns the sources, build with graph.build() and stop it deliberately. Do not push app input into a graph that already has its own source nodes.

Drive many streams through one model lane

Use one public input endpoint when several live streams share the same model lane. Stamp each sample with stream_id and frame_id, use the realtime preset, and pull continuously. The rebel move is boring but effective: never let the output queue become your hidden bottleneck.

simaai::neat::RunOptions options;
options.preset = simaai::neat::RunPreset::Realtime;

auto run = graph.build(options);

while (running) {
for (const auto& camera : cameras) {
auto sample = simaai::neat::Sample::from_image(
camera.frame(),
simaai::neat::ImageSpec::PixelFormat::BGR,
simaai::neat::TensorMemory::CPU);
sample.stream_id = camera.id();
sample.frame_id = camera.next_frame_id();

if (!run.try_push("image", sample)) {
++local_drop_count[camera.id()];
}
}

while (auto output = run.pull("detections", /*timeout_ms=*/0)) {
count_output_by_stream(output->stream_id);
}
}

run.close_input();
while (auto output = run.pull("detections", /*timeout_ms=*/1000)) {
count_output_by_stream(output->stream_id);
}
run.close();

This pattern maximizes useful throughput only when the model lane can keep up with the accepted input rate. If one lane saturates, split streams across more lanes or lower the offered rate. Do not bury stale frames under a mountain of queue depth.

Split streams across model lanes

When one model lane is saturated, add lanes instead of hiding overload behind deeper queues. A lane is usually one Graph plus one Run with its own model route names and graph element prefix. Partition streams by a stable key, then measure each lane and each stream.

auto build_lane = [&](int lane_index) {
const std::string lane_name = "lane" + std::to_string(lane_index);

simaai::neat::Model::Options model_options;
model_options.name_suffix = "_" + lane_name;
simaai::neat::Model lane_model(model_path, model_options);

simaai::neat::GraphOptions graph_options;
graph_options.element_name_prefix = lane_name + "_";

simaai::neat::Graph graph("detector_" + lane_name, graph_options);
graph.add(simaai::neat::nodes::Input("image"));
graph.add(lane_model);
graph.add(simaai::neat::nodes::Output(
"detections",
simaai::neat::OutputOptions::Latest()));

simaai::neat::RunOptions run_options;
run_options.preset = simaai::neat::RunPreset::Realtime;
return graph.build(run_options);
};

std::vector<simaai::neat::Run> lanes;
lanes.emplace_back(build_lane(0));
lanes.emplace_back(build_lane(1));

while (running) {
for (const auto& camera : cameras) {
auto sample = make_sample_for_camera(camera);
const std::size_t lane_index = camera.id() % lanes.size();

if (!lanes[lane_index].try_push("image", sample)) {
++drop_count_by_lane[lane_index];
}
}

for (std::size_t lane_index = 0; lane_index < lanes.size(); ++lane_index) {
while (auto output = lanes[lane_index].pull("detections", /*timeout_ms=*/0)) {
count_output(lane_index, output->stream_id);
}
}
}

Keep the partition stable so stream identity and cache behavior stay predictable. If lane 0 starves while lane 1 is idle, the partitioning policy is the bug.

Tune the model lane deliberately

If a graph is correct but cannot meet the offered stream rate, first prove where the bottleneck lives. Do not start by making every queue bigger. That hides overload and gives stale frames a place to retire.

Use this triage:

SymptomFirst checkThen try
Accepted input FPS is high, but output FPS is lowThe model lane or postprocess lane is saturatedSplit streams across lanes, reduce offered rate, or test advanced_execution.inference_async on the model route or graph options.
try_push(...) returns false oftenThe ingress queue is fullPull continuously, reduce the offered rate, or choose an explicit OverflowPolicy.
One stream disappears in aggregate metricsMissing or uneven stream_id accountingCount outputs and drops per stream; use live latest-by-stream behavior for live fan-in.
Output stalls while input keeps movingThe app is not pulling fast enough, or it holds runtime-backed outputsPull in a dedicated loop and release/copy outputs before pushing more.
Latency grows over timeQueues are absorbing old workUse a smaller queue, RunPreset::Realtime, or OutputOptions::Latest() where freshness wins.

When you need to test model-route execution behavior, set one advanced execution field at a time and measure the same workload before and after:

simaai::neat::GraphOptions graph_options;
graph_options.advanced_execution.inference_async = true;

simaai::neat::Graph graph("detector", graph_options);

If the change does not improve the measured path, revert it. A knob that cannot prove its value does not belong in the app.

Pick a throughput recipe

Start from the workload, not from a random queue number.

WorkloadRuntime shapeStart withProve it with
Single live streamOne reusable Run, one producer, one pullerRunPreset::Realtime; OutputOptions::Latest() for preview-style outputsAccepted FPS, output FPS, drop count, and latency.
File or batch processingOne reusable Run; close input and drainRunPreset::Reliable; OutputOptions::EveryFrame(...)Input count equals output count, unless the model contract says otherwise.
Many live streams into one model laneApp-pushed Sample inputs with stream_id / frame_id, or source-owned fragments that stamp identityRunPreset::Realtime; GraphLinkPolicy::RealtimeLatestByStream through GraphLinkOptions for live fan-inPer-stream FPS and per-stream drops, not only aggregate FPS.
Many live streams across model lanesPartition streams across multiple model instances or graph lanesSame as the live-stream recipe per lanePer-lane utilization, per-stream starvation, and target-normalized FPS.
One input fans out to several modelsBranch once, then run separate model pathsBranch/fan-out in the Graph; choose output behavior per branchBranch latency and target-normalized FPS.

If one model lane is saturated, do not hide the problem behind a deeper queue. Split the work across lanes, lower the offered input rate, or choose an explicit drop policy. Queue depth buys tolerance for jitter; it does not create accelerator capacity.

Tune throughput without lying to yourself

Throughput is a loop shape, not one magic option.

  1. Build the graph once.
  2. Warm up outside the measurement window.
  3. Keep a bounded number of inputs in flight.
  4. Pull continuously so output queues do not become the bottleneck.
  5. Release or copy outputs before pushing more when output buffers may be shared with the runtime.
  6. Pick one overload policy: block, keep latest, or drop incoming.
  7. Preserve stream_id and frame_id.
  8. Close input and drain before stopping the run.
  9. Measure the right numbers.
  10. Export run evidence after the measured workload.

Measure these separately:

MetricMeaning
Offered input FPSInputs attempted per second, often streams * source_fps.
Accepted input FPSInputs accepted by push(...) or try_push(...) per second.
Aggregate output FPSAll pulled outputs per second across all outputs.
Per-stream FPSOutput rate for each stream_id.
Target-normalized FPSOutputs that count toward the app's target result per second. Useful when one input fans out to several outputs.
Drop rateDropped or rejected inputs by stream_id, source, and reason.

Aggregate FPS can look great while one stream starves. Per-stream metrics catch the crime.

Throughput loop shape

Use this shape for an app-pushed graph. Replace next_inputs() with your input source. Keep the loop boring: bounded in-flight work, continuous pulls, and no report export inside the hot path.

auto run = graph.build(options);

for (int i = 0; i < warmup_frames; ++i) {
run.push(next_inputs());
(void)run.pull(/*timeout_ms=*/5000);
}

auto measurement = run.start_measurement();

int in_flight = 0;
while (in_flight < max_in_flight && has_input()) {
if (run.push(next_inputs())) {
++inputs_sent;
++in_flight;
}
}

while (has_input() || in_flight > 0) {
auto output = run.pull(/*timeout_ms=*/1000);
if (output) {
++outputs_seen;
--in_flight;
output.reset(); // Do not pin runtime-backed buffers longer than needed.
}

while (has_input() && in_flight < max_in_flight) {
if (!run.try_push(next_inputs())) {
break;
}
++inputs_sent;
++in_flight;
}
}

run.close_input();
while (auto output = run.pull(/*timeout_ms=*/1000)) {
++outputs_seen;
}

simaai::neat::MeasureReport report = measurement.stop();
simaai::neat::save_run_json(run, report, "run_after_measurement.json");
run.close();

Keep per-frame logging, output validation, file download, source setup, and report export out of the measured hot loop unless you are explicitly measuring end-to-end behavior.

Measure and export evidence

Use start_measurement(...) to observe an application-owned push/pull window.

Use run export for evidence:

  • RunOptions.run_export writes a build-time snapshot.
  • C++ run_to_json(...) and save_run_json(...) export a run after it has executed.
  • Python run.json(...) and run.save_json(...) export the same kind of evidence.

Enable power telemetry on the RunOptions that builds the run:

simaai::neat::RunOptions options;
options.enable_board_power(/*sample_interval_ms=*/100);

auto run = graph.build(options);

simaai::neat::MeasureOptions measure_options;
measure_options.include_power = true;
auto scope = run.start_measurement(measure_options);

Power data depends on board rail support and monitor configuration. Document the measurement setup with the numbers; do not make power numbers look portable when the rails are not.

Build-time export answers “what did Neat build?” After-run export answers “what happened while it ran?”

Export at build time and after execution

Use build-time export for CI artifacts and startup debugging:

simaai::neat::RunOptions options;
options.run_export.path = "run-build.json";
options.run_export.label = "classifier-startup";

auto run = graph.build(options);

Use after-run export after samples have moved through the graph:

auto scope = run.start_measurement();
// Push and pull the workload.
simaai::neat::MeasureReport report = scope.stop();

simaai::neat::save_run_json(run, report, "run-after.json");

Do not export inside the measured hot loop unless the benchmark is explicitly end-to-end.

Read a run export

A run export is useful because it ties topology, runtime options, and measurements together in one artifact. When you open the JSON, start with the customer-facing evidence:

Section or fieldWhat it answers
graph.named_inputs / graph.named_outputsWhich public endpoints did this run expose?
graph.public_viewWhat did the app graph look like before runtime lowering?
run.output_materializationWere outputs owned, zero-copy, or selected automatically?
run.statsLifetime inputs, outputs, drops, and latency high-level counters.
run.graph_metrics.countersInputs, outputs, and drops for the exported run or measured window.
run.graph_metrics.windowThe measured time window when the export includes a MeasureReport.
run.node_metrics / run.plugin_metrics_unattributedWhich stages dominated runtime when detailed timing was enabled.
run.path_timingEdge/path timing when path timing data was collected.
run.graph_metrics.powerWhether power was collected, skipped, disabled, or unavailable.

Attach the run export with the model contract and the smallest reproducer when you ask for help. It is the black box recorder, minus the mystery.

Debug a graph run

When a graph fails, inspect what you built before changing options.

  1. Validate the graph.
  2. Inspect public graph endpoints before build.
  3. Inspect runtime endpoints after build.
  4. Pull with a status-aware path when timeout, closed, and error must mean different things.
  5. Export the run after the workload has executed.
simaai::neat::GraphReport report = graph.validate();
std::cout << report.to_json() << "\n";

auto run = graph.build();

simaai::neat::Sample sample;
simaai::neat::PullError error;

switch (run.pull("classes", /*timeout_ms=*/1000, sample, &error)) {
case simaai::neat::PullStatus::Ok:
// Use sample.
break;
case simaai::neat::PullStatus::Timeout:
// No output arrived before the timeout.
break;
case simaai::neat::PullStatus::Closed:
// End of stream. Stop draining.
break;
case simaai::neat::PullStatus::Error:
std::cerr << error.code << ": " << error.message << "\n";
if (error.report) {
std::cerr << error.report->repro_note << "\n";
}
break;
}

Collect evidence for support

When a graph fails in an application, capture the smallest evidence packet that explains the public behavior. Do this before changing options. Evidence beats folklore.

Include:

  • the model artifact name and how it was produced;
  • Neat version/build information;
  • input shape, dtype, layout, pixel format, and payload family;
  • graph.validate().to_json() when build or validation fails;
  • run.input_names() and run.output_names() for endpoint failures;
  • a run export JSON after at least one sample has moved when runtime behavior is the issue;
  • the MeasureReport JSON or text when the issue is throughput, latency, or power;
  • the smallest runnable snippet that reproduces the behavior.

Python can capture version and run evidence directly:

print(pyneat.build_info())

report = graph.validate()
with open("graph-report.json", "w", encoding="utf-8") as f:
f.write(report.to_json())

# After samples have moved through the run:
run.save_json("run-after.json")

C++ can export the same evidence with GraphReport::to_json() and save_run_json(...):

std::cout << "neat_version=" << sima_neat_version() << "\n";
std::cout << graph.validate().to_json() << "\n";

// After samples have moved through the run:
simaai::neat::save_run_json(run, "run-after.json");

If the failure only appears under load, attach the measured run export instead of a build-time snapshot. Build-time export says what Neat built; after-run export says what happened when the graph fought real input.

For thrown failures, catch NeatError and read the structured report:

try {
auto run = graph.build();
} catch (const simaai::neat::NeatError& error) {
const auto& report = error.report();
std::cerr << report.error_code << "\n";
std::cerr << report.repro_note << "\n";
}

Troubleshoot slow or missing output

If throughput is low or outputs disappear, check these first:

  1. Are you building the graph inside the measured loop?
  2. Are you pushing one input, waiting for the whole graph to go idle, then pushing the next?
  3. Is the app pulling continuously?
  4. Is one output branch blocking the whole graph?
  5. Are you holding zero-copy or runtime-backed outputs too long?
  6. Are queues too shallow for jitter, or too deep to expose backpressure?
  7. Is the overload policy explicit?
  8. Are drops counted through on_input_drop or local try_push(...) failures?
  9. Does every expected stream_id produce output in the measured window?
  10. Are logs, decoding checks, file I/O, or report export inside the hot loop?

Fix correctness first. Then make it fast. Then prove which one you measured.

See also