Run a Graph
A Graph is the plan. A Run is the live execution handle.
Use this page after you have authored a graph. If you still need to decide which nodes or fragments belong in the graph, start with Graph. If the graph already looks right, this page is where you make it run, drain, measure, and survive real input.
Choose one-shot or reusable execution
Use the smallest runtime path that fits the job:
| Need | Use | Why |
|---|---|---|
| Run one input and get one output | Graph.run(...) | Shortest one-shot path. |
| Push many inputs over time | Graph.build(...) and Run | Reuses the runtime and exposes push/pull control. |
| Use named inputs or outputs | Graph.build(...) and named run.push(...) / run.pull(...) | Keeps multi-input and multi-output apps explicit. |
| Let source nodes drive the graph | Graph.build() or Graph.run() with no app input | Use when the graph owns a camera, file, RTSP, or other source node. |
| Measure, export, drain, or stop deliberately | Run | Gives you lifecycle and diagnostics control. |
No magic. Build the graph, run it, inspect the result.
Choose how input enters the graph
Before you tune queues, decide who owns input.
| Graph style | How input enters | How you run it |
|---|---|---|
| App-pushed graph | Your application calls Graph.run(input, ...), run.run(input, ...), run.push(...), or run.try_push(...) | Build or run with input. Inspect endpoint names before pushing into multi-input graphs. |
| Source-owned graph | The graph contains a source node or fragment, such as file, camera, RTSP, or stream input | Build or run without app input: graph.build() or graph.run(). Pull outputs, use output nodes, or use callbacks depending on the graph. |
If the graph owns the source, do not push into it. Inspect what it emits instead.
Run a source-owned graph
If the graph contains its own source node, build or run it without app input. Do not push into a graph that already owns the source. Pull named outputs when the graph exposes them; let sink nodes handle output when the graph ends in a sink.
Use graph.run() for a source-to-sink job where the graph owns both input and output. Use graph.build() when your app needs to pull results, measure the run, or stop it deliberately.
auto run = graph.build();
while (running && run.can_pull()) {
auto sample = run.pull("detections", /*timeout_ms=*/1000);
if (!sample) {
continue;
}
handle(*sample);
}
run.close();
For long-running sources, make your application decide when to exit the loop and call close(). A timeout means no output arrived in that window; it does not always mean the source is done.
Run once
Use Graph.run(...) when you want one synchronous push/pull operation.
simaai::neat::Graph graph("classifier");
graph.add(simaai::neat::nodes::Input("image"));
graph.add(model);
graph.add(simaai::neat::nodes::Output("classes"));
simaai::neat::TensorList outputs = graph.run(std::vector<cv::Mat>{frame});
In Python, pass a list or tuple. graph.run([tensor]) means “one graph input,” not “add a batch dimension.”
Build a reusable Run
Use Graph.build(...) when your application owns the loop.
auto run = graph.build();
run.push("image", std::vector<cv::Mat>{frame});
simaai::neat::TensorList outputs = run.pull_tensors("classes", /*timeout_ms=*/2000);
run.close_input();
while (auto sample = run.pull(/*timeout_ms=*/100)) {
// Drain remaining output after end-of-input.
}
run.close();
Use close_input() when you are done pushing and want in-flight work to finish. Use close() when you want to tear the run down; C++ also exposes stop() as the immediate-stop spelling.
Use a reusable Run for request/response
Graph.run(...) is the shortest one-shot path. If you want the same request/response shape without rebuilding the graph each time, build a reusable Run once and call run.run(...).
Use this when:
- the graph stays alive for many requests;
- each request should still wait for its own output;
- you do not need a separate producer thread and consumer thread yet.
auto run = graph.build();
for (const auto& frame : frames) {
simaai::neat::TensorList outputs = run.run(
std::vector<cv::Mat>{frame},
/*timeout_ms=*/2000);
handle(outputs);
}
run.close();
Move from run.run(...) to explicit push(...) / pull(...) when you need in-flight work, producer/consumer threads, non-blocking push, named output polling, or drain control.
Inspect runtime endpoints
Before you push into a multi-input graph, ask the Run what names it accepts.
auto run = graph.build();
for (const auto& name : run.input_names()) {
std::cout << "input: " << name << "\n";
}
for (const auto& name : run.output_names()) {
std::cout << "output: " << name << "\n";
}
If a graph has more than one public input or output, use named push(...) and pull(...). Neat should not have to guess which wire you meant.
Run multi-input and multi-output graphs
For multi-input graphs, push one named endpoint at a time, or push an unnamed list only when the graph has one unambiguous input route.
run.push("left", simaai::neat::TensorList{left_tensor});
run.push("right", simaai::neat::TensorList{right_tensor});
auto boxes = run.pull_tensors("detections", /*timeout_ms=*/2000);
auto preview = run.pull("preview", /*timeout_ms=*/2000);
When you combine streams, preserve the matching key that the graph expects. CombinePolicy::ByFrame needs frame_id; CombinePolicy::ByPts needs pts_ns. Missing keys should fail loudly. Silent joins are how bugs get promoted to architecture.
Choose run options
RunOptions controls runtime behavior. Start with defaults. Change options when the source, output lifetime, or throughput target needs a different policy.
| Workload | Start with | Why |
|---|---|---|
| First working app | default RunOptions | Prove correctness before tuning. |
| Live camera or RTSP input | RunPreset::Realtime; OutputOptions::Latest() where output freshness matters | Fresh frames beat complete history. The realtime preset resolves to latest-frame overflow unless you override it. |
| File or batch processing | RunPreset::Reliable; OutputOptions::EveryFrame(...) | Preserve every input and expose backpressure. The reliable preset resolves to blocking overflow unless you override it. |
| Normal app serving | RunPreset::Balanced | Good default once the graph works. |
| Jittery source needs bounded buffering | queue_depth | Increase only enough to absorb jitter. A deep queue can hide stale frames and delayed backpressure. |
| App stores outputs after pull | OutputMemory::Owned | Keeps output lifetime independent of runtime buffers. |
| App consumes outputs immediately | OutputMemory::Auto | Let Neat choose the right ownership path first. |
| Default wait time should be explicit | input_timeout_ms | Sets the default timeout for build/run input-mode paths. Per-call timeouts still win. |
| Seeded build should catch first-sample errors early | startup_preflight = true | Keeps seeded build honest. Disable only when first-sample failures can surface later through pull(...) or last_error(). |
| Source buffer lifetime is short | advanced.copy_input = true | Protects input memory that may disappear after push(...). |
| Input size needs a guardrail | advanced.max_input_bytes | Rejects oversized input before it enters the graph. |
| You need drop telemetry | on_input_drop | Counts overload and size-guard drops by stream and reason. |
| You need build-time evidence | run_export | Writes a run snapshot when the run is built. |
simaai::neat::RunOptions options;
options.preset = simaai::neat::RunPreset::Realtime;
options.on_input_drop = [](const simaai::neat::InputDropInfo& drop) {
std::cerr << "dropped input from stream " << drop.stream_id
<< ": " << drop.reason << "\n";
};
auto run = graph.build(options);
Do not set every knob because it exists. The fastest way to get lost is to tune before you have a baseline.
Runtime option recipes
Copy the shape of these recipes, not the numbers. Queue sizes and output limits depend on the model, the source rate, and how fast your app pulls results.
Low-latency live output
Use this when the next frame matters more than the complete frame history. Set the output queue policy when you add the output node; set the input/drop policy when you build the Run.
graph.add(simaai::neat::nodes::Output(
"detections",
simaai::neat::OutputOptions::Latest()));
simaai::neat::RunOptions options;
options.preset = simaai::neat::RunPreset::Realtime;
auto run = graph.build(options);
This recipe keeps the newest useful result instead of building a museum of stale frames. Pull continuously and count drops by stream.
Lossless batch output
Use this when every input should produce its corresponding output and backpressure is better than loss.
graph.add(simaai::neat::nodes::Output(
"result",
simaai::neat::OutputOptions::EveryFrame(/*max_buffers=*/64)));
simaai::neat::RunOptions options;
options.preset = simaai::neat::RunPreset::Reliable;
auto run = graph.build(options);
Close input when the producer is done, then drain the output. If input count and output count diverge, inspect the model contract before blaming the runtime.
Owned output lifetime
Use owned output when your app stores tensors after pull(...) returns or hands them to another thread. Keep Auto for first-run code and change this only when lifetime requires it.
simaai::neat::RunOptions options;
options.output_memory = simaai::neat::OutputMemory::Owned;
auto run = graph.build(options);
Seed build when shape or format must be proven early
Most reusable runs can build without input:
run = graph.build()
Use seeded build(input, ...) when the first real input should prove shape, format, caps, or byte-guard behavior before the app enters the streaming loop.
auto run = graph.build(std::vector<cv::Mat>{frame});
startup_preflight is on by default for seeded builds, so the seed catches payload-level failures while building. If build fails, the structured report can include build_adaptation: the seed shape, dynamic limits, byte guard, and adaptation actions Neat tried. Use it to debug evidence, not vibes.
Handle backpressure
Backpressure means the graph cannot accept or emit data as fast as the app wants.
Use these controls deliberately:
queue_depthcontrols how much work can wait in runtime queues.overflow_policy = Blockapplies backpressure to the producer.overflow_policy = KeepLatestdrops older queued input so live streams stay fresh.overflow_policy = DropIncomingrejects new input when the queue is full.try_push(...)returnsfalseinstead of blocking.on_input_dropreports dropped input withInputDropInfofields such asstream_id,frame_id,port_name, andreason.
For threading, use one push thread and one pull thread for a Run. Do not push to the same Run concurrently from multiple threads unless your app serializes those calls.
Use a simple threading pattern
For live or high-throughput app-pushed graphs, start with two application threads:
- A producer thread stamps metadata and calls
push(...)ortry_push(...). - A consumer thread pulls continuously and releases or copies outputs quickly.
Add more threads around your own queues, not around the same Run. The hot loop should be boring. Boring is fast.
auto run = graph.build(options);
std::thread producer([&] {
while (auto sample = next_sample()) {
sample->stream_id = current_stream_id();
sample->frame_id = next_frame_id();
if (!run.try_push("image", *sample)) {
count_local_drop(sample->stream_id);
}
}
run.close_input();
});
std::thread consumer([&] {
simaai::neat::Sample output;
simaai::neat::PullError error;
while (true) {
switch (run.pull("detections", /*timeout_ms=*/100, output, &error)) {
case simaai::neat::PullStatus::Ok:
handle_output(output);
break;
case simaai::neat::PullStatus::Timeout:
continue;
case simaai::neat::PullStatus::Closed:
return;
case simaai::neat::PullStatus::Error:
record_runtime_error(error);
return;
}
}
});
producer.join();
consumer.join();
run.close();
In C++, use the status-aware pull(...) overload when timeout, end-of-stream, and errors must be handled differently. In Python, pull(...) returns None when no sample is returned for that call, so pair it with your own producer/shutdown state.
Close, drain, or stop deliberately
Pick the shutdown path that matches your intent. Do not keep pushing into a run that is closing.
| Intent | Use | What to do next |
|---|---|---|
| Finish queued work after the last input | close_input() | Keep pulling until the output is drained. In C++, status-aware pull returns PullStatus::Closed at end-of-stream. |
| Cancel now | stop() | Stop producers and let waiting pulls unblock. Use this for shutdown or failure paths, not normal batch drain. |
| Release runtime resources | close() | Call after drain or cancellation, or let the Run object leave scope. |
For batch work, close input, drain output, then close the run. For live work, stop producers first, then stop or close the run. No zombie producers, no haunted queues.
Choose output ownership
OutputMemory controls how pulled tensors relate to runtime buffers:
Auto: let Neat choose. Use this first.Owned: copy output into framework-owned memory. Use this when another thread or object stores tensors after pull.ZeroCopy: share runtime storage. Use this only when the page or example explains the lifetime rules.
If throughput falls off a cliff, check whether the app is holding output samples too long. Zero-copy can be fast, but pinned buffers are still pinned buffers.
Preserve stream identity
Multistream graphs need identity before they need tuning. Preserve stream_id and frame_id so you can prove fairness, detect starvation, and count drops.
auto sample = simaai::neat::Sample::from_image(
frame,
simaai::neat::ImageSpec::PixelFormat::BGR,
simaai::neat::TensorMemory::CPU);
sample.stream_id = camera_id;
sample.frame_id = frame_number++;
if (!run.try_push("image", sample)) {
// Count local backpressure here. Runtime drops also flow through on_input_drop.
}
For source-owned graphs, pick source nodes that preserve or stamp stream metadata. For app-pushed graphs, your app owns that metadata.
Scale from one stream to many
Start with one stream. Then scale the topology and runtime policy on purpose.
| Pattern | Use it when | Watch |
|---|---|---|
| One stream -> one model -> one output | Building the first correct path | Output shape, dtype, and latency. |
| Many streams -> one model lane | Aggregate input rate fits one model path | Per-stream fairness and stale streams. |
| Many streams -> multiple model lanes | One lane cannot keep up | Stream partitioning, route naming, and output accounting. |
| One stream -> several models | Different decisions need the same input | Branch-level latency and target-normalized FPS. |
| Many streams -> model + metadata/video outputs | Production app emits several artifacts | Count target outputs separately from preview or telemetry outputs. |
When connecting live graph fragments, GraphLinkOptions can select realtime latest-by-stream behavior. Use it when freshness matters more than preserving every frame across a live fan-in.
Run source-owned multistream graphs
For camera-heavy apps, the graph often owns the streams. In that shape, source groups feed the model path and your app pulls results. You still need the same throughput discipline:
- give each source a stable
stream_id; - use realtime latest-by-stream behavior on live fan-in links when freshness matters;
- pull outputs continuously;
- count outputs per stream, not only in aggregate;
- export the run after the measured window if one stream starves or drops frames.
| Source-owned choice | Start with | Why |
|---|---|---|
| One camera per graph | One source group, one model path, one output | Easiest way to prove the camera, model, and output contract. |
| Many cameras into one model lane | Source fragments connected to one model fragment with GraphLinkOptions for live fan-in | Keeps one model lane busy while preserving per-stream identity. |
| Many cameras across lanes | Partition source fragments across several graph lanes | Use when one model lane saturates. Measure each lane and each stream. |
| Video output handled by the graph | Sink groups such as VideoSender(...) or H.264/UDP output groups | Use when the app should not pull and transmit every frame itself. |
If the graph owns the sources, build with graph.build() and stop it deliberately. Do not push app input into a graph that already has its own source nodes.
Drive many streams through one model lane
Use one public input endpoint when several live streams share the same model lane. Stamp each sample with stream_id and frame_id, use the realtime preset, and pull continuously. The rebel move is boring but effective: never let the output queue become your hidden bottleneck.
simaai::neat::RunOptions options;
options.preset = simaai::neat::RunPreset::Realtime;
auto run = graph.build(options);
while (running) {
for (const auto& camera : cameras) {
auto sample = simaai::neat::Sample::from_image(
camera.frame(),
simaai::neat::ImageSpec::PixelFormat::BGR,
simaai::neat::TensorMemory::CPU);
sample.stream_id = camera.id();
sample.frame_id = camera.next_frame_id();
if (!run.try_push("image", sample)) {
++local_drop_count[camera.id()];
}
}
while (auto output = run.pull("detections", /*timeout_ms=*/0)) {
count_output_by_stream(output->stream_id);
}
}
run.close_input();
while (auto output = run.pull("detections", /*timeout_ms=*/1000)) {
count_output_by_stream(output->stream_id);
}
run.close();
This pattern maximizes useful throughput only when the model lane can keep up with the accepted input rate. If one lane saturates, split streams across more lanes or lower the offered rate. Do not bury stale frames under a mountain of queue depth.
Split streams across model lanes
When one model lane is saturated, add lanes instead of hiding overload behind deeper queues. A lane is usually one Graph plus one Run with its own model route names and graph element prefix. Partition streams by a stable key, then measure each lane and each stream.
auto build_lane = [&](int lane_index) {
const std::string lane_name = "lane" + std::to_string(lane_index);
simaai::neat::Model::Options model_options;
model_options.name_suffix = "_" + lane_name;
simaai::neat::Model lane_model(model_path, model_options);
simaai::neat::GraphOptions graph_options;
graph_options.element_name_prefix = lane_name + "_";
simaai::neat::Graph graph("detector_" + lane_name, graph_options);
graph.add(simaai::neat::nodes::Input("image"));
graph.add(lane_model);
graph.add(simaai::neat::nodes::Output(
"detections",
simaai::neat::OutputOptions::Latest()));
simaai::neat::RunOptions run_options;
run_options.preset = simaai::neat::RunPreset::Realtime;
return graph.build(run_options);
};
std::vector<simaai::neat::Run> lanes;
lanes.emplace_back(build_lane(0));
lanes.emplace_back(build_lane(1));
while (running) {
for (const auto& camera : cameras) {
auto sample = make_sample_for_camera(camera);
const std::size_t lane_index = camera.id() % lanes.size();
if (!lanes[lane_index].try_push("image", sample)) {
++drop_count_by_lane[lane_index];
}
}
for (std::size_t lane_index = 0; lane_index < lanes.size(); ++lane_index) {
while (auto output = lanes[lane_index].pull("detections", /*timeout_ms=*/0)) {
count_output(lane_index, output->stream_id);
}
}
}
Keep the partition stable so stream identity and cache behavior stay predictable. If lane 0 starves while lane 1 is idle, the partitioning policy is the bug.
Tune the model lane deliberately
If a graph is correct but cannot meet the offered stream rate, first prove where the bottleneck lives. Do not start by making every queue bigger. That hides overload and gives stale frames a place to retire.
Use this triage:
| Symptom | First check | Then try |
|---|---|---|
| Accepted input FPS is high, but output FPS is low | The model lane or postprocess lane is saturated | Split streams across lanes, reduce offered rate, or test advanced_execution.inference_async on the model route or graph options. |
try_push(...) returns false often | The ingress queue is full | Pull continuously, reduce the offered rate, or choose an explicit OverflowPolicy. |
| One stream disappears in aggregate metrics | Missing or uneven stream_id accounting | Count outputs and drops per stream; use live latest-by-stream behavior for live fan-in. |
| Output stalls while input keeps moving | The app is not pulling fast enough, or it holds runtime-backed outputs | Pull in a dedicated loop and release/copy outputs before pushing more. |
| Latency grows over time | Queues are absorbing old work | Use a smaller queue, RunPreset::Realtime, or OutputOptions::Latest() where freshness wins. |
When you need to test model-route execution behavior, set one advanced execution field at a time and measure the same workload before and after:
simaai::neat::GraphOptions graph_options;
graph_options.advanced_execution.inference_async = true;
simaai::neat::Graph graph("detector", graph_options);
If the change does not improve the measured path, revert it. A knob that cannot prove its value does not belong in the app.
Pick a throughput recipe
Start from the workload, not from a random queue number.
| Workload | Runtime shape | Start with | Prove it with |
|---|---|---|---|
| Single live stream | One reusable Run, one producer, one puller | RunPreset::Realtime; OutputOptions::Latest() for preview-style outputs | Accepted FPS, output FPS, drop count, and latency. |
| File or batch processing | One reusable Run; close input and drain | RunPreset::Reliable; OutputOptions::EveryFrame(...) | Input count equals output count, unless the model contract says otherwise. |
| Many live streams into one model lane | App-pushed Sample inputs with stream_id / frame_id, or source-owned fragments that stamp identity | RunPreset::Realtime; GraphLinkPolicy::RealtimeLatestByStream through GraphLinkOptions for live fan-in | Per-stream FPS and per-stream drops, not only aggregate FPS. |
| Many live streams across model lanes | Partition streams across multiple model instances or graph lanes | Same as the live-stream recipe per lane | Per-lane utilization, per-stream starvation, and target-normalized FPS. |
| One input fans out to several models | Branch once, then run separate model paths | Branch/fan-out in the Graph; choose output behavior per branch | Branch latency and target-normalized FPS. |
If one model lane is saturated, do not hide the problem behind a deeper queue. Split the work across lanes, lower the offered input rate, or choose an explicit drop policy. Queue depth buys tolerance for jitter; it does not create accelerator capacity.
Tune throughput without lying to yourself
Throughput is a loop shape, not one magic option.
- Build the graph once.
- Warm up outside the measurement window.
- Keep a bounded number of inputs in flight.
- Pull continuously so output queues do not become the bottleneck.
- Release or copy outputs before pushing more when output buffers may be shared with the runtime.
- Pick one overload policy: block, keep latest, or drop incoming.
- Preserve
stream_idandframe_id. - Close input and drain before stopping the run.
- Measure the right numbers.
- Export run evidence after the measured workload.
Measure these separately:
| Metric | Meaning |
|---|---|
| Offered input FPS | Inputs attempted per second, often streams * source_fps. |
| Accepted input FPS | Inputs accepted by push(...) or try_push(...) per second. |
| Aggregate output FPS | All pulled outputs per second across all outputs. |
| Per-stream FPS | Output rate for each stream_id. |
| Target-normalized FPS | Outputs that count toward the app's target result per second. Useful when one input fans out to several outputs. |
| Drop rate | Dropped or rejected inputs by stream_id, source, and reason. |
Aggregate FPS can look great while one stream starves. Per-stream metrics catch the crime.
Throughput loop shape
Use this shape for an app-pushed graph. Replace next_inputs() with your input source. Keep the loop boring: bounded in-flight work, continuous pulls, and no report export inside the hot path.
auto run = graph.build(options);
for (int i = 0; i < warmup_frames; ++i) {
run.push(next_inputs());
(void)run.pull(/*timeout_ms=*/5000);
}
auto measurement = run.start_measurement();
int in_flight = 0;
while (in_flight < max_in_flight && has_input()) {
if (run.push(next_inputs())) {
++inputs_sent;
++in_flight;
}
}
while (has_input() || in_flight > 0) {
auto output = run.pull(/*timeout_ms=*/1000);
if (output) {
++outputs_seen;
--in_flight;
output.reset(); // Do not pin runtime-backed buffers longer than needed.
}
while (has_input() && in_flight < max_in_flight) {
if (!run.try_push(next_inputs())) {
break;
}
++inputs_sent;
++in_flight;
}
}
run.close_input();
while (auto output = run.pull(/*timeout_ms=*/1000)) {
++outputs_seen;
}
simaai::neat::MeasureReport report = measurement.stop();
simaai::neat::save_run_json(run, report, "run_after_measurement.json");
run.close();
Keep per-frame logging, output validation, file download, source setup, and report export out of the measured hot loop unless you are explicitly measuring end-to-end behavior.
Measure and export evidence
Use start_measurement(...) to observe an application-owned push/pull window.
Use run export for evidence:
RunOptions.run_exportwrites a build-time snapshot.- C++
run_to_json(...)andsave_run_json(...)export a run after it has executed. - Python
run.json(...)andrun.save_json(...)export the same kind of evidence.
Enable power telemetry on the RunOptions that builds the run:
simaai::neat::RunOptions options;
options.enable_board_power(/*sample_interval_ms=*/100);
auto run = graph.build(options);
simaai::neat::MeasureOptions measure_options;
measure_options.include_power = true;
auto scope = run.start_measurement(measure_options);
Power data depends on board rail support and monitor configuration. Document the measurement setup with the numbers; do not make power numbers look portable when the rails are not.
Build-time export answers “what did Neat build?” After-run export answers “what happened while it ran?”
Export at build time and after execution
Use build-time export for CI artifacts and startup debugging:
simaai::neat::RunOptions options;
options.run_export.path = "run-build.json";
options.run_export.label = "classifier-startup";
auto run = graph.build(options);
Use after-run export after samples have moved through the graph:
auto scope = run.start_measurement();
// Push and pull the workload.
simaai::neat::MeasureReport report = scope.stop();
simaai::neat::save_run_json(run, report, "run-after.json");
Do not export inside the measured hot loop unless the benchmark is explicitly end-to-end.
Read a run export
A run export is useful because it ties topology, runtime options, and measurements together in one artifact. When you open the JSON, start with the customer-facing evidence:
| Section or field | What it answers |
|---|---|
graph.named_inputs / graph.named_outputs | Which public endpoints did this run expose? |
graph.public_view | What did the app graph look like before runtime lowering? |
run.output_materialization | Were outputs owned, zero-copy, or selected automatically? |
run.stats | Lifetime inputs, outputs, drops, and latency high-level counters. |
run.graph_metrics.counters | Inputs, outputs, and drops for the exported run or measured window. |
run.graph_metrics.window | The measured time window when the export includes a MeasureReport. |
run.node_metrics / run.plugin_metrics_unattributed | Which stages dominated runtime when detailed timing was enabled. |
run.path_timing | Edge/path timing when path timing data was collected. |
run.graph_metrics.power | Whether power was collected, skipped, disabled, or unavailable. |
Attach the run export with the model contract and the smallest reproducer when you ask for help. It is the black box recorder, minus the mystery.
Debug a graph run
When a graph fails, inspect what you built before changing options.
- Validate the graph.
- Inspect public graph endpoints before build.
- Inspect runtime endpoints after build.
- Pull with a status-aware path when timeout, closed, and error must mean different things.
- Export the run after the workload has executed.
simaai::neat::GraphReport report = graph.validate();
std::cout << report.to_json() << "\n";
auto run = graph.build();
simaai::neat::Sample sample;
simaai::neat::PullError error;
switch (run.pull("classes", /*timeout_ms=*/1000, sample, &error)) {
case simaai::neat::PullStatus::Ok:
// Use sample.
break;
case simaai::neat::PullStatus::Timeout:
// No output arrived before the timeout.
break;
case simaai::neat::PullStatus::Closed:
// End of stream. Stop draining.
break;
case simaai::neat::PullStatus::Error:
std::cerr << error.code << ": " << error.message << "\n";
if (error.report) {
std::cerr << error.report->repro_note << "\n";
}
break;
}
Collect evidence for support
When a graph fails in an application, capture the smallest evidence packet that explains the public behavior. Do this before changing options. Evidence beats folklore.
Include:
- the model artifact name and how it was produced;
- Neat version/build information;
- input shape, dtype, layout, pixel format, and payload family;
graph.validate().to_json()when build or validation fails;run.input_names()andrun.output_names()for endpoint failures;- a run export JSON after at least one sample has moved when runtime behavior is the issue;
- the
MeasureReportJSON or text when the issue is throughput, latency, or power; - the smallest runnable snippet that reproduces the behavior.
Python can capture version and run evidence directly:
print(pyneat.build_info())
report = graph.validate()
with open("graph-report.json", "w", encoding="utf-8") as f:
f.write(report.to_json())
# After samples have moved through the run:
run.save_json("run-after.json")
C++ can export the same evidence with GraphReport::to_json() and save_run_json(...):
std::cout << "neat_version=" << sima_neat_version() << "\n";
std::cout << graph.validate().to_json() << "\n";
// After samples have moved through the run:
simaai::neat::save_run_json(run, "run-after.json");
If the failure only appears under load, attach the measured run export instead of a build-time snapshot. Build-time export says what Neat built; after-run export says what happened when the graph fought real input.
For thrown failures, catch NeatError and read the structured report:
try {
auto run = graph.build();
} catch (const simaai::neat::NeatError& error) {
const auto& report = error.report();
std::cerr << report.error_code << "\n";
std::cerr << report.repro_note << "\n";
}
Troubleshoot slow or missing output
If throughput is low or outputs disappear, check these first:
- Are you building the graph inside the measured loop?
- Are you pushing one input, waiting for the whole graph to go idle, then pushing the next?
- Is the app pulling continuously?
- Is one output branch blocking the whole graph?
- Are you holding zero-copy or runtime-backed outputs too long?
- Are queues too shallow for jitter, or too deep to expose backpressure?
- Is the overload policy explicit?
- Are drops counted through
on_input_dropor localtry_push(...)failures? - Does every expected
stream_idproduce output in the measured window? - Are logs, decoding checks, file I/O, or report export inside the hot loop?
Fix correctness first. Then make it fast. Then prove which one you measured.