Compose GenAI into a Graph

Field	Value
Difficulty	Advanced
Estimated Read Time	20-25 minutes
Labels	`genai`, `graph`, `composition`, `streaming`, `advanced`

Most GenAI applications should start with direct model APIs. Graph composition becomes useful when GenAI needs to sit beside other Neat stages, named inputs, named outputs, routing, or application-level orchestration.

Walkthrough

Create a GenAI graph fragment

Create a task-specific model handle, configure graph-fragment options, and build a public Graph fragment.

The vision-language fragment exposes prompt, image, and use_cached_image inputs plus tokens, done, encoded, and error outputs. The speech transcriber fragment exposes audio and audio_path inputs plus tokens, done, and error outputs.

tutorials/022_compose_genai_into_graph/compose_genai_into_graph.cpp
auto model = std::make_shared<genai::VisionLanguageModel>(args.model);

genai::VisionLanguageOptions options;
options.system_prompt = "You are concise.";
options.max_new_tokens = 96;
options.streaming = true;
options.encode_images_on_input = false;

simaai::neat::Graph genai_fragment =
    genai::graphs::VisionLanguage(model, options, "genai_stage");

Add the fragment to an app graph

Add the fragment to a larger application graph. The fragment keeps its public endpoint names, so application code can push and pull by name.

tutorials/022_compose_genai_into_graph/compose_genai_into_graph.cpp
simaai::neat::Graph app("genai_app");
app.add(genai_fragment);
std::cout << app.describe() << "\n";

Build and push graph inputs

Build the graph into a Run, push an image sample to the image input, then push a text sample to the prompt input and let the GenAI stage produce tokens.

tutorials/022_compose_genai_into_graph/compose_genai_into_graph.cpp
simaai::neat::Run run = app.build();
if (!run.push("image", make_image_sample(args.image))) {
  throw std::runtime_error("push(image) failed: " + run.last_error());
}
if (!run.push("prompt", make_text_sample("prompt", "Describe this image in one sentence."))) {
  throw std::runtime_error("push(prompt) failed: " + run.last_error());
}

Pull tokens and completion metadata

Pull from tokens until a done sample arrives. The done sample is a bundle with fields such as generated token count and finish reason.

tutorials/022_compose_genai_into_graph/compose_genai_into_graph.cpp
std::cout << "assistant: ";
for (int i = 0; i < 256; ++i) {
  if (auto token = run.pull("tokens", 250)) {
    std::cout << sample_text(*token) << std::flush;
    continue;
  }
  if (auto done = run.pull("done", 10)) {
    (void)done;
    break;
  }
  if (auto error = run.pull("error", 10)) {
    throw std::runtime_error(sample_text(*error));
  }
}
std::cout << "\n";
run.close();

Run

On the Modalix DevKit, download the LFM2-VL 1.6B VLM from Hugging Face using the LLiMa CLI:

llima pull LFM2-VL-1.6B-a16w4

Run the tutorial on Modalix with the DevKit-local model directory and a local image:

C++ (prebuilt):

./lib/sima-neat/tutorials/tutorial_022_compose_genai_into_graph \
  --model /media/nvme/llima/models/LFM2-VL-1.6B-a16w4 \
  --image share/sima-neat/tutorials/assets/fronalpstock_1330.jpg

C++ (build from source):

./build.sh --target tutorial_022_compose_genai_into_graph
./build/tutorials-standalone/tutorial_022_compose_genai_into_graph \
  --model /media/nvme/llima/models/LFM2-VL-1.6B-a16w4 \
  --image share/sima-neat/tutorials/assets/fronalpstock_1330.jpg

Expected output prints the graph description and a streamed answer pulled from the tokens output.

In Practice

Use this pattern when GenAI is part of a larger application graph. Keep direct GenAIModel, VisionLanguageModel, and ASRModel calls for simple request/response application code.

Full source

Show the complete source programs

tutorials/022_compose_genai_into_graph/compose_genai_into_graph.cpp
#include "neat.h"

#include <opencv2/imgcodecs.hpp>
#include <opencv2/imgproc.hpp>

#include <filesystem>
#include <iostream>
#include <memory>
#include <stdexcept>
#include <string>

namespace genai = simaai::neat::genai;

struct Args {
  std::filesystem::path model;
  std::filesystem::path image;
};

Args parse_args(int argc, char** argv) {
  Args args;
  for (int i = 1; i < argc; ++i) {
    const std::string arg = argv[i];
    if (arg == "--model" && i + 1 < argc) {
      args.model = argv[++i];
    } else if (arg == "--image" && i + 1 < argc) {
      args.image = argv[++i];
    } else {
      throw std::runtime_error(
          "usage: compose_genai_into_graph --model <vlm_model_dir> --image <image>");
    }
  }
  if (args.model.empty() || args.image.empty()) {
    throw std::runtime_error("missing required --model <vlm_model_dir> or --image <image>");
  }
  return args;
}

simaai::neat::Sample make_text_sample(const std::string& port, const std::string& text) {
  return simaai::neat::make_tensor_sample(port, simaai::neat::Tensor::from_text(text));
}

simaai::neat::Sample make_image_sample(const std::filesystem::path& image_path) {
  cv::Mat bgr = cv::imread(image_path.string(), cv::IMREAD_COLOR);
  if (bgr.empty()) {
    throw std::runtime_error("failed to read image: " + image_path.string());
  }

  cv::Mat rgb;
  cv::cvtColor(bgr, rgb, cv::COLOR_BGR2RGB);
  return simaai::neat::make_tensor_sample(
      "image", simaai::neat::Tensor::from_cv_mat(rgb, simaai::neat::ImageSpec::PixelFormat::RGB,
                                                 simaai::neat::TensorMemory::CPU));
}

std::string sample_text(const simaai::neat::Sample& sample) {
  if (sample.kind == simaai::neat::SampleKind::Tensor && sample.tensor.has_value()) {
    return sample.tensor->to_text();
  }
  if (sample.kind == simaai::neat::SampleKind::TensorSet && sample.tensors.size() == 1U) {
    return sample.tensors.front().to_text();
  }
  return {};
}

int main(int argc, char** argv) {
  try {
    const Args args = parse_args(argc, argv);

    auto model = std::make_shared<genai::VisionLanguageModel>(args.model);

    genai::VisionLanguageOptions options;
    options.system_prompt = "You are concise.";
    options.max_new_tokens = 96;
    options.streaming = true;
    options.encode_images_on_input = false;

    simaai::neat::Graph genai_fragment =
        genai::graphs::VisionLanguage(model, options, "genai_stage");

    simaai::neat::Graph app("genai_app");
    app.add(genai_fragment);
    std::cout << app.describe() << "\n";

    simaai::neat::Run run = app.build();
    if (!run.push("image", make_image_sample(args.image))) {
      throw std::runtime_error("push(image) failed: " + run.last_error());
    }
    if (!run.push("prompt", make_text_sample("prompt", "Describe this image in one sentence."))) {
      throw std::runtime_error("push(prompt) failed: " + run.last_error());
    }

    std::cout << "assistant: ";
    for (int i = 0; i < 256; ++i) {
      if (auto token = run.pull("tokens", 250)) {
        std::cout << sample_text(*token) << std::flush;
        continue;
      }
      if (auto done = run.pull("done", 10)) {
        (void)done;
        break;
      }
      if (auto error = run.pull("error", 10)) {
        throw std::runtime_error(sample_text(*error));
      }
    }
    std::cout << "\n";
    run.close();

    return 0;
  } catch (const std::exception& e) {
    std::cerr << "error: " << e.what() << "\n";
    return 1;
  }
}

Walkthrough​

Create a GenAI graph fragment​

Add the fragment to an app graph​

Build and push graph inputs​

Pull tokens and completion metadata​

Run​

In Practice​

Full source​

Source​