QVAC Logo
How-to guides

Logging

Visibility into what's happening during loading, inference, and other operations.

Overview

QVAC provides two complementary logging primitives:

  • loggingStream(): stream real-time logs emitted by the SDK server and native addons (llamacpp, whispercpp, etc.). You decide what to do with each log line (print, persist, filter).
  • getLogger(): create a logger for your own application code (namespaced, configurable level, optional transports).

Functions

  1. getLogger() — create a logger
  2. loadModel() — pass logger via logger option
  3. loggingStream() — stream real-time logs from models or SDK server

For how to use each function, see SDK — API reference.

Flow

  1. Pass a logger when loading your model.
  2. When logging is enabled, you'll see real-time logs from the underlying model libraries:
[DEBUG] llamacpp:llm: Loading model weights...
[INFO] llamacpp:llm: Model loaded successfully, vocab_size=32000
[DEBUG] llamacpp:llm: Starting inference...
[DEBUG] llamacpp:llm: Inference completed, tokens=12

Features

  • Streaming API (loggingStream) — Consume real-time logs programmatically. Stream either:

    • SDK server logs using SDK_LOG_ID, or
    • per-model addon logs using the model ID returned by loadModel().
  • Logger API (getLogger) — Create loggers for your application code with custom transports. Console output enabled by default; set enableConsole: false to use only custom transports.

It works for all model types (LLM, Whisper, NMT, Embeddings) and provides valuable insight into model performance and behavior.

Configuration

To configure global logging (level and console output), use a config file (qvac.config.json, qvac.config.js, or qvac.config.ts) and set:

  • loggerLevel: "error" | "warn" | "info" | "debug"
  • loggerConsoleOutput: boolean

Example

The following script shows an example of streaming logs from loaded models:

logging.js
import { loadModel, completion, unloadModel, loggingStream, SDK_LOG_ID, LLAMA_3_2_1B_INST_Q4_0, GTE_LARGE_FP16, VERBOSITY, embed, } from "@qvac/sdk";
try {
    console.log("🚀 Starting log streaming demo...\n");
    // Note: To configure logging (level and console output), use config file:
    // { "loggerLevel": "debug", "loggerConsoleOutput": false } in qvac.config.json/js/ts
    // Subscribe to SDK server logs in background
    console.log("📡 Starting SDK server log stream...\n");
    (async () => {
        for await (const log of loggingStream({ id: SDK_LOG_ID })) {
            console.log(`[SDK] [${log.level.toUpperCase()}] [${log.namespace}] ${log.message}`);
        }
    })().catch(() => {
        // Stream terminated - normal on shutdown
    });
    // Load models
    console.log("📥 Loading models (watch SDK logs above)...\n");
    const llmModelId = await loadModel({
        modelSrc: LLAMA_3_2_1B_INST_Q4_0,
        modelType: "llm",
        modelConfig: {
            ctx_size: 2048,
            temp: 0.7,
            verbosity: VERBOSITY.ERROR, // Only log errors, remaining logs are captured by loggingStream
        },
    });
    const embedModelId = await loadModel({
        modelSrc: GTE_LARGE_FP16,
        modelType: "embeddings",
    });
    console.log("📡 Starting model-specific log streams...\n");
    (async () => {
        for await (const log of loggingStream({ id: llmModelId })) {
            const timestamp = new Date(log.timestamp).toISOString();
            console.log(`[LLM] [${timestamp}] [${log.level.toUpperCase()}] ${log.namespace}: ${log.message}`);
        }
    })().catch(() => {
        // Stream terminated - this is normal when model unloads
    });
    (async () => {
        for await (const log of loggingStream({ id: embedModelId })) {
            const timestamp = new Date(log.timestamp).toISOString();
            console.log(`[EMBED] [${timestamp}] [${log.level.toUpperCase()}] ${log.namespace}: ${log.message}`);
        }
    })().catch(() => {
        // Stream terminated - this is normal when model unloads
    });
    const messages = [
        { role: "user", content: "Count from 1 to 5 and explain each number." },
    ];
    const result = completion({
        modelId: llmModelId,
        history: messages,
        stream: true,
    });
    const embedding = await embed({
        modelId: embedModelId,
        text: messages[0]?.content ?? "Hello, world!",
    });
    console.log("📝 Response:\n");
    for await (const token of result.tokenStream) {
        process.stdout.write(token);
    }
    console.log("Embedding (first 20 elements)", embedding.slice(0, 20));
    console.log("Embeddings length", embedding.length);
    console.log("\n💡 Notice three log streams running:\n" +
        "   - [SDK] = SDK server operations\n" +
        "   - [LLM] = LLM model inference logs\n" +
        "   - [EMBED] = Embedding model logs\n");
    await unloadModel({ modelId: llmModelId, clearStorage: false });
    await unloadModel({ modelId: embedModelId, clearStorage: false });
}
catch (error) {
    console.error("❌ Error:", error);
    process.exit(1);
}

Tip: all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see SDK quickstart.

On this page