Logging
Built-in log propagation that streams model logs from the worker process to your client application in real-time. This gives you visibility into what's happening inside your models during loading, inference, and other operations.
Flow
- Pass a logger when loading your model.
- When logging is enabled, you'll see real-time logs from the underlying model libraries:
[DEBUG] llamacpp:llm: Loading model weights...
[INFO] llamacpp:llm: Model loaded successfully, vocab_size=32000
[DEBUG] llamacpp:llm: Starting inference...
[DEBUG] llamacpp:llm: Inference completed, tokens=12Features
-
Streaming API (
loggingStream) — Consume real-time logs from your models programmatically. Console output disabled by default—you control formatting, storage (file/database), analytics, etc. -
Logger API (
getLogger) — Create loggers for your application code with custom transports. Console output enabled by default; setenableConsole: falseto use only custom transports.
It works for all model types (LLM, Whisper, NMT, Embeddings) and provides valuable insight into model performance and behavior.
Usage
import {
loadModel,
completion,
unloadModel,
loggingStream,
LLAMA_3_2_1B_INST_Q4_0,
GTE_LARGE_FP16,
VERBOSITY,
embed,
} from "@qvac/sdk";
try {
console.log("🚀 Starting addon log streaming demo...\n");
// Load model
const llmModelId = await loadModel({
modelSrc: LLAMA_3_2_1B_INST_Q4_0,
modelType: "llm",
modelConfig: {
ctx_size: 2048,
temp: 0.7,
verbosity: VERBOSITY.ERROR, // Only log errors, remaining logs are captured by loggingStream
},
});
const embedModelId = await loadModel({
modelSrc: GTE_LARGE_FP16,
modelType: "embeddings",
});
console.log("Starting log stream in background...\n");
(async () => {
for await (const log of loggingStream({ modelId: llmModelId })) {
const timestamp = new Date(log.timestamp).toISOString();
console.log(
`[LLM] [${timestamp}] [${log.level.toUpperCase()}] ${log.namespace}: ${log.message}`,
);
}
})().catch(() => {
// Stream terminated - this is normal when model unloads
});
(async () => {
for await (const log of loggingStream({ modelId: embedModelId })) {
const timestamp = new Date(log.timestamp).toISOString();
console.log(
`[EMBED] [${timestamp}] [${log.level.toUpperCase()}] ${log.namespace}: ${log.message}`,
);
}
})().catch(() => {
// Stream terminated - this is normal when model unloads
});
const messages = [
{ role: "user", content: "Count from 1 to 5 and explain each number." },
];
const result = completion({
modelId: llmModelId,
history: messages,
stream: true,
});
const embedding = await embed({
modelId: embedModelId,
text: messages[0]?.content ?? "Hello, world!",
});
console.log("📝 Response:\n");
for await (const token of result.tokenStream) {
process.stdout.write(token);
}
console.log("Embedding (first 20 elements)", embedding.slice(0, 20));
console.log("Embeddings length", embedding.length);
await unloadModel({ modelId: llmModelId, clearStorage: false });
await unloadModel({ modelId: embedModelId, clearStorage: false });
} catch (error) {
console.error("❌ Error:", error);
process.exit(1);
}