OCR
Optical character recognition (OCR) for extracting text from images.
Overview
OCR uses ONNX runtime as the inference engine. It runs a two-stage pipeline and requires compatible models for both stages:
- Text detection: locate text regions in an image
- Text recognition: decode characters in detected regions
Load supported models using modelType: "ocr". Then, provide an image as either a file path (string) or an in-memory buffer. Each OCR block contains extracted text and may include bbox (bounding box coordinates) and confidence (recognition score).
Functions
Use the following sequence of function calls:
For how to use each function, see SDK — API reference.
Models
You can load any ONNX Runtime-compatible OCR pipeline. Required files: detector_craft.onnx + recognizer_<lang>.onnx (file format: *.onnx).
For models available as constants, see SDK — Models.
Example
The following script shows an example of OCR:
import { close, loadModel, ocr, OCR_LATIN_RECOGNIZER_1, unloadModel, } from "@qvac/sdk";
import path from "path";
import { fileURLToPath } from "url";
const __dirname = path.dirname(fileURLToPath(import.meta.url));
const imagePath = process.argv[2] || path.join(__dirname, "image/basic_test.bmp");
try {
console.log("🚀 Loading OCR model...");
const modelId = await loadModel({
modelSrc: OCR_LATIN_RECOGNIZER_1,
modelType: "ocr",
modelConfig: {
langList: ["en"],
useGPU: true,
timeout: 30000,
magRatio: 1.5,
defaultRotationAngles: [90, 180, 270],
contrastRetry: false,
lowConfidenceThreshold: 0.5,
recognizerBatchSize: 1,
},
});
console.log(`✅ Model loaded successfully! Model ID: ${modelId}`);
console.log(`\n🔍 Running OCR on: ${imagePath}`);
const { blocks } = ocr({
modelId,
image: imagePath,
options: {
paragraph: false,
},
});
const result = await blocks;
console.log("\n📝 OCR Results:");
console.log("================================");
for (const block of result) {
console.log(`\n📄 Text: ${block.text}`);
if (block.bbox) {
console.log(` 📍 BBox: [${block.bbox.join(", ")}]`);
}
if (block.confidence !== undefined) {
console.log(` ✓ Confidence: ${block.confidence}`);
}
}
console.log("\n================================");
console.log("\n🔄 Unloading model...");
await unloadModel({ modelId, clearStorage: false });
console.log("✅ Model unloaded successfully.");
process.exit(0);
}
catch (error) {
console.error("❌ Error during OCR processing:", error);
await close();
}Tip: all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see SDK quickstart.