QVAC Logo

Install and quickstart

Supported platforms, and how to install and run a simple example using the JS/TS SDK.

Supported platforms

JS runtimes

  • Node.js \geq v22.17
  • Bare \geq v1.24
  • Hermes \geq v0.13

OS

  • macOS
  • Linux with GLIBC >= v2.38
  • Windows
  • iOS
  • Android

Installation

npm install @qvac/sdk

Linux

Peer dependency:

apt install vulkan-sdk

Expo

  1. Peer dependencies:
npm i expo-file-system react-native-bare-kit
  1. On Android, bump minSdkVersion to 29, by adding ext { minSdkVersion=29 } to android/build.gradle or using expo-build-properties.

  2. Add the QVAC Expo plugin to app.json:

export default {
  expo: {
    plugins: ["@qvac/sdk/expo-plugin"],
  },
};
  1. Prebuild your project to generate the native files:
npx expo prebuild
  1. Build and run it on a physical device:
npx expo run:ios --device
# or
npx expo run:android --device

Due to limitations with llamacpp, QVAC currently does not run on emulators. You must use a physical device.

Flow

Before you can use a model, you need to load it from some location into memory. Flow for performing AI inference:

  1. Load one model by calling function loadModel().
  2. Perform inference and other use cases by calling the appropriate functions from SDK API — e.g., completion().
  3. After finishing using the model, call unloadModel() to release computer resources.

Quickstart

Run the self-contained example below to verify that the SDK is working:

quickstart.ts
import {
  loadModel,
  LLAMA_3_2_1B_INST_Q4_0,
  completion,
  unloadModel,
} from "@qvac/sdk";

try {
  // Load a model into memory
  const modelId = await loadModel({
    modelSrc: LLAMA_3_2_1B_INST_Q4_0,
    modelType: "llm",
    onProgress: (progress) => {
      console.log(progress);
    },
  });

  // You can use the loaded model multiple times
  const history = [
    {
      role: "user",
      content: "Explain quantum computing in one sentence",
    },
  ];
  const result = completion({ modelId, history, stream: true });
  for await (const token of result.tokenStream) {
    process.stdout.write(token);
  }

  // Unload model to free up system resources
  await unloadModel({ modelId });
} catch (error) {
  console.error("❌ Error:", error);
  process.exit(1);
}

On this page