Install and quickstart
Supported platforms, and how to install and run a simple example using the JS/TS SDK.
Supported platforms
JS runtimes
- Node.js v22.17
- Bare v1.24
- Hermes v0.13
OS
- macOS
- Linux with GLIBC >= v2.38
- Windows
- iOS
- Android
Installation
npm install @qvac/sdkLinux
Peer dependency:
apt install vulkan-sdkExpo
- Peer dependencies:
npm i expo-file-system react-native-bare-kit-
On Android, bump
minSdkVersionto 29, by addingext { minSdkVersion=29 }toandroid/build.gradleor usingexpo-build-properties. -
Add the QVAC Expo plugin to
app.json:
export default {
expo: {
plugins: ["@qvac/sdk/expo-plugin"],
},
};- Prebuild your project to generate the native files:
npx expo prebuild- Build and run it on a physical device:
npx expo run:ios --device
# or
npx expo run:android --deviceDue to limitations with llamacpp, QVAC currently does not run on emulators.
You must use a physical device.
Flow
Before you can use a model, you need to load it from some location into memory. Flow for performing AI inference:
- Load one model by calling function
loadModel(). - Perform inference and other use cases by calling the appropriate functions from SDK API — e.g.,
completion(). - After finishing using the model, call
unloadModel()to release computer resources.
Quickstart
Run the self-contained example below to verify that the SDK is working:
import {
loadModel,
LLAMA_3_2_1B_INST_Q4_0,
completion,
unloadModel,
} from "@qvac/sdk";
try {
// Load a model into memory
const modelId = await loadModel({
modelSrc: LLAMA_3_2_1B_INST_Q4_0,
modelType: "llm",
onProgress: (progress) => {
console.log(progress);
},
});
// You can use the loaded model multiple times
const history = [
{
role: "user",
content: "Explain quantum computing in one sentence",
},
];
const result = completion({ modelId, history, stream: true });
for await (const token of result.tokenStream) {
process.stdout.write(token);
}
// Unload model to free up system resources
await unloadModel({ modelId });
} catch (error) {
console.error("❌ Error:", error);
process.exit(1);
}