loadModel( )
Loads a machine learning model from a local path, remote URL, or Hyperdrive key.
function loadModel(options): Promise<string>;Description
This function supports multiple model types: LLM (Large Language Model), Whisper (speech recognition), embeddings, NMT (translation), and TTS. It can handle both local file paths and Hyperdrive URLs (pear://).
When onProgress is provided, the function uses streaming to provide real-time download progress. Otherwise, it uses a simple request-response pattern for faster execution.
Accepted forms
Load new model
function loadModel(options): Promise<string>;Hot-reload existing model config
function loadModel(options): Promise<string>;Parameters
| Name | Type | Required? | Description |
|---|---|---|---|
| options | object | ✓ | Model loading configuration |
options (new model)
| Field | Type | Required? | Description |
|---|---|---|---|
| modelSrc | string | ✓ | The location from which the model weights are fetched (local path, remote URL, or Hyperdrive URL) |
| modelType | "llm" | "whisper" | "embeddings" | "nmt" | "tts" | ✓ | The type of model |
| modelConfig | object | ✗ | Model-specific configuration options |
| projectionModelSrc | string | ✗ | (LLM only) Projection model source for multimodal models |
| vadModelSrc | string | ✗ | (Whisper only) VAD model source for voice activity detection |
| configSrc | string | ✗ | (TTS only) Config file source |
| eSpeakDataPath | string | ✗ | (TTS only) Path to eSpeak data |
| delegate | object | ✗ | Delegation configuration for P2P inference |
| onProgress | function | ✗ | Callback for download progress updates |
| logger | Logger | ✗ | Logger instance for model operation logs |
options (hot-reload)
| Field | Type | Required? | Description |
|---|---|---|---|
| modelId | string | ✓ | The ID of an existing loaded model |
| modelType | "llm" | "whisper" | "embeddings" | "nmt" | "tts" | ✓ | The type of model (must match) |
| modelConfig | object | ✓ | New configuration to apply |
modelConfig (varies by model type)
Configuration options depend on the modelType. Common options include:
LLM models:
ctx_size(number): Context window sizedevice("cpu" | "gpu"): Device to usetemp(number): Temperaturetop_p(number): Top-p samplingtop_k(number): Top-k samplingseed(number): Random seedn_predict(number): Max tokens to predict
Whisper models:
language(string): Language code (e.g., "en")translate(boolean): Whether to translate to Englishstrategy("greedy" | "beam_search"): Sampling strategytemperature(number): Temperaturevad_params(object): VAD parameters
Embeddings models:
device("cpu" | "gpu"): Device to usectx_size(number): Context size
NMT models:
from(string): Source languageto(string): Target languagetemperature(number): Temperature
TTS models:
language(string): Language code
delegate
| Field | Type | Required? | Description |
|---|---|---|---|
| topic | string | ✓ | P2P topic for delegation |
| providerPublicKey | string | ✓ | Provider's public key |
| timeout | number | ✗ | Timeout in milliseconds |
| fallbackToLocal | boolean | ✗ | Whether to fallback to local inference if delegation fails |
onProgress callback
(progress: ModelProgressUpdate) => voidModelProgressUpdate object:
| Field | Type | Description |
|---|---|---|
| percentage | number | Download percentage (0-100) |
| downloaded | number | Bytes downloaded |
| total | number | Total bytes |
| downloadKey | string | Key for canceling download |
| shardInfo | object | Shard information (if sharded model) |
Returns
Promise<string> — Promise that resolves to the model ID (either the provided modelSrc or a generated ID)
Throws
- When model loading fails, with details in the error message
- When streaming ends unexpectedly (only when using onProgress)
- When receiving an invalid response type from the server
Example
// Local file path - absolute path
const localModelId = await loadModel({
modelSrc: "/home/user/models/llama-7b.gguf",
modelType: "llm",
modelConfig: { ctx_size: 2048 }
});
// Local file path - relative path
const relativeModelId = await loadModel({
modelSrc: "./models/whisper-base.gguf",
modelType: "whisper"
});
// Hyperdrive URL with key and path
const hyperdriveId = await loadModel({
modelSrc: "pear://<hyperdrive-key>/llama-7b.gguf",
modelType: "llm",
modelConfig: { ctx_size: 2048 }
});
// Remote HTTP/HTTPS URL with progress tracking
const remoteId = await loadModel({
modelSrc: "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf",
modelType: "llm",
onProgress: (progress) => {
console.log(`Downloaded: ${progress.percentage}%`);
}
});
// Multimodal model with projection
const multimodalId = await loadModel({
modelSrc: "https://huggingface.co/.../main-model.gguf",
modelType: "llm",
projectionModelSrc: "https://huggingface.co/.../projection-model.gguf",
modelConfig: { ctx_size: 512 },
onProgress: (progress) => {
console.log(`Loading: ${progress.percentage}%`);
}
});
// Whisper with VAD model
const whisperId = await loadModel({
modelSrc: "https://huggingface.co/.../whisper-model.gguf",
modelType: "whisper",
vadModelSrc: "https://huggingface.co/.../vad-model.bin",
modelConfig: {
language: "en",
strategy: "greedy",
vad_params: {
threshold: 0.35
}
}
});
// Load with automatic logging
import { getLogger } from "@qvac/sdk";
const logger = getLogger("my-app");
const modelId = await loadModel({
modelSrc: "/path/to/model.gguf",
modelType: "llm",
logger
});