loadModel( )

Loads a machine learning model from a local path, remote URL, or Hyperdrive key.

function loadModel(options): Promise<string>;

Description

This function supports multiple model types: LLM (Large Language Model), Whisper (speech recognition), embeddings, NMT (translation), and TTS. It can handle both local file paths and Hyperdrive URLs (pear://).

When onProgress is provided, the function uses streaming to provide real-time download progress. Otherwise, it uses a simple request-response pattern for faster execution.

Accepted forms

Load new model

function loadModel(options): Promise<string>;

Hot-reload existing model config

function loadModel(options): Promise<string>;

Parameters

Name	Type	Required?	Description
options	`object`	✓	Model loading configuration

options (new model)

Field	Type	Required?	Description
modelSrc	`string`	✓	The location from which the model weights are fetched (local path, remote URL, or Hyperdrive URL)
modelType	`"llm" \| "whisper" \| "embeddings" \| "nmt" \| "tts"`	✓	The type of model
modelConfig	`object`	✗	Model-specific configuration options
projectionModelSrc	`string`	✗	(LLM only) Projection model source for multimodal models
vadModelSrc	`string`	✗	(Whisper only) VAD model source for voice activity detection
configSrc	`string`	✗	(TTS only) Config file source
eSpeakDataPath	`string`	✗	(TTS only) Path to eSpeak data
delegate	`object`	✗	Delegation configuration for P2P inference
onProgress	`function`	✗	Callback for download progress updates
logger	`Logger`	✗	Logger instance for model operation logs

options (hot-reload)

Field	Type	Required?	Description
modelId	`string`	✓	The ID of an existing loaded model
modelType	`"llm" \| "whisper" \| "embeddings" \| "nmt" \| "tts"`	✓	The type of model (must match)
modelConfig	`object`	✓	New configuration to apply

modelConfig (varies by model type)

Configuration options depend on the modelType. Common options include:

LLM models:

ctx_size (number): Context window size
device ("cpu" | "gpu"): Device to use
temp (number): Temperature
top_p (number): Top-p sampling
top_k (number): Top-k sampling
seed (number): Random seed
n_predict (number): Max tokens to predict

Whisper models:

language (string): Language code (e.g., "en")
translate (boolean): Whether to translate to English
strategy ("greedy" | "beam_search"): Sampling strategy
temperature (number): Temperature
vad_params (object): VAD parameters

Embeddings models:

device ("cpu" | "gpu"): Device to use
ctx_size (number): Context size

NMT models:

from (string): Source language
to (string): Target language
temperature (number): Temperature

TTS models:

language (string): Language code

delegate

Field	Type	Required?	Description
topic	`string`	✓	P2P topic for delegation
providerPublicKey	`string`	✓	Provider's public key
timeout	`number`	✗	Timeout in milliseconds
fallbackToLocal	`boolean`	✗	Whether to fallback to local inference if delegation fails

onProgress callback

(progress: ModelProgressUpdate) => void

ModelProgressUpdate object:

Field	Type	Description
percentage	`number`	Download percentage (0-100)
downloaded	`number`	Bytes downloaded
total	`number`	Total bytes
downloadKey	`string`	Key for canceling download
shardInfo	`object`	Shard information (if sharded model)

Returns

Promise<string> — Promise that resolves to the model ID (either the provided modelSrc or a generated ID)

Throws

When model loading fails, with details in the error message
When streaming ends unexpectedly (only when using onProgress)
When receiving an invalid response type from the server

Example

// Local file path - absolute path
const localModelId = await loadModel({
  modelSrc: "/home/user/models/llama-7b.gguf",
  modelType: "llm",
  modelConfig: { ctx_size: 2048 }
});

// Local file path - relative path
const relativeModelId = await loadModel({
  modelSrc: "./models/whisper-base.gguf",
  modelType: "whisper"
});

// Hyperdrive URL with key and path
const hyperdriveId = await loadModel({
  modelSrc: "pear://<hyperdrive-key>/llama-7b.gguf",
  modelType: "llm",
  modelConfig: { ctx_size: 2048 }
});

// Remote HTTP/HTTPS URL with progress tracking
const remoteId = await loadModel({
  modelSrc: "https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf",
  modelType: "llm",
  onProgress: (progress) => {
    console.log(`Downloaded: ${progress.percentage}%`);
  }
});

// Multimodal model with projection
const multimodalId = await loadModel({
  modelSrc: "https://huggingface.co/.../main-model.gguf",
  modelType: "llm",
  projectionModelSrc: "https://huggingface.co/.../projection-model.gguf",
  modelConfig: { ctx_size: 512 },
  onProgress: (progress) => {
    console.log(`Loading: ${progress.percentage}%`);
  }
});

// Whisper with VAD model
const whisperId = await loadModel({
  modelSrc: "https://huggingface.co/.../whisper-model.gguf",
  modelType: "whisper",
  vadModelSrc: "https://huggingface.co/.../vad-model.bin",
  modelConfig: {
    language: "en",
    strategy: "greedy",
    vad_params: {
      threshold: 0.35
    }
  }
});

// Load with automatic logging
import { getLogger } from "@qvac/sdk";
const logger = getLogger("my-app");

const modelId = await loadModel({
  modelSrc: "/path/to/model.gguf",
  modelType: "llm",
  logger
});

loadModel( )

On this page