Model Configuration

Details about the LLM model and its configuration settings.

Model Information

Details about the deployed model

Model Name

Mistral Nemo Instruct

Architecture

Managed foundation model via DigitalOcean Agent

Parameters

Managed by provider

Context Length

32,768 tokens

Deployment Configuration

Server and optimization settings

Server Type

DigitalOcean Agent Endpoint

Acceleration

Managed by provider

Quantization

Managed by provider

Batch Size

Managed by provider

Generation Settings

Parameters for text generation

Temperature

0.7 (default)

Top-P

0.9

Max New Tokens

2048

Repetition Penalty

1.1

Performance Metrics

Runtime performance statistics

Average Latency

~100ms per token

Throughput

~30 tokens per second

GPU Memory Usage

~24GB VRAM

Request Queue

Max 128 concurrent requests