Model Configuration
Details about the LLM model and its configuration settings.
Model Information
Details about the deployed model
Model Name
Mistral Nemo Instruct
Architecture
Managed foundation model via DigitalOcean Agent
Parameters
Managed by provider
Context Length
32,768 tokens
Deployment Configuration
Server and optimization settings
Server Type
DigitalOcean Agent Endpoint
Acceleration
Managed by provider
Quantization
Managed by provider
Batch Size
Managed by provider
Generation Settings
Parameters for text generation
Temperature
0.7 (default)
Top-P
0.9
Max New Tokens
2048
Repetition Penalty
1.1
Performance Metrics
Runtime performance statistics
Average Latency
~100ms per token
Throughput
~30 tokens per second
GPU Memory Usage
~24GB VRAM
Request Queue
Max 128 concurrent requests