Skip to main content
GET
/
api
/
v1
/
ollama
/
show-model
curl --request GET \
  --url 'http://localhost:3000/api/v1/ollama/show-model?modelName=llama2&verbose=true' \
  --header 'Authorization: Bearer <token>'
{
  "success": true,
  "capabilities": ["chat", "completion", "generate"],
  "details": {
    "parent_model": "",
    "format": "gguf",
    "family": "llama",
    "families": ["llama"],
    "parameter_size": "7B",
    "quantization_level": "Q4_0"
  }
}

Query Parameters

modelName
string
required
Name of the model to inspect (e.g., “llama2”, “mistral:7b”)
verbose
boolean
default:false
Include verbose model information
baseUrl
string
Optional Ollama server URL (defaults to http://localhost:11434)

Response

success
boolean
Indicates if the request was successful
capabilities
array
Model capabilities and features (e.g., [“chat”, “completion”, “embedding”])
details
object
Detailed model information
curl --request GET \
  --url 'http://localhost:3000/api/v1/ollama/show-model?modelName=llama2&verbose=true' \
  --header 'Authorization: Bearer <token>'
{
  "success": true,
  "capabilities": ["chat", "completion", "generate"],
  "details": {
    "parent_model": "",
    "format": "gguf",
    "family": "llama",
    "families": ["llama"],
    "parameter_size": "7B",
    "quantization_level": "Q4_0"
  }
}

Notes

  • This endpoint only works for installed models (use List Models to see what’s installed)
  • The capabilities array indicates what the model can do (chat, embeddings, etc.)
  • quantization_level affects model size and performance
  • Use verbose=true for additional technical details

Common Quantization Levels

  • Q4_0: 4-bit quantization, smallest size, lower quality
  • Q5_K_M: 5-bit medium quality
  • Q8_0: 8-bit quantization, larger size, higher quality