Gradio

Granite4 model family: Chat multi llm selection

Chatbot

llama.cpp based quantized gguf inference

This space hosts the Granite4 model family from 350m up to 32b. Select the model of your choice in the additional inputs section below.

Model

System message

Max tokens

1 4096

Temperature

0.1 4

Top-p

0.1 1

Top-k

0 100

Repetition penalty

0 2

·

·