Minueza-32M-Chat: A chat model with 32 million parameters

Base model: Felladrin/Minueza-32M-Base
Datasets used during SFT:
Datasets used during DPO:
License: Apache License 2.0
Availability in other ML formats:
- GGUF: Felladrin/gguf-Minueza-32M-Chat
- ONNX: Felladrin/onnx-Minueza-32M-Chat

Recommended Prompt Format

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant

Recommended Inference Parameters

do_sample: true
temperature: 0.65
top_p: 0.55
top_k: 35
repetition_penalty: 1.176

Usage Example

from transformers import pipeline

generate = pipeline("text-generation", "Felladrin/Minueza-32M-Chat")

messages = [
    {
        "role": "system",
        "content": "You are a helpful assistant who answers the user's questions with details and curiosity.",
    },
    {
        "role": "user",
        "content": "What are some potential applications for quantum computing?",
    },
]

prompt = generate.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

output = generate(
    prompt,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.65,
    top_k=35,
    top_p=0.55,
    repetition_penalty=1.176,
)

print(output[0]["generated_text"])

How it was trained

This model was trained with SFT Trainer and DPO Trainer, in several sessions, using the following settings:

For Supervised Fine-Tuning:

Hyperparameter	Value
learning_rate	2e-5
total_train_batch_size	24
max_seq_length	2048
weight_decay	0
warmup_ratio	0.02

For Direct Preference Optimization:

Hyperparameter	Value
learning_rate	7.5e-7
total_train_batch_size	6
max_length	2048
max_prompt_length	1536
max_steps	200
weight_decay	0
warmup_ratio	0.02
beta	0.1

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	28.49
AI2 Reasoning Challenge (25-Shot)	20.39
HellaSwag (10-Shot)	26.54
MMLU (5-Shot)	25.75
TruthfulQA (0-shot)	47.27
Winogrande (5-shot)	50.99
GSM8k (5-shot)	0.00

Downloads last month: 14

Safetensors

Model size

32.8M params

Tensor type

F32

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Felladrin/Minueza-32M-Chat

Base model

Felladrin/Minueza-32M-Base

Finetuned

this model

Quantizations

2 models

Datasets used to train Felladrin/Minueza-32M-Chat

Spaces using Felladrin/Minueza-32M-Chat 2

Collection including Felladrin/Minueza-32M-Chat

Trained Models 🏋️

Collection

They may be small, but they're training like giants! • 8 items • Updated May 13 • 16

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

20.390
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

26.540
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

25.750
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

47.270
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

50.990
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

0.000

View on Papers With Code