Edit model card

pythia-1b-tulu-v2-mix-nos-rm

This model is a fine-tuned version of kykim0/pythia-1b-tulu-v2-mix-nos on the allenai/ultrafeedback_binarized_cleaned dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5227
  • Accuracy: 0.7458

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1.41e-05
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.5749 0.0527 100 0.5849 0.6900
0.5873 0.1055 200 0.5581 0.7100
0.5599 0.1582 300 0.5470 0.7212
0.5456 0.2109 400 0.5379 0.7258
0.521 0.2637 500 0.5358 0.7294
0.5361 0.3164 600 0.5363 0.7376
0.5662 0.3691 700 0.5270 0.7412
0.5301 0.4219 800 0.5268 0.7427
0.5661 0.4746 900 0.5301 0.7381
0.5608 0.5274 1000 0.5242 0.7437
0.5223 0.5801 1100 0.5242 0.7422
0.5322 0.6328 1200 0.5249 0.7448
0.4891 0.6856 1300 0.5241 0.7427
0.5111 0.7383 1400 0.5234 0.7437
0.5145 0.7910 1500 0.5225 0.7422
0.4746 0.8438 1600 0.5226 0.7458
0.5551 0.8965 1700 0.5223 0.7448
0.563 0.9492 1800 0.5222 0.7453

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.6
  • Tokenizers 0.19.1
Downloads last month
1,558
Safetensors
Model size
909M params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kykim0/pythia-1b-tulu-v2-mix-uf-rm

Finetuned
this model

Dataset used to train kykim0/pythia-1b-tulu-v2-mix-uf-rm

Evaluation results

  • Accuracy on allenai/ultrafeedback_binarized_cleaned
    self-reported
    0.746