--- library_name: transformers tags: - trl - sft language: - de base_model: - TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T pipeline_tag: text-generation --- # Model Card for Model ID We fine-tuned our base model for 21 epochs on the Ca dataset, epoch 1 showed the best macro average f1 score on the evaluation dataset. ## Context format "### Context\n\nText to analyse.\n\n###Answer" ## Metric eval_AVGf1 0.9102075019834961 eval_DIAGNOSIS.f1 0.8808602150537634 eval_DIAGNOSIS.precision 0.8943231441048035 eval_DIAGNOSIS.recall 0.8677966101694915 eval_DIAGNOSTIC.f1 0.9472166137871358 eval_DIAGNOSTIC.precision 0.9624853458382181 eval_DIAGNOSTIC.recall 0.9324247586598523 eval_DRUG.f1 0.9440145653163405 eval_DRUG.precision 0.9792256846081209 eval_DRUG.recall 0.9112478031634447 eval_MEDICAL_FINDING.f1 0.9092427259297321 eval_MEDICAL_FINDING.precision 0.9073195744135367 eval_MEDICAL_FINDING.recall 0.9111740473738414 eval_THERAPY.f1 0.8697033898305084 eval_THERAPY.precision 0.8729399255715046 eval_THERAPY.recall 0.8664907651715039 eval_accuracy 0.9618960382191458 eval_f1 0.7632318301785055 eval_loss 0.006697072647511959 eval_model_preparation_time 0 eval_precision 0.6619246861924686 eval_recall 0.9011526605012733 eval_runtime 341.5967 eval_samples_per_second 23.952 eval_steps_per_second 5.99 test_AVGf1 0.8676664044743045 test_DIAGNOSIS.f1 0.7754658946987515 test_DIAGNOSIS.precision 0.7846942511900403 test_DIAGNOSIS.recall 0.7664520743919886 test_DIAGNOSTIC.f1 0.9211950129381322 test_DIAGNOSTIC.precision 0.9346062052505967 test_DIAGNOSTIC.recall 0.9081632653061225 test_DRUG.f1 0.9448028673835126 test_DRUG.precision 0.9835820895522388 test_DRUG.recall 0.9089655172413793 test_MEDICAL_FINDING.f1 0.879590997238056 test_MEDICAL_FINDING.precision 0.8656025907934305 test_MEDICAL_FINDING.recall 0.8940389439732409 test_THERAPY.f1 0.8172772501130711 test_THERAPY.precision 0.8187584956955143 test_THERAPY.recall 0.8158013544018059 test_accuracy 0.9665184459433998 test_f1 0.7391588362393848 test_loss 0.009836438111960888 test_model_preparation_time 0 test_precision 0.6447795213465416 test_recall 0.865905344949376 test_runtime 394.9961 test_samples_per_second 24.023 test_steps_per_second 6.008