hermeo-7b / README.md
malteos's picture
Files uploaded (1/9)
ae27b01
|
raw
history blame
No virus
3.86 kB
metadata
language:
  - en
  - de
library_name: transformers
pipeline_tag: text-generation
license: apache-2.0

image/png

Hermes + Leo = Hermeo

Hermeo-7B

A German-English language model merged from DPOpenHermes-7B-v2 and leo-mistral-hessianai-7b-chat using mergekit. Both base models are fine-tuned versions of Mistral-7B-v0.1.

Model details

Acknowledgements

Evaluation

The evaluation methdology of the Open LLM Leaderboard is followed.

German benchmarks

German tasks: MMLU-DE Hellaswag-DE ARC-DE
Models / Few-shots: (5 shots) (10 shots) (24 shots)
7B parameters
llama-2-7b 0.400 0.513 0.381
leo-hessianai-7b 0.400 0.609 0.429
bloom-6b4-clp-german 0.274 0.550 0.351
mistral-7b 0.524 0.588 0.473
leo-mistral-hessianai-7b 0.481 0.663 0.485
leo-mistral-hessianai-7b-chat 0.458 0.617 0.465
DPOpenHermes-7B-v2 TBA 0.603 0.515
hermeo-7b (this model) 0.511 0.668 0.528
13B parameters
llama-2-13b 0.469 0.581 0.468
leo-hessianai-13b 0.486 0.658 0.509
70B parameters
llama-2-70b 0.597 0.674 0.561
leo-hessianai-70b 0.653 0.721 0.600

English benchmarks

TBA

Prompting / Prompt Template

Prompt dialogue template (ChatML format):

"""
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant
"""

The model input can contain multiple conversation turns between user and assistant, e.g.

<|im_start|>user
{prompt 1}<|im_end|>
<|im_start|>assistant
{reply 1}<|im_end|>
<|im_start|>user
{prompt 2}<|im_end|>
<|im_start|>assistant
(...)

License

Apache 2.0