eryk-mazus commited on
Commit
4074583
1 Parent(s): a0d9543

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -13,16 +13,15 @@ pipeline_tag: text-generation
13
 
14
  # polka-1.1B-dpo
15
 
16
- `eryk-mazus/polka-1.1b-dpo` is the first Polish model trained to act as a helpful, conversational assistant that can be run locally.
17
 
18
- This model is based on [TinyLlama-1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) with an extended tokenizer for more efficient Polish text generation, pretrained on an additional 6 billion Polish tokens. It was then fine-tuned using synthetically created and machine-translated multi-turn conversations with the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) performed on top of it.
19
 
20
  Context size: 4,096 tokens
21
 
22
- In addition, we've releasing:
23
  * [polka-1.1b](https://huggingface.co/eryk-mazus/polka-1.1b) - our base model with an extended tokenizer and additional pre-training on Polish corpus sampled using [DSIR](https://github.com/p-lambda/dsir)
24
  * [polka-pretrain-en-pl-v1](https://huggingface.co/datasets/eryk-mazus/polka-pretrain-en-pl-v1) - the pre-training dataset
25
- * [polka-1.1b-sft](https://huggingface.co/eryk-mazus/polka-1.1b-sft) - SFT version of the base model trained on polish conversations
26
  * [polka-dpo-v1](https://huggingface.co/datasets/eryk-mazus/polka-dpo-v1) - dataset of DPO pairs
27
 
28
  ## Usage
 
13
 
14
  # polka-1.1B-dpo
15
 
16
+ `eryk-mazus/polka-1.1b-dpo` is **the first Polish model trained to act as a helpful, conversational assistant that can be run locally.**
17
 
18
+ This model is based on [TinyLlama-1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T) with an extended tokenizer for more efficient Polish text generation that pretrained on an additional 6 billion Polish tokens. It was then fine-tuned using synthetically generated and machine-translated multi-turn conversations with the [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290) performed on top of it.
19
 
20
  Context size: 4,096 tokens
21
 
22
+ In addition, we're releasing:
23
  * [polka-1.1b](https://huggingface.co/eryk-mazus/polka-1.1b) - our base model with an extended tokenizer and additional pre-training on Polish corpus sampled using [DSIR](https://github.com/p-lambda/dsir)
24
  * [polka-pretrain-en-pl-v1](https://huggingface.co/datasets/eryk-mazus/polka-pretrain-en-pl-v1) - the pre-training dataset
 
25
  * [polka-dpo-v1](https://huggingface.co/datasets/eryk-mazus/polka-dpo-v1) - dataset of DPO pairs
26
 
27
  ## Usage