nie3e commited on
Commit
4bd5455
1 Parent(s): 85cde26

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -6
README.md CHANGED
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  # sentiment-polish-gpt2-large
20
 
21
- This model is a fine-tuned version of [sdadas/polish-gpt2-large](https://huggingface.co/sdadas/polish-gpt2-large) on the None dataset.
22
  It achieves the following results on the evaluation set:
23
  - epoch: 10.0
24
  - eval_accuracy: 0.9634
@@ -30,24 +30,42 @@ It achieves the following results on the evaluation set:
30
 
31
  ## Model description
32
 
33
- More information needed
34
 
35
  ## Intended uses & limitations
36
 
37
- More information needed
38
 
39
  ## Training and evaluation data
40
 
41
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  ## Training procedure
44
 
 
 
 
 
45
  ### Training hyperparameters
46
 
47
  The following hyperparameters were used during training:
48
  - learning_rate: 2e-05
49
- - train_batch_size: 2
50
- - eval_batch_size: 2
51
  - seed: 42
52
  - gradient_accumulation_steps: 8
53
  - total_train_batch_size: 16
 
18
 
19
  # sentiment-polish-gpt2-large
20
 
21
+ This model is a fine-tuned version of [sdadas/polish-gpt2-large](https://huggingface.co/sdadas/polish-gpt2-large) on the [polemo2-official](https://huggingface.co/datasets/clarin-pl/polemo2-official) dataset.
22
  It achieves the following results on the evaluation set:
23
  - epoch: 10.0
24
  - eval_accuracy: 0.9634
 
30
 
31
  ## Model description
32
 
33
+ Trained from [polish-gpt2-large](https://huggingface.co/sdadas/polish-gpt2-large)
34
 
35
  ## Intended uses & limitations
36
 
37
+ Sentiment analysis - neutral/negative/positive/ambiguous
38
 
39
  ## Training and evaluation data
40
 
41
+ Merged all rows from [polemo2-official](https://huggingface.co/datasets/clarin-pl/polemo2-official) dataset.
42
+
43
+ Discarded rows with length > 512.
44
+
45
+ Train/test split: 80%/20%
46
+
47
+ Datacollator:
48
+ ```py
49
+ data_collator = DataCollatorWithPadding(
50
+ tokenizer=tokenizer,
51
+ padding="longest",
52
+ max_length=MAX_INPUT_LENGTH,
53
+ pad_to_multiple_of=8
54
+ )
55
+ ```
56
 
57
  ## Training procedure
58
 
59
+ GPU: 2x RTX 4060Ti 16GB
60
+
61
+ Training time: 29:16:50
62
+
63
  ### Training hyperparameters
64
 
65
  The following hyperparameters were used during training:
66
  - learning_rate: 2e-05
67
+ - train_batch_size: 1
68
+ - eval_batch_size: 1
69
  - seed: 42
70
  - gradient_accumulation_steps: 8
71
  - total_train_batch_size: 16