nie3e
/

sentiment-polish-gpt2-large

Text Classification

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nie3e commited on Feb 13

Commit

4bd5455

•

1 Parent(s): 85cde26

Update README.md

Files changed (1) hide show

README.md +24 -6

README.md CHANGED Viewed

@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
 # sentiment-polish-gpt2-large
-This model is a fine-tuned version of [sdadas/polish-gpt2-large](https://huggingface.co/sdadas/polish-gpt2-large) on the None dataset.
 It achieves the following results on the evaluation set:
 - epoch: 10.0
 - eval_accuracy: 0.9634
@@ -30,24 +30,42 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 2
-- eval_batch_size: 2
 - seed: 42
 - gradient_accumulation_steps: 8
 - total_train_batch_size: 16

 # sentiment-polish-gpt2-large
+This model is a fine-tuned version of [sdadas/polish-gpt2-large](https://huggingface.co/sdadas/polish-gpt2-large) on the [polemo2-official](https://huggingface.co/datasets/clarin-pl/polemo2-official) dataset.
 It achieves the following results on the evaluation set:
 - epoch: 10.0
 - eval_accuracy: 0.9634
 ## Model description
+Trained from [polish-gpt2-large](https://huggingface.co/sdadas/polish-gpt2-large)
 ## Intended uses & limitations
+Sentiment analysis - neutral/negative/positive/ambiguous
 ## Training and evaluation data
+Merged all rows from [polemo2-official](https://huggingface.co/datasets/clarin-pl/polemo2-official) dataset.
+Discarded rows with length > 512.
+Train/test split: 80%/20%
+Datacollator:
+```py
+data_collator = DataCollatorWithPadding(
+  tokenizer=tokenizer,
+  padding="longest",
+  max_length=MAX_INPUT_LENGTH,
+  pad_to_multiple_of=8
+)
+```
 ## Training procedure
+GPU: 2x RTX 4060Ti 16GB
+Training time: 29:16:50
 ### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 1
+- eval_batch_size: 1
 - seed: 42
 - gradient_accumulation_steps: 8
 - total_train_batch_size: 16