hiyouga commited on
Commit
f6bb88b
1 Parent(s): ce8eb41

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -1
README.md CHANGED
@@ -45,7 +45,7 @@ generate_ids = model.generate(**inputs, max_new_tokens=256, streamer=streamer)
45
  You could also alternatively launch a CLI demo by using the script in https://github.com/hiyouga/LLaMA-Efficient-Tuning
46
 
47
  ```bash
48
- python src/cli_demo.py --model_name_or_path hiyouga/baichuan-13b-sft
49
  ```
50
 
51
  ---
@@ -54,6 +54,38 @@ You can reproduce our results by visiting the following step-by-step (Chinese) g
54
 
55
  https://zhuanlan.zhihu.com/p/645010851
56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  Loss curve:
58
 
59
  ![loss](loss.svg)
 
45
  You could also alternatively launch a CLI demo by using the script in https://github.com/hiyouga/LLaMA-Efficient-Tuning
46
 
47
  ```bash
48
+ python src/cli_demo.py --template default --model_name_or_path hiyouga/baichuan-13b-sft
49
  ```
50
 
51
  ---
 
54
 
55
  https://zhuanlan.zhihu.com/p/645010851
56
 
57
+ or using the following scripts in [LLaMA-Efficient-Tuning](https://github.com/hiyouga/LLaMA-Efficient-Tuning):
58
+
59
+ ```bash
60
+ CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
61
+ --stage sft \
62
+ --model_name_or_path baichuan-inc/Baichuan-13B-Base \
63
+ --do_train \
64
+ --dataset alpaca_gpt4_en,alpaca_gpt4_zh,self_cognition,oaast_sft,sharegpt_zh,refgpt_zh_p1,lima,self_cognition \
65
+ --template default \
66
+ --finetuning_type lora \
67
+ --lora_rank 32 \
68
+ --lora_target W_pack,o_proj,down_proj,up_proj,gate_proj \
69
+ --output_dir baichuan_13b_lora \
70
+ --per_device_train_batch_size 4 \
71
+ --gradient_accumulation_steps 8 \
72
+ --preprocessing_num_workers 16 \
73
+ --max_source_length 1024 \
74
+ --max_target_length 512 \
75
+ --optim paged_adamw_32bit \
76
+ --lr_scheduler_type cosine \
77
+ --logging_steps 10 \
78
+ --save_steps 100 \
79
+ --eval_steps 100 \
80
+ --warmup_steps 100 \
81
+ --learning_rate 3e-5 \
82
+ --max_grad_norm 0.5 \
83
+ --num_train_epochs 2.0 \
84
+ --quantization_bit 4 \
85
+ --plot_loss \
86
+ --fp16
87
+ ```
88
+
89
  Loss curve:
90
 
91
  ![loss](loss.svg)