sagawa commited on
Commit
ed20c5e
1 Parent(s): 7eb18f6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -52,11 +52,11 @@ output # 'CN1CCC=C(CO)C1'
52
  ### Training Procedure
53
 
54
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
55
- We used the Open Reaction Database (ORD) dataset for model training.
56
- The command used for training is the following. For more information, please refer to the paper and GitHub repository.
57
 
58
  ```python
59
- python train_without_duplicates.py \
60
  --model='t5' \
61
  --epochs=100 \
62
  --lr=1e-3 \
@@ -67,12 +67,12 @@ python train_without_duplicates.py \
67
  --evaluation_strategy='epoch' \
68
  --save_strategy='epoch' \
69
  --logging_strategy='epoch' \
70
- --train_data_path='/home/acf15718oa/ReactionT5_neword/data/all_ord_reaction_uniq_with_attr20240506_v3_train.csv' \
71
- --valid_data_path='/home/acf15718oa/ReactionT5_neword/data/all_ord_reaction_uniq_with_attr20240506_v3_valid.csv' \
72
- --test_data_path='/home/acf15718oa/ReactionT5_neword/data/all_ord_reaction_uniq_with_attr20240506_v3_test.csv' \
73
- --USPTO_test_data_path='/home/acf15718oa/ReactionT5_neword/data/USPTO_MIT/MIT_separated/test.csv' \
74
  --disable_tqdm \
75
- --pretrained_model_name_or_path='sagawa/ZINC-t5'
76
  ```
77
 
78
  ### Results
 
52
  ### Training Procedure
53
 
54
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
55
+ We used the [Open Reaction Database (ORD) dataset](https://drive.google.com/file/d/1fa2MyLdN1vcA7Rysk8kLQENE92YejS9B/view?usp=drive_link) for model training. In addition, we used [USPTO_MIT dataset](https://yzhang.hpc.nyu.edu/T5Chem/index.html)'s test split to prevent data leakage.
56
+ The command used for training is the following. For more information about data preprocessing and training, please refer to the paper and GitHub repository.
57
 
58
  ```python
59
+ python train.py \
60
  --model='t5' \
61
  --epochs=100 \
62
  --lr=1e-3 \
 
67
  --evaluation_strategy='epoch' \
68
  --save_strategy='epoch' \
69
  --logging_strategy='epoch' \
70
+ --train_data_path='../data/preprocessed_ord_train.csv' \
71
+ --valid_data_path='../data/preprocessed_ord_valid.csv' \
72
+ --test_data_path='../data/preprocessed_ord_test.csv' \
73
+ --USPTO_test_data_path='../data/USPTO_MIT/MIT_separated/test.csv' \
74
  --disable_tqdm \
75
+ --pretrained_model_name_or_path='sagawa/CompoundT5'
76
  ```
77
 
78
  ### Results