最近使用RocketQA的DuReaderretrieval Baseline System时候碰见了下面的问题:

                   [--do_train DO_TRAIN] [--do_val DO_VAL] [--do_test DO_TEST]
[--output_item OUTPUT_ITEM]
[--output_file_name OUTPUT_FILE_NAME]
[--test_data_cnt TEST_DATA_CNT]
[--use_multi_gpu_test USE_MULTI_GPU_TEST]
[--metrics METRICS] [--shuffle SHUFFLE] [--for_cn FOR_CN]
train_de.py: error: argument --save_steps: invalid int value: '$[$[889580/128/4]*10/2]'
grep: warning: GREP_OPTIONS is deprecated; please use an alias or script

我的环境是ubuntu的docker,而给的baseline是centos,我觉得是这个原因,导致bash文件的一些语法失效的

解决方法

把run_dual_encoder_train.sh的文件修改一下即可:

lr=3e-5
batch_size=128
train_exampls=`cat $TRAIN_SET | wc -l`
save_steps=$((train_exampls/batch_size/node))
data_size=$[$save_steps*$batch_size*$node]
# new_save_steps=$[$save_steps\*$epoch/2]

new_save_steps=$((save_steps*epoch/2))

参考文献

[1].Bash multiplication and addition.https://unix.stackexchange.com/questions/299321/bash-multiplication-and-addition