In the VFT, each trial consisted of a 40-s pretask control period, a 60-s task period, and a 70-s post-task control period. During each task period (60 s), the participants were requested to verbalize as many words as possible that began with a Japanese character enunciated through headphones every 20 s (three characters per trial). The characters, which were enunciated randomly, included /a/, /to/, /na/, /i/, /ki/, /se/, /o/, /ta/, and /ha/. During each control period, the participants were requested to repeatedly verbalize the five Japanese vowels (/a/, /i/, /u/, /e/, and /o/).11 The sequence was repeated for three trials. Speech during fMRI scanning might cause movement artifacts in BOLD signals; therefore, in this study, we adopted a method to acquire all slices from the volume in the first period of the relatively longer TR and to make the remaining period a “no-sound” period.50,51 The acquisition time (TA) (for 30 slices) was set to 1205 ms and participants produced all speech (words and vowels) during the no-sound period of TRs, i.e., . We confirmed that this duration was sufficient for all the participants to complete their articulation. This is the point that is different from the conventional VFT sequence. The temporal differences among slices exist within TA (1205 ms) and were not corrected in the present study. This is because the temporal change in the BOLD signal is several times longer than the time scale of the present TA, for which the benefit of correction can be minimal.