The assessment of speaking skills in foreign language testing has always had some pros (testing learners’ speaking skills doubles the validity of any language test) and cons (many test-relevant/irrelevant variables interfere) since it is a multi-dimensional process. In the meantime, exploring grader behaviours while scoring learners’ speaking skills is necessary not only for inter/intra-rater reliability estimations but also for identifying the potential stringent and lenient graders in the rater-group to act accordingly to settle the best matches for graders when paired-rater-scorings or cross-marking-gradings are preferred for increasing the objectivity. In this exploratory study, which was implemented in 2019, 6 expert speaking graders scored 24 English language learners’ speaking interviews from their video recordings including an individual and a pair discussion task for each student. A Rasch model in which MFRM (Many Faceted Rasch Measurement) was utilised to explore the scoring behaviours of the expert graders in terms of stringency and find out if their grading habits significantly affect language learners’ overall speaking performances. The results of the present research showed that graders had significant score differences among each other and some of them score too leniently or too stringently that might affect learners’ speaking grades significantly.