Test Evaluation.doc

上传人:人*** 文档编号:557298667 上传时间:2023-06-10 格式:DOC 页数:19 大小:111KB
返回 下载 相关 举报
Test Evaluation.doc_第1页
第1页 / 共19页
Test Evaluation.doc_第2页
第2页 / 共19页
Test Evaluation.doc_第3页
第3页 / 共19页
Test Evaluation.doc_第4页
第4页 / 共19页
Test Evaluation.doc_第5页
第5页 / 共19页
点击查看更多>>
资源描述

《Test Evaluation.doc》由会员分享,可在线阅读,更多相关《Test Evaluation.doc(19页珍藏版)》请在金锄头文库上搜索。

1、Introduction21 Reliability32 Validity42.1 Construct Validity42.2 Content Validity62.2.1 Test Syllabus62.2.2 What ability has been assessed?72.2.3 What topics have been covered?73 Authenticity, Interactiveness and Practicality83.1 Authenticity83.2 Interactiveness103.3 Practicality114 Impact11Conclusi

2、on12References12Appendix 1 Listening Subtest A (December, 2005)13Appendix 2 Listening Subtest B (June, 2009)16IntroductionCollege English Test Band 4 (CET-4) in China is administered by the National College English Testing Committee on behalf of the Higher Education Department of the Ministry of Edu

3、cation. The test is held twice a year since 1987. Its test-takers are all undergraduates in China majoring in any discipline except English. Usually they are second-year university students who have completed the college English course Band 1 to 4. The purpose of the test is to provide an objective

4、evaluation of college students overall English proficiency and the English teaching quality of universities so as to exert a positive impact on the college English education in China.However, CET has long been accused of mainly examining grammar and vocabulary instead of focusing on communicative ab

5、ility. The biggest criticism against CET is that it produces students who are only good at paper-based test with unsatisfactory command of practical use. Facing such criticism, from 1996 onwards, a series of reform had undergone by the CET committee to attach more importance to students productive s

6、kills in the assessment, such as introducing CET spoken English test; employing new response formats; reporting the average graded scores; launching the web-based CET. The most recent reform took place in 2005, resulting in a new CET test across the country in 2006. The changes were made in various

7、aspects: increasing the weighting of listening from 20% to 35%; removing the vocabulary and structure section; introducing new test content, such as skimming and scanning and translation; introducing new test formats, such as banked close, true or false, short answer questions; introducing an online

8、 marking system of subjective items; changing the scoring system.This paper will mainly centre on the reform made in the listening section by comparing two listening tests taken place in December 2005 (Test A) and June 2009 (Test B), which were held before and after the reform. The evaluation will b

9、e based on Bachman and Palmers (1996) framework of test usefulness which is defined as “a function of several different qualities, all of which contribute in unique but interrelated ways to the overall usefulness of a test (1996: 18).” The qualities contributing to test usefulness are reliability, c

10、onstruct validity, authenticity, interactiveness, impact and practicality. The two listening tests will be evaluated and compared on the basis of this framework.1. ReliabilityIn this section, instrument-related reliability will be examined of both tests. In Test A, all questions are objective multip

11、le choices, while Test B also contains compound dictation, so we will only discuss assessor-related reliability of Test B.In Test A, there are 10 short conversations and 3 short passages which all together comprise 20 multiple choice questions. In Test B, there are 8 short conversations, 2 long conv

12、ersations, 3 short passages and 1 passage for compound dictation composing all together 25 multiple choice questions and 11 blanks. The introduction of long conversations and compound dictation is a striking change in the listening section. The longer conversation can incorporate meaning negotiation

13、 and discourse features into the texts, thus engaging test-takers into a broader context of listening.In compound dictation section, 8 blanks are required to be filled with exact words students have just heard, and 3 others can be filled with either exact words or the main points in their own words.

14、 For the former 8 blanks the missing information is just one word, whereas the other 3 blanks miss longer clauses. Apparently, the introduction of dictation minimizes the chance of guessing and cheating, so Test B has higher instrument-related reliability than Test A in this aspect. However, filling

15、 with exact words might impose a potential danger of only testing learners short-term memory and spelling, which is mitigated by the follow 3 larger chunks of missing information. The examiners are trained and standardized to accept any semantically acceptable form, so any answer that demonstrates u

16、nderstanding of the dictated utterance is awarded a mark. We can say that assessor-related reliability can be assured in Test B. Moreover, the missing clauses are long enough and the time allowance is short enough to make it almost impossible for any test-taker to memorize every exact word of the clauses. The increased amount of missing information places more demand on working memory and linguistic knowledge of learn

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 生活休闲 > 社会民生

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号