SCIENCE ENHANCEMENT PROGRAMME
 Teaching Ideas and Evidence
Home Contents Topics index Institute of Education, University of London University of York Kings College London Keele University University of Cambridge
 
 quick link: Menu | Intro | H | I | J | K | L
Activity L: Quality evaluations?
Developed by Rod Watson. Sc1, Evaluating
King’s College, University of London

Introduction
Pupils can learn how to evaluate by simply doing them. A much more focused way of learning how to evaluate is to make explicit the qualities of good evaluations so that pupils acquire criteria for judging quality. Pupils may be concerned about their feelings about the enquiry or how well they have worked with others in their group, rather than judging an enquiry by the quality of evidence collected and how that evidence has been used. In this activity, pupils observe a simple demonstration and then look at the evaluations from ten different groups and highlight good and poor points of evaluation. They then take part in a whole class discussion focused on the quality of evaluations.


Objectives
Pupils will learn about the features of a good evaluation, and what should be omitted from an evaluation.


Outcomes
By the end of the lesson, pupils will be able to:

• write evaluations that highlight whether adequate controls have been made in a fair-test enquiry;

• identify whether measurements are accurate and how accuracy can be improved;

• recognise that statements of opinion about how they felt about an enquiry have nothing to do with the quality of evidence collected.


Notes for Teachers
The following books provide a useful introduction to how to teach the concepts of evidence introduced in these materials:

• Watson, R. and Wood-Robinson, V. (2002) Investigations: Evaluating evidence pp. 89-107 in (eds.) Sang, D. and Wood-Robinson, V. Teaching Secondary Scientific Enquiry. London, John Murray.

• Goldsworthy, A., Watson, J. R. and Wood-Robinson, V. (2000) Investigations: Developing Understanding Hatfield, Association for Science Education, pp. 50-55 and pp. 65-68.

Another example of an exercise to teach evaluation of fair tests can be found in:

• Goldsworthy, A., Watson, J. R. and Wood-Robinson, V. (2000) Investigations: Developing Understanding Unit 20b: Evaluation of a fair test, pp. 113-114. Hatfield, Association for Science Education.










Task 1 (10 minutes)
First demonstrate the candle experiment. The question being investigated is ‘Do different coloured candles burn for different lengths of time?’ The purpose of the demonstration is to set a context for looking at the quality of evaluations.

Fix two different coloured birthday candles to a heat-proof mat by standing each in a drop of melted wax. It is best to use half candles otherwise the demonstration takes too long. Light the candles and time how long it takes for each to burn down completely. The times will probably be very similar. Discuss with the class how you would know whether the difference in burning times was significant or just due to a draft or slightly different sized candles etc.

Emphasise that the class will look at the quality of evaluations and that good evaluations focus on the quality of evidence and how it has been used. Discuss the sorts of things that could be done to improve the quality of evidence e.g. better control of variables, improving accuracy by taking repeat readings and improving accuracy by better measurement. Ask pupils not only to identify what to improve, but also how to improve it. The purpose of this discussion is to focus on what goes in a good evaluation.


Task 2 (20 minutes)
Hand out pupil activity sheet 1, Quality Evaluations? and ask the pupils to complete the sheet in pairs. The answers can be found in appendix 1. Blue and yellow highlighters have been used on the answer sheet because these colours are distinguishable on a black and white printed copy.


Task 3 (10 minutes) Discuss answers with the whole class. Make the points below.

Evaluation is not about:

• whether the enquiry is easy or difficult; (2),
• getting the right answer and who is to blame for any errors; (5, 6, 7),
• working well together; (7),
• whether predictions were correct. (8).


Evaluation is about quality of evidence:

• Experimental design, in this case ‘fair testing’. Fair testing requires relevant control variables to be identified and controlled (1, 8) and saying how they can be controlled (3, 9).
• Accuracy of data collection (5) and how accuracy can be achieved by collecting repeat readings (3, 7) or making measurements accurate (9).

Evaluations that identify how the quality of evidence can be improved are better than those that simply identify possible problems with evidence.

[NB Numbers in brackets refer to the statements on pupil activity sheet 2]









Pupil Activity Sheet 1: Quality evaluations?

Quality evaluations?



When you write an evaluation of an enquiry what should you write? What is the difference between a good quality evaluation and a poor one?

Groups of students were asked to find out whether different coloured birthday candles burn for different lengths of time. They were then asked to describe their enquiries and write an evaluation of their enquiries.

Your job in this assignment is to look at their evaluations and decide which ones are the best and explain why.




On sheet 2 there are statements from the evaluations that the students wrote. Some statements show that the student has thought carefully about the evidence. Other statements are not critical of the evidence, but give a personal opinion without any evidence to back them up.


What you have to do:

1. Read each statement on sheet 2.

2. Work in pairs to decide which parts are good points of evaluation. Underline these good points in red.

3. Work in pairs to decide which parts are poor points of evaluation. Underline these poor points in blue.










Pupil Activity Sheet 2: Quality evaluations?

Statements from Evaluations

1. We tested a blue and a red candle and measured the time it took for them to burn. We tried to keep everything the same, except for the colour of the candle, but I lit the red one and Jemma lit the blue one. We should have had the same person lighting both candles.


2. I think this experiment was really easy. My results and Sue's together gave us a good pattern. We didn't include Dan's because his were way out. He must have read the stop-clock wrong all the time.


3. We tested three yellow and three pink candles. We measured the length of each candle to make sure they were all the same length. The results with different candles were different.


  Time to burn yellow candle (s) Time to burn white candle (s)
First candle 183.03 185.95
Second candle 242.11 181.73
Third candle 187.56 189.44
Average 204.23 185.71



4. We think the candles were different thicknesses and we should have weighed them to make sure they were the same size.


5. We tested one blue and one white candle and timed them to see how long they burnt for. We repeated the measurements three times with new blue and white candles to make it fair.

6. It was the right answer, because we did it fairly. We made sure the candles were all the same size and out timing was accurate. It wasn't our fault that people kept walking by and making drafts, so you can't blame us and say we weren't fair.

7. Our experiment was good because it proved something. We now know that red candles burn faster than blue ones.


8. I feel very pleased with the success of the experiment. We worked together well as a group and did the investigation very well. I would have liked to come to a much more final conclusion before having to finish. This is not our fault. We did manage our time well and got a lot done in the time we had.


9. We could have investigated further by timing more candles and taking an average, but I think we have the correct results and carried out a well planned investigation, which was worthwhile and interesting.


10. I think I have done well in predicting the results. I said the red candles would burn longer and they did. We made it as fair but we couldn’t tell if the wicks were exactly the same.


11. I used a stopwatch to make the experiment accurate. I also measured the length of each candle with a ruler and they were the same length (4.3cm). I trimmed the wicks of the candle before I started to make sure they were all the same length. The flames of the candles kept on blowing around because Joel kept opening the window. This mucked up our experiment and made the wax run. If I repeated the experiment I would shield the candle with books.


12. I don’t think this tells me much. The yellow candle burnt for 240.45secs and the blue one for 242.11secs. This might have been because the candles were just a bit different and be nothing to do with the colour of the candle. I need to do it again with a lot more candles to find out if there is really a difference.








Appendix 1: Chemical Reactions & Measurement

Answers to pupil activity

Answers are coded as follows:
Yellow - good points of evaluation.
Blue - poor points of evaluation.


1. We tested a blue and a red candle and measured the time it took for them to burn. We tried to keep everything the same, except for the colour of the candle, but I lit the red one and Jemma lit the blue one. We should have had the same person lighting both candles.

Evaluation comments are related to fair testing, but the person who takes the measurement is not relevant.


2. I think this experiment was really easy. Sue's and my results together got a good pattern. We didn't include Dan's because his were way out. He must have read the stop-clock wrong all the time.

Opinions about whether the experiment is easy have nothing to do with the quality of the evidence. No evidence is given to be able to judge whether different students had made measurement errors. Dan may have been ‘way out’, but out voting another is not a good way of evaluating.


3. We thought that the candles might not be exactly the same so we tested three yellow and three pink candles and took the average to get a more accurate result. We measured the length of each candle to make sure they were all the same length. The results with different candles



  Time to burn yellow candle (s) Time to burn white candle (s)
First candle 183.03 185.95
Second candle 242.11 181.73
Third candle 187.56 189.44
Average 204.23 185.71


We think the candles were different thicknesses and we should have weighed them to make sure they were the same size.

Procedures for increasing accuracy by averaging are mentioned, as are two factors affecting whether relevant variables were controlled.


4. We tested one blue and one white candle and timed them to see how long they burnt for. We repeated the measurements three times with new blue and white candles to make it fair.

The writer has identified the need to make repeat readings, but justifies this in terms of fair testing (i.e. controlling variables) rather than in terms of accuracy.

5. It was the right answer, because we did it fairly. We made sure the candles were all the same size and our timing was accurate. It wasn't our fault that people kept walking by and making drafts, so you cant blame us and say we weren't fair.

This statement contains comments on how well the group had worked, which has nothing to do with quality of evidence. Fair testing and accuracy are considered.


6. Our experiment was good because it proved something. We now know that red candles burn faster than blue ones.

This statement has nothing to do with quality of evidence.


7. I feel very pleased with the success of the experiment. We worked together well as a group and did the investigation very well. I would have liked to come to a much more final conclusion before having to finish. This is not our fault. We did manage our time well and got a lot done in the time we had. We could have investigated further by timing more candles and taking an average but I think we have the correct results and carried out a well planned investigation, which was worthwhile and interesting.

These students are trying to justify how they worked, but do make a suggestion for improving the investigation.


8. I think I have done well in predicting the results. I said the red candles would burn longer and they did. We made it as fair but we couldn’t tell if the wicks were exactly the same.

This student judges the enquiry in terms of whether her prediction was correct rather than in terms of the quality of the evidence. One uncontrolled variable is mentioned in relation to fair testing.


9. I used a stopwatch to make the experiment accurate. I also measured the length of each candle with a ruler and they were the same length (4.3cm). I trimmed the wicks of the candle before I started to make sure they were all the same length. The flames of the candles kept on blowing around because Joel kept opening the window. This mucked up our experiment and made the wax run. If I repeated the experiment I would shield the candle with books.

The quality of the evidence is considered in terms of accuracy of measurement.


10. I don’t think this tells me much. The yellow candle burnt for 240.45secs and the blue one for 242.11secs. This might have been because the candles were just a bit different and be nothing to do with the colour of the candle. I need to do it again with a lot more candles to find out if there is really a difference.

Uncontrolled variables are mentioned, although no specific ones are identified.




 
open page as: open as Word document to edit open as PDF file to print
< previous page |