Peer-Assessment Economic Convenience or Pedagogical Preference ?

The new economy is driving change in higher education. Fiscal pressures coupled with unprecedented growth are reducing staff-student contact time and summative assessment processes that include self and peer-assessment are offering convenient solutions to reduce staff workloads. Despite convincing pedagogical arguments, student beliefs and perceived capabilities regarding these alternative assessment practices undermine their validity. An epistemological basis for the range of student beliefs is proposed. Pre-service primary teachers’ levels of support for selfand peer-assessment were measured drawing on scales developed from multi-dimensional indicators. Moderate support was evident, along with support for selfand peer-assessment being positively and strongly correlated, though one third of students opposed their use. A relationship between students’ epistemological perspectives and their support for peer-assessment is tentatively established, and particularly for men. The implication for practice is that the valuing of peer-assessment by students and therefore its validity is likely to be enhanced through providing students with greater opportunities to experience valuable peer interactions in the classroom. While men were more supportive of peer-assessment than the women, supporting men towards developing a more connected knowing orientation would also likely enhance their support for the practice. Genderrelated patterns evident in this study of students’ epistemological perspectives are consistent with earlier work on adult intellectual development. Recommended are future studies to explore gender-related validity issues with respect to studentderived summative assessment. Qualitative data revealed a tension for students between the maintenance of peer relationships over providing accurate assessment of their peers’ work. Tensions between fiscal pressures and pedagogical preference are discussed.


I
N RESPONSE TO unprecedented growth in student enrolments in the higher education sector, coupled with current political imperatives towards cost cutting and efficiency drives (Ballantyne, Hughes & Mylonas, 2002), teaching staff are seeking creative ways to deal with the associated increase in student assessment workload (Pope, 2005).Within this context, self-and peer-assessment are seemingly attractive options (Handrahan & Isaacs, 2001).As Pope (2005, p.51) blatantly comments, "It is in the interests of universities to consider the introduction of peer and self-assessment if only on a purely financial basis".Nonetheless, Ballantyne et al. (2002) suggest that at least in the short term these practices may be just as time consuming as more typical kinds of assessment due to the additional procedural guidelines, criteria sheets and student training.Others also propose that when used teaching staff should also assess to enhance validity (e.g.Zevenbergen, 2001).
Certainly the use of diverse assessment methods to ascertain students' knowledge is generally accepted as best practice (Fry, 1990;Orsmond, Merry & Reiling, 2000) and the diversification of assessment practices to include some form of self-and peer-as-sessment in higher education would seem to be a growing practice given the extent of coverage in the literature (see Falchikov & Goldfinch, 2000;Pope, 2005) In this paper an instrument to systematically measure student beliefs about self-and peer-assessment based on issues raised in this literature is presented in the context of pre-service education.There is an ethical imperative for faculties of education to be deeply involved in research on alternative assessment practices, particularly within the current economic climate.Teacher educators are particularly responsible to model excellent teaching practice and there is an onus on faculties of education to come to terms with the complexities of self-and peer-assessment.

Theoretical Framework
The use of peer assessment is grounded in philosophies such as active learning (e.g.Piaget, 1971), adult learning theory (Cross, 1981) and social constructivism (Vygotsky, 1962).According to a number of authors, peer-assessment can assist students to become more independent learners through more active engagement with new knowledge and a deepened understanding of content (Hanrahan & Isaacs, 2001;McDowell, 1995;Topping, 1998).Peer-assessment can also foster better inter-personal relationships and project management skills (McDowell, 1995) and improve student verbal communication and negotiation skills (Ellis, 2001;Topping, Smith, Swanson & Elliot, 2000).These qualities are particularly desirable to develop in teachers.That these practices can take the mystery out of the assessment process (Brindley & Scoffield, 1998), thereby imbuing students with a better understanding of the inherent subjectivities in the process (Hanrahan & Isaacs, 2001) suggests students' epistemological beliefs needs to be a consideration when exploring this area of practice.Perry (1970) in his foundational longitudinal study provided a theory of stages in adult epistemological development that pertained to how undergraduate students make meaning of their educational experience in association with their assumptions about the nature, limits and certainty of knowledge.Baxter Magolda (1992) in her comparable longitudinal study identified epistemological shifts in student views with respect to five domains in learning: role of self, role of peers learning, role of the teacher, role of assessment, and the nature of knowledge.At one end of the scale, absolute knowing is characterised by a belief that knowledge is certain; the perception that the role of peers is mainly social; a reliance on external knowledge authority; and the view that assessment is the sole domain of the teacher.At the other end of the scale, Baxter Magolda (1992) described a perspective called 'contextual knowing', characterised by a belief that knowledge is inherently uncertain; a view that the role of peers is critical for learning to assist in understanding alternate perspectives; clear indications of a strong internal knowledge authority; and, in relation to assessment, a perception that both the teacher and students need to work together as a team toward the setting of goals to measure progress.As students developed an appreciation of knowledge being theory driven, rather than factual, their appreciation of peers' views on focus topics were similarly enhanced.Baxter Magolda's (1992) research findings echoed those of Perry (1970) but she also identified gender-related patterns.As the men and women in her study moved towards a more 'contextual' view of knowledge creation, women were more likely to come to this perspective from an interpersonal reasoning pattern, and men from an impersonal reasoning pattern.
These descriptions have striking parallels with Belenky, Clinchy, Goldberger and Tarule (1986) in Women's Ways of Knowing (WWK) and have been discussed elsewhere (see Brew, 2001).Belenky et al (1986) proposed that their participants tended to emphasise either a separate or connected knowing orientation in their cognitive framework.Separate knowing (SK) is linked to the traditional notion of critical thinking, epitomised by taking an oppositional orientation when someone expresses their viewpoint (Belenky et al. 1986).Connected knowing (CK) on the other hand, is an empathic orientation towards understanding others, "When I have an idea about something, and it differs from the way another person's thinking about it, I'll usually try to look at it from that person's point of view, see how they could say that, why they think they're right, why it makes sense" (Clinchy, 2002, p.73).In this way both separate and connected knowers are described as exhibiting objectivity but that they develop this objectivity from different perspectives.Clinchy (2002, p.73) in retrospect, reflects on how her and her colleagues in WWK were "audacious" in proposing CK as a genuine procedure as they had little empirical data for constructing the concept.Knight, Elfenbein and Messina (1995) did establish a reliable quantitative measure for these two constructs, identifying them as unipolar or orthogonal dimensions rather than opposites.They suggested that future research examine the preference of connected and separate knowers for particular teaching approaches.In Clinchy's more recent writings a more coherent and systematic comparision of the two concepts from past and more recent research is also presented.The relevance of the two concepts to the current study on assessment practices is more evident from this work.
Connected Knowers, like Subjectivists, are reluctant to make judgements; they are in this sense accepting but theirs is not the passive acceptance … of the Subjectivist".Fully developed Connected Knowing requires that one 'affirm" or confirm the subjective reality of the other, and affirmation is not merely the absence of negative evaluation; it is a positive effortful act.(Clinchy, 2002, p.77)With SK, the purpose is primarily about examining the validity of the other's perspective, an emphasis on "looking for what is wrong … How good is this ….? What are its strengths and weaknesses?" (Clinchy, 2002, p.74).Furthermore they are concerned about bias, and would be supportive of the use of "blind grading in assessing students' work" (Clinchy, 2002, p.75).
With respect to CK and SK being gender-related, undergraduate women have consistently rated themselves to be more connected knowers than separate knowers with men not rating themselves differently on the scales (Galotti, Clinchy, Ainsworth, Lavin & Mansfield, 1999;Galotti, Drebus & Reimer, 1999).In later years of study in higher education, evidence suggests that epistemic gender-related differences tend to diminish (Baxter Magolda, 2002;Wood & Kardash, 2002).It seems appropriate, therefore, to investigate student views with respect to their perceived validity of self-and peer-assessment practices in connection with their orientation towards connected and separate knowing with additional consideration to gender.

Issues Relating to Self and Peer Assessment
The pedagogical arguments for the use of self-and peer-assessment pertain primarily to a formative assessment structure.That is, the benefits are described with respect to enhancing students' intellectual and inter-personal skills and capabilities.The university education system, however, is biased towards summative assessment practices (Pope, 2005) and this results in some severe limitations.
Over the past decade descriptive studies have reported beliefs among students that demonstrate these limitations.Students commonly believe it is the role of teaching staff to assess and not their responsibility (Davies, 2000;Brindley & Scoffielf, 1998;Searby & Ewers 1997).Students request that if peer-assessment is used, teaching staff should also provide feedback to them (McDowell, 1995).Orsmond and Merry (1996) state that this belief is linked to students' perceived lack of expertise to mark other's work and an inherent scepticism about the worthiness of peers' comments.Similar views have also been reported by other authors (Falchikov 1995;Mowl and Pain, 1995).Corrupt practice such as the awarding of higher marks to friends occurs (Cheng & Warren, 1997) and there is a general reluctance to award low marks to peers for poorly presented assignments (Brindley & Scoffield, 1998;Falchikov, 1995).Peer assessment is also associated with considerable stress and anxiety (Pope, 2005).
The implicit assumption behind using self-and peer-assessment for summative purposes is that it will be at least as accurate and fair as staff assessment (Ballantyne et al., 2002).Studies that have explored the validity of summative peer-assessment more systematically commonly compare peer-assessed scores with faculty-based scores.A meta-analysis of 48 such studies by Falchikov and Goldfinch (2000) is cited by several authors for the purpose of suggesting there is strong evidence in support of the validity of peer-assessment (e.g.Hanrahan & Isaacs, 2001;McConnell, 2002;Laughlin & Simpson, 2004) though Langan et al. (2005) does provide a more even-handed consideration of the evidence provided by Falchikov and Goldfinch (2000).After removing outlier studies and statistically adjusting for sample size, Falchikov and Goldfinch (2000, p.314) report a standardised correlation of 0.69 (range 0.14-0.99)which they propose is indicative of, "definite evidence of agreement between peer and teacher marks on average".This is the evidence most commonly cited by the above mentioned studies, with the omission of the following rider: An r value of 0.69 indicates that less than half of the variation in peer marks is associated with variation in teacher marks … [and that] … Some caution should be exercised in the interpretation of the results of the present study due to the presence of some very small sample sizes.In addition, there may be some liberal readings of the data due to the combination of variables in several ways.(Falchikov & Goldfinch, 2000, p. 314, 317) It is important to recognise that in the hypothetical situation of student derived grades being either consistently the same or consistently one grade more (or less) than a faculty score, a perfect correlation would be obtained in both instances.Hence an r-value alone is not necessarily a representation of peer-assessment 'validity' in this context.Falchikov and Goldfinch (2000) did provide a mean effect size of .02(range -.75-1.25) from 24 of the studies which does support their overall conclusion as this is a measure of the difference in means between peer and faculty scores.Nonetheless, the range in values is more indicative of peers in these reviewed studies to be either providing scores well above or below those of staff.Interestingly, "the correlations were significantly smaller as the number of peers increased" (Falchikov & Goldfinch, 2000, p.312) and there was no consistent pattern to support the proposition that the correlations between staff and peer scores increases from first year to more advanced levels in the higher education context.
An alternative review on the available 'validity' literature by Pope (2005) proposes largely low correlations between teacher and peer scores, with notable exceptions, particularly when the process is well designed with respect to clear criteria.Falchikov and Goldfinch (2000) also list a number of recommendations to enhance validity when peer assessment is used for grades in higher education.Particular recommendations include avoiding having students numerically rate many separate dimensions and to use global marks with well understood integrated criteria.Assessment criteria developed by students also apparently provides better correlations between student and faculty derived scores.
With respect to self-assessment, evidence does suggest better agreement with faculty-derived scores for students in advanced compared to introductory courses (Falchikov & Boud, 1989).More recently, Rudy et al (2001) noted that medical students tend to be overly critical of their own performance and Lejk and Wyvill (2001) that lower achieving computing students tend to favour themselves while brighter students do not.Subject area also appears to be an important variable (Falchikov & Boud, 1989).

Context for the Study
Working within an Australian teacher education program, my colleagues and I are regularly reminded by our superordinates to seek out more cost effective ways to conduct our teaching and administrative responsibilities.'Over-teaching' and 'over-assessing' are now common expressions at faculty meetings.This pressure is directly linked to a decreasing basic federal government funding arrangement to the university sector.At the same time we are required to strive for improved results on feedback about the quality of our teaching to obtain a slice of the Australian Federal Government's new Learning and Teaching Performance Fund (LTPF).This initiative is designed to reward financially the top 14 universities assessed from findings from the CEQ (Course Experience Questionnaire), the GDS (Graduate Destination Survey) and completion rates.
Over the past three years our one-year full time primary pre-service teacher enrolment has grown from around 100 to 280 students.Despite additional staff, tutorial class sizes have increased from 30 to 45 students.Workload allocation is increasingly becoming atomised and more time is now consumed with administration duties related to accountability requirements.Within this climate practices that reduce time associated with formal assessment of students are highly desirable.

Aims of the Study
There were four broad aims for this study.The first aim was to develop an instrument to systematically explore the perceived validity of self-and peer-assessment among students.The second aim was to explore a possible relationship between perceived validity of these assessment practices and students' epistemological perspectives.The third aim was to correlate self and faculty derived scores to investigate the extent of overlap.The fourth aim was to explore for gender differences across these investigations.A broad range of predictions were proposed; 1. Perceived validity of self-and peer-assessment would be positively correlated given that they both require a perception of competence in self with respect to the ability to assess student work.2. Perceived validity of peer-assessment would be positively correlated with support for a discussion oriented classroom environment as support for peer assessment should be indicative of a strong role for peers in learning.
3. Perceived validity of peer-and self-assessment would be positively correlated with a lack of acceptance for knowledge ambiguity as this would reflect an absolute orientation towards knowledge and authority.
In relation to a possible relationship between level of support for peer assessment and the connected and separate knowing orientations: 1. Given that a connected knowing orientation would likely reflect the embracing of the opportunity to give and receive constructive feedback to enhance learning, and that there is a high level of acceptance of subjectivity, it is proposed that there would be a positive correlation with the perceived validity of self-and peerassessment.However, connected knowers are reluctant to judge and this may create conflict within the learner when awarding grades.2. Given that a separate knowing orientation would likely reflect a concern for subjective bias this may create concern for validity and hence a negative correlation with perceived validity of self-and peer-assessment is proposed.However, separate knowers are natural judgers and so there may be some appeal among them for the peer-assessment process.
With respect to gendered patterns, it was proposed that; 1.The women in the study would report a greater orientation for connected versus separate knowing compared to men based on the literature though within a primary preservice program this general trend may be negligible.2. With respect to self assessment it was proposed that; 3. Higher-achievers would be were more likely to assess themselves at a lower grade compared to faculty, and lower-achievers to assess themselves at a higher grade than faculty.

Method
Students enrolled in the primary mathematics method of a metropolitan-based one-year Graduate Diploma of Education in 2006 participated.The students were predominately local Australian students (85%) and women (70%), with a small international cohort of students (15%) drawn mainly from Canada.

Assessment Data
There were two mathematics method assignments in the first semester.The first assignment (value 20%) involved a group presentation based on a research article in mathematics education.In groups of four, students were required to discuss their article and deliver a 15 minute activity-based presentation to peers.The focus of the presentation was to illustrate the practical implications of the research for the school classroom.Student groups were provided with six criteria based on a four-point scale for marking: 1) Activity engaging and meaningful with respect to classroom practice; 2) research findings clear; 3) activity relevant to the findings; 4) explicit links to the curriculum; 5) clarity of expression and directions to group; and 6) responsiveness to questions from class members.Each group was required to assess the presenting group and arrive at a score by consensus and each member of the presenting group received the same score that was an average of the individual group scores.Three or four groups assessed at any one time.The pedagogical basis for the activity was two-fold.First it was designed for students to discover the research literature, provide the opportunity to share and discuss their interpretations and consider implications for practice.Second, the peer-assessment aspect was designed to provide trainee teachers with assessment experience, to encourage them to listen attentively and develop their skills in providing constructive feedback.To allow all groups to present in a two hour session, the classes of 45 students were divided across two rooms.The role of the teacher in this session was as an observer who moved between rooms regularly and collected the assessment sheets for later collation.It is important to add that the original assessment format included faculty-assessed scores and presentations which occurred over two weeks.Due to a reduction in coursework hours and increased class sizes, the faculty-assessed component was abandoned.A comparision of teacher and peer assessed scores was therefore not possible.
The major assignment required students to conduct an early numeracy diagnostic assessment with a child.The focus of the assignment was on the students number skills: counting, number recognition and the four operations.Students were required to demonstrate how they had ascertained what the child was capable of in these areas and how they had extended the child to determine the limits of their knowledge which would provide insight into the level of sophistication of the child's number strategies.Students were provided with a list of criteria with an allocated score for each.A completed self-assessment sheet was requested.The reasons given to students for this process included: To ensure students did not overlook criteria, as is common; and to provide assessors an avenue for providing specific feedback where there were discrepancies with the self-assessed score.The self-assessment score did not contribute to student grades.The means, and the correlation between self-assessed and faculty scores are reported.

Survey Data
Twenty nine survey items were used to measure multi-dimensional scales concerning peer assessment; self assessment; an interactive classroom, tolerance for ambiguity in learning and the role of faculty in student assessment.Items to measure beliefs about peer assessment included the following themes: perceived validity and accuracy; preparedness to engage; comfortableness with the process, perceived capability; and curiosity regarding peer feedback.Items to measure beliefs about self assessment included the following themes: perceived validity, preparedness to engage and perceived capability.Two additional assessment items were included to measure students' beliefs about the responsibility of faculty for assessment.Four items were included to measure support for an interactive classroom environment with respect to the valuing of students' voices, ideas, discussion and debate.Three items were included to measure the students' acceptance for ambiguity of knowledge in learning.To measure students' orientation for connected and separate knowing, four items were taken from the scales developed by Knight, Elfenbein and Messina (1995) for both.Additional items were included on peer assessment but not designed for scale development per se.These explored themes regarding the extent to which students believed peer-assessment should contribute to their grade and level of bias with respect to faculty grades.Further items exploring past and present experiences with learning mathematics were also included.A six point scale was adopted: 1strongly disagree; 2 = disagree; 3 = mostly disagree; 4 = mostly agree; 5 = agree; 6 = strongly agree.
Survey distribution occurred at the conclusion of the peer-assessed presentations.In all, 76 students responded of 192 students enrolled in the mathematics method (40% response rate).Of these, 66% were women and 15% international students, reflecting the make-up of the larger cohort.For the 53 students who reported their age, the mean was 28.4 years.The mean age for men was higher than for women (32 and 27 years respectively).Fifteen students also participated in three focus group discussions and written notes were taken at this time.Students were also invited to provide written open-ended feedback on their survey.

Data Analysis
The survey items were subjected to an Exploratory Factor Analysis using maximum likelihood with oblique rotation (Direct Oblimin) to establish the presence of latent factors.Three methods were adop-ted to justify the number of latent factors: scree test; a parallel analysis (Thompson & Daniel, 1996) and theoretical interpretability.Factor analytic techniques generally assume a sample size of at least three hundred (Tabachnick & Fidell, 1989) and hence the small sample size creates limitations.However, if the factor structure is unambiguous there is a reasonable premise for exploration.Alpha values, means, standard deviations and inter-correlations of the resultant scales are reported, along with a gender comparison.The qualitative data collected were used to provide further insights into the findings of the quantitative data.

Exploratory Factor Analysis
Initial measures for the EFA indicated a factor structure worthy of exploring (Kaiser-Meyer-Olkin measure of sampling adequacy = 0.71; Bartlett's test of Sphericity = 997.66,df = 378, p <.001).The scree plot and the parallel analysis (Table 1) suggested a five factor structure, accounting for 56 percent of the variance.
The first factor included the seven items measuring beliefs about peer assessment and the two items concerning the role of faculty in assessment.From the structure matrix this dimension is overall measuring students' beliefs about the validity of peer-assessment for summative purposes.The second factor included four items measuring support for an interactive or discussion oriented classroom.The third factor included three of the four items measuring a separate knowing orientation.The fourth factor included five items concerning self assessment and from the structure matrix this dimension is measuring students' beliefs about the validity of self-assessment for summative purposes.The fifth factor included the four items measuring connected knowing and two of the four items designed to measure acceptance of knowledge ambiguity in learning.Theoretically, it seems sensible that tolerance for contradictory ideas and acceptance of more than one right answer would be correlated with the connected knowing scale given its association with a subjective epistemological orientation (Clinchy, 2002).Despite the small sample size the pattern matrix was essentially unambiguous, and only two items were omitted from scale development.One separate knowing item did not have a correlation coefficient greater than |0.30| (Tabachnick & Fidell, 1989) on any of the five factors, and the other item omitted designed to measure students' perspective on the nature of knowledge, "I like classes that focus on factual information", loaded negatively with the self assessment factor.This was omitted due to lack of clear theoretical interpretability.Otherwise item correlation coefficients were above 0.45 across all five factors (Table 2).Reverse scoring for items phrased in the negative occurred before further analysis.The internal consistency measure for the resultant scales yielded adequate Alpha scores particularly given the small number of items for some scales (Table 3).The mean for each construct was calculated from an aggregation of scores for all items.A gender comparison established that the men perceived peer assessment to have greater validity than the women, with no gender difference for the comparable self-assessment dimension (Table 4).Overall, these students conveyed they mostly agreed that self-and peer-assessment have validity for summative purposes but remain cautious.Comments made by students either on the survey or during the focus group discussion were consistent with this interpretation and their reasons for caution echo those reported in the literat-ure elsewhere.Of the eighteen students who provided verbal or written feedback regarding peer-assessment, ten qualified their support for peer-assessment because of friendship bias.Representative quotes follow:

Peer assessment can be very unfair especially when friends give each other full marks. If done, it should be only qualitative and anonymousit is awkward and difficult to be honest and having to give marks in person.
While peer-assessment is a valuable learning and teaching experience, I think it can be unfair.Working in small groups means there is no way to ensure balanced marking standards across the classes.I am also aware that some grades have been awarded on the basis of friendship rather than merit which clearly undermines those that deserve good marks in the first place.Other students suggested strategies to overcome this bias, for example: Mix it up with different groups of people who we don't know, make it more objective.All be-ing really nice to each other, so letting things slip through.When using peer assessment as the final mark for a task please have as many people as possible mark it and average it so as to have standardisation across the class.
Being capable and comfortable to provide graded feedback to peers on quality of work is critical for the peer-assessment process, and does require trusting relationships.When people trust each other there is a sense of friendship, and this is important not to undermine.Hence one can readily empathise with the following perspective: Early in the year the focus is getting along with people, no skin off my nose if everyone gets an A. I am happy to give constructive feedback but I want to get along with the group, these are people who I am going to rely on for the rest of the year.
Scores awarded by peers for the group presentations ranged from 16 to a maximum score of 20 (mean = 18.3; sd = 1.1) resulting in all groups being awarded an A grade.As one student put it, "Peer assessment results in high scores for all."These findings reflect the view of Langan et al. (2005) who concluded that the use of peer-assessment was extremely valuable for formative purposes, but who warned strongly against its use for summative purposes.
The benefits of learner inclusion and active learning dimensions of this scheme (e.g.learner empowerment, assessment experience, better understanding of assessment criteria), merits its inclusion in future courses.Evidence remains equivocal for the use of student marks from this study being used for anything other than formative assessment, due to the possibility that peer assessment of presentations is open to bias.(Langan et al., 2005, p.31-32)The competing pressures in higher education between economic and pedagogical issues were also not lost on some students.
Regarding a mark given by a peer: I know that some groups felt they were given an appropriate mark.Not only can it be awkward, but also inaccurate.I don't mind so much working in groups, assessing each other.I understand the number of students is disproportionate to the number of instructors and that it makes for good practice, I think that it's difficult to dispute.
But again, this perception was not widely apparent, as only 26% of students agreed that, "Peer assessment is just a way for teachers' to reduce their assessment responsibilities."Still, 92% thought that, "If used, peer assessment should contribute minimally to my grade (no more than 20%)".And as one student said: "You have to have a lecturer who is looking across at the overall".

Students' Epistemological Perspectives
Support for an interactive classroom was high and overall students emphasised a CK compared to a SK orientation (Table 3).Interestingly, a comparison by gender established that the men were more in favour of an interactive classroom than the women though women reporting a greater emphasis on CK compared to a SK orientation than the men.Moderate effect sizes by gender for the statistically significant comparisions indicate that these differences have practical significance (Table 4).

Relationship between Epistemological Perspectives and Assessment Practices
Scores measuring the perceived validity of peer-and self-assessment were positively correlated as anticipated with the correlation higher for women compared to men (Tables 5 & 6).The peer-assessment dimension and an interactive classroom were also positively correlated as proposed, but only moderately for both men and women.This suggests that the variation in scores for the peer-assessment dimension can be partly explained by the extent to which peers are valued for intellectual exchange and discussion of ideas.Interestingly neither the CK or SK scales were correlated with any of the episemological dimensions for the women, with the pattern distinctly different for the men.First, SK correlated negatively with the peer-assessment dimension indicating that the less men use a SK orientation the more likely they are to accept the validity of peer-assessment.This is consistent with the literature that proposes separate knowers to be concerned about validity and subjective bias (Clinchy, 2002).Second, men were less likely to perceive a role for peer interaction in their learning the more separate their orientation (r=-0.37),and third, the men reported a stronger role for peer interaction in the classroom the more connected they reported themselves to be (r=.36)(Table 6).With respect to the work of Perry (1970) and Baxter-Magolda (1992) this result suggests that the separate knowing orientation represents a less developed epistemological perspective if one accepts their notions of cognitive development which emphasises an increasing role for peers and their views through the stages of development that is further epitomised by an increasing acceptance that knowledge is inherently uncertain..08M Given the gender differences for the level of support for an interactive classroom and the validity of peerassessment, lack of confidence with the nature of the course was explored as an extraneous factor.In a primary pre-service mathematics program, anxiety associated with discussing issues of a mathematical nature is likely to be a factor (Carrol, 1994;Burton, 1995).The men in this cohort did report a greater ability and comfortableness with mathematics (Table 7) but there were no significant correlations between the scores for these items and the two dimensions to measure support for an interactive classroom and perceived validity of peer-assessment (Table 8).Hence there is no evidence that the reason for the gender differences is contextual with respect to subject area though the differences may be course specific with respect to primary teacher training.In re-sponse to the item, "My final mark would likely be higher if peer-assessment formed part of my grade" both men and women mostly agreed.It is proposed therefore, that the men's greater faith in the validity of peer-assessment compared to women is not about getting a higher grade but their genuine interest in being involved in the assessment process.At least for the men who emphasise a CK orientation, they are more embracing of the opportunity to give and receive constructive feedback to enhance learning.Women on the other hand may experience conflict, perceiving that the giving of scores to peers may undermine friendships.Certainly the qualitiative data collected supports this interpretation.As the men were on average five years older than the women in the sample, this gender difference may well reflect a maturity issue.

Self Assessment Data
Self-assessment data were available for 72% of the cohort.When compared with the faculty scores derived from three assessor scores, overall the students assessed themselves at a higher level (mean faculty score = 75.2%,sd = 12.4; mean self-assessed score = 84.8%,sd = 7.1; t-value = 8.84, df = 139, p<.001).Overall, the large majority of students awarded themselves an A grade (79%) and the remainder a B. As anticipated from the literature, higher achievers were more self-critical as 97% of students who provided a self-assessed score less than faculty received an A grade, compared to 24% of those students who gave themselves a score that was higher than faculty (Table 9).The correlation between faculty and self-assessed scores was positive but only small (r = 0.23, p<.01).Just less than half of the students (46%) provided a self-assessed score that was within 10% of the faculty score.A cross-tab analysis by gender for those who 'over' or 'under' assessed their assignment found no difference.

Conclusions
As predicted there was an overall positive relationship between the perceived validity of peer-assessment with both the perceived validity of self-assessment and support for an interactive classroom.A relationship between pre-service primary male teachers' epistemological perspectives and their belief in the validity of peer-assessment is tentatively established, but not so with self-assessment.Despite the men overall having more faith in the validity of peer-assessment than the women, those men who did not have this faith were more likely to emphasise a separate knowing orientation and were less supportive of an interactive classroom.Whether these views have implications for the validity of peer-assessment per se is unclear but they do demonstrate that there are a range of views among students with respect to these alternative assessment practices and that for men at least (or perhaps for older students), they are related to their epistemological perspectives.The reason that comparable epistemological relationships were not evident for women may be due to a perceived conflict between judging others and maintaining friendships.Baxter Magolda (1992, p.214) proposed that women, "conceptualise autonomy in the context of connection, whereas the two concepts are separate for men".Brew (2002, p.387) also noted that "although dealing with feelings and being open to new ideas are highly related preferences in women's learning (i.e. they are highly … "connected"), they are apparently more fragmented or "separate" elements for men".
Future studies that explore validity issues with respect to student-derived summative assessment are advised to explore gender and age as interacting factors.A further related area worthy of follow up is whether students who emphasise a separate knowing orientation are harder markers of their peers than others.
In reviewing the literature there are extensive pedagogical arguments put forward for the use of peer-and self-assessment, but economic realities in higher education would seem to be a driving force for its increasing use for summative purposes.Hanrahan and Isaacs ( 2001) is a report of particular note.Despite citing some of the recommendations for enhancing the validity of peer-assessment proposed by Falchikov and Goldfinch (2000), the actual model for peer-assessment that Hanrahan and Isaacs (2001) provide for academics to adapt is not based on the optimum model (student-derived criteria with global grades) proposed by Falchikov and Goldfinch (2000).Instead it is actually based on the model that Falchikov and Goldfinch (2000) found to be the least valid with respect to reflecting faculty-derived scores (a scaled rating of criteria using pre-determined teacher criteria).Universities generally require assessment criteria to be established before the beginning of a course and hence this precludes the use of negotiated criteria.Hanrahan and Isaacs (2001) are also silent on advising whether faculty scores should also be used when student-derived scores are obtained for summative purposes to ensure some kind of validity check.At the very least, this is advisable to allow students to dispute their mark.Of course, if faculty also have to mark, then where is the economic saving?Economic realities were the reason for abandoning the original format for the peer-assessed presentation in this course assignment that included a faculty score, though I might add unwisely.
In all, the issue of peer assessment must be resolved in terms of both validity and pedagogical preference.In light of the dubious validity obtained from other studies, how do we respond to the impetus in higher education to move toward including these kinds of assessment practices in final grades?The power of the study of epistemological perspectives, especially when considered in the context of student intellectual development, lies in an understanding of the role of peers as a key catalyst to a shift in perspective from external to internal knowledge authority.Relational knowing and the role of peers there-fore are critical to learning that all knowledge is negotiable.This is an important issue for teachers, teacher educators and pre-service teachers.Inquiry with peers engenders and maintains the freshness of sincere curiosity about diverse understandings and ways of coming to know.The fascination with how one learns is the essence of a good teacher.Contextual connected knowing is part and parcel of the gifted mathematics teacher who helps students appreciate the multiple modes of mathematical reasoning.The comfort with oneself as a work in progress is something a teacher can convey to a student through modelling.Vygotsky (1939Vygotsky ( /1987) ) called it the gift of confidence (Mahn & John-Steiner, 2002).Too soon teachers leap to the role of judge and juror and hence it is important to enrich and extend pre-service teachers' collaborative consideration of the complexities of assessment.Including them in the assessment process is very valuable for their learning.Yet for considerations about assessment to be authentic, reflective and collaborative and encouraging of selfcritique there must be safety.When peers become graders, if they were to grade according to their actual judgements of each other rather than in an ethic of charity, the relationship would change.Perhaps some of the findings suggest that they cannot afford to give 'valid' assessments of each other unless they are prepared to lose the most valuable asset they have in the program -each other.How then in conscience, does a faculty of education respond to the current pressures?Even though involving students extensively in assessment processes remains of paramount pedagogical importance, to introduce peer-assessment for the express purpose of reducing workload and decreasing costs is fraught, as the pedagogical basis is likely to be thwarted as adjustments are made to processes.Let us not fall into the trap of bait and switch, whereby the pedagogical value of being involved in assessment processes for formative reasons is used to rationalise requiring students to determine their own or each other's grades for economic reasons.