Who Test Whom

November 2019
PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA

Overview

Download & View Who Test Whom as PDF for free.

More details

Words: 7,294
Pages: 6

Preview
Full text

By Sharon McDonald and Helen M. Edwards

Who Should Test Whom? Examining the use and abuse of personality tests in software engineering.

T

he construction of software engineering teams, the interaction between members, and how individual personalities influence these, has been a concern from the 1960s to the present day [5]. Nevertheless, despite claims from leading figures in the field that it is fundamentally people that make the difference between software success and failure, a corpus of knowledge and good practice has failed to emerge. While there have been some attempts to investigate these issues through the application of psychometric tests, the issue of what personality analysis can or cannot offer software engineering is still open for debate [6, 9]. In this article we argue that the lack of progress in this field is due in part to the inappropriate use of psychological tests, Illustration by R OBERT N EUBECKER

COMMUNICATIONS OF THE ACM January 2007/Vol. 50, No. 1

67

By Sharon McDonald and Helen M. Edwards

Who Should Test Whom? Examining the use and abuse of personality tests in software engineering.

T

he construction of software engineering teams, the interaction between members, and how individual personalities influence these, has been a concern from the 1960s to the present day [5]. Nevertheless, despite claims from leading figures in the field that it is fundamentally people that make the difference between software success and failure, a corpus of knowledge and good practice has failed to emerge. While there have been some attempts to investigate these issues through the application of psychometric tests, the issue of what personality analysis can or cannot offer software engineering is still open for debate [6, 9]. In this article we argue that the lack of progress in this field is due in part to the inappropriate use of psychological tests, Illustration by R OBERT N EUBECKER

COMMUNICATIONS OF THE ACM January 2007/Vol. 50, No. 1

67

Table title: The MBTI functions and their focus.

frequently coupled with basic misunderstandings of personality theory by those who use them. To support this case we will present our analysis of papers that focus on the empirical use of personality tests in a software engineering context. Our analysis is supported by the expertise of the first author, who is both a chartered psychologist and a trained administrator qualified in the use of MBTI and 16PF psychometric tests. We conclude with a set of recommendations for test application and use for researchers, participants, and readers.

ANALYZING PERSONALITY IN SOFTWARE ENGINEERING RESEARCH We surveyed papers published in the software engineering field relevant to the topic of personality testing, using digital libraries. This process generated 40 papers published between 1984 and 2004. From this pool 13 distinct papers were identified that focused on the empirical use of personality tests in a software engineering context: this subset is used to illustrate our arguments (osiris.sunderland.ac.uk/~cs0hed/CACMdata provides access to the full data set). Our analysis of these papers concentrates on examining test selection to identify whether reliable and valid instruments have been used, whether the test chosen is appropriate for the purpose, and the extent to which the personality testing process used is explicitly reported and discussed. It is our contention that as a minimum a paper must account for these issues if there is to be any confidence in the resultant data analysis. The majority of the papers surveyed (25 out of 40) focused on the Myers-Briggs Type Indicator (MBTI); we will therefore confine our discussion to this tool. The MBTI classifies personality in terms of people’s preferred ways of operating in the world. It categorizes individuals into one of 16 personality types. These types are derived from people’s expressed preferences on four functions: (E)xtroversion vs. (I)ntroversion, (S)ensing vs. (I)ntuition, (T)hinking vs. (F)eeling, and (J)udging vs. (P)erceiving [2]. Each type has a number of positive features: there is no ideal type, they are all equally valued. The eight functions and their focus are summarized in the table here.

together, or exhibit tensions. Capretz [3] studied the MBTI types of 100 software engineers and found the largest type was ISTJ. While Capretz acknowledges there is no link between type and performance, and that other factors have a bearing on career choice, he goes on to state that these findings are important for employers looking for software engineering professionals. More recently, the MBTI was used to investigate the link between personality type and a code review task with a sample of 64 students [4]. In this study those with an NT (Intuition-Thinking) preference were seen to perform the task better than other types: the largest single type was ENTP. The authors expressed surprise as their results conflicted with those of Capretz. However, these findings do not tell us a lot about the ideal or even adequate software engineer, given the fact that type is not normally distributed in the population. As Kerth et al. [9] correctly point out: personality tests cannot identify good software engineers over bad, nor can their results predict “on the job” performance; there is some evidence of the importance of other factors, such as work experience [11]. Where researchers wish to identify the personality factors related to software engineering, or those factors that typify a group of exceptional software engineers, a more appropriate approach would be to use a traitbased instrument (such as the 16PF) where comparisons to a normative sample can be made. The main barrier to this approach would be choosing, or most probably creating, a representative normative sample for comparison. The relevance of the MBTI to identify the makeup of software development teams has been limited, in that observable behavior is not always related to the underlying type. People can, and do, choose to operate in the non-preferred mode as situations dictate. The MBTI is a tool for the development of selfawareness and, when results are shared, awareness of others. Knowledge of personality type within a team allows people to expect others to react differently from themselves and equip them to cope more constructively with those differences. As such, the MBTI can be used to improve teamwork with the hoped for byproduct of improved productivity and quality, as long as the test is used properly.

WHAT CAN PERSONALITY TESTING OFFER SOFTWARE ENGINEERING? DISPELLING THE MYTHS In general, the MBTI has been used within software engineering research in one of two ways: to discover the personality type(s) that most typify good software engineers, or to identify the makeup of software development teams that are likely to work well

WILL THE REAL MBTI PLEASE STAND UP? The value of any psychometric instrument is directly related to the techniques used during construction; not all psychometric tests are created equal. Test publishers describe the precise methods of test development, in particular, statistical data relating to test reliability and validity. They do this because to

68

January 2007/Vol. 50, No. 1 COMMUNICATIONS OF THE ACM

communication with the ignore such issues would render a test worthless: a E–I Focus: The way in which we focus our attention and draw energy authors it was established (E) Introversion (I) that rather than using the poor test will yield poor Extroversion Focus on the external world of people Focus on own inner worlds of thoughts, ideas, and experiences. full MBTI, they had in results. However, the and activity. fact constructed their own importance of this appears S–N Focus: The way in which we take in information Sensing (S) Intuition (N) test: no details of test conto be lost on many of those Observant of detail, focus on the real See the bigger picture, drawing relationships between concepts, struction and validity were who use such tests. Unfor- and tangible, the here and now. imagining new possibilities. provided. tunately, the casual reader You might ask: So of many of the articles dis- T–F Focus: The way in which we make decisions (T) Feeling (F) what? Well, the developcussed here would not see Thinking Base decisions on logic and objectivity. Make decisions through empathy, guided by personal values. ment of a robust personalthe significance of this ity measure is a point, or its likely bearing J–P Focus: The way in which we deal with the outer world (J) Perceiving (P) time-consuming, iterative on the validity of the Judging Enjoy making plans, methodical and Like to be flexible and open to change, feel constrained by plans. process that can take many research, because in the strive to accomplish tasks by selfimposed deadlines. years, not least because majority of cases details of personality measures prethe specific tests used are The MBTI functions and sent particular problems during construction. For a glossed over, and in some their focus. test to be of any value it must be both reliable and cases, misrepresented. Even valid. (1/07) Test reliability is the extent to which a test is conwhen researchers have used Edward table the real MBTI, for example [1, 11], details of the sistent within itself, and over time. That is, the degree to which a test will give the same score or personality administration process are missing. Karn and Cowling’s [8] study of the interactions of type for an individual on retesting. Test validity is the personality types during software development claims extent to which a test measures what it is intended to to have used the MBTI to identify the individual per- measure. Reputable tests such as the MBTI provide sonality types of two teams of student software devel- statistical data on these factors and details of the methopers. In fact, the MBTI was not used, a later technical ods and samples used to gather this data. To ignore report [7] reveals that a freely available test was used these factors when choosing a test will increase the pos(www.humanmetrics.com). While it is claimed that sibility of acquiring misleading data. In addition, respondents may attempt to distort there are “no significant statistical differences between this test and the MBTI” [7], the argument is not con- their profiles; for example, by responding to items in vincing. On inspection of www.humanmetrics.com, ways they believe will create a favorable impression. no data is provided on the methods of test construc- Care must be taken therefore during item develoption, no reliability or validity data, and no MBTI cor- ment to limit the insight a respondent may have, and relation data. Moreover, in our opinion, the content to ensure that one pole of the preference does not and style of the site itself is hardly indicative of a pro- appear more appealing or acceptable than the other. fessional organization: no surface contact details are Standardized tests such as the MBTI are developed in provided and there is no firm evidence of credentials. a way that will limit the effects of such response sets. The site offers an interesting range of other free tests However, this peace of mind comes at a price: tests including, “find your perfect partner”—perhaps the such as the MBTI can be relatively expensive to purbasis for a new slant on the concept of “pair program- chase. Freely available tests generally do not provide ming”? Although this site might offer some amuse- data on reliability and validity. Nor do they offer an ment, the potential effects on the subject group are not insight into the test construction process, nor comso lighthearted. A critical part of the administration ment on the possible effects of response sets and how process is gaining client acceptance and willingness to the test design limits these effects. Taken together, the answer honestly. The testing environment in this case two issues of the lack of detail provided on test concan in no way have guaranteed that the subjects will struction, and the absence of validity and reliability data, severely limit our ability to trust the results of have taken the process seriously. Miller and Yin [10] discuss the use of the MBTI in such tests. A test is worthless if we cannot be sure that the construction of software inspection teams. They it measures what it is supposed to measure, and its claim to use the MBTI within this study, but then results are consistent over time. comment that they use the “standard approach of online specialized questionnaires.” We were interested TEST ADMINISTRATION AND FEEDBACK to discover precisely which version of the MBTI had All tests, including the MBTI, have a degree of error been used within this study and through personal in their accuracy, and this error may be amplified by COMMUNICATIONS OF THE ACM January 2007/Vol. 50, No. 1

69

Table title: The MBTI functions and their focus.

frequently coupled with basic misunderstandings of personality theory by those who use them. To support this case we will present our analysis of papers that focus on the empirical use of personality tests in a software engineering context. Our analysis is supported by the expertise of the first author, who is both a chartered psychologist and a trained administrator qualified in the use of MBTI and 16PF psychometric tests. We conclude with a set of recommendations for test application and use for researchers, participants, and readers.

ANALYZING PERSONALITY IN SOFTWARE ENGINEERING RESEARCH We surveyed papers published in the software engineering field relevant to the topic of personality testing, using digital libraries. This process generated 40 papers published between 1984 and 2004. From this pool 13 distinct papers were identified that focused on the empirical use of personality tests in a software engineering context: this subset is used to illustrate our arguments (osiris.sunderland.ac.uk/~cs0hed/CACMdata provides access to the full data set). Our analysis of these papers concentrates on examining test selection to identify whether reliable and valid instruments have been used, whether the test chosen is appropriate for the purpose, and the extent to which the personality testing process used is explicitly reported and discussed. It is our contention that as a minimum a paper must account for these issues if there is to be any confidence in the resultant data analysis. The majority of the papers surveyed (25 out of 40) focused on the Myers-Briggs Type Indicator (MBTI); we will therefore confine our discussion to this tool. The MBTI classifies personality in terms of people’s preferred ways of operating in the world. It categorizes individuals into one of 16 personality types. These types are derived from people’s expressed preferences on four functions: (E)xtroversion vs. (I)ntroversion, (S)ensing vs. (I)ntuition, (T)hinking vs. (F)eeling, and (J)udging vs. (P)erceiving [2]. Each type has a number of positive features: there is no ideal type, they are all equally valued. The eight functions and their focus are summarized in the table here.

together, or exhibit tensions. Capretz [3] studied the MBTI types of 100 software engineers and found the largest type was ISTJ. While Capretz acknowledges there is no link between type and performance, and that other factors have a bearing on career choice, he goes on to state that these findings are important for employers looking for software engineering professionals. More recently, the MBTI was used to investigate the link between personality type and a code review task with a sample of 64 students [4]. In this study those with an NT (Intuition-Thinking) preference were seen to perform the task better than other types: the largest single type was ENTP. The authors expressed surprise as their results conflicted with those of Capretz. However, these findings do not tell us a lot about the ideal or even adequate software engineer, given the fact that type is not normally distributed in the population. As Kerth et al. [9] correctly point out: personality tests cannot identify good software engineers over bad, nor can their results predict “on the job” performance; there is some evidence of the importance of other factors, such as work experience [11]. Where researchers wish to identify the personality factors related to software engineering, or those factors that typify a group of exceptional software engineers, a more appropriate approach would be to use a traitbased instrument (such as the 16PF) where comparisons to a normative sample can be made. The main barrier to this approach would be choosing, or most probably creating, a representative normative sample for comparison. The relevance of the MBTI to identify the makeup of software development teams has been limited, in that observable behavior is not always related to the underlying type. People can, and do, choose to operate in the non-preferred mode as situations dictate. The MBTI is a tool for the development of selfawareness and, when results are shared, awareness of others. Knowledge of personality type within a team allows people to expect others to react differently from themselves and equip them to cope more constructively with those differences. As such, the MBTI can be used to improve teamwork with the hoped for byproduct of improved productivity and quality, as long as the test is used properly.

WHAT CAN PERSONALITY TESTING OFFER SOFTWARE ENGINEERING? DISPELLING THE MYTHS In general, the MBTI has been used within software engineering research in one of two ways: to discover the personality type(s) that most typify good software engineers, or to identify the makeup of software development teams that are likely to work well

WILL THE REAL MBTI PLEASE STAND UP? The value of any psychometric instrument is directly related to the techniques used during construction; not all psychometric tests are created equal. Test publishers describe the precise methods of test development, in particular, statistical data relating to test reliability and validity. They do this because to

68

January 2007/Vol. 50, No. 1 COMMUNICATIONS OF THE ACM

communication with the ignore such issues would render a test worthless: a E–I Focus: The way in which we focus our attention and draw energy authors it was established (E) Introversion (I) that rather than using the poor test will yield poor Extroversion Focus on the external world of people Focus on own inner worlds of thoughts, ideas, and experiences. full MBTI, they had in results. However, the and activity. fact constructed their own importance of this appears S–N Focus: The way in which we take in information Sensing (S) Intuition (N) test: no details of test conto be lost on many of those Observant of detail, focus on the real See the bigger picture, drawing relationships between concepts, struction and validity were who use such tests. Unfor- and tangible, the here and now. imagining new possibilities. provided. tunately, the casual reader You might ask: So of many of the articles dis- T–F Focus: The way in which we make decisions (T) Feeling (F) what? Well, the developcussed here would not see Thinking Base decisions on logic and objectivity. Make decisions through empathy, guided by personal values. ment of a robust personalthe significance of this ity measure is a point, or its likely bearing J–P Focus: The way in which we deal with the outer world (J) Perceiving (P) time-consuming, iterative on the validity of the Judging Enjoy making plans, methodical and Like to be flexible and open to change, feel constrained by plans. process that can take many research, because in the strive to accomplish tasks by selfimposed deadlines. years, not least because majority of cases details of personality measures prethe specific tests used are The MBTI functions and sent particular problems during construction. For a glossed over, and in some their focus. test to be of any value it must be both reliable and cases, misrepresented. Even valid. (1/07) Test reliability is the extent to which a test is conwhen researchers have used Edward table the real MBTI, for example [1, 11], details of the sistent within itself, and over time. That is, the degree to which a test will give the same score or personality administration process are missing. Karn and Cowling’s [8] study of the interactions of type for an individual on retesting. Test validity is the personality types during software development claims extent to which a test measures what it is intended to to have used the MBTI to identify the individual per- measure. Reputable tests such as the MBTI provide sonality types of two teams of student software devel- statistical data on these factors and details of the methopers. In fact, the MBTI was not used, a later technical ods and samples used to gather this data. To ignore report [7] reveals that a freely available test was used these factors when choosing a test will increase the pos(www.humanmetrics.com). While it is claimed that sibility of acquiring misleading data. In addition, respondents may attempt to distort there are “no significant statistical differences between this test and the MBTI” [7], the argument is not con- their profiles; for example, by responding to items in vincing. On inspection of www.humanmetrics.com, ways they believe will create a favorable impression. no data is provided on the methods of test construc- Care must be taken therefore during item develoption, no reliability or validity data, and no MBTI cor- ment to limit the insight a respondent may have, and relation data. Moreover, in our opinion, the content to ensure that one pole of the preference does not and style of the site itself is hardly indicative of a pro- appear more appealing or acceptable than the other. fessional organization: no surface contact details are Standardized tests such as the MBTI are developed in provided and there is no firm evidence of credentials. a way that will limit the effects of such response sets. The site offers an interesting range of other free tests However, this peace of mind comes at a price: tests including, “find your perfect partner”—perhaps the such as the MBTI can be relatively expensive to purbasis for a new slant on the concept of “pair program- chase. Freely available tests generally do not provide ming”? Although this site might offer some amuse- data on reliability and validity. Nor do they offer an ment, the potential effects on the subject group are not insight into the test construction process, nor comso lighthearted. A critical part of the administration ment on the possible effects of response sets and how process is gaining client acceptance and willingness to the test design limits these effects. Taken together, the answer honestly. The testing environment in this case two issues of the lack of detail provided on test concan in no way have guaranteed that the subjects will struction, and the absence of validity and reliability data, severely limit our ability to trust the results of have taken the process seriously. Miller and Yin [10] discuss the use of the MBTI in such tests. A test is worthless if we cannot be sure that the construction of software inspection teams. They it measures what it is supposed to measure, and its claim to use the MBTI within this study, but then results are consistent over time. comment that they use the “standard approach of online specialized questionnaires.” We were interested TEST ADMINISTRATION AND FEEDBACK to discover precisely which version of the MBTI had All tests, including the MBTI, have a degree of error been used within this study and through personal in their accuracy, and this error may be amplified by COMMUNICATIONS OF THE ACM January 2007/Vol. 50, No. 1

69

&IGURECAPTION'UIDELINESFORTHETESTINGPROCESS

external factors, the most potent 0ERSONALITY 4ESTING being the administration process. Therefore, all standardized tests 4EST 0URPOSE provide administration proce5NDERSTANDING 3ELECTION $EVELOPMENT dures and it is important that 0OSSIBLE TESTS 0OSSIBLE TESTS EG -"4) EG 0& these procedures are followed. Administration is not a simple %MPLOY %MPLOY .O .O .O .O .O 1UALIFIED TO .O 1UALIFIED TO 'ET TRAINING QUALIFIED QUALIFIED 'ET TRAINING process of issuing instructions USE USE TESTER TESTER 9ES 9ES 1UIT 1UIT and asking people to complete 9ES 9ES 9ES 9ES question booklets—it involves a 7HICH (OW ARE ADMINISTRATION RESULTS TO BE 4EAM PROCESS degree of skill to ensure that the USED DEVELOPMENT 'ROUP )NDIVIDUAL )NDIVIDUAL need for standardization is met 4EAM GIVES .O 9ES DEVELOPMENT )NTRODUCE AND !DMINISTER IN )NTRODUCE AND !DMINISTER IN 1UIT CONSENT TO SHARE INDIVIDUAL PROCESS GROUP PROCESS RESULTS and that clients are at ease and "E AWARE OF GAINING RAPPORT BUT "E AWARE OF SAVING TEST TIME 7HICH INCREASING TEST TIME BUT LOSING RAPPORT ADMINISTRATION have a good understanding of PROCESS )NDIVIDUAL 'ROUP personality theory and the under3CORE TEST )NTRODUCE -"4) AND !DMINISTER )NTRODUCE -"4) AND !DMINISTER lying assumptions of the instruIN INDIVIDUAL PROCESS IN GROUP PROCESS )NTERPRET 2ESULTS "E AWARE OF GAINING RAPPORT BUT "E AWARE OF SAVING TEST TIME ment in question. Test publishers INCREASING TEST TIME BUT LOSING RAPPORT &EEDBACK TO )NDIVIDUALS are fully aware of this and conse3CORE TEST quently the purchase and use of 6ERIFY QUESTIONNAIRE THROUGH FEEDBACK 4RAIT V STATE )NTERPRET 2ESULTS standardized tests is restricted to #OMPARE SCORE TO REPRESENTATIVE &EEDBACK TO )NDIVIDUALS qualified users who have underNORMATIVE SAMPLE gone specific training. -AKE DECISION 'O THROUGH BEST FIT PROCESS WITH INDIVIDUALS TYPE VERIFICATION In the case of personality assessment an important aspect of the 3ECURE STORAGE AND DISPOSAL OF TEST DATA $EVELOPMENT process is providing feedback it to PURPOSE %ND 4EAM )NDIVIDUAL clients. With the MBTI feedback is $EVELOPMENT $EVELOPMENT 0& HAS NORMATIVE DATA FOR POPULATIONS ACTIVITY ACTIVITY an absolute necessity, as it involves 3ECURE STORAGE type verification and the process of AND DISPOSAL OF TEST DATA “best fit.” The MBTI question%ND naire provides an indication or esti0ERSONALITY TYPES ARE ./4 NORMALLY DISTRIBUTED IN POPULATIONS WHICH MEANS AN INDIVIDUAL CANNOT BE mate of an individual’s personality COMPARED TO A GROUP type (the “reported type”). Type verification occurs during a client Guidelines for the over the other. It also provides an indication of those feedback session; in some instances testing process. %DWARD type verification cannot be resolved preferences where FIGURE the client may not agree with their in a single session. reported type. However, it does not, as Hardiman (in The feedback process involves the test administra- response to [9]) suggests, provide an indication that tor explaining the history and aims of the MBTI, the respondent has degree of preference ranging on a along with a description of the four functions, the continuum. The process of helping a client reach client then self-assesses their type. This is done “best fit” can only be done by a qualified test adminthrough a process of open discussion during which istrator. Untrained individuals might bias this process the test administrator asks questions that will facilitate through inappropriate or leading questions, their own reflection. It is essential that this process takes place as misunderstandings of the MBTI and Jungian theory, there may be discrepancies between an individual’s or through the influences of their own personality reported type and their self-assessment of their type. type. If this process is not undertaken we cannot be A number of studies have investigated discrepancies sure the respondent would agree with the reported between reported and self-assessed type. Generally type. This is an important point since the clarity speaking, disagreements with reported type occur index data reported in some papers, for example [8], more frequently in dichotomies where the expressed suggests the reported type for some respondents on clarity index is weak. For example, Walck [12] found some dimensions is extremely weak, whereas Bradley that out of a sample of 256 people 75% agreed on all and Hebert [1] have examples of equal clarity indices four letters and 97% agreed on three out of the four. for a pair of functions, but no discussion of how the The clarity index is a score that represents how sure choice for one over the other was made (for example, the respondent is that he or she prefers one dimension N rather than S). 70

January 2007/Vol. 50, No. 1 COMMUNICATIONS OF THE ACM

None of the papers reviewed discussed the administration and feedback process in any detail. Therefore, even if we discount the problems of using inappropriate tests, the basic concept of the individual being involved in the process of “best fit” has been ignored, therefore we cannot be certain about the accuracy of the types identified in the research papers. These aspects may not have been neglected but, if they were carried out, they were not seen as worthy of discussion (despite being central to the effective and acceptable use of the MBTI). CONCLUSION Our analysis of what is required for effective use of psychometric tests leads us to the following recommendations aimed at potential participants, researchers, and readers. We recommend to a potential participant (whether work or research related) that you ask the following of your tester: 1. What test is to be used? Press for the specific test and its version, a qualified tester will be able to be precise. 2. Is it a recognized and validated test? If it is not either politely refuse or, if you agree to be involved, be aware of the limitations of the test and its results. 3. Are the testers fully trained and qualified to administer the process? If this question causes bemusement then the answer will be “no.” 4. How will the process be administered? A valid approach will be relatively time consuming since for each individual there will be pre-test and post-test discussions in addition to the test time. To a researcher whose aim is to investigate the impact of personality in a software team we suggest the following: 1. Become qualified, or team up with a qualified tester, and use standardized tests: use the flowchart provided in the figure here to help identify the appropriate test, and process, for your work. 2. Ensure in publications that you are clear about the tests and the process used and the following information is provided: the test used, the administration process, who the qualified testers were, how feedback was given, whether the types in the paper are reported types or verified types (derived after feedback). To a reader of such articles we suggest the following: 1. Look for explicit details of types of test used, administration process, the qualifications of the testers. 2. Don’t assume because a paper has been published in a prestigious journal it is flawless. Finally, people are entitled to develop questionnaires of their own to test out their hypotheses, gather data, and report results. These approaches are not

invalid or necessarily suspect and therefore we are not warning against such work. However, those who claim to be using personality testing are making claims of authenticity and validity that are often not warranted. Software engineers often complain about those who, in the course of their work, do some programming in support of their professional activities: the claim being that such individuals are not professionals and do not understand the discipline. The same can be said of those who adopt psychological approaches without the relevant qualifications and background. c References. 1. Bradley, J.H. and Hebert, F.J. The effect of personality type on team performance. Journal of Management Development 16, 5 (1997), 337–353. 2. Briggs Myers, I., McCaulley, M.H., Quenk, N.L., and Hammer, A.L. MBTI® Manual: A Guide to the Development and Use of the Myers-Briggs Type Indicator®. Third Edition, CPP Publications, Palo Alto, CA, 1998. 3. Capretz, L.F. Personality types in software engineering. International Journal of Human Computer Studies 58, 2 (2003), 207–214. 4. Devito Da Cunha, A. and Greathead, D. Code review and personality: Is performance linked to MBTI type? Technical Report Series, CS TR 837, School of Computing Science, University of Newcastle upon Tyne, UK, April 2004. 5. Gorla, N. and Lam, W.Y. Who should work with whom? Building effective software project teams. Commun. ACM 47, 6 (June 2004). 6. Hardiman, L.T. Personality types and software engineers. IEEE Computer 30, 10 (Oct. 1997), 10. 7. Karn, J.S. and Cowling, A.J. A study into the effect of disruptions on the performance of software engineering teams. Research Memorandum CS-04-07, Department of Computer Science, Sheffield University, 2004. 8. Karn, J.S. and Cowling, A.J. An initial observational study of the effects of personality type on software engineering teams. In Proceedings of the Eighth International Conference on Empirical Assessment in Software Engineering (EASE 2004), IEE, Edinburgh, UK, 2004. 9. Kerth, N.L, Coplien, J., and Weinberg, J. Call for the rational use of personality indicators. IEEE Computer 31, 1 (Jan. 1998), 146–147. 10. Miller, J. and Yin, Z. A cognitive-based mechanism for constructing software inspection teams. IEEE Transactions on Software Engineering (Nov. 2004). 11. Turley, R.T. and Bieman, J.M. Competencies of exceptional and nonexceptional software engineers. Journal of Systems and Software 28, 1 (1995), 19–38. 12. Walck, C.L. The relationship between indicator type and true type: Slight preferences and the verification process. Journal of Psychological Type 23 (1992), 17–21.

Sharon McDonald ([email protected]) is a senior research lecturer in the School of Computing and Technology at the University of Sunderland, U.K., and a chartered psychologist trained in the administration and interpretation of the MBTI and 16PF personality scale. Helen M. Edwards ([email protected]) is a professor of software engineering in the School of Computing and Technology at the University of Sunderland, U.K., a fellow of the British Computer Society, and a chartered engineer. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. © 2007 ACM 0001-0782/07/0100 $5.00

COMMUNICATIONS OF THE ACM January 2007/Vol. 50, No. 1

71

&IGURECAPTION'UIDELINESFORTHETESTINGPROCESS

external factors, the most potent 0ERSONALITY 4ESTING being the administration process. Therefore, all standardized tests 4EST 0URPOSE provide administration proce5NDERSTANDING 3ELECTION $EVELOPMENT dures and it is important that 0OSSIBLE TESTS 0OSSIBLE TESTS EG -"4) EG 0& these procedures are followed. Administration is not a simple %MPLOY %MPLOY .O .O .O .O .O 1UALIFIED TO .O 1UALIFIED TO 'ET TRAINING QUALIFIED QUALIFIED 'ET TRAINING process of issuing instructions USE USE TESTER TESTER 9ES 9ES 1UIT 1UIT and asking people to complete 9ES 9ES 9ES 9ES question booklets—it involves a 7HICH (OW ARE ADMINISTRATION RESULTS TO BE 4EAM PROCESS degree of skill to ensure that the USED DEVELOPMENT 'ROUP )NDIVIDUAL )NDIVIDUAL need for standardization is met 4EAM GIVES .O 9ES DEVELOPMENT )NTRODUCE AND !DMINISTER IN )NTRODUCE AND !DMINISTER IN 1UIT CONSENT TO SHARE INDIVIDUAL PROCESS GROUP PROCESS RESULTS and that clients are at ease and "E AWARE OF GAINING RAPPORT BUT "E AWARE OF SAVING TEST TIME 7HICH INCREASING TEST TIME BUT LOSING RAPPORT ADMINISTRATION have a good understanding of PROCESS )NDIVIDUAL 'ROUP personality theory and the under3CORE TEST )NTRODUCE -"4) AND !DMINISTER )NTRODUCE -"4) AND !DMINISTER lying assumptions of the instruIN INDIVIDUAL PROCESS IN GROUP PROCESS )NTERPRET 2ESULTS "E AWARE OF GAINING RAPPORT BUT "E AWARE OF SAVING TEST TIME ment in question. Test publishers INCREASING TEST TIME BUT LOSING RAPPORT &EEDBACK TO )NDIVIDUALS are fully aware of this and conse3CORE TEST quently the purchase and use of 6ERIFY QUESTIONNAIRE THROUGH FEEDBACK 4RAIT V STATE )NTERPRET 2ESULTS standardized tests is restricted to #OMPARE SCORE TO REPRESENTATIVE &EEDBACK TO )NDIVIDUALS qualified users who have underNORMATIVE SAMPLE gone specific training. -AKE DECISION 'O THROUGH BEST FIT PROCESS WITH INDIVIDUALS TYPE VERIFICATION In the case of personality assessment an important aspect of the 3ECURE STORAGE AND DISPOSAL OF TEST DATA $EVELOPMENT process is providing feedback it to PURPOSE %ND 4EAM )NDIVIDUAL clients. With the MBTI feedback is $EVELOPMENT $EVELOPMENT 0& HAS NORMATIVE DATA FOR POPULATIONS ACTIVITY ACTIVITY an absolute necessity, as it involves 3ECURE STORAGE type verification and the process of AND DISPOSAL OF TEST DATA “best fit.” The MBTI question%ND naire provides an indication or esti0ERSONALITY TYPES ARE ./4 NORMALLY DISTRIBUTED IN POPULATIONS WHICH MEANS AN INDIVIDUAL CANNOT BE mate of an individual’s personality COMPARED TO A GROUP type (the “reported type”). Type verification occurs during a client Guidelines for the over the other. It also provides an indication of those feedback session; in some instances testing process. %DWARD type verification cannot be resolved preferences where FIGURE the client may not agree with their in a single session. reported type. However, it does not, as Hardiman (in The feedback process involves the test administra- response to [9]) suggests, provide an indication that tor explaining the history and aims of the MBTI, the respondent has degree of preference ranging on a along with a description of the four functions, the continuum. The process of helping a client reach client then self-assesses their type. This is done “best fit” can only be done by a qualified test adminthrough a process of open discussion during which istrator. Untrained individuals might bias this process the test administrator asks questions that will facilitate through inappropriate or leading questions, their own reflection. It is essential that this process takes place as misunderstandings of the MBTI and Jungian theory, there may be discrepancies between an individual’s or through the influences of their own personality reported type and their self-assessment of their type. type. If this process is not undertaken we cannot be A number of studies have investigated discrepancies sure the respondent would agree with the reported between reported and self-assessed type. Generally type. This is an important point since the clarity speaking, disagreements with reported type occur index data reported in some papers, for example [8], more frequently in dichotomies where the expressed suggests the reported type for some respondents on clarity index is weak. For example, Walck [12] found some dimensions is extremely weak, whereas Bradley that out of a sample of 256 people 75% agreed on all and Hebert [1] have examples of equal clarity indices four letters and 97% agreed on three out of the four. for a pair of functions, but no discussion of how the The clarity index is a score that represents how sure choice for one over the other was made (for example, the respondent is that he or she prefers one dimension N rather than S). 70

January 2007/Vol. 50, No. 1 COMMUNICATIONS OF THE ACM

None of the papers reviewed discussed the administration and feedback process in any detail. Therefore, even if we discount the problems of using inappropriate tests, the basic concept of the individual being involved in the process of “best fit” has been ignored, therefore we cannot be certain about the accuracy of the types identified in the research papers. These aspects may not have been neglected but, if they were carried out, they were not seen as worthy of discussion (despite being central to the effective and acceptable use of the MBTI). CONCLUSION Our analysis of what is required for effective use of psychometric tests leads us to the following recommendations aimed at potential participants, researchers, and readers. We recommend to a potential participant (whether work or research related) that you ask the following of your tester: 1. What test is to be used? Press for the specific test and its version, a qualified tester will be able to be precise. 2. Is it a recognized and validated test? If it is not either politely refuse or, if you agree to be involved, be aware of the limitations of the test and its results. 3. Are the testers fully trained and qualified to administer the process? If this question causes bemusement then the answer will be “no.” 4. How will the process be administered? A valid approach will be relatively time consuming since for each individual there will be pre-test and post-test discussions in addition to the test time. To a researcher whose aim is to investigate the impact of personality in a software team we suggest the following: 1. Become qualified, or team up with a qualified tester, and use standardized tests: use the flowchart provided in the figure here to help identify the appropriate test, and process, for your work. 2. Ensure in publications that you are clear about the tests and the process used and the following information is provided: the test used, the administration process, who the qualified testers were, how feedback was given, whether the types in the paper are reported types or verified types (derived after feedback). To a reader of such articles we suggest the following: 1. Look for explicit details of types of test used, administration process, the qualifications of the testers. 2. Don’t assume because a paper has been published in a prestigious journal it is flawless. Finally, people are entitled to develop questionnaires of their own to test out their hypotheses, gather data, and report results. These approaches are not

invalid or necessarily suspect and therefore we are not warning against such work. However, those who claim to be using personality testing are making claims of authenticity and validity that are often not warranted. Software engineers often complain about those who, in the course of their work, do some programming in support of their professional activities: the claim being that such individuals are not professionals and do not understand the discipline. The same can be said of those who adopt psychological approaches without the relevant qualifications and background. c References. 1. Bradley, J.H. and Hebert, F.J. The effect of personality type on team performance. Journal of Management Development 16, 5 (1997), 337–353. 2. Briggs Myers, I., McCaulley, M.H., Quenk, N.L., and Hammer, A.L. MBTI® Manual: A Guide to the Development and Use of the Myers-Briggs Type Indicator®. Third Edition, CPP Publications, Palo Alto, CA, 1998. 3. Capretz, L.F. Personality types in software engineering. International Journal of Human Computer Studies 58, 2 (2003), 207–214. 4. Devito Da Cunha, A. and Greathead, D. Code review and personality: Is performance linked to MBTI type? Technical Report Series, CS TR 837, School of Computing Science, University of Newcastle upon Tyne, UK, April 2004. 5. Gorla, N. and Lam, W.Y. Who should work with whom? Building effective software project teams. Commun. ACM 47, 6 (June 2004). 6. Hardiman, L.T. Personality types and software engineers. IEEE Computer 30, 10 (Oct. 1997), 10. 7. Karn, J.S. and Cowling, A.J. A study into the effect of disruptions on the performance of software engineering teams. Research Memorandum CS-04-07, Department of Computer Science, Sheffield University, 2004. 8. Karn, J.S. and Cowling, A.J. An initial observational study of the effects of personality type on software engineering teams. In Proceedings of the Eighth International Conference on Empirical Assessment in Software Engineering (EASE 2004), IEE, Edinburgh, UK, 2004. 9. Kerth, N.L, Coplien, J., and Weinberg, J. Call for the rational use of personality indicators. IEEE Computer 31, 1 (Jan. 1998), 146–147. 10. Miller, J. and Yin, Z. A cognitive-based mechanism for constructing software inspection teams. IEEE Transactions on Software Engineering (Nov. 2004). 11. Turley, R.T. and Bieman, J.M. Competencies of exceptional and nonexceptional software engineers. Journal of Systems and Software 28, 1 (1995), 19–38. 12. Walck, C.L. The relationship between indicator type and true type: Slight preferences and the verification process. Journal of Psychological Type 23 (1992), 17–21.

Sharon McDonald ([email protected]) is a senior research lecturer in the School of Computing and Technology at the University of Sunderland, U.K., and a chartered psychologist trained in the administration and interpretation of the MBTI and 16PF personality scale. Helen M. Edwards ([email protected]) is a professor of software engineering in the School of Computing and Technology at the University of Sunderland, U.K., a fellow of the British Computer Society, and a chartered engineer. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. © 2007 ACM 0001-0782/07/0100 $5.00

COMMUNICATIONS OF THE ACM January 2007/Vol. 50, No. 1

71

Who Test Whom

Overview

More details

Related Documents

Who Test Whom

Who Whom[1]

Who Accepts Whom

To Whom

Who-who?

Who