The hierarchical terminology technique: a method to address terminology inconsistency

A review of prior research in the computer-supported cooperative work (CSCW) field found that the terminology and definitions used by researchers were inconsistent, with some terms being defined differently but then used interchangeably. Inconsistency in the use of terminology makes it difficult to determine what technology is being used in the research and where a research study fits into the research field. Authors in many fields describe the problem of inconsistent terminology use. A suitable method for identifying inconsistencies and structuring terminology for greater clarity of terminology use in a given research field was not identified. This paper presents a new method, the hierarchical terminology technique (HTT), which is a form of qualitative content analysis process that extends the idea of concept mapping. HTT was developed for this research problem to structure a hierarchy of terms to expose the relationship between the terms. This technique includes terminology identification, analysis and presentation to show the scope of the research field, and to present terminology and definitions to improve consistency. This technique could be used in other fields of study.


Introduction
Knowledge of any research field is mainly gained through reading published research documents.Inconsistent terminology has a negative impact on effective communication, making sense of research findings, integrating studies, and building cumulative theory.Terminology inconsistency may be "a significant impediment to effective communication and to our ability to make sense of research findings" (Alter 2000, p. 2).
Taxonomy is a particular science that deals with classification (Macquarie Dictionary Online 2003).The classification of concepts from a particular knowledge domain (in this case CSCW) can be considered a preliminary to producing new knowledge about the domain.Classification schemes, typologies and taxonomies are tools for conferring organisation and stability on our thoughts about reality and like tools they may be judged or found more or less useful for a particular purpose (Kemeny 1959).Kwasnik (1999) argued that classification is a form of knowledge representation and that there is a relationship between classification and the process of knowledge discovery and creation.Knowledge was described as being hierarchical (Landauer and Rowlands 2001), in that every higher level concept is based on lower level information.Without knowledge of the hierarchy, one is left with only a vague notion of the meaning of a term.
Other authors argued that a taxonomy of terminology should also have a dictionary of definitions (Ishida and Ohta 2002).When describing the philosophy of research, Neuman (1994) argued that "People must share the terms for concepts and their definitions if they are to be of value" (Neuman 1994, p. 36).Cooper and Emory (1995, p. 35) stated that "If words have different meanings to the parties involved, then they are not communicating on the same wavelength.Definitions are one way to reduce this danger".These two statements specifically highlight the need for consistency in the use of terminology in research disciplines.

The research problem
A review of literature highlighted issues relating to inconsistent terminology and provided suggestions for addressing this problem.This led to the research problem to be investigated: To develop a taxonomy of CSCW terms, and to develop a dictionary of definitions of those terms in order to provide a consistent foundation of terminology for research and practice in the field.
Three research tasks were developed to address the research problem.Specific Research Task (SRT) 1: Identify the terms and definitions used in the CSCW literature.SRT 2: Use the CSCW terms to develop a taxonomy of CSCW terminology.SRT 3: Compile definitions to construct a dictionary of CSCW terminology.
In order to address the specific research tasks a process of data collection, analysis and presentation was required.Some researchers have considered the process of qualitative data collection and analysis and have suggested steps to follow.For example Sarantakos (1996) described four steps of documentary research: When discussing qualitative analysis Miles and Huberman (1994)

Hierarchical Terminology Technique
Stage 1: Identification of the terms through content analysis.
Step 1.1: Identification of the relevant documents; Step 1.2: Condensation of the documents; and Step 1.3: Coding of the data.Stage 2: Analysis and construction of CSCW taxonomy of terms.
Step 2.1: Development of simple hierarchical tables; and Step 2.2: Development of hierarchy diagrams.Stage 3: Development of dictionary of definitions.
Step 3.1: Collating of terms and definitions; Step 3.2: Formatting of the dictionary; and Step 3.3: Presentation of dictionary.

Final stage
Stage 4: Identification of terminology inconsistencies and implications of research.
The above steps help with the process of undertaking qualitative research activities, but none of the processes found in prior literature provided a suitable method for identifying data in order to develop a taxonomy and dictionary.

Hierarchical terminology technique
This study has combined and modified the steps discussed by Miles and Huberman (1994) and Sarantakos (1996) to complete this study and address the research problem.The research process illustrated in Table 1 shows four stages used for this research.Each stage in the research process is comprised of one or more steps.
The HTT is made up of Stages 1-3.Stage 1 has used three steps for data collection and analysis.The development of the hierarchical taxonomy in Stage 2 has two steps and the dictionary development in Stage 3 has three steps.The approach used in this technique was iterative, in that the steps were not mutually exclusive and were not necessarily consecutive.
Finally Stage 4 presents the terminology inconsistencies and describes the implications of the hierarchy and dictionary to future research and practice.These four stages provided clarity in describing the processes for this research and to address the research problem.
During each stage of the research process, audit activities were undertaken to improve rigour.The following sections of the paper (4.1-4.3)describe the stages of the research in detail.During each step a description of the audit trail and computer usage has been included to show how these support quality in this research.

Step 1.1: Identification of the relevant documents
The literature review did not reveal a previous study of this kind in the CSCW field, so the boundaries of this research relating to data source were not originally apparent.In order to qualify for inclusion into this research the source documents required three elements: groups of people, who use computer systems, to undertake group work (Grudin 1991).The relevant documents for analysis in this research were those articles published between 1978 and 2007, which related to the 'use of CSCW systems'.
The choice of articles for this study depended on availability, accessibility, and relevance.Sampling was considered necessary for this interpretive study as there was a requirement to have a range of published articles from across a number of years from a number of different authors.
The issue of sampling is discussed, followed by an explanation of where the articles were found.Sampling in qualitative research is not based on probability theory, and the size of samples is usually too small to reflect the attributes of the population concerned.However, for this research the sample size used was quite large to provide an overall representation of the research field.The unit of analysis in this study was either the written articles published in this field or the CSCW authors who research in this field.In either case, the size of the population is large and difficult to quantify.
There are a number of different types of sampling procedures used for qualitative studies.Three non-probability sampling techniques, snowball sampling (Cavana et al. 2001), judgement sampling (Oppenheim 1992) and serendipity sampling, were considered the most feasible methods, and were used in this research for obtaining articles across the relevant years.
Snowball sampling is often used when specific characteristics or knowledge are required in the population, but they are difficult to locate (Cavana et al. 2001).In this research, sampling was commenced with finding a few articles by using general search, using keywords.A snowball effect was then used to identify other articles from reference lists from the original documents found.It is understood that this sample could be biased (Davidson and Layder 1994).This bias is offset by the use of other sampling techniques such as judgement sampling.
A sampling method that relies on the researcher to obtain a wide representation of articles is called judgement sampling (Oppenheim 1992).Judgement sampling takes account of likely sources of difference in the views and experiences of the articles' authors."This may be the only feasible method in certain circumstances, when access to the population is difficult or restricted" (Oppenheim 1992, p. 43).
Some reviews of previous literature in the field helped to identify further articles to find.Some web sites, such as CSCW conferences, the European CSCW conferences and the CSCW Journal, were targeted directly to obtain articles that were relevant for this study.Arti Some articles were found 'by chance', referred to as serendipity.Some articles were found by colleagues, others were found when searching for other reference materials.Some articles were identified through discussions with colleagues.
A quantitative literature analysis was undertaken to determine the breadth and depth of articles retrieved.Each article was analysed to determine whether it was a qualitative or quantitative study, or both, and whether it was a review, a conceptual paper or both.The quantitative analysis of the documents showed that there were 510 articles with over 700 different authors, from 24 different countries, 47 different organisations and 167 different universities.Further, 257 articles were from 56 different conferences, and 221 articles from 97 different journals.The remaining articles were from web sites.

Audit trail and computer usage for
Step 1.1 During the search for documents, notes were kept of the progress of the document retrieval.Coding forms were used as a means to determine whether the articles were appropriate for this study and to collect relevant data, for sampling purposes, to be recorded in the Endnotes database.A random sample of these coding forms, was checked by an independent researcher to determine the validity of the data collection procedure.The data collected on the coding forms included the author, date, research type, theory or research framework, and variables discussed in the research such as time/place, team environment, and technology used.The coding forms were used as a means of checking for reliability of the data to ensure stability, reproducibility, uniformity and accuracy, and to transfer details of research type, time/place dimensions, and system type into the EndNotes database.As articles were found they were subjected to Step 1.2 of the process.

Step 1.2: Condensation of the documents
Condensation of documents depends on several factors, primarily related to the method of analysis and the purpose of the study.When methods such as content analysis are employed, organisation of the data as well as their analysis, become more sophisticated (Sarantakos 1996).
For this research, condensation refers to the extraction of relevant data from the published articles and the preparation of the relevant data into electronic format.Condensation of each document was time consuming, but necessary to reduce the quantity of text down to relevant data.Types of text, from the articles, that were not considered relevant, consisted of reference lists, methodology sections, and quantitative analysis sections.Parts of the articles that were considered relevant were literature reviews, definitions, concepts used, systems or applications specified and discussed, and variables used.

Audit trail and computer usage for
Step 1.2 Electronic document files were created to store the extracted data using MSWord.As additional articles were found they were recorded in the Endnote database and all articles (hard copy and soft copy) were filed.Backup copies of all electronic files were stored on compact disc (CD) and a second hard drive, to prevent potential loss of data due to breakdown of computer equipment.
As the condensed electronic documents were completed they were subjected to Step 1.3 of the research process.

Step 1.3: Coding of the data
As stated in Step 1.1, hand coding of the data from the condensed documents was undertaken to check that the documents were relevant for this study and to provide an overall picture of the breadth and depth of the sample of articles.Some keywords needed for searching and for coding the data from the documents, were identified during the literature search and during the development of the research problem.Other codes were identified by reading through a sample of the documents.The identified codes were then used to automatically search for other instances within the documents.The electronic documents were coded, using the AtlasTi computer software.The types of data coded were: definitions, classifications, systems and terminology structures.
The terminology structures were identified by using Spradley's semantic relationships and the structures were then used to develop the hierarchy of terminology.Spradley (1979in LeCompte 2000) used semantic relationships to assist with this process as shown in Table 2.These structures were then used in Stage 2 for the development of the terminology hierarchies.The coded sections were scanned to create a list of terms.The list of terms, definitions and descriptions of terms and concepts were then used in Stage 3 for the development of the dictionary.

Audit trail and computer usage for
Step 1.3 The electronic documents were searched to identify all the terms and definitions used in the CSCW research field.This search was achieved by developing hermeneutic units in AtlasTi software program that provided support for computer assisted coding.Weber (1990) argued that while many research projects can benefit from computer-assisted coding, fully computerised coding systems are unlikely to be useful in anything other than the simplest texts because of a variety of issues (eg semantic variability and contextual information).'Auto coding' was used for some of the terms and definitions where words such as 'define', 'defined' and 'definition' were specified in the documents.Manual coding was used where other descriptions of terms were used, by reading the documents and attaching codes.
All sentences and paragraphs showing structured text were copied into an MSExcel spreadsheet along with the citation source including; author, date, and page number.A sample from the spreadsheet is included in Table 3.

Output of Stage 1
The material collated during the data reduction process was stored in the spreadsheet files and word files and have been used as data for Stages 2 and 3.

Step 2.1: Development of simple hierarchical tables
The terminology hierarchy was then developed in Stage 2 to display the data visually.Data that is displayed in diagrammatic, pictorial and visual forms should be viewed as an organised compressed assembly of information that permits conclusion drawing and/or action taking (Miles and Huberman 1994).

EIES
The subjective reactions of users of EIES to this form of communication and to specific features of the system have been reported elsewhere (Hiltz, 1978a(Hiltz, , 1978b))  The procedure used to develop the hierarchy had two steps.In the first step, a table of terms was developed that showed the hierarchical relationship of terms from the spreadsheet file.This table format was acceptable for a small number of terms, but as the number of terms increased it became very cumbersome and difficult to manipulate the terminology and add more terms.This led to the need for the second step of the process.In the second step a display of the terms was developed in diagrammatic form using mind mapping software.
The two step procedure was used for every row in the data spreadsheet and an example of this procedure has been described in this section.This example uses the terms from seven rows of the data spreadsheet (Table 3).The seven rows used in the example are only a small portion of the complete data spreadsheet which contains 775 rows of data.Table 3 has four columns showing the author and date of the quote, the upper level term, lower level terms, and the quotes from which the terms were drawn.
The terms in columns 3 and 4 of Table 3 were used to structure Tables 4 and 5. Table 4 shows the communication system terminology from the quotes to 1980 (from Table 3) and shows four levels of the communication systems (CMC) terminology.
Each set of terms was analysed from the data spreadsheet to determine if it related to other terminology already in the table.For instance communication modes related to the usage of systems, so this terminology was linked to usage at level 2. Communication activities were considered to relate to the group, so this terminology was linked to group at level 2.
Table 5 shows user behaviour phases from the quote in row 7 of Table 4 by Hiltz and Turoff (1981) which as shown, was after 1980.

Audit trail and computer usage for
Step 2.1 The printouts from the spreadsheet file of quotes describing structure were used to identify the terms to add to the hierarchy.As each section of text was used to develop the simple hierarchies they were marked in the MSExcel spreadsheet so that they could be tracked to avoid repeated use.

Step 2.2: Development of the hierarchical diagram
The procedure in the second step transferred the terminology and relationships from the tables into diagrammatic form.The second step was undertaken using mind mapping software (MindManager) to simplify diagram development and to better display the hierarchical structure format.For example, all levels of Table 4 were used to develop the hierarchical diagram shown in Fig. 1.
As the number of levels in the hierarchy grew, not all terminology could be displayed at the same time on a single A4 page.The MindManager application provided a means to overcome this problem.The diagrams in MindManager were manipulated to display different branches and different levels along the branches.Figure 2 shows the additional terminology used in Table 5.When terminology in the lower levels is hidden, an arrow indicates that further terminology is linked at the lower levels.For example, an arrow on the line below a term, such as task and communication activities (in Fig. 2), indicates that further terminology is linked at the lower levels.
As terminology was added to the hierarchy it was necessary to display this as new terminology in the diagram without necessarily repeating terminology already described.The boundary and highlight around terms is used to show new terminology that has been added since the previous section.For instance the boundary and highlight around user behaviour phases, in Fig. 2, shows that this terminology has been added from a quote referenced after 1980.The levels of the terminology in these tables and figures do not signify any ranking or rating.

Audit trail and computer usage for Step 2.2
The simple hierarchical tables were used to develop the terminology hierarchy in the MindManager software.

Output of Stage 2
The output from Stage 2 was a hierarchical structure (the taxonomy of terms).The completed hierarchical structure for six eras from 1978 to 2007 is used to show the scope of terminology use in this research field and the relationship of the terms to one another.In summary, these two steps described the procedure that was followed to identify and structure all the terminology stored in the data spreadsheet.Figures 1 and 2 provided examples of the format of the hierarchical diagrams used.The procedures in Steps 2.1 and 2.2 described in this section were followed throughout the analysis to develop the final taxonomy of terms.The final taxonomy is too large to present in this article.

Step 3.1: Collating of terms and definitions
The text that was coded in Step 1.3 was searched for terms and definitions.All CSCW field related terms and associated definitions found in the coded text, were recorded in a spreadsheet of citations.It was found that a number of terms being used were not defined in the analysed articles.Although these terms and how they are used are included in the dictionary, the definitions have not been searched for elsewhere as this is beyond the scope of this research.Table 6 is an example from the dictionary spreadsheet showing six terms with definitions or descriptions and the sources of the quotes.

Audit trail and computer usage for
Step 3.1 During the coding step (Step 1.3), the terms and concepts and definitions were listed in a spreadsheet.As definitions were found they were recorded in the spreadsheet, together with the source information; author, date and page number.Some particular instances of descriptive citations of terms and concepts, were also recorded in this spreadsheet.

Step 3.2: Formatting of dictionary
The first task of editing the dictionary was to review the spreadsheet of citations to identify the use of words likely to be included.A selection was made of the citations which most fully represented a word's life and most definitively and vividly illustrated its use and meaning.The terminology and citations in the dictionary were displayed from an historical perspective of how this terminology has been used in the CSCW discipline since 1978.The citations were chosen as being the most representational of the definitions and descriptions of the terms and concepts found during the content analysis.The spelling of words in the dictionary is English (Australian).
Each entry was designed to present the information in the most illuminating form.Entries range from simple one word entries, to complex multiple word entries.Where terms had more than one definition the terms were placed in the dictionary more than once.These terms were placed with a number after the term to show the number of different definitions for a particular word.An example of the dictionary format is shown below:  The formatting used in the introduction, style and arrangement of entries in the dictionary were developed from the Australian National University (1988).The elements of an entry (not all of which may be required) appear in the following order.
Headword: The headword, the word which is the subject of the entry, appears at its head in bold roman.Subordinate items-combinations, collations, and phrases of which the headword is the main element, as well as derivatives, appear in their place in the entry in bold roman.Words which normally have an initial capital, as proprietary names retain the capital, all other initial letters being in lower case.Definition: The definition may include cross-references to words which have main entries or are subordinate items.Definitions worded by the author from other sources are in normal font.
Cross reference: There are two main forms of cross reference: if a word is defined by another word in the dictionary or listed within the qualification 'see …' the synonymy is exact.If the cross-reference is introduced by 'see also …' the synonymy is not exact but the information provided under the referred to word, is complimentary or in some other way useful.
Citations: Sets of citations provide substantiation for the definition and illustrate the history of the word's use.Some words are more copiously exemplified than others, which may be a reflection of their amount of use.A citation is preceded by a date (of publication) and the name of the author of an article is given (Full references of sources are provided in the reference list).Page numbers are presented in brackets following the authors' names.
Every effort was made to record the earliest use of a word in this research field, and to provide a reasonably spaced sequence of citations to the year 2007.Citations for each entry are in date order from the oldest to the more recent.Citations are given directly from the source except that, in the interests of economy, ellipses have been used to show the removal of extraneous material.Care has been taken not to distort the authors' intent.
Acronyms used in CSCW research have been included in the dictionary and also listed at the front of the dictionary.

Audit trail and computer usage for
Step 3.2 The development of the dictionary was undertaken at the same time as the development of the terminology hierarchy.The terms used in the hierarchy were cross checked against the terms and definitions in the dictionary to make sure that all terms used in the hierarchy were included in the dictionary.

Step 3.3: Presentation of dictionary
The aim of the dictionary is to provide an historical record of the use of terms used in the CSCW field.The purpose of the dictionary is to show the scope of the field and provide a consistent set of terms and definitions for use in the CSCW field of research.The dictionary is intended to cover the specialist vocabulary and document the history of words, with some terms established as being in common use.The essence of an entry in an historical dictionary is its citations, which help to establish the chronology of a word's use, to substantiate the definition or definitions, and illustrate the range of circumstances within which a word has been used.

Output of Stage 3
The output of Stage 3 was a dictionary of terms and definitions that consists of approximately 1,200 main entries.
In summary, three stages form the HTT.The concepts of the CSCW literature were identified from the articles in Stage 1. Terminology and structures of terms that were identified in the articles through Stage 1 were used in Stage 2 to develop the hierarchical taxonomy of terms.The output from Stage 2 shows the breadth of terminology to display the scope of the terminology across the CSCW field (to address SRT1).The output from Stage 2 also shows the relationships of the concepts of the CSCW field (to address SRT2).
The output from SRT1 and SRT2 were used in Stage 3 to develop a dictionary of CSCW terminology (to address SRT3).Both the Taxonomy from SRT2 and the dictionary from SRT3 were considered in Stage 4 to determine some of the inconsistencies and the implications of the developed structures for research and practice.

Advantages and limitations
There are a number of advantages and limitations of the HTT relating to the data collection technique and the analysis.The analysis technique uses an unobtrusive or indirect method of data collection from documents.Advantages of this method are retrospectivity, quick and easy accessibility, spontaneity, low cost, sole source, high quality of information, possibility of retesting, and non-reactivity (Sarantakos 1996).Limitations of the unobtrusive method are that the data are not necessarily representative of their kind, documents may not be easily accessible, documents may not be complete or up to date, reliability of some documents may be questionable, and some documents may be biased (Sarantakos 1996).
The HTT is subjective and different relationships could be considered appropriate by other researchers.It has only been applied to this study and would need to be used in other situations to determine its value.Also, the HTT is very time consuming, which may impact the opportunities for its use.

Conclusion
This was an interpretive/descriptive study, using an unobtrusive data collection technique of analysing prior research literature.A computer-aided qualitative form of content analysis has been used to extract data about terminology, definitions, and other groupwork issues from the published CSCW literature.The data were used to form interconnected groups or concept clusters, which holistically form a web of meaning.These groupings were then used to develop a hierarchical taxonomy of terms and associated dictionary of definitions to address the research problem.
The review of the literature did not reveal a previously researched theoretical model or method for developing a taxonomy and dictionary of definitions for a particular research field.Thus it was necessary to develop a method to undertake these activities.A study of taxonomic theory provided guidance in the development of a practical method for identifying terminology and structuring the terminology to show diagrammatically the scope of the CSCW research field.Spradley's (1979in LeCompte 2000) semantic relationships were used to identify the relationships of terms within the text.The development of the HTT (which extends the idea of concept mapping) and its descriptions is an important outcome of this research.
The HTT which can be used to show the structure of terminology use and the scope of terms and definitions will be of benefit to many fields of study in particular in the areas of information sciences, ontology development in IS and health informatics.The HTT could also be used for dynamic application in organizational contexts when similar problem characteristics arise.

•
Step 1: Identification of relevant documents • Step 2: Organisation and analysis of the documents • Step 3: Evaluation of the information • Step 4: Interpretation of the data.
described three steps: • Step 1: Data reduction which includes selection and condensation • Step 2: Data display-in diagrammatic, pictorial and visual forms • Step 3: Conclusion drawing and verification where displayed data are interpreted and meanings drawn.

Fig. 1 Fig. 2
Fig. 1 Example hierarchical diagram developed fromTable 4 displaying CMC terms to end 1980 Activity awareness gives workers indications of what is happening and what has happened recently in collaborative activities.1999 Hayashi et al. (99) Activity awareness gives workers indications of what is happening and what has happened recently in collaborative activities.2000 Jang et al. (28) Activity awareness represents a lack of awareness about other's activities (what are they doing).adoption (1): Adoption means the decision to purchase.Adoption is a process that may or may not lead to continued use.2002 Turner and Turner (4) Adoption can mean the decision to purchase, or the routine use of technology byend-users.2003 Pollard (172)

Table 1
Plan of the research process

- cles were found by searching electronic libraries and electronic databases such as Proquest, Emerald and Infotrac, web search engines including Google Scholar, individual researchers
' web sites and research group web sites.Web sites searched included Institute of Electrical and Electronics Engineers Computer Society (IEEE), Association for Computing Machinery (ACM), Australian Computer Society (ACS), and University web sites such as University of Strathclyde, University of Calgary and University of Arizona.

Table 3
Portion of data matrix showing quotes

Table 4
Communication systems terminology from Table3

Table 6
Portion of spreadsheet showing definitions Affective reward is the positive emotional response sometimes associated with goal attainment.2003 Siao (19) Affective reward is defined as the positive emotional response sometimes associated with goal attainment (Reinig & Briggs 1995).