Towards automatic conceptual personalization tools

10 pages
58 views
of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Description
Towards automatic conceptual personalization tools
Tags
Transcript
  Towards Automatic Conceptual Personalization Tools Faisal Ahmad, Sebastian de la Chica, Kirsten Butcher 2 , Tamara Sumner, James H. Martin Institute of Cognitive Science Department of Computer Science   University of Colorado, Boulder, CO 80309 Learning Research and Development Center 2  University of Pittsburgh, Pittsburgh, PA 15260 faisal.ahmad, sebastian.delachica, tamara.sumner, james.martin@colorado.edu kbutcher@pitt.edu 2 ABSTRACT  This paper describes the results of a study designed to validate the use of domain competency models to diagnose student scientific misconceptions and to generate personalized instruction plans using digital libraries. Digital library resources provided the content base for human experts to construct a domain competency model for earthquakes and plate tectonics encoded as a knowledge map. The experts then assessed student essays using comparisons against the constructed domain competency model and prepared personalized instruction plans using the competency model and digital library resources. The results from this study indicate that domain competency models generated from select digital library resources may provide the desired degree of content coverage to support both automated diagnosis and personalized instruction in the context of nationally-recognized science learning goals. These findings serve to inform the design of personalized instruction tools for digital libraries. Categories and Subject Descriptors  H.1.2 [ Models and Principles ]: User/Machine Systems; H.3.7 [ Information Storage and Retrieval ]: Digital Libraries – User issues ; K.3.1. [ Computers and Education ]: Computer Uses in Education – Computer-aided instruction .   General Terms  Design, Experimentation, Human Factors Keywords  Personalization, digital libraries, competency models, knowledge maps, student misconceptions, instructional plans 1.   INTRODUCTION Over the past two decades, cognitive research has examined the role of background knowledge, individual differences, and preferred learning styles in influencing learning outcomes [4]. A key finding is that every student brings preconceptions about how the world works to every learning situation, and that these initial understandings need to be explicitly targeted as part of the instructional process [ibid]. Simultaneously, there have been major demographic shifts taking place in learning populations, with many classrooms containing learners from diverse cultural backgrounds and prior experiences [28]. Educators increasingly need support to customize educational content and activities to meet the needs of a heterogeneous student population [11].   To address these needs, we are investigating how personalization tools capable of assessing and responding to current student conceptions and misconceptions can be embedded in educational digital libraries. The personalization tools we are creating for digital libraries are similar in intent to prior work in adaptive learning environments, such as AutoTutor [8] and the Practical Algebra Tutor [12], but differ in two key ways. First, rather than relying on human-intensive knowledge engineering efforts to create models of student competencies within a given domain, we are examining how natural language processing techniques may be used to automatically construct learner competency models by summarizing existing digital library resources. Our hypothesis is that a carefully selected set of high-quality educational resources features the necessary breadth and depth of coverage to serve as the basis for the automatic development of pedagogically sound and age-appropriate domain competency models. Second, rather than relying on predefined content and curriculum specifically authored for a particular adaptive learning environment, we are developing personalization tools to dynamically select digital library resources that encompass a variety of instructional strategies, such as background readings, simulations, and other learning activities. To inform the design and development of the envisioned conceptual personalization tools, and the underlying natural language processing algorithms, we have conducted a multi-part 10 month study to examine the processes used by human experts to: (1) construct domain competency models from digital library resources and (2) develop personalized instructional strategies based on the competency models. This study examined in detail how experts identify and represent key concepts that students should know by analyzing and summarizing digital library resources. We specifically asked experts to develop an age-appropriate domain competency model for high school students studying earthquakes and plate tectonics using materials from the Digital Library for Earth System Education (DLESE.org). DLESE provides access to high-quality collections of educational resources for the Earth sciences and services to help educators and learners effectively find, create, use, and share such resources. The topics chosen are recognized as important content areas in national science educational standards [2, 17]. In this article, we describe the study methodology, the results, and implications for the design of automatic conceptual personalization tools. In particular, we focus on discussing four key questions: Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.  JCDL’07  , June 18–23, 2007, Vancouver, BC, Canada. Copyright 2007 ACM 1-58113-000-0/00/0004…$5.00.  •   Can domain concepts important for the development of student competencies be reliably and consistently identified in educational digital library resources? If human experts can do so consistently, it is likely that computational approaches can be developed to automate such processes. •   Do the domain concepts embodied in a set of digital library resources provide sufficient coverage of important learning goals for the target student population? •   Is the domain competency model generated from digital library resources useful for diagnosing student misconceptions and understandings? •   How well does the domain competency model support developing personalized instructional strategies using digital library resources? In the remainder of this paper we first review related work in the areas of adaptive learning environments, knowledge maps and digital library information extraction. We then describe the methodology and results from our study. Finally we discuss the implications for the design of automatic algorithms to construct domain competency models that support personalized instruction. 2.   RELATED WORK Developing approaches for tailoring instruction to students’ current understanding, in both face-to-face classroom settings and computer-based learning environments, has been a long-term focus of learning science research [20]. For instance, conversational learning theory describes how personalization takes place in a face-to-face classroom setting. According to this theory, understanding of a topic occurs as a result of a structured and iterative conversation between an instructor and a learner within a conversational domain  [20, 25]. Learner knowledge is constructed through an iterative process where students communicate their current understanding – through discussion, writings, or other scholarly artifacts – and the instructor develops an instructional strategy appropriate to both the learner’s current understanding and the instructor’s knowledge of what the learner should   know about the topic. Together, these two forms of knowledge about the topic under study comprise the conversational domain. Pask developed this theory as part of one of the first efforts to create an adaptive learning environment in the 1970s; he provided an early illustration of how a computer-based knowledge representation of the conversational domain could be used to support personalized instruction [20]. More recently, adaptive learning environments have tried to emulate these types of student-teacher interactions by developing rich and detailed symbolic computer models to drive interactions with the student. These models typically depict the key ideas or concepts that learners should understand, how these ideas are interconnected, how these ideas change over time, and specific curriculum components to support learning of selected concepts [1, 2, 26]. Two prominent adaptive learning environments – the Practical Algebra Tutor and AutoTutor – have both reported impressive learning outcomes in controlled studies [8, 12]. The Practical Algebra Tutor supports students studying high-school algebra [23]. It uses a detailed cognitive model of desirable student competencies in algebra, represented as production rules, and a detailed model of the curriculum, represented as algebra problems that students should be able to solve, to automatically select problems to present to students. The Practical Algebra Tutor has been widely deployed in real classroom settings. To deploy this system in new settings and school districts, the researchers report that the content, i.e., the choices of problems presented to students, needed to be continually expanded and updated [ibid]. The costs and human effort associated with developing and updating the necessary instructional content is so high that the researchers have developed authoring tools to assist with these human-intensive knowledge engineering efforts [ibid]. The AutoTutor system developed by Graesser et al uses an underlying model of what students should know about qualitative physics, represented as curriculum scripts, to personalize tutorial conversations for the topic of undergraduate introductory physics [8]. These curriculum scripts, created by subject matter experts, explicitly model expected student domain competencies, ideal answers, corresponding physics problems, questions to elicit further student knowledge, and common student misconceptions. Once again, the researchers found that the intensive human effort involved in creating these scripts raises significant barriers to applying this approach to other domains; similarly to the Practical Algebra Tutor work, the researchers are also developing tools to facilitate the construction of these scripts [27]. These two examples illustrate the potential successes (positive learning gains) and the challenges (human-intensive labor needed for model and curriculum development) to be faced when developing systems to support conceptual personalization in learning interactions. Prior research has demonstrated both the utility and production costs of using symbolic knowledge models for diagnosing student understanding and for generating personalized instruction strategies. Advances in statistical methods have prompted other researchers to explore the use of fully automatic techniques such as Latent Semantic Analysis [13] in adaptive learning environments. For instance, the Summary Street application assesses student writing by comparing the ‘bag of words’ derived from a student essay, represented as a vector, with those from a prepared corpus of materials about the subject domain to produce a cosine value characterizing the degree of alignment [29]. While adept at detecting the existence of discrepancies and problems in student essays, this vector-based knowledge representation does not readily support identifying the contents of learners’ specific misunderstandings, nor does it support generating a specific instructional strategy. Our approach to supporting conceptual personalization is aimed at balancing ‘the best of both worlds’: using statistical natural language processing approaches to automatically generate domain competency models, in the form of a semi-structured knowledge representation called a “knowledge map”, and in turn, using these knowledge maps to underpin personalized instructional strategies. We believe that knowledge maps provide a rich enough representation to support our computational needs, and are also useful representations for learners to see and use. To illustrate their utility, consider a scenario where Heather, a 12th grade science student, has been assigned the task of writing an online essay on the causes of earthquakes using the envisioned personalization tools. The personalization tools have previously processed digital library resources from DLESE to construct a domain competency model depicting desired high-school level understandings of earthquakes and plate tectonics (see Figure 1.)  Heather writes that earthquakes can occur all over the world and requests feedback from the personalization tools. The tools analyze and detect critical differences between Heather’s essay and nodes 1 and 2 in the domain competency model. To address this misconception, the personalization tools select age-appropriate resources from DLESE about the distribution of earthquakes and their prevalence along plate boundaries. These resources are presented to Heather along with the relevant portion of this knowledge map. The knowledge map provides Heather with an overarching conceptual guide, highlighting the core concepts Heather needs to work on and how they are related. This map also helps her understand why these resources were selected and how they can help with her specific learning needs. This personalized response prompts Heather to reflect on the inaccuracy of her current conception. Heather remembers that there are more earthquakes in California (where her grandmother lives) than in Colorado (where she lives). Heather explores this difference using a DLESE resource – a simulation illustrating the relationship between plate boundaries and earthquakes – suggested by the personalization tools. As shown in Figure 1, knowledge maps are a specialized type of concept map. Concept maps have been shown to be reliable representations of learner understanding and flexible models to track and assess cognitive development [18]. Concept maps are hierarchical node-link diagrams that depict concepts, usually as nodes with one or two keywords, and their interrelationships, either as labeled or unlabeled links. Knowledge maps differ from concept maps by depicting knowledge using a network layout, by using richly descriptive statements in the nodes to capture robust concepts and ideas related to a domain, and by focusing on a limited number of link types. Because of these differences, knowledge maps tend to be more concise and more useful as sharable, human-readable representations than concept maps. Prior research indicates that knowledge maps are useful cognitive scaffolds, helping users lacking domain expertise – such as learners, new teachers, or educators teaching out of area – to understand the macro-level structure of an information space [9, 19]. Supporting concept map creation and their automatic analysis is an active research area in the digital library community [14, 15]. Recent research shows promise for the development of algorithms for automatically performing node and link element matching in order to assess student-produced concept maps computationally [14]. Marshall et al developed algorithms to compare student-produced maps to a ‘gold-standard’ expert-produced map to characterize the degree of alignment with a numerical score. We are extending this research to consider whether automatic comparisons of knowledge maps can provide learners with more specific feedback about conceptual differences, such as the interactions between Heather and the personalization tools in the above scenario. The knowledge maps we are trying to create are, in effect, a specialized type of multi-document summarization. As such, generating them will require addressing issues in information extraction and library resource summarization. Within the digital library community, there are several research efforts that inform this work. For instance, Liddy [30] uses an assortment of natural language processing techniques to extract metadata elements by analyzing educational digital library resources, including the generation of very short resource summaries to populate the brief description field in the metadata. Fox et al [10] also demonstrated the potential value of these techniques for information extraction and metadata generation. While these approaches have met with promising results, these techniques do not directly address the issue of how to identify and extract key domain concepts from digital library resources, nor how to produce summaries of more than one resource. McKeown et al [16] directly investigated how multi-document summarization techniques could support personalized digital library interactions. This research considered how user models (in the form of patient medical records) could be used to select and re-rank the presentation of search results. Additionally, they developed algorithms for selecting and organizing important passages from retrieved documents to create a personalized summary of the search results for doctors. As part of a related but more generic research effort in multi-document summarization, Radev et al [22], have developed the MEAD toolkit to support the development of summarization applications. We are directly building upon this toolkit in our research and extending it to support the summarization of educational resources, which differ in structure and content from the news articles that the MEAD team focused on supporting. 3.   METHODOLOGY Our research efforts in the area of personalization in digital libraries aim to determine how domain competency models computationally constructed from digital library resources may support the automatic diagnosis of student misconceptions and the generation of personalized instruction plans. We have conducted a study to elicit human expertise for constructing and utilizing a domain competency model for personalized instruction purposes and to identify design requirements for the automation of these pedagogical processes. This human-centered approach follows in the tradition of prior research efforts where data collected in human studies serves as the basis for formulating design requirements [3, 6]. To foster the construction of a scientifically accurate and pedagogically useful domain competency model, we recruited two geology experts and two instructional design experts. Our geology experts had a Master’s level education in geology and at least 5 years of field experience. Our instructional designers had at least 10 years of experience in curriculum and learning materials design. These four experts collaborated with the research team on Figure 1 - Partial domain knowledge map [7]  this study over a period of 10 months for a total of approximately 80 hours per expert. The study consisted of two phases: competency model construction and competency model assessment. 3.1   Competency Model Construction The first phase of the study involved the experts creating a domain competency model. This phase included resource selection and knowledge map construction processes. 3.1.1    Resource Selection Process The purpose of the resource selection process was to identify a suite of digital library resources suitable for constructing a comprehensive domain competency model featuring the desired level of science content accuracy and coverage. We instructed the experts to select appropriate learning resources from DLESE using three guiding principles. (1) To encourage the selection of resources that provided adequate breadth of coverage on the topic and to avoid excessive specialization, we instructed the experts to focus on resources providing comprehensive coverage for high school age learners in earthquakes and plate tectonics. (2) Given that the National Science Education Standards highlight the importance of learners understanding the necessary scientific terminology [17], we instructed the experts to focus on resources using age-appropriate domain terminology. (3) To generate data suitable for informing the design of natural language processing tools, we instructed the experts to focus on resources consisting mainly of expository text. The experts independently selected and ranked the 10 optimal DLESE resources given the resource selection guidelines and the targeted topic and age group. To achieve adequate balance between domain and pedagogical coverage, we paired each domain expert with an instructional designer to jointly rank their respective resource choices. All four experts then collaboratively reviewed and ranked all the individually selected resources and agreed on 20 resources in a discussion facilitated by research team members. At the end of this process, the research team collected the list of the 20 resources selected by the experts. 3.1.2   Knowledge Map Construction Process The purpose of this process was to construct a comprehensive domain competency model encoded as a knowledge map using the 20 digital library resources selected by the experts. 3.1.2.1    Individual Knowledge Map Construction To establish the consistency of the construction process and hence the feasibility of the planned natural language processing approaches, the experts first created knowledge maps for each of the learning resources. Each expert used CmapTools [5], a knowledge modeling software, to individually create knowledge maps for 11 of the 20 resources chosen in the resource selection process. To ensure that the experts would faithfully and reliably reflect the contents of the digital library resource in their knowledge maps, we instructed the experts to use the nodes in the knowledge map to capture the concepts as presented in each digital library resource as a paragraph, a sentence, a clause or one or more words. In addition, the experts were also instructed to use the knowledge map links to reflect the relationships between the concepts as expressed in the digital library resource. To facilitate relationship tagging, we provided our experts with a preliminary list of relationship categories and examples presented in Table 1. We also encouraged our experts to introduce any relationship terms they deemed necessary to best capture the contents of the resource. Each digital library resource was mapped by at least two experts to enable the research team to analyze the degree of knowledge mapping consistency across experts. At the completion of this activity, the research team collected the knowledge maps individually created by the experts. Table 1 - Relationship types adapted from GetSmart [15] Relationship Type Examples Causes cause-effect, cause, result, consequence Compares comparison, analogy, is similar to, contrasts to Elaborates elaboration-additional, elaboration-general-specific, elaboration-part-whole/consists-of, elaboration-process-step, elaboration-object-attribute, elaboration-set-member, example, definition Evaluates evaluation, interpretation, conclusion, comment Followed by sequence Is a is-a-kind-of Supports / Is evidence for supports, is-evidence-for 3.1.2.2   Knowledge Map Integration To ensure the desired breadth and depth of coverage, the experts integrated their knowledge maps into a single domain competency model encoded as a knowledge map in two steps. First, to facilitate the creation of the domain competency model, each expert used CmapTools to merge the 11 knowledge maps s/he created into a single, merged knowledge map. This process resulted in four individually merged knowledge maps (one per expert). To better inform the automation of this process, we instructed the experts to constrain their knowledge map merging activities to the actual contents of the resources using three guiding principles. (1) The merged knowledge map should be representative of the digital library resources from which it was synthesized. (2) The merged knowledge map should contain key domain concepts and relationships from the resources. (3) If the underlying resources contained conflicting concepts or propositions, such inconsistencies needed to be captured accurately. Second, we conducted a one-day collaborative workshop where the experts integrated the four individually merged knowledge maps into a single knowledge map. During this workshop, the experts also validated the contents of the final domain competency model for accuracy, completeness and fidelity to the srcinal resources. At the workshop, the research team provided access to large printouts of the four individually merged knowledge maps as well as to the electronic versions of the individually merged knowledge maps and the emerging domain competency model. During the workshop, research team members facilitated the discussion and construction of the domain knowledge map using CmapTools to capture the final outcome. At the completion of the workshop, the experts had produced a draft of the domain competency model encoded as a knowledge map. This draft was circulated via email for a final round of individual  reviews and to reach final consensus on its contents. The research team collected the final version of the domain knowledge map at the completion of this offline review. Figure 2 shows a portion of the final domain competency model, including an inset showing the full extent of the model. 3.2   Competency Model Assessment The second phase of the study involved the experts assessing aspects of the domain competency model crucial to our guiding research questions. This phase included construction process reliability, content coverage, and pedagogical utility assessments. 3.2.1    Reliability The purpose of the reliability assessment was to establish whether the experts represented the same content from the srcinal digital library resources in their knowledge maps. Establishing consistency in the knowledge mapping process would provide an indication of the feasibility of the proposed automated domain competency model construction approach. Two members of the research team served as annotators and independently created a hierarchical outline of the contents of four randomly selected digital library resources. The annotators then aligned their outlines to the concepts used by the experts in the corresponding knowledge maps. Table 2 shows the hierarchical outline created by one of the annotators for a DLESE resource and its alignment to the knowledge map for that same resource. All the hierarchical content outlines created by the annotators depicted the resources using two or more levels. We leveraged this commonality to drive our reliability analysis. To this end, we created clusters consisting of knowledge map nodes aligned to the first and second levels in the outline. In addition, we rolled up nodes aligned to third and lower levels of the outline to the clusters at the second level of the outline. For example, Table 2 shows the outline levels below 4.3 Convergent   aligned to Figure 2 - Domain competency model for earthquakes and plate tectonics
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks