diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..5b6a065 --- /dev/null +++ b/.gitignore @@ -0,0 +1,4 @@ +.Rproj.user +.Rhistory +.RData +.Ruserdata diff --git a/Code of Conduct.pdf b/Code of Conduct.pdf new file mode 100644 index 0000000..b987daf Binary files /dev/null and b/Code of Conduct.pdf differ diff --git a/HUDK2017.rdf b/HUDK2017.rdf new file mode 100644 index 0000000..15c0041 --- /dev/null +++ b/HUDK2017.rdf @@ -0,0 +1,1368 @@ + + + journalArticle + + + + + + Bowers + Alex J. + + + + + + + + data + + + data analysis + + + Decision Making + + + Dropouts + + + + Elementary School Students + + + + + Grades (Scholastic) + + + + Identification + + + + MULTIVARIATE analysis + + + + + School Districts + + + + + Secondary School Students + + + 2010/05/00 + 2014-09-24 19:31:29 + ERIC + en + School personnel currently lack an effective method to pattern and visually interpret disaggregated achievement data collected on students as a means to help inform decision making. This study, through the examination of longitudinal K-12 teacher assigned grading histories for entire cohorts of students from a school district (n=188), demonstrates a novel application of hierarchical cluster analysis and pattern visualization in which all data points collected on every student in a cohort can be patterned, visualized and interpreted to aid in data driven decision making by teachers and administrators. Additionally, as a proof-of-concept study, overall schooling outcomes, such as student dropout or taking a college entrance exam, are identified from the data patterns and compared to past methods of dropout identification as one example of the usefulness of the method. Hierarchical cluster analysis correctly identified over 80% of the students who dropped out using the entire student grade history patterns from either K-12 or K-8. (Contains 5 figures.) + Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students: Grades, Data Driven Decision Making, Dropping out and Hierarchical Cluster Analysis + Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students + + + 15 + 7 + Practical Assessment, Research & Evaluation + ISSN 1531-7714 + + + attachment + + Bowers_2010_Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students.pdf + 2 + application/pdf + + + attachment + + + + http://eric.ed.gov/?id=EJ933686 + + + 2014-09-24 19:31:29 + Snapshot + 1 + text/html + + + journalArticle + + + + + + Grunspan + Daniel Z. + + + + + Wiggins + Benjamin L. + + + + + Goodreau + Steven M. + + + + + + + Week 2 + + + http://www.lifescied.org/content/13/2/167 + + + 167-178 + 06/20/2014 + 2014-08-20 20:21:46 + www.lifescied.org + en + Social interactions between students are a major and underexplored part of undergraduate education. Understanding how learning relationships form in undergraduate classrooms, as well as the impacts these relationships have on learning outcomes, can inform educators in unique ways and improve educational reform. Social network analysis (SNA) provides the necessary tool kit for investigating questions involving relational data. We introduce basic concepts in SNA, along with methods for data collection, data processing, and data analysis, using a previously collected example study on an undergraduate biology classroom as a tutorial. We conduct descriptive analyses of the structure of the network of costudying relationships. We explore generative processes that create observed study networks between students and also test for an association between network position and success on exams. We also cover practical issues, such as the unique aspects of human subjects review for network studies. Our aims are to convince readers that using SNA in classroom environments allows rich and informative analyses to take place and to provide some initial tools for doing so, in the process inspiring future educational studies incorporating relational data. + Understanding Classrooms through Social Network Analysis: A Primer for Social Network Analysis in Education Research + Understanding Classrooms through Social Network Analysis + + + 13 + 2 + CBE-Life Sciences Education + ISSN , 1931-7913 + CBE Life Sci Educ + DOI 10.1187/cbe.13-08-0162 + + + attachment + + Grunspan et al_2014_Understanding Classrooms through Social Network Analysis.pdf + 2 + application/pdf + + + attachment + + + + http://www.lifescied.org/content/13/2/167 + + + 2014-08-20 20:21:46 + Snapshot + 1 + text/html + + + blogPost + + + The Chronicle of Higher Education Blogs: Wired Campus + + + + + + + Young + Jeffrey R. + + + + + + + + http://chronicle.com/blogs/wiredcampus/why-students-should-own-their-educational-data/54329 + + + August 21, 2014 + 2014-08-23 21:32:22 + Why Students Should Own Their Educational Data + + + attachment + + + + http://chronicle.com/blogs/wiredcampus/why-students-should-own-their-educational-data/54329 + + + 2014-08-23 21:32:24 + Chronicle of Higher Education Snapshot + 1 + text/html + + + journalArticle + + + + + + Corbett + Albert T. + + + + + Anderson + John R. + + + + + + + + Education (general) + + + + + empirical validity + + + + + individual differences + + + + + intelligent tutoring systems + + + + Learning + + + + Management of Computing and Information Systems + + + + + mastery learning + + + + + Multimedia Information Systems + + + + + procedural knowledge + + + + + Psychology, general + + + + + student modeling + + + + + User Interfaces and Human Computer Interaction + + + + + http://link.springer.com.ezp-prod1.hul.harvard.edu/article/10.1007/BF01099821 + + + 253-278 + 1994/12/01 + 2013-04-21 21:21:19 + link.springer.com.ezp-prod1.hul.harvard.edu + en + This paper describes an effort to model students' changing knowledge state during skill acquisition. Students in this research are learning to write short programs with the ACT Programming Tutor (APT). APT is constructed around a production rule cognitive model of programming knowledge, called theideal student model. This model allows the tutor to solve exercises along with the student and provide assistance as necessary. As the student works, the tutor also maintains an estimate of the probability that the student has learned each of the rules in the ideal model, in a process calledknowledge tracing. The tutor presents an individualized sequence of exercises to the student based on these probability estimates until the student has ‘mastered’ each rule. The programming tutor, cognitive model and learning and performance assumptions are described. A series of studies is reviewed that examine the empirical validity of knowledge tracing and has led to modifications in the process. Currently the model is quite successful in predicting test performance. Further modifications in the modeling process are discussed that may improve performance levels. + Knowledge tracing: Modeling the acquisition of procedural knowledge + Knowledge tracing + + + 4 + 4 + User Modeling and User-Adapted Interaction + ISSN 0924-1868, 1573-1391 + User Model User-Adap Inter + DOI 10.1007/BF01099821 + + + attachment + + Corbett_Anderson_1994_Knowledge tracing.pdf + 2 + application/pdf + + + conferencePaper + + + + LAK '12 + + ISBN 978-1-4503-1111-3 + DOI 10.1145/2330601.2330661 + Proceedings of the 2Nd International Conference on Learning Analytics and Knowledge + + + + + + + New York, NY, USA + + + ACM + + + + + + + Siemens + George + + + + + Baker + Ryan S. J. d. + + + + + + + Collaboration + + + + educational data mining + + + + + learning analytics and knowledge + + + + + http://doi.acm.org/10.1145/2330601.2330661 + + + 252–254 + 2012 + 2015-01-16 03:15:55 + ACM Digital Library + Growing interest in data and analytics in education, teaching, and learning raises the priority for increased, high-quality research into the models, methods, technologies, and impact of analytics. Two research communities -- Educational Data Mining (EDM) and Learning Analytics and Knowledge (LAK) have developed separately to address this need. This paper argues for increased and formal communication and collaboration between these communities in order to share research, methods, and tools for data mining and analysis in the service of developing both LAK and EDM fields. + Learning Analytics and Educational Data Mining: Towards Communication and Collaboration + Learning Analytics and Educational Data Mining + + + attachment + + Siemens_Baker_2012_Learning Analytics and Educational Data Mining.pdf + 2 + application/pdf + + + book + + + + + Sebastopol, CA + + + O'Reily Media + + + + + + + Zheng + Alice + + + + + + + + http://www.oreilly.com/data/free/evaluating-machine-learning-models.csp?intcmp=il-data-free-lp-lgen_free_reports_page + + + September 2015 + 2015-12-15 18:26:39 + Data science today is a lot like the Wild West: there’s endless opportunity and excitement, but also a lot of chaos and confusion. If you’re new to data science and applied machine learning, evaluating a machine-learning model can seem pretty overwhelming... + Evaluating Machine Learning Models + + + attachment + + + + http://www.oreilly.com/data/free/evaluating-machine-learning-models.csp?intcmp=il-data-free-lp-lgen_free_reports_page + + + 2015-12-15 18:26:39 + Snapshot + 1 + text/html + + + videoRecording + + + + + Educause + + + + + + + + + Collier + Amy + + + + + Hickey + Daniel + + + + + Reich + Justin + + + + + Wagner + Ellen + + + + + Campbell + Gardner + + + + + + Assessment + + + Education + + + + educational assessment + + + + EDUCAUSE + + + + Higher Education + + + + learners + + + Learning + + + + Teaching and learning + + + + + https://www.youtube.com/watch?v=_iv8A1pHNYA + + + 2015-08-17 + 2016-01-17 18:50:57 + YouTube + 470 seconds + Several higher education learning and assessment professionals discuss the difficulties of measuring learning. + Why Is Measuring Learning So Difficult? + + + webpage + + + + + + + + Weinersmith + Zach + + + + + + + + http://www.smbc-comics.com/index.php?id=3978 + + + January 5 2016 + 2016-01-18 18:17:09 + Saturday Morning Breakfast Cereal + + + attachment + + + + http://www.smbc-comics.com/index.php?id=3978 + + + 2016-01-18 18:17:10 + Saturday Morning Breakfast Cereal + 1 + text/html + + + webpage + + + + + + + RStudio + + + + + + + http://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf + + + January 2015 + 2016-01-18 18:42:27 + The Data Wrangling Cheatsheet + + + attachment + + + + http://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf + + + 2016-01-18 18:42:27 + data-wrangling-cheatsheet - data-wrangling-cheatsheet.pdf + 1 + application/pdf + + + conferencePaper + + + Proceedings of the Fourth International Conference on Learning Analytics And Knowledge + + + + ACM + + + + + + Clow + Doug + + + + + 49–53 + 2014 + Data wranglers: human interpreters to help close the feedback loop + + + magazineArticle + + The Conversation + + + + + + Kucirkova + Natalia + + + + + FitzGerald + Elizabeth + + + + + + + + http://theconversation.com/zuckerberg-is-ploughing-billions-into-personalised-learning-why-51940 + + + December 9 2015 + 2016-01-18 19:14:05 + Zuckerburg wants to plough billions into personalised learning, but his way may not be the right way. + Zuckerberg is ploughing billions into 'personalised learning' – why? + + + attachment + + + + https://theconversation.com/zuckerberg-is-ploughing-billions-into-personalised-learning-why-51940 + + + 2016-01-18 19:14:05 + Snapshot + 1 + text/html + + + videoRecording + + Youtube + + + + + Udacity + + + + + + + + Georgia Tech + + + + + + + + https://www.youtube.com/watch?v=8CpRLplmdqE + + + 23 February 2015 + 2016-01-18 19:18:06 + 3:13 + Feature Selection + + + attachment + + + + https://www.youtube.com/watch?v=8CpRLplmdqE + + + 2016-01-18 19:18:06 + Snapshot + 1 + text/html + + + webpage + + RStudio + + + + + + Groelmund + Garrett + + + + + + + + https://www.rstudio.com/resources/cheatsheets/ + + + 01 August 2014 + 2016-01-19 21:17:28 + RStudio Cheat Sheets + + + attachment + + + + http://shiny.rstudio.com/articles/rm-cheatsheet.html + + + 2016-01-19 21:17:29 + Shiny - The R Markdown Cheat sheet + 1 + text/html + + + journalArticle + + + + + + Greller + Wolfgang + + + + + Drachsler + Hendrik + + + + + + + http://www.jstor.org/stable/jeductechsoci.15.3.42 + + + 42-57 + 2012 + 2016-09-03 18:55:41 + JSTOR + ABSTRACT With the increase in available educational data, it is expected that Learning Analytics will become a powerful means to inform and support learners, teachers and their institutions in better understanding and predicting personal learning needs and performance. However, the processes and requirements behind the beneficial application of Learning and Knowledge Analytics as well as the consequences for learning and teaching are still far from being understood. In this paper, we explore the key dimensions of Learning Analytics (LA), the critical problem zones, and some potential dangers to the beneficial exploitation of educational data. We propose and discuss a generic design framework that can act as a useful guide for setting up Learning Analytics services in support of educational practice and learner guidance, in quality assurance, curriculum development, and in improving teacher effectiveness and efficiency. Furthermore, the presented article intends to inform about soft barriers and limitations of Learning Analytics. We identify the required skills and competences that make meaningful use of Learning Analytics data possible to overcome gaps in interpretation literacy among educational stakeholders. We also discuss privacy and ethical issues and suggest ways in which these issues can be addressed through policy guidelines and best practice examples. + Translating Learning into Numbers: A Generic Framework for Learning Analytics + Translating Learning into Numbers + + + 15 + 3 + Journal of Educational Technology & Society + ISSN 1176-3647 + Journal of Educational Technology & Society + + + journalArticle + + + + + + Konstan + Joseph A. + + + + + Walker + J. D. + + + + + Brooks + D. Christopher + + + + + Brown + Keith + + + + + Ekstrand + Michael D. + + + + + + + learning assessment + + + + + Massively Online Open Course (MOOC) + + + + + http://doi.acm.org/10.1145/2728171 + + + 10:1–10:23 + April 2015 + 2016-09-03 20:38:02 + ACM Digital Library + Teaching Recommender Systems at Large Scale: Evaluation and Lessons Learned from a Hybrid MOOC + Teaching Recommender Systems at Large Scale + + + 22 + 2 + ACM Trans. Comput.-Hum. Interact. + ISSN 1073-0516 + DOI 10.1145/2728171 + + + book + + + International Educational Data Mining Society + + + + + + + Matsuda + Noboru + + + + + Furukawa + Tadanobu + + + + + Bier + Norman + + + + + Faloutsos + Christos + + + + + + + + Automation + + + + Comparative Analysis + + + + Correlation + + + data + + + + Formative Evaluation + + + + models + + + Online Courses + + + Skills + + + + http://eric.ed.gov/?id=ED560513 + + + 2015/06/00 + 2016-09-03 20:48:57 + ERIC + en + How can we automatically determine which skills must be mastered for the successful completion of an online course? Large-scale online courses (e.g., MOOCs) often contain a broad range of contents frequently intended to be a semester's worth of materials; this breadth often makes it difficult to articulate an accurate set of skills and knowledge (i.e., a skill model, or the QMatrix). We have developed an innovative method to discover skill models from the data of online courses. Our method assumes that online courses have a pre-defined skill map for which skills are associated with formative assessment items embedded throughout the online course. Our method carefully exploits correlations between various parts of student performance, as well as in the text of assessment items, to build a superior statistical model that even outperforms human experts. To evaluate our method, we compare our method with existing methods (LFA) and human engineered skill models on three Open Learning Initiative (OLI) courses at Carnegie Mellon University. The results show that (1) our method outperforms human-engineered skill models, (2) skill models discovered by our method are interpretable, and (3) our method is remarkably faster than existing methods. These results suggest that our method provides a significant contribution to the evidence-based, iterative refinement of online courses with a promising scalability. [For complete proceedings, see ED560503.] + Machine Beats Experts: Automatic Discovery of Skill Models for Data-Driven Online Course Refinement + Machine Beats Experts + + + attachment + + Matsuda et al_2015_Machine Beats Experts.pdf + 2 + application/pdf + + + attachment + + + + http://eric.ed.gov/?id=ED560513 + + + 2016-09-03 20:48:57 + Snapshot + 1 + text/html + + + bookSection + + + Introduction to Social Network Methods + + + + + + + Hanneman + R.A. + + + + + Riddle + M. + + + + + + + + http://faculty.ucr.edu/~hanneman/nettext/C1_Social_Network_Data.html + + + 2016-01-18 20:17:24 + 2016-01-18 20:17:24 + Chapter 1: Social Network Data + + + attachment + + + http://faculty.ucr.edu/~hanneman/nettext/C1_Social_Network_Data.html + + + 2016-01-18 20:17:25 + Introduction to Social Network Methods:  Chapter 1: Social Network Data + 3 + text/html + + + bookSection + + + ISBN 978-0-9952408-0-3 + The Handbook of Learning Analytics + + + + + + + Vancouver, BC, Canada + + + + + + + + + Klerkx + Joris + + + + + Verbert + Katrien + + + + + Duval + Erik + + + + + + www.solaresearch.org + + 1 + 2017 + EN + Learning Analytics Dashboards + + + bookSection + + + ISBN 978-0-9952408-0-3 + The Handbook of Learning Analytics + + + + + + + Alberta, Canada + + + Society for Learning Analytics Research (SoLAR) + + + + + + + Bergner + Yoav + + + + + + + + + Lang + Charles + + + + + Siemens + George + + + + + Wise + Alyssa Friend + + + + + Gaševic + Dragan + + + + + + + http://solaresearch.org/hla-17/hla17-chapter1 + + + 1 + 34-48 + 2017 + Psychological measurement is a process for making warranted claims about states of mind. As such, it typically comprises the following: de ning a construct; specifying a measurement model and (developing) a reliable instrument; analyzing and accounting for various sources of error (including operator error); and framing a valid argument for particular uses of the outcome. Measurement of latent variables is, after all, a noisy endeavor that can neverthe- less have high-stakes consequences for individuals and groups. This chapter is intended to serve as an introduction to educational and psychological measurement for practitioners in learning analytics and educational data mining. It is organized thematically rather than historically, from more conceptual material about constructs, instruments, and sources of measurement error toward increasing technical detail about particular measurement models and their uses. Some of the philosophical differences between explanatory and predictive modelling are explored toward the end. + Measurement and its Uses in Learning Analytics + + + bookSection + + + ISBN 978-0-9952408-0-3 + The Handbook of Learning Analytics + + + + + + + Alberta, Canada + + + Society for Learning Analytics Research (SoLAR) + + + + + + + Brooks + Christopher + + + + + Thompson + Craig + + + + + + + + + Lang + Charles + + + + + Siemens + George + + + + + Wise + Alyssa Friend + + + + + Gaševic + Dragan + + + + + + + http://solaresearch.org/hla-17/hla17-chapter1 + + + 1 + 61-68 + 2017 + This article describes the process, practice, and challenges of using predictive modelling in teaching and learning. In both the elds of educational data mining (EDM) and learning analytics (LA) predictive modelling has become a core practice of researchers, largely with a focus on predicting student success as operationalized by academic achievement. In this chapter, we provide a general overview of considerations when using predictive modelling, the steps that an educational data scientist must consider when engaging in the process, and a brief overview of the most popular techniques in the eld. + Predictive Modelling in Teaching and Learning + + + bookSection + + + ISBN 978-0-9952408-0-3 + The Handbook of Learning Analytics + + + + + + + Vancouver, BC + + + Society for Learning Analytics Research + + + + + + + Prinsloo + P + + + + + Slade + S + + + + + + + https://solaresearch.org/hla-17/hla17-chapter4/ + + + 1 + 49-57 + March 2017 + EN + Ethics and Learning Analytics: Charting the (Un)Charted + + + bookSection + + + ISBN 978-0-9952408-0-3 + The Handbook of Learning Analytics + + + + + + + Vancouver, BC + + + Society for Learning Analytics Research + + + + + + + Liu + R + + + + + Koedinger + K + + + + + + + https://solaresearch.org/hla-17/hla17-chapter6/ + + + 1 + 69-76 + March 2017 + EN + Going Beyond Better Data Prediction to Create Explanatory Models of Educational Data + + + journalArticle + + Significance + + + + + + Gelman + A + + + + + Niemi + J + + + + + 134-136 + September 2011 + Statistical graphics: making information clear – and beautiful + + + journalArticle + + + 38 + 2 + The American Statistician + + + + + + + Wainer + H + + + + + 137-147 + 1984 + How to display data badly + + + journalArticle + + + + + + + + Gelman + A + + + + + Unwin + A + + + + + 2012 + Infovis and Statistical Graphics: Different Goals, Different Looks (with discussion) + + + blogPost + + Junkcharts + + + + + + Fung + K + + + + + + + http://junkcharts.typepad.com/junk_charts/junk-charts-trifecta-checkup-the-definitive-guide.html + + + 2014 + Blog + Junkcharts Trifecta Checkup: The Definitive Guide + + diff --git a/README.html b/README.html deleted file mode 100644 index 3a22094..0000000 --- a/README.html +++ /dev/null @@ -1,507 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - -
-

Data Science in Education: Syllabus

- -
-

Course Description

-

New class motto: “If its not messing up, its not technology”

-

The Internet and mobile computing are changing our relationship to data. Data can be collected from more people, across longer periods of time, and a greater number of variables, at a lower cost and with less effort than ever before. This has brought opportunities and challenges to many domains, but the full impact on education is only beginning to be felt. On the one hand there is a critical mass of educators, technologists and investors who believe that there is great promise in the analysis of this data. On the other, there are concerns about what the utilization of this data may mean for education and society more broadly. Data Science in Education provides an overview of the use of new data cources in education with the aim of developing students’ ability to perform analyses and critically evaluate the technologies and consequences of this emerging field. It covers methods and technologies associated with Data Science, Educational Data Mining and Learning Analytics, as well as discusses the opportunities for education that these methods present and the problems that they may create.

-

No previous experience in statistics, computer science or data manipulation will be expected. However, students will be encouraged to get hands-on experience, applying methods or technologies to educational problems. Students will be assessed on their understanding of technological or analytical innovations and how they critique the consequences of these innovations within the broader educational context.

-
-
-

Course Goals

-

The overarching goal of this course is for students to acquire the knowledge and skills to be intelligent producers and consumers of data science in education. By the end of the course students should: * Systematically develop a line of inquiry utilizing data to make an argument about learning * Be able to evaluate the implications of data science for educational research, policy, and practice

-

This necessarily means that students become comfortable with the educational applications of three domain areas: computer science, statistics and the context surrounding data use. There is no expectation for students to become experts in any one of these areas but rather the course will aim to: enhance student competency in identifying issues at the level of data acquisition, data analysis and application of analysis in education.

-
-
-

Assessment

-

In EDCT-GE 2550 students will be attempting several data science projects, however, unlike most courses, students will not be asssessed based on how successful they are in completing these projects. Rather students will be assessed on two key components for future sucess: contribution and organization. Contribution reflects the extent to which students participate in the course, how often they tweet, whether or not they complete assignments and quizzes, attend class, etc. Organization reflects how well students document their process and maintain data and software resources. For example, maintaining a well organized Zotero library with notes, maintaining a well organized Github account and maintaining organized data sets that are labelled appropriately. To do well in EDCT-GE 2550 requires that students finish the course with the resources to sucessfully use data science in education in the future. Do the work and stay organized and all will be well!

-

Tasks that need to be completed during the semester:

-
    -
  • Attend class
  • -
  • Weekly readings
  • -
  • Comment on readings on Twitter
  • -
  • Weekly in class questionnaire
  • -
  • Maintain documentation of work (Github, R Markdown, Zotero)
  • -
  • Ask one question on Stack Overflow
  • -
  • In person meeting with instructor
  • -
  • 8 short assignments (including one group assignment)
  • -
  • Group presentation of group assignment, 3-5 students each
  • -
  • Produce one argument about learning using data from the class
  • -
-
- - -
-
-

Unit 2: Data Sources & their Manipulation

- -
-

Week 3 Data Tidying (2/11/16 - 2/18/16)

-
-

Learning Objectives:

-
    -
  • Be able to perform a data tidying workflow
  • -
  • Be able to do basic visualization
  • -
  • Understand the importance of workflow and recording workflow
  • -
-
- -
-
-

Week 4: Personalization through Features (2/18/16 - 2/25/16)

-
    -
  • Understand why dimensionality reduction is necessary
  • -
  • Be familiar with broad groups of dimensionality reduction (feature transformation, feature selection, feature extraction)
  • -
  • Understand the complexity of personalization in education
  • -
- -
-
-

Week 5: Dimension Reduction (2/25/16 - 3/3/16)

-
    -
  • Perform one method from each group of dimensionality reduction methods
  • -
  • Be aware of the complexity of Open Data
  • -
- -
-
-
-

Unit 3: Networks

- -
-

Week 7 Social Network Analysis (3/10/16 - 3/17/16)

-
    -
  • Describe and interpret the results of social network analysis for the study of learning
  • -
  • Describe and critically reflect on approaches to the use of social network analysis for the study of learning
  • -
- -
-
-
-

Unit 4: Prediction

- - -
-
-

Unit 5: Natural Language Processing

- -
-

Week 11 Natural Language Processing (4/7/16 - 4/14/16)

-
    -
  • Perform a basic NLP analysis
  • -
  • Develop a well defined opinion on whether students should have a right to understand how they are judged
  • -
- -
-
-
-

Unit 6: The Quantified Student

- -
-
-

Unit 7: Advanced Graphics

- -
- - - - -
- - - - - - - - diff --git a/README.md b/README.md index 2b9fd45..922cfaa 100644 --- a/README.md +++ b/README.md @@ -1,270 +1,390 @@ -# Data Science in Education: Syllabus +# Core Methods in Educational Data Mining: Syllabus +# Jie Chen, Sept. 10th, 2019 -* **Course:** [EDCT-GE2550, NYU Steinhardt](http://steinhardt.nyu.edu/alt/ect/courses) -* **Instructor:** Charles Lang, [charles.lang@nyu.edu](mailto:charles.lang@nyu.edu), @learng00d -* **Location:** 2 MetroTech, Room 845 +Introduction class -## Course Description +* **Course:** [HUDK 4050, Teachers College, Columbia](http://www.columbia.edu/~rsb2162/EDM2015/index.html) +* **Instructor:** Charles Lang, [charles.lang@tc.columbia.edu](lang2@tc.columbia.edu), Twitter: @learng00d +* **Course Assistants:** Anna Lizarov, [al38684@tc.columbia.edu](al3868@tc.columbia.edu), Aidi Bian, [ab4499@tc.columbia.edu](ab4499@tc.columbia.edu) +* **Day/Time:** Tuesdays/Thursdays, 5:10pm - 6:50pm +* **Location:** TH 136 +* **Instructor Office Hours:** Thursdays, 3:00pm - 5:00pm in GDH 454 - **[Please make an appointment to attend office hours here](https://calendar.google.com/calendar/selfsched?sstoken=UUNxY1RIY01kNmJZfGRlZmF1bHR8M2U5ODgxZmNiOWQ0NDc2N2VmNWQ0NThiM2JmMGRmZmQ)** +**(If no appointments are available or you cannot attend those that are please send an email to charles.lang@tc.columbia.edu and CC amy@x.ai)** -New class motto: "If its not messing up, its not technology" +* **Prerequisite:** HUDM 5122 *or* HUDM 5126 *or* approved statistics/computer science data mining course. +* **Credits:** 3 +* **Required Technology:** Laptop with RStudio installed, Phone with the Sensor Kinetics Pro app installed -The Internet and mobile computing are changing our relationship to data. Data can be collected from more people, across longer periods of time, and a greater number of variables, at a lower cost and with less effort than ever before. This has brought opportunities and challenges to many domains, but the full impact on education is only beginning to be felt. On the one hand there is a critical mass of educators, technologists and investors who believe that there is great promise in the analysis of this data. On the other, there are concerns about what the utilization of this data may mean for education and society more broadly. Data Science in Education provides an overview of the use of new data cources in education with the aim of developing students’ ability to perform analyses and critically evaluate the technologies and consequences of this emerging field. It covers methods and technologies associated with Data Science, Educational Data Mining and Learning Analytics, as well as discusses the opportunities for education that these methods present and the problems that they may create. +## Course Description -No previous experience in statistics, computer science or data manipulation will be expected. However, students will be encouraged to get hands-on experience, applying methods or technologies to educational problems. Students will be assessed on their understanding of technological or analytical innovations and how they critique the consequences of these innovations within the broader educational context. +The Internet and mobile computing are changing our relationship to data. Data can be collected from more people, across longer periods of time, and a greater number of variables, at a lower cost and with less effort than ever before. This has brought opportunities and challenges to many domains, but the full impact on education is only beginning to be felt. Core Methods in Educational Data Mining provides an overview of the use of new data sources in education with the aim of developing students’ ability to perform analyses and critically evaluate their application in this emerging field. It covers methods and technologies associated with Data Science, Educational Data Mining and Learning Analytics, as well as discusses the opportunities for education that these methods present and the problems that they may create. ## Course Goals -The overarching goal of this course is for students to acquire the knowledge and skills to be intelligent producers and consumers of data science in education. By the end of the course students should: +The overarching goal of this course is for students to acquire the knowledge and skills to be intelligent producers and consumers of data mining in education. By the end of the course students should: + * Systematically develop a line of inquiry utilizing data to make an argument about learning * Be able to evaluate the implications of data science for educational research, policy, and practice -This necessarily means that students become comfortable with the educational applications of three domain areas: computer science, statistics and the context surrounding data use. There is no expectation for students to become experts in any one of these areas but rather the course will aim to: enhance student competency in identifying issues at the level of data acquisition, data analysis and application of analysis in education. +This necessarily means that students become comfortable with the educational applications of three domain areas: computer science, statistics and the context surrounding data use. There is no expectation for students to become experts in any one of these areas but rather the course will aim to: enhance student competency in identifying issues at the level of data acquisition, data analysis and application of analyses to the educational enterprise. ## Assessment -In EDCT-GE 2550 students will be attempting several data science projects, however, unlike most courses, students will not be asssessed based on how successful they are in completing these projects. Rather students will be assessed on two key components for future sucess: contribution and organization. **Contribution** reflects the extent to which students participate in the course, how often they tweet, whether or not they complete assignments and quizzes, attend class, etc. **Organization** reflects how well students document their process and maintain data and software resources. For example, maintaining a well organized Zotero library with notes, maintaining a well organized Github account and maintaining organized data sets that are labelled appropriately. To do well in EDCT-GE 2550 requires that students finish the course with the resources to sucessfully use data science in education *in the future*. Do the work and stay organized and all will be well! +In HUDK4050 students will be attempting several data science projects, however, unlike most courses, students will not be asssessed based on how successful they are in completing these projects. Rather students will be assessed on two key components that will contribute to their future sucess in the field: contribution and organization. **Contribution** reflects the extent to which students participate in the course, whether or not students complete assignments and quizzes, attend class, etc. **Organization** reflects how well students document their process and maintain data and software resources. For example, maintaining a well organized bibliographic library with notes, maintaining a well organized Github account and maintaining organized data sets that are labelled appropriately. To do well in HUDK 4050 requires that students finish the course with the resources to sucessfully use data science in education *in the future*. Do the work and stay organized and all will be well! Tasks that need to be completed during the semester: +Weekly: * Attend class * Weekly readings - * Comment on readings on Twitter - * Weekly in class questionnaire - * Maintain documentation of work (Github, R Markdown, Zotero) + * Notes on weekly readings + * Complete Swirl course + * Maintain documentation of work (Github, R Markdown) + +One time only: * Ask one question on Stack Overflow - * In person meeting with instructor + * Attend office hours once * 8 short assignments (including one group assignment) * Group presentation of group assignment, 3-5 students each - * Produce one argument about learning using data from the class - + ## Week-by-week Unit 1: Introduction -Unit 2: Data Sources +Unit 2: Data Sources & Their Manipulation -Unit 3: Networks +Unit 3: Structure Discovery Unit 4: Prediction -Unit 5: Natural Language Processing - -Unit 6: Quantified Student - -Unit 7: Advanced Graphics - -## Unit 1: Introduction (1/28/16 - 2/4/16) +# Unit 1: Introduction +## Class 1 - Introduction (9/5/18) ### Learning Objectives * Be familiar with course philosophy, logic & structure * Install and be familiar with the software to be used in the course -* Consider informed consent and its complexity in education technology * Appreciate the importance of tightly defining educational goals +## Class 2 - LA, EDM and the Learning Sciences (9/10/18) + +### Learning Objectives + +* Be familiar with the kinds of work done in the fields of LA and EDM + ### Tasks to be completed: - - 1. Read and comment on by 1/30/16: - * [Leong, B. and Polonetsky, J. 2015. Why Opting Out of Student Data Collection Isn’t the Solution. EdSurge.](https://www.edsurge.com/news/2015-03-16-why-opting-out-of-student-data-collection-isn-t-the-solution) - * [Young, J.R. 2014. Why Students Should Own Their Educational Data. The Chronicle of Higher Education Blogs: Wired Campus.](http://chronicle.com/blogs/wiredcampus/why-students-should-own-their-educational-data/54329) - 2. Assignment 1: Set up +Read/watch: + * [Siemens, George. and Baker, Ryan S.J. d. 2012. Learning Analytics and Educational Data Mining: Towards Communication and Collaboration. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (New York, NY, USA, 2012), 252–254.](http://www.upenn.edu/learninganalytics/ryanbaker/LAKs%20reformatting%20v2.pdf) + +Read chapter 1-3: + * [Grolemund, Garrett. 2014. Hands-On Programming with R](https://d1b10bmlvqabco.cloudfront.net/attach/ighbo26t3ua52t/igp9099yy4v10/igz7vp4w5su9/OReilly_HandsOn_Programming_with_R_2014.pdf) -# Unit 2: Data Sources & their Manipulation -## Week 2 Data Sources (2/4/16 - 2/11/16) +#### Due: Assignment 1 - Set up -### Learning Objectives +## Class 3 - Data Sources (9/12/18) * Be familiar with a range of data sources, formats and extraction processes * Be familiar with R & Github & markdown -* Be familiar with the kinds of work done in the fields of LA and EDM ### Tasks to be completed: - 1. Read/watch and comment: - * [Siemens, G. and Baker, R.S.J. d. 2012. Learning Analytics and Educational Data Mining: Towards Communication and Collaboration. Proceedings of the 2nd International Conference on Learning Analytics and Knowledge (New York, NY, USA, 2012), 252–254.](http://users.wpi.edu/~rsbaker/LAKs%20reformatting%20v2.pdf) - * [Educause 2015. Why Is Measuring Learning So Difficult?](http://er.educause.edu/multimedia/2015/8/why-is-measuring-learning-so-difficult-v) - * [Saturday Morning Breakfast Cereal: 2016.](http://www.smbc-comics.com/index.php?id=3978) - * [The R Markdown Cheat sheet: 2014.](http://shiny.rstudio.com/articles/rm-cheatsheet.html) +Read: +* [Bergner, Yoav. (2017). Measurement and its Uses in Learning Analytics. In C. Lang, G. Siemens, A. F. Wise, & D. Gaševic (Eds.), The Handbook of Learning Analytics (1st ed., pp. 34–48). Vancouver, BC: Society for Learning Analytics Research.](http://solaresearch.org/hla-17/hla17-chapter1) +* [The R Markdown Cheat sheet: 2014.](http://shiny.rstudio.com/articles/rm-cheatsheet.html) - 2. Assignment 2: Github and RStudio +Swirl: +* Unit 1 - Introduction -## Week 3 Data Tidying (2/11/16 - 2/18/16) +# Unit 2: Data Sources & their Manipulation + +## Class 4 - Data Wrangling (9/17/18) ### Learning Objectives: - * Be able to perform a data tidying workflow - * Be able to do basic visualization * Understand the importance of workflow and recording workflow ### Tasks to be completed: -1. Read/watch: - * [Poulson, B. Up and Running with R. Lynda.com. Section 3 - 4](http://www.lynda.com/R-tutorials/Up-Running-R/120612-2.html?org=nyu.edu) - * [Data Wrangling Cheatsheet: 2015.](http://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf) +Read: +* [Prinsloo, Paul, & Slade, Sharon (2017). Ethics and Learning Analytics: Charting the (Un)Charted. In C. Lang, G. Siemens, A. F. Wise, & D. Gaševic (Eds.), The Handbook of Learning Analytics (1st ed., pp. 49–57). Vancouver, BC: Society for Learning Analytics Research.](https://solaresearch.org/hla-17/hla17-chapter4/) +* [Greller, Wendy, & Drachsler, Hendrik. (2012). Translating Learning into Numbers: A Generic Framework for Learning Analytics. Journal of Educational Technology & Society, 15(3), 42–57.](https://www.jstor.org/stable/jeductechsoci.15.3.42?seq=1#page_scan_tab_contents) + +## Class 5 - Data Wrangling (9/19/18) -2. Read/comment: - * [Clow, D. 2014. Data wranglers: human interpreters to help close the feedback loop. Proceedings of the Fourth International Conference on Learning Analytics And Knowledge (2014), 49–53.](http://oro.open.ac.uk/40608/2/Clow-DataWranglers-final.pdf) +### Learning Objectives: -3. Assignment 3 - -## Week 4: Personalization through Features (2/18/16 - 2/25/16) + * Be able to perform a data tidying workflow - * Understand why dimensionality reduction is necessary - * Be familiar with broad groups of dimensionality reduction (feature transformation, feature selection, feature extraction) - * Understand the complexity of personalization in education - ### Tasks to be completed: -1. Read/Comment: +Read: +* [Saturday Morning Breakfast Cereal: 2016.](http://www.smbc-comics.com/index.php?id=3978) - * [Kucirkova, N. and FitzGerald, E. 2015. Zuckerberg is Ploughing Billions into “Personalised Learning” – Why? The Conversation.](https://theconversation.com/zuckerberg-is-ploughing-billions-into-personalised-learning-why-51940) - -2. Read/Watch: +* [Data Wrangling Cheatsheet: 2015.](http://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf) - * [Georgia Tech 2015. Feature Selection. Youtube.](https://www.youtube.com/watch?v=8CpRLplmdqE) - * [Perez-Riverol, Y. 2013. Introduction to Feature Selection for Bioinformaticians Using R, Correlation Matrix Filters, PCA & Backward Selection. R-bloggers.](http://www.r-bloggers.com/introduction-to-feature-selection-for-bioinformaticians-using-r-correlation-matrix-filters-pca-backward-selection/) - -3. Assignment 4 - -## Week 5: Dimension Reduction (2/25/16 - 3/3/16) +## Class 6 - Data Wrangling (9/24/18) + +Read: +* [Clow, Doug. 2014. Data wranglers: human interpreters to help close the feedback loop. Proceedings of the Fourth International Conference on Learning Analytics And Knowledge (2014), 49–53.](http://oro.open.ac.uk/40608/2/Clow-DataWranglers-final.pdf) +* [Young, Jeffrey R. 2014. Why Students Should Own Their Educational Data. The Chronicle of Higher Education Blogs: Wired Campus.](http://chronicle.com/blogs/wiredcampus/why-students-should-own-their-educational-data/54329) + +## Class 7 - Data Wrangling (9/26/18) + +### Learning Objectives: - * Perform one method from each group of dimensionality reduction methods - * Be aware of the complexity of Open Data + * Be familiar with a range of data manipulation commands ### Tasks to be completed: -1. Read/Comment: +Read: +* [Data Wrangling Cheatsheet: 2015.](http://www.rstudio.com/wp-content/uploads/2015/02/data-wrangling-cheatsheet.pdf) - * [Ridgway, J. and Smith, A. 2013. Open data, official statistics and statistics education: threats, and opportunities for collaboration. Proceedings of the Joint IASEIAOS Satellite Conference “Statistics Education for Progress”, Macao, China (2013).](http://iase-web.org/documents/papers/sat2013/IASE_IAOS_2013_Paper_K3_Ridgway_Smith.pdf) +Swirl: +* Unit 2 - Data Sources & Manipulation -2. Assignment 5 +# Unit 3: Structure Discovery -# Unit 3: Networks -## Week 6 Introduction to Networks (3/3/16 - 3/10/16) +## Class 8 - Teachley Class Visit (10/1/18) -### Learning Objectives +## Class 9 - Start Social Networks (10/3/18) + +## Class 10 - Check-in Exam (10/8/18) + +### Learning Objectives: + + * Understand the place of data visualization in the data analysis cycle + * Be familiar with a range of data simulation commands + +## Class 11 - Visualization (10/10/18) + +### Learning Objectives: + + * Be able to generate basic visualizations during on-the-fly analysis + +Read: +* [Gelman, A., & Unwin, A. (2012). Infovis and Statistical Graphics: Different Goals, Different Looks (with discussion)](http://www.stat.columbia.edu/~gelman/research/published/vis14.pdf) +* [Fung, K. (2014). Junkcharts Trifecta Checkup: The Definitive Guide](http://junkcharts.typepad.com/junk_charts/junk-charts-trifecta-checkup-the-definitive-guide.html) + +## Class 12 - Networks (10/15/18) + +### Learning Objectives: + + * Understand the basic premise of graph theory applied to social networks + +### Tasks to be completed: + +Read: +* [Grunspan, D. Z., Wiggins, B. L., & Goodreau, S. M. (2014). Understanding Classrooms through Social Network Analysis: A Primer for Social Network Analysis in Education Research. CBE-Life Sciences Education, 13(2), 167–178.](http://www.lifescied.org/content/13/2/167.full.pdf) + +## Class 13 - Networks (10/17/18) + +### Learning Objectives: + + * Conceptualize a data structure suitable for network analysis, generate a network and produce basic summary metrics + +### Tasks to be completed: + +Read: + +* [Hanneman, R.A. and Riddle, M. Chapter 1: Social Network Data. Introduction to Social Network Methods.](http://faculty.ucr.edu/~hanneman/nettext/C1_Social_Network_Data.html) + +#### Due: Assignment 2 - Social Network + +## Class 14 - Clustering (10/22/17) +### Learning Objectives: + + * Understand the basic principle and algorithm behind cluster analysis - * Define social network analysis and its main analysis methods - * Perform social network analysis and visualize analysis results in R - * Develop a well defined opinion on how to approach student privacy and data use +### Tasks to be completed: + +Read: +* [Bowers, A.J. (2010) Analyzing the Longitudinal K-12 Grading Histories of Entire Cohorts of Students: Grades, Data Driven Decision Making, Dropping Out and Hierarchical Cluster Analysis. Practical Assessment, Research & Evaluation (PARE), 15(7), 1-18.](http://pareonline.net/pdf/v15n7.pdf) + +## Class 15 - Clustering (10/24/17) + +### Learning Objectives: + + * Create a suitable data structure and perform clustering on a sample + +### Tasks to be completed: + +Watch: +* Chapter 7 in Baker, R. (2014). Big Data in Education: [video 1](https://youtu.be/mgXm3AwLxP8), [video 2](https://youtu.be/B9dvJYwBfmk) + +## Class 16 - Principal Component Analysis (10/29/18) + +### Learning Objectives: + + * Be familiar with the basic ideas behind dimension reduction and the reasons for needing it + * Understand the basic principles behind Principal Component Analysis + +### Tasks to be completed: + +Read: +* [Visually Explained](http://setosa.io/ev/principal-component-analysis/) +* [Konstan, J. A., Walker, J. D., Brooks, D. C., Brown, K., & Ekstrand, M. D. (2015). Teaching Recommender Systems at Large Scale: Evaluation and Lessons Learned from a Hybrid MOOC. ACM Trans. Comput.-Hum. Interact., 22(2), 10:1–10:23.](https://dl.acm.org/citation.cfm?id=2728171) + +## Class 17 - Principal Component Analysis (10/31/18) + +### Learning Objectives: + + * Perform principal component analysis ### Tasks to be completed: -1. Read/Comment: - * [Hanneman, R.A. and Riddle, M. Chapter 1: Social Network Data. Introduction to Social Network Methods.](http://faculty.ucr.edu/~hanneman/nettext/C1_Social_Network_Data.html) - * [Krueger, K.R. and Moore, B. 2015. New Technology “Clouds” Student Data Privacy. Phi Delta Kappan. 96, 5 (Feb. 2015), 19–24.](http://www.greeleyschools.org/cms/lib2/CO01001723/Centricity/Domain/2387/New%20technology%20clouds%20student%20data%20privacy.pdf) - * [Leong, B. and Polonetsky, J. 2016. Passing the Privacy Test as Student Data Laws Take Effect. EdSurge.](https://www.edsurge.com/news/2016-01-12-passing-the-privacy-test-as-student-data-laws-take-effect?utm_content=bufferc0042&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer) +Watch: +* [Georgia Tech 2015. Feature Selection. Youtube.](https://www.youtube.com/watch?v=8CpRLplmdqE) -2. Assignment 6 +## Class 18 - Domain Structure Discovery (11/5/18) -## Week 7 Social Network Analysis (3/10/16 - 3/17/16) +### Learning Objectives: - * Describe and interpret the results of social network analysis for the study of learning - * Describe and critically reflect on approaches to the use of social network analysis for the study of learning + * Be familiar with the range of strategies for mapping domains and skills ### Tasks to be completed: -1. Read/Comment: +Read: +* [Matsuda, N., Furukawa, T., Bier, N., & Faloutsos, C. (2015). Machine Beats Experts: Automatic Discovery of Skill Models for Data-Driven Online Course Refinement. International Educational Data Mining Society.](http://eric.ed.gov/?id=ED560513) - * [Grunspan, D. Z., Wiggins, B. L., & Goodreau, S. M. (2014). Understanding Classrooms through Social Network Analysis: A Primer for Social Network Analysis in Education Research. CBE-Life Sciences Education, 13(2), 167–178. doi:10.1187/cbe.13-08-0162](http://www.lifescied.org/content/13/2/167.full.pdf) - * [Manai, J. 2015. The Learning Analytics Landscape: Tension Between Student Privacy and the Process of Data Mining. Carnegie Foundation for the Advancement of Teaching.](http://www.carnegiefoundation.org/blog/the-learning-analytics-landscape-tension-between-student-privacy-and-the-process-of-data-mining/) +#### Due: Assignment 3 - Clustering -2. Assignment 7 +## Class 19 - Domain Structure Discovery (11/7/18) + +### Learning Objectives: + + * Be familiar with the Q-matrix method + +### Tasks to be completed: + +Watch: +* Chapter 7 in Baker, R. (2014). Big Data in Education:[video 6](https://youtu.be/oFSV6-opnws) + +Swirl: +* Unit 3 - Structure Discovery # Unit 4: Prediction -## Week 8 Prediction Modelling (3/17/16 - 3/24/16) +## Class 20 - Prediction (11/12/18) + +##### Due: Assignment 4 - Principal Component Analysis + +### Learning Objectives: - * Conduct one form of prediction modeling effectively and appropriately - * Understand the basis of predictive inference - * Develop a well defined opinion of the complexity of adaption + * Understand why prediction is desireable goal, the various meanings of the word and general strategies employed across statistics, machine learning and experimental psychology ### Tasks to be completed: -1. Read/Comment: +Read: +* [Kucirkova, N. and FitzGerald, E. 2015. Zuckerberg is Ploughing Billions into “Personalised Learning” – Why? The Conversation.](https://theconversation.com/zuckerberg-is-ploughing-billions-into-personalised-learning-why-51940) +* [Brooks, C., & Thompson, C. (2017). Predictive Modelling in Teaching and Learning. In The Handbook of Learning Analytics (1st ed., pp. 61–68). Vancouver, BC: Society for Learning Analytics Research.](https://solaresearch.org/hla-17/hla17-chapter5/) + +## Class 21 - Prediction (11/14/18) + +### Learning Objectives: - * [Honan, M. (2014, August 11). I Liked Everything I Saw on Facebook for Two Days. Here’s What It Did to Me | Gadget Lab. WIRED. Retrieved August 12, 2014](http://www.wired.com/2014/08/i-liked-everything-i-saw-on-facebook-for-two-days-heres-what-it-did-to-me/) - * [Farr, C. 2014. Microsoft and Knewton partner up to bring adaptive learning to publishers & schools. VentureBeat.](http://venturebeat.com/2014/03/13/microsoft-and-knewton-partner-up-to-bring-adaptive-learning-to-publishers-schools/) + * Employ a linear prediction model -2. Read: +### Tasks to be completed: - * [Zheng, A. 2015. Evaluating Machine Learning Models. O’Reily Media. Chapter 2: Evaluation Metrics p.7-18](http://www.oreilly.com/data/free/evaluating-machine-learning-models.csp?intcmp=il-data-free-lp-lgen_free_reports_page) +Watch: +* Chapter 1 in Baker, R. (2014). Big Data in Education: [video 1](https://youtu.be/dc5Nx3tyR8g) -3. Assignment 8 +## Class 22 - Classification (11/19/18) -## Week 9 Prediction Modelling (3/24/16 - 3/31/16) +### Learning Objectives: - * Understand core uses of prediction modeling in intelligent tutors - * Learn how to engineer both features and training labels - * Learn about key diagnostic metrics and their uses + * Understand the concept of classification and its relationship to modeling ### Tasks to be completed: - 1. Read/Comment: +Read: - * [San Pedro, M.O.Z., Baker, R.S.J.d., Bowers, A.J., Heffernan, N.T. (2013) Predicting College Enrollment from Student Interaction with a Intelligent Tutoring System in Middle School. Proceedings of the 6th International Conference on Educational Data Mining, 177-184.](http://www.columbia.edu/~rsb2162/EDM2013_SBBH.pdf) - * [Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.-H., Steinbach, M., Hand, D.J. and Steinberg, D. 2007. Top 10 algorithms in data mining. Knowledge and Information Systems. 14, 1 (Dec. 2007), 1–6.](https://www.cs.umd.edu/~samir/498/10Algorithms-08.pdf) +* [Liu, R., & Koedinger, K. (2017). Going Beyond Better Data Prediction to Create Explanatory Models of Educational Data. In The Handbook of Learning Analytics (1st ed., pp. 69–76). Vancouver, BC: Society for Learning Analytics Research.](https://solaresearch.org/hla-17/hla17-chapter6/) -2. Assignment 9 +#### Due: Assignment 5 - Prediction -# Unit 5: Natural Language Processing -## Week 10 Natural Language Processing (3/31/16 - 4/7/16) +## Class 23 - Classification (11/21/18) - Thanksgiving No Class + +### Learning Objectives: - * Describe prominent areas of text mining - * Assemble a corpus of documents - * Describe applications of text mining to education + * Implement a CART model ### Tasks to be completed: -1. Read/Comment: - * [Nadkarni, P.M., Ohno-Machado, L. and Chapman, W.W. 2011. Natural language processing: an introduction. Journal of the American Medical Informatics Association : JAMIA. 18, 5 (2011), 544–551.](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3168328/) - * [Shermis, M. D. (2014). State-of-the-art automated essay scoring: Competition, results, and future directions from a United States demonstration. Assessing Writing, 20, 53–76.](https://s3.amazonaws.com/s3.documentcloud.org/documents/1094637/shermis-aw-final.pdf) +Watch: +* Chapter 1 in Baker, R. (2014). Big Data in Education: [video 3](https://youtu.be/k9Z4ibzH-1s) & [video 4](https://youtu.be/8X0UlMShss4) -2. Assignment 10 +## Class 24 - Diagnostic Metrics (11/26/18) -## Week 11 Natural Language Processing (4/7/16 - 4/14/16) +### Learning Objectives: - * Perform a basic NLP analysis - * Develop a well defined opinion on whether students should have a right to understand how they are judged + * Understand and apply the following diagnostic metrics to models: Kappa, A', correlation, RMSE, ROC ### Tasks to be completed: -1. Read/Comment: - * [Crawford, K. and Schultz, J. 2014. Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms - Boston College Law Review. Boston College Law Review. LV, 1 (2014).](http://bclawreview.org/files/2014/01/03_crawford_schultz.pdf) - * [Thompson, J. 2015. Text Mining, Big Data, Unstructured Data. Dell Computing.](http://documents.software.dell.com/Statistics/Textbook/Text-Mining) +Read: +* [Zheng, A. 2015. Evaluating Machine Learning Models. O’Reily Media. Chapter 2: Evaluation Metrics p.7-18](http://www.oreilly.com/data/free/evaluating-machine-learning-models.csp?intcmp=il-data-free-lp-lgen_free_reports_page) + +Watch: +* Chapter 2 in Baker, R. (2014). Big Data in Education: [video 2](https://youtu.be/fGMFYTHhcHg), [video 3](https://youtu.be/9PDwRdyb6Sw) and [video 4](https://youtu.be/7r3hfJW1gz0) +* Chapter 2 in Baker, R. (2014). Big Data in Education: [video 5](https://youtu.be/1P34cxpEdKA) +* [Georgia Tech 2015. Cross Validation. Youtube.](https://youtu.be/sFO2ff-gTh0) -2. Assignment 11 +## Class 25 - Knowledge Tracing (11/28/18) -# Unit 6: The Quantified Student +### Vectr Class Visit -## Week 12 The Quantified Student (4/14/16 - 4/21/16) +### Learning Objectives: - * Have a well defined opinion of the use of biometric data in education - * Extract orientation data from a mobile device + * Understand the concepts behind Bayesian Knowledge Tracing ### Tasks to be completed: - -1. Read/Comment - * [Lee, V. R., & Drake, J. (2013). Quantified Recess: Design of an Activity for Elementary Students Involving Analyses of Their Own Movement Data. In Proceedings of the 12th International Conference on Interaction Design and Children (pp. 273–276). New York, NY, USA: ACM. doi:10.1145/2485760.2485822](http://quantifiedself.com/wp-content/uploads/2014/11/Quantified-recess_-Design-of-an-activity-for-elementary-students.pdf) - * [Kamenetz, A. 2015. The Quantified Student: An App That Predicts GPA. NPR.](http://www.npr.org/sections/ed/2015/06/02/409780423/the-quantified-student-an-app-that-predicts-gpa) - * [Meyer, R. (2016, February 25). The Quantified Welp. The Atlantic.](http://www.theatlantic.com/technology/archive/2016/02/the-quantified-welp/470874/) -2. Assignment 12 +Read: +[Corbett, A.T., Anderson, J.R. (1995) Knowledge Tracing: Modeling the Acquisition of Procedural Knowledge. User Modeling and User-Adapted Interaction, 4, 253-278.](https://link.springer.com/article/10.1007/BF01099821) + +Swirl: +* Unit 4 - Prediction + + +## Class 26 - Knowledge Tracing (12/3/18) + +### Learning Objectives: + + * Understand Bayesian Knowledge Tracing + +### Tasks to be completed: + +Watch: +* Chapter 4 in Baker, R. (2014). Big Data in Education: [video 1](https://youtu.be/_7CtthPZJ70) + +#### Due: Assignment 6 - CART Models + +## Class 27 - Work Session: Assignment 8, Group Project (12/5/18) + +## Class 28 - Work Session: Assignment 8, Group Project (12/10/18) + +##### Due: Assignment 7 - Diagnostic Metrics + +## Class 29 - Rate video presentations (12/12/18) + +## Class 30 - Rate video presentations (12/17/18) + +## EVERYTHING DUE - 12/19/18 + +---------------------------------------------------- + +# Fine Print -# Unit 7: Advanced Graphics +1. All examinations, papers, and other graded work and assignments are to be completed in conformance with the [Academic Integrity Policy](http://www.tc.columbia.edu/administration/diversity/index.asp? Id=Civility+Resources+and+Policies&Info=Civility+Resources+and+Policies&Area=Studen t+Miscon duct+Policy). Students who intentionally submit work either not their own or without clear attribution to the original source, fabricate data or other information, engage in cheating, or misrepresentation of academic records may be subject to charges. Sanctions may include dismissal from the college for violation of the TC principles of academic and professional integrity fundamental to the purpose of the College. -## Week 13 Advanced Graphics (4/21/16 - 4/28/16) +2. The College will make reasonable accommodations for persons with documented disabilities. Students are encouraged to contact the Office of Access and Services for Individuals with Disabilities for information about registration (166 Thorndike Hall). Services are available only to students who are registered and submit appropriate documentation. As your instructor, I am happy to discuss specific needs with you as well. - * Understand basic principals of the grammar of graphics - * Understand the basic principals of effective data visualization - * Produce a range of graphical representations using ggplot & D3.js for R +3. The grade of Incomplete will be assigned only when the course attendance requirement has been met but, for reasons satisfactory to the instructor, the granting of a final grade has been postponed because certain course assignments are outstanding. If the outstanding assignments are completed within one calendar year from the date of the close of term in which the grade of Incomplete was received and a final grade submitted, the final grade will be recorded on the permanent transcript, replacing the grade of Incomplete, with a transcript notation indicating the date that the grade of Incomplete was replaced by a final grade. If the outstanding work is not completed within one calendar year from the date of the close of term in which the grade of Incomplete was received, the grade will remain as a permanent Incomplete on the transcript. In such instances, if the course is a required course or part of an approved program of study, students will be required to re-enroll in the course including repayment of all tuition and fee charges for the new registration and satisfactorily complete all course requirements. If the required course is not offered in subsequent terms, the student should speak with the faculty advisor or Program Coordinator about their options for fulfilling the degree requirement. Doctoral students with six or more credits with grades of Incomplete included on their program of study will not be allowed to sit for the certification exam. -### Tasks to be completed: IMPORTANT +4. Teachers College students have the responsibility for activating the Columbia University Network ID (UNI) and a free TC Gmail account. As official communications from the College – e.g., information on graduation, announcements of closing due to severe storm, flu epidemic, transportation disruption, etc. -- will be sent to the student’s TC Gmail account, students are responsible for either reading email there, or, for utilizing the mail forwarding option to forward mail from their account to an email address which they will monitor. -1. Read/Watch: - * [Datacamp 2015. The ggvis R package - How to Work With The Grammar of Graphics - YouTube. Youtube.](https://www.youtube.com/watch?v=rf55oB6xX3w) - * [Friendly, M. 2008. A Brief History of Data Visualization. Handbook of Data Visualization. Springer Berlin Heidelberg. 15–56.] (http://download.springer.com.ezp-prod1.hul.harvard.edu/static/pdf/797/chp%253A10.1007%252F978-3-540-33037-0_2.pdf?originUrl=http%3A%2F%2Flink.springer.com%2Fchapter%2F10.1007%2F978-3-540-33037-0_2&token2=exp=1453237938~acl=%2Fstatic%2Fpdf%2F797%2Fchp%25253A10.1007%25252F978-3-540-33037-0_2.pdf%3ForiginUrl%3Dhttp%253A%252F%252Flink.springer.com%252Fchapter%252F10.1007%252F978-3-540-33037-0_2*~hmac=f39b47d9779f7d2ef33b7e231c7385fb79662ec5cc43ff39d52e812fe9ca466c) +5. It is the policy of Teachers College to respect its members’ observance of their major religious holidays. Students should notify instructors at the beginning of the semester about their wishes to observe holidays on days when class sessions are scheduled. Where academic scheduling conflicts prove unavoidable, no student will be penalized for absence due to religious reasons, and alternative means will be sought for satisfying the academic requirements involved. If a suitable arrangement cannot be worked out between the student and the instructor, students and instructors should consult the appropriate department chair or director. If an additional appeal is needed, it may be taken to the Provost. -2. Assignment 13 diff --git a/Syllabus.Rproj b/Syllabus.Rproj new file mode 100644 index 0000000..8e3c2eb --- /dev/null +++ b/Syllabus.Rproj @@ -0,0 +1,13 @@ +Version: 1.0 + +RestoreWorkspace: Default +SaveWorkspace: Default +AlwaysSaveHistory: Default + +EnableCodeIndexing: Yes +UseSpacesForTab: Yes +NumSpacesForTab: 2 +Encoding: UTF-8 + +RnwWeave: Sweave +LaTeX: pdfLaTeX diff --git a/syllabus.Rproj b/syllabus.Rproj new file mode 100644 index 0000000..8e3c2eb --- /dev/null +++ b/syllabus.Rproj @@ -0,0 +1,13 @@ +Version: 1.0 + +RestoreWorkspace: Default +SaveWorkspace: Default +AlwaysSaveHistory: Default + +EnableCodeIndexing: Yes +UseSpacesForTab: Yes +NumSpacesForTab: 2 +Encoding: UTF-8 + +RnwWeave: Sweave +LaTeX: pdfLaTeX