Learning Analytics MOOC – Week 4 – SNA Case Studies

Upon the recommendation of a friend, this week, I read the book „The Circle“ from Dave Eggers, which had quite an effect on me – especially when taking a MOOC with the topic SNA at the same time… Therefore and because I had a heavy workload in my job, I postponed engaging with DALMOOC from the evenings to this weekend. I was really interested in this week’s hangouts and watched the recordings (http://www.youtube.com/watch?v=GUUaP39VpLI and http://www.youtube.com/watch?v=ziM0EvN9n0o) which were fascinating and brought to my attention the level of „that we as participants are object of study as well“. Personally, I find it a little bit scary that my social media and learning activity is under such scrutiny. Certainly I knew before the course that it would be the case – learning analytics naturally means monitoring and drawing conclusions – but it’s different when it’s for real. However, at this moment, I don’t know yet enough of what is possible via Learning Analytics and what is commonly in use. Maybe it’s inevitable and in five years or so it will be quite normal, but I think we as educators really have to think about which tools we are using and how we do that – We have to be very, very careful with our learner data at our universities.

SNA Case Studies
This week, we got an insight in some SNA case studies and the educational constructs which were used (e.g. learning design, sense of community, creative potential, social presence, academic performance, distributed MOOC pedagogy). Thankfully the full text was available and I’ll keep these examples in mind for further reading later on. SNA can help to detect patterns of interactions in online learning environments and instructors could start intervening depending on the intended learning design (e.g. from an instructor-centered network to an „equal distribution of student distributions“). Sense of community means to which extent learners feel that they belong to a community and on the other hand benefit from participating in the community (getting information, getting feedback, also relevant for student retention in universities). The aspects creative potential („network brokers are associated with achievement and creativity“), social presence („teaching presence facilitates the development of network centrality by guiding students to establish social presence“) and academic performance („those students who were central in cross-class networks had best academic performance“) were also covered in studies, but I just watched the intro-videos.
In order to learn more about cMOOCs (where social media is an integral part), this article about Twitter use in CCK11 (socio-technical approach: nodes are persons and hashtags) is very interesting: http://www.sfu.ca/~dgasevic/papers_shared/bjet2014_cmoocs.pdf

Unfortunately I don’t have the time for further experiments with Gephi (which I would really have liked to do because I value that we are doing something concrete with tools) and week 5 with different topics is near.

Learning Analytics MOOC – Week 3 – SNA

My goal with these blog posts is to summarize and reflect a little bit about things/content I’ve learned – my blog seems to be a good way to keep this for later on after the MOOC.

Week 3 is about an introduction to Social Network Analysis (SNA) and insights how social processes unfold. „SNA aims to understand the determinants, structure, and consequences of relationships between actors“ (Source http://www.lifescied.org/content/13/2/167.full.pdf+html) SNA is multidisciplinary (not only sociology and statistics) and main analysis methods are density, centrality and modularity types of analysis. We’ll do some analysis with test data and again visualization, this time with Gephi. The interesting thing will be what’s the use of SNA for learning (I’m not there yet).

Networks consist of actors (=nodes) and relationships/connections which can represent friendship, advice, hindrance, communication. In a spreadsheet nodes and relationships would be represented in rows (and weight via adding as many rows). Data can be collected by self-reports, interviews, collection from social networks (who is following whom on Twitter etc.) and special tools which collect data from LMS (activity in online discussion boards,..) and later on be analyzed in tools like Gephi. As these networks are seldom static, you have to decide on a time frame when collecting data. Also important: Anonymizing data, obtaining consent (which may lead to incomplete networks), ethics

 

Network Measures

In my understanding this very informative YouTube video from Dragan Gasevic http://www.youtube.com/watch?v=Gq-4ErYLuLA  lists network measures as follows:

a) Measures which are measuring the entire network:

* Diameter = „a measure which is determining the longest distance between any pair of two nodes in the network“

* Density = „is determining the potential of the entire network to talk to each other“ (how many connections of all the possible connections are actually happening)

b) Measures which are measuring the potential of individual nodes in a network:

* Concept of Centrality: (The meaning of centrality is dependent on the kind of different metric which is used)

** Degree centrality =  A very often used measure which „indicates the total number of connections for each actor in a network“

*** In-Degree centrality = Pointing to an actor / „how many other nodes are directly trying to establish communication or are talking  to a particular node“  (popularity, prestige)

*** Out-Degree centrality = Pointing away from an actor / „outgoing connections, may mean how many emails someone sent, generosity in conversations with others“  (gregariousness)

** Betweenness centrality = „measure which indicates the ease of connection with anybody else in the network but in particular to try to connect all these potentially small subclusters of the nodes“ (network broker)

** Closeness centrality = „used to measure the ease or the shortest distance of a node to anybody else in the network (indicates how quickly you can get to anybody else in the network, not useful for networks with many actors with no ties or groups with no connection to other groups)“

It also can be interesting to think about network modularity, e.g. smaller subgroups that are closely connected to each other (modules=communities). It is relevant for later use of modularity algorithms to identify the „giant component“ and use it as a filter.

 

About Gephi
Installation was easy but when I played around with the test data we were given, I even didn’t find the function to zoom in so I watched this YouTube video which was very helpful to get an overview of how to use Gephi (17 min very well spent): http://youtu.be/L0C_D68E1Q0

As I haven’t done anything like this before, I reduced my tests with the example blog dataset of week 6 to the „Average Degree“ and tried to find something useful. My results are in the attached pdf-file: w3-gephi-2

I’m looking forward to week 4 – maybe the Hangout times of day will be a little bit more convenient for Middle Europe again. And I still have to try Bazaar (I really want to do that), but again, this week, I had no time for that.

Learning Analytics MOOC – Week 2

Topic of this week is the „Learning Analytics Cycle“ and conducting some basic analytics with our test access to the Tableau software. The YouTube video from George Siemens about the Data/Analytics Cycle was very helpful, also the Google Hangout with Tony Hirst about „Data Wrangling“. I also attended the  Google Hangout on Wednesday and I am very impressed  by the commitment of the course facilitators – thank you!

My interpretation of the Learning Analytics Cycle consists of these steps:
1. Data collection, Data Acquisition and Storage (Data is generated by or about the learners: Sources can be LMS, Student information systems, Social Media… any interaction between Learner & Institution)
2. Dataset Cleaning (missing data, different spelling of names,…)
3. Analysis & Visualization
4. Action (Intervention, Optimization,… and back to the learners)
That means the process starts with the learners and ideally the cycle / loop closes with feedback of the intervention to the learners.
When we look at data, we can do counting, sorting and therefore get different sort of charts with the same data. As to interpretation, theses aspects are relevant: looking for outliers, looking for similarities and differences, looking for trends, looking for patterns & structure.
Certainly, you have to think about what you would like to know when you do the data collection and not only when you do the analytics & visualization.

My tests with the Tableau Software:
After having registered with Tableau a lot of times –  at first to get a test version of the software (thankfully in this MOOC we get an extended test period), then to actually start it after installation and then even to get the video tutorials – I watched the „Getting started“ video (20 minutes) and was really impressed with the variety of functions.

As I don’t have a set of educational data which I could use for testing, I had to be somewhat creative to use a different kind of dataset for testing. At our University (and in Germany) we have very strict regulations for the use of user data and logfiles and so my example won’t have anything to do with educational data but with recreational data… But my goal at the moment was to play around with the Tableau Software in order to get used to working with table cells and rows and visualizations and I am satsified with my results:

I spent more time with the MOOC this week than I planned because it was fun and creating artifacts is really very time-consuming. I am looking forward to week 3 and hopefully, I’ll find the time to try Bazaar meetings.

 

Learning Analytics MOOC – Week 1

The Horizon Report Higher Education 2014 sees Learning Analytics in the Time-to-Adoption Horizon as „One Year or Less“ (in the Report 2013 it was „Two to Three Years“), so the topic LA is quite an interesting one for Educators.

However, Learning Analytics (LA) is a term which needs definition – The Society for Learning Analytics Research defines it as „the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs“.

The slide at 3:00 min shows that it’s a good idea to know more about LA: „Data trails reveals our sentiments, our attitudes, our social connections, our intentions, what we know, how we learn and what we might do next“.

I really liked the link to the fulltext article „Educational Data Mining and Learning Analytics“ (Baker & Siemens 2014) as it gives an introduction and overview of the field (graduate programs, journals, conferences, methods & tools, differences between Educational Data Mining, EDM, and LA research communities). The reasons of the growing use of LA are cited as „a substantial increase in data quantity, improved data formats, advances in computing, and increased sophistication of tools available for analytics“.

What about software (analytics / research tools)? I try to remember that for single functionality there is NodeXL and Gephi whereas integrated suites would be SAS, IBM BI Analytics suite and Pentaho. Open Socurce tools are R and Weka and in our course we will focus on Tableau, Gephi, RapidMiner and LightSide. I intentionally skipped doing a tool matrix because at the moment I don’t feel like being competent enough and other things were more important to me (I decided that it’s  the kind of MOOC where I choose my learning goals).

In week 1, I spent about 5 hours with the MOOC: at first looking at the course / resources / activities in edX (plus joining one of the live hangouts until midnight local time on Tuesday) and then signing in ProSolo. My first impression of ProSolo was that is wasn’t very intuitive, so in week 2 I’ll  have a closer look at what the menus „plan, learn“, „goals, competences, activities“ mean.

I look forward to week 2 of DALMOOC and I’m curious what we will do with the Tableau software.

DALMOOC – My first steps

The official start of this MOOC on „Data, Analytics and Learning“ (DALMOOC) is on Monday 20th, but a lot of activity has already taken place, e.g. two Google hangouts – at not such perfect times for Middle Europe… Therefore I watched the archive of the Course Design Explanation session (October 17th) on YouTube http://youtu.be/b2gSd6oxEBM which was quite intereresting.

Why I joined this course

I already looked at the course in edx.org (whose user interface I already know very well) and was surprised that the first thing I saw in the Courseware section was a special DALMOOC Course agreement about participation in a study – I am still not sure what’s the difference if I say yes or just ignore it  (which I did for now). Afterwards I watched the „How DALMOOC works“ video which emphasized on „It’s about connections & creation not content“ and that it might be a little disorienting at the start for someone who has taken MOOCs that were more structured. For DALMOOC in edx.org, 4 points were mentioned to help orientation: the visual syllabus, a daily email, hangouts/recordings and edX forums.

Course Tools
In the course, many tools beyond edX will be used – for my orientation I add these URLs in the blog post:

I am looking forward to the official start of DALMOOC on Monday. I don’t know yet how much time I will spend, because the course is parallel to the busiest time of year with the beginning of the winter semester at our University. Besides the edx.org course portal https://www.edx.org/course/utarlingtonx/utarlingtonx-link5-10x-data-analytics-2186 I will have a look on what happens on Twitter (#dalmooc  https://twitter.com/dalmooc ) and maybe Google+ https://plus.google.com/106806974410074176435/posts