home > articles >
Data Mining Applications in Higher Education
Source: www.spss.com
Copyright SPSS, Inc. 2004
Part 1 of 3
Case study one: creating meaningful learning outcome typologies
Challenge
"What do institutions know about their students?" If the answer is a recital of
the percentages of enrollment by gender or some other counts, then institutions truly do
not know their students. This case study shows how unsupervised data mining enables
suburban community
colleges to establish learning outcome typologies1 for their students.
In a typical suburban community college with an enrollment of 15,000, students are
traditionally identified as "transfer-oriented," "vocational education
directed," or "basic skill upgraders."
However, these identifications are all based on students' initial declaration of
educational goals at enrollment. These groupings are very inclusive classifications, but
they don't help in comprehending the vast differences among each type of students.
Solution
To establish appropriate typologies for 15,000 students, researchers used both TwoStep and
K-means, two powerful clustering algorithms. The first attempt was made using the
aforementioned general groupings of "transfer," "vocational," and
"basic skills." The results appeared to
be mixed. The boundaries among clusters were unclear and dispersed - a reflection of
goodness of fit of the feature vectors associated with centroids. After repeated testing
on holdout datasets as well as the removal of suspected outliers, the results did not
improve much. It
is possible that the students' declaration of goals did not dictate their academic
behavior. Therefore, a replacement method was used by looking at two elements: educational
outcomes in combination with lengths of study.
Defining educational outcomes is easier said than done. First of all, enough time must
pass to conclude that a student has reached a certain milestone. Secondly, dropping out
was also an outcome by itself. After this, further work was conducted to determine length
of study, which
required decisions on how to deal with students, or "stopouts," who left for a
while and came back later for more.
All of these tested one's domain knowledge. There are no absolutely right or absolutely
wrong typologies. They are all relative, giving new meaning to unsupervised data mining. A
typology is a good one if it serves a particular business or scientific research
objective.
After dealing with outliers (cases that do not appear to belong to any group) by either
finding them a home (cluster) or removing them, TwoStep produced the following clusters:
"transfers," "vocational students," "basic skills students,"
"students with mixed outcomes," and "dropouts." K-means validated
these clusters. After introducing the element of length of study, it gave new dimensions
to each of the clusters. Some transfers blazed through their studies in no time; some
vocational students took longer; and others appeared to be happy taking a course or two
for no particular purpose.
Results
Data mining, combined with student demographics and other information, helped colleges
better describe the clusters. For example, certain older students tended to take their
time and younger students with better socioeconomic backgrounds often picked high credit
courses
to graduate quickly. The most interesting part of classification is to name these
typologies. For example, we used the term "transfer speeders" to describe those
who piled up their units quickly, and "college historians" to describe those who
have been taking classes forever. Others are "fence-sitters," "skill
upgraders," etc.
Typologies like the above are important because they reach beyond conventional student
profiling. They provide a way to identify homogenous groups of students, thus increasing
the accuracy of predictive modeling algorithms. Even if data mining ceases after having
found sensible typologies, the knowledge of the newly discovered patterns and
relationships helps college teachers and administrators better meet the needs of various
student groups.
|
Home | Solutions
| Articles | Partners | Company | Contact |
The Software Marketing and
Applications Company |
"Analytical tools can improve organizational efficiency,
sales, and profits"
|