LSA

Marco Kalz, M.A. over at Educational Technology Expertise Centre Open University of the Netherlands, informed me months ago that the University of Netherlands was organizing the 1st European Workshop on LSA in Technology-Enhanced Learning. Marco is part of the Scientific Committee responsible for organizing the event and co-author of the workshop proceedings.

It is my pleasure to inform our readers that the event was a complete success. I will ask Marco for additional inside information, to perhaps include in our next issue of IRW Newsletter.

Meanwhile, the full metadata record and Mini-Proceedings publication of the event are available online for your perusal.

The manuscripts are a goldmine for educators interested in implementing LSA in controlled, small collections like in several scholar environments. Since access to the entire “collection” to be SVD is possible, these can test and use the technique at will. [Note that this is different from SEOs or users sitting at the front of a search box or with access to few records trying to manipulate LSI/SVD.]

Indeed, more and more educators, curriculum developers, and administrators are now learning of the advantages of using LSA as an assessment or management resource.

The work of two LSA pioneers, Professors Tom K. Landauer and Peter Foltz says it all. In page 13 of the Mini-Proceedings they describe “New LSA Education Applications at University of Colorado and Pearson Knowledge Technologies”. At first one might think theirs was a sales pitch presentation, but it was not. Indeed, they presented on cutting edge technologies developed and tested at UofC and Pearson. Here is a sample:

“LSA has been combined with other statistical analysis
and machine learning methods to create a suite of
complementary educational tools. These include:

• Summary Street presents texts, students summarize them
in many fewer words and receive immediate feedback on
how well they have understood and expressed the important
aspects of the content of the reading.

• Summary Street has recently been combined with IEA in
an integrated reading and writing literacy tutorial and assessment
tool called WriteToLearn.

• SuperManual is an automatically produced digital instruction
tool in which LSA makes learning objects easier to
locate and understand by providing meaning-based
search, summaries and optimum learning paths

• Standard Seeker automatically aligns instructional texts
and test items with compendia of learning standards
• Career Map automatically matches educational and work
experience with job and training programs.

• PKT automatic metadata tagger, annotates the content
of learning object repositories in keywords that best express
the central content of paragraphs or longer text using
words that are not necessarily from the text itself, and
with semantically most representative sentences and classifications
into pre-specified categories

• Open-Cloze and Meaningful Sentences are new webdelivered
and automatically scored constructed response
reading and writing exercises with immediate feedback

• Team Communications and Knowledge Post are both
LSA-based systems that “listen in” to communication
generated during group training or learning activities and
provide immediate and aggregated automatic mentoring,
assessment and real-time moderator intervention.”

“In the current R&D pipeline are technologies to select the
most important words for vocabulary instruction, ones to
modify the reading difficulty of texts to suit individual students
and techniques for choosing readings that will maximize
growth of useful vocabulary, reading comprehension and writing
ability. Fuller descriptions of a selection of these tools and
evidence supporting their educational utility follows.”

Of course, I cannot reproduce the entire manuscript here, but there is no doubt that LSA goes beyond mere LSI (as LSA is known in information retrieval), wherein collection are just preindexed documents and the matrices to be decomposed are term-doc arrays. As a matter of fact, the scope of LSA is not limited to search engines. In this sense it is a generalized, multidisciplinary technique.

The authors mention at the very end a recent publication. I predict this will quickly become a must-have/need/read textbook and reference book in the bookshelf of educators, researchers, and -yes- search engine marketers/spammers: Handbook of Latent Semantic Analysis; Landauer, T. K, McNamara, D.S., Dennis, S., Kintsch, W. (2007).

The authors conclude:

“There have been other applications as well, for example an
experimental LSA-based method for automatic indexing of
books by the central meaning of pages rather than by single
words and phrases, used for the index of the new Handbook of
Latent Semantic Analysis [5]. Overall, our experience has
been that LSA offers an enormous spectrum of valuable opportunities
for educational tools.”

Review at Amazon

You can buy the Handbook for $145. Here is an editorial review published at Amazon:

“The Handbook of Latent Semantic Analysis is the authoritative reference for the theory behind Latent Semantic Analysis (LSA), a burgeoning mathematical method used to analyze how words make meaning, with the desired outcome to program machines to understand human commands via natural language rather than strict programming protocols. The first book of its kind to deliver such a comprehensive analysis, this volume explores every area of the method and combines theoretical implications as well as practical matters of LSA.

Readers will be introduced to a powerful new way of understanding language phenomena, as well as innovative ways to perform tasks that depend on language or other complex systems. The Handbook clarifies misunderstandings and pre-formed objections to LSA, and provides examples of exciting new educational technologies made possible by LSA and similar techniques. It raises issues in philosophy, artificial intelligence, and linguistics, while describing how LSA has underwritten a range of educational technologies and information systems. Alternate approaches to language understanding are addressed and compared to LSA.

This work is essential reading for anyone—newcomers to this area and experts alike—interested in how human language works or interested in computational analysis and uses of text. Educational technologists, cognitive scientists, philosophers, and information technologists in particular will consider this volume especially useful.”

References

[1] Landauer, T. K, Laham, D. and Foltz, P.W. Automated scoring and annotation
of essays with the Intelligent Essay Assessor. In, Shermis, M. D and
Burstein, J. Automated essay scoring: A cross-disciplinary perspective.
Mahwah, NJ: Lawrence Erlbaum.

[2] Landauer, T.K., Laham, D., & Derr, M. (2004). From paragraph to graph:
Latent semantic analysis for information visualization. Proceedings of the
National Academy of Science, 101, 5214-5219.

[3] Foltz, P. W., Martin, M. J., Abdelali, A., Rosenstein, M. B. & Oberbreckling,
R. J. (2006). Automated Team Discourse Modeling: Test of Performance
and Generalization. In Proceedings of the 28th Annual Cognitive Science
Conference.

[4] LaVoie, N, Streeter, L., Lochbaum, K. Boyce, L., Krupnick, C., and
Psotka, J. Automating Expertise in Collaborative Learning Environments.
International Journal of Computer-supported Collaborative Learning. Submitted.

[5] Landauer, T. K, McNamara, D.S., Dennis, S., Kintsch, W. (2007). Handbook
of Latent Semantic Analysis. Mahwah, NJ: Lawrence Erlbaum.

Note: This is a legacy post originally published in 2007/04/03.

Advertisements