LexTagAnalysis:index

From CommerceNet Wiki

Jump to: navigation, search

[edit] Lexical Analysis of Tag Collections for the Improvement of Keyword Auto-Generation

By Kevin Hughes, Webmaster, CommerceNet Originally presented at TagCamp, Palo Alto, CA, October 30, 2005
Later revised for the TagCamp followup at CommerceNet, November 3, 2005

Abstract: This presentation explores the results of the lexical analysis of various tag collections as well as normal text. What can we learn from human-generated metadata to help make automatically-generated metadata more usable, correct, efficient, and most importantly, humane?

The source files and code related to this presentation can be found here:

In this file is maketags.php, which generates tags from a given body of text, wordinfo.php, which produces a number of statistics for a body of text, tagstats.xls, numerical results from the experiments, and a directory of sources I used (text and tag collections from around the Web). These programs require wordnet to run. This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

[edit] Index

  1. Title Page
  2. Tags:Some Questions
  3. Lexical Analysis
  4. Everyday Text
  5. Tag Text
  6. "All Tags" vs "Popular Tags"
  7. Normal Text vs Tags
  8. Keyword Auto-Generation
  9. Results:Political Article
  10. Results:Alice In Wonderland
  11. Results: GCC Manual
  12. Issues
  13. Conclusions
  14. Calls For Action
  15. End Page
Personal tools