One of the workshops at the 16th International World Wide Web Conference (WWW2007), to be held in Banff, Alberta, Canada on May 8, is a workshop on Tagging and Metadata for Social Information Organization.
As content grows on the web, generated from many different sources, annotations and tagging of that content is also happening at a tremendous rate of growth. Those tags can help people find and filter information on the web, and collaborative filtering can tell us something about the content found upon the web, and the ways that people use it.
I’ve included links to the papers here, as well as abstracts for the papers, and links to as many of the pages for the authors of those papers as I could find.
If you use digg, or del.icio.us or Flickr or other tagging and social networking type web sites, you may find a number of these papers interesting.
Due to the high popularity of social bookmarking systems, a large amount of metadata is available. Aggregating the metadata belonging to one user results in an user profile similar to those often used in Information Filtering. This paper shows how to create user profiles from tagging data. We present the Add-A-Tag algorithm for profile construction which takes account of the structural and temporal nature of tagging data. In addition, we explore ways of leveraging these user profiles. There are two main insights gained. Firstly, as we experienced in a small-scale user study, simply being able to view aggregated information about past tagging behavior was considered useful. Secondly, the user profile can be used to guide the user’s navigation, that is, to provide the user with personalized access to information resources.
See: The Add-A-Tag algorithm: Learning adaptive user profiles from tagging data for more, including a demo.
Social resource sharing systems like YouTube and del.icio.us have acquired a large number of users within the last few years. They provide rich resources for data analysis, information retrieval, and knowledge discovery applications. A first step towards this end is to gain better insights into content and structure of these systems. In this paper, we will analyse the main network characteristics of two of the systems. We consider their underlying data structures – so called folksonomies – as tri-partite hypergraphs, and adapt classical network measures like characteristic path length and clustering coefficient to them.
Subsequently, we introduce a network of tag co-occurrence and investigate some of its statistical properties, focusing on correlations in node connectivity and pointing out features that reflect emergent semantics within the folksonomy. We show that simple statistical indicators unambiguously spot non-social behavior such as spam.
One of the sites mentioned in the paper is BibSonomy, which allows people to share bookmarks and lists of literature
Emerging Motivations for Tagging: Expression, Performance, and Activism (pdf)
Alla Zollers (having problems connecting to her blog at http://blog.ayre.org/ – hopefully they are only temporary)
Social tagging systems have generally been designed and used for personal information organization and retrieval. People use a variety of sites to tag photos, websites, blogs, and videos. Recently, commercial websites such as Amazon.com, have also implemented tagging on their websites. This type of tagging is not only social, where users can view other’s tags and resources, but collective or collaborative, where any user can tag any resource. By analyzing the tags of two sites that implement free-for-all tagging – Amazon.com and Last.fm – this paper describes emergent social motivations for tagging. The motivations that were found in the systems include expression, performance, and activism.
This paper outlines our experiences with applying collaborative tagging in e-learning systems to supplement more traditional metadata gathering approaches. Over the last 10 years, the learning object paradigm has emerged in e-learning and has caused standards bodies to focus on creating metadata repositories based upon strict domain-free taxonomies. We argue that the social collection phenomena and flexible metadata standards are key in collecting the kinds of metadata required for adaptable online learning. This paper takes a broad look at tagging within elearning. It first looks at the implications for tagging within the domain through an analysis of tags students provided when classifying learning objects. Next, it looks at two case studies based on novel interfaces for applying tagging. These two systems emphasize tags being applied within learning content through the use of a highlighting metaphor.
Tag clouds provide an aggregate of tag-usage statistics. They are typically sent as in-line HTML to browsers. However, display mechanisms suited for ordinary text are not ideal for tags, because font sizes may vary widely on a line. As well, the typical layout does not account for relationships that may be known between tags. This paper presents models and algorithms to improve the display of tag clouds that consist of in-line HTML, as well as algorithms that use nested tables to achieve a more general 2-dimensional layout in which tag relationships are considered. The first algorithms leverage prior work in typesetting and rectangle packing, whereas the second group of algorithms leverage prior work in Electronic Design Automation. Experiments show our algorithms can be efficiently implemented and perform well.
SemKey: A Semantic Collaborative Tagging System (pdf)
Andrea Marchetti, Maurizio Tesconi, Francesco Ronzano, Marco Rosella, and Salvatore Minutoli
By analysing the current structure and the usage patterns of collaborative tagging systems, we can find out many important aspects which still need to be improved. Problems related to synonymy, polysemy, different lexical forms, mispelling errors or alternate spellings, different levels of precision and different kinds of tag-to-resource association cause inconsistencies and reduce the efficiency of content search and the effectiveness of the tag space structuring and organization. They are mainly caused by the lack of semantic information inclusion in the tagging process. We propose a new way to describe resources: the semantic tagging. It allows user to state semantic assertions: each of them expresses a defined characteristic of a resource associating it with a concept. We present SemKey, a semantic collaborative tagging system, describing its global architecture and functioning along with the most relevant organizational issues faced. We explore the adequacy of the support offered by the entries of Wikipedia and WordNet in order to access to and reference concepts.
Towards Federated Web 2.0 Sites: The TAGMAS Approach (pdf)
Jon Iturrioz, Oscar Diaz, and Cristobal Arellano
The success of Web2.0 is draining user’s resources from the desktop to the Web. An increasing number of users are keeping their pictures at Flickr, their bookmarks at del.icio.us, their documents at googleDocs and so on. There are important advantages to be gained, but this dissemination of user’s resources should go handbyhand with tooling that permits users to keep a global view of their resources regardless of where they are kept. Unfortunately, heterogeneity on API’s, tag conventions and message protocols hinders interoperability. Consequently, this work promotes a loosely coupled federated view of Web2.0 sites which powers traditional desktops with tagging and searching capabilities that expand over the desktop folders to transparently account for Web2.0 sites. This federation is achieved in a user basis: the Web2.0 sites to be integrated are those that keep resources of the user at hand. The paper introduces the current status of TAGMAS, a TAG MAnagement System that provides an interface to deal with multiple, autonomous Web2.0 sites from the desktop.