Please use this identifier to cite or link to this item: https://doi.org/10.25540/941F-64FU
Title: About.Me Dataset for User-centric OSN Analysis
Creators: Bang Hui Lim
Dongyuan Lu
Tao Chen
Kan Min-Yen 
NUS Contact: Min-Yen Kan
Subject: Twitter
Media
YouTube
Couplings
Joining processes
User interfaces
Social networking (online)
User behaviour
Online Social Networks
Cross-sharing
Instagram
#mytweet
User behaviour
Multiple online social networks
OSN
Flickr
Google+
Tumblr
DOI: doi:10.25540/941F-64FU
Description: 

A user-centric study of cross Online Social Network (OSN) behavior requires a collective study of many individual users, each of whom use multiple OSNs. We purposefully sidestep the issue of user linkage to focus our attention on user behavior.

We leverage an OSN aggregation service called About.me , which enables people to easily create a public online identity that unifies a self-described short biography with prominent links to the person’s other OSN accounts and personal websites.

Using the About.me API, we collected a set of registered 180,000 user profiles, and further limit the users that link to at least 4 out of 6 OSNs (i.e., Flickr, Google+, Instagram, Tumblr, Twitter and Youtube). These six OSNs are chosen for the reason that they are represent the breadth of media and functionality common in today’s Web 2.0 ecology, and they expose most user information publicly through an API. This selection criteria resulted in 15, 595 users in our dataset.

With these 15, 595 users and their user identity in the six OSNs, we crawled each user’s publicly accessible activities in the six OSNs via the respective APIs on 15 August 2013. Since all of the data we have obtained is public, and since we believe that the compiled dataset is a valuable resource for studying multiple OSN behavior, we have released our dataset for others to conduct further study. As required by most OSNs, we are not able to distribute the actual content of a post. Followed the other public social datasets, we release the post ID and its user identity instead.

If you use our dataset, please cite our ASONAM 2015 paper:

"about_me.sql (727K)"is a mysql dump file that contains user's profile name in about.me and their user identities in the six OSNs.

The following six dump files contain the post IDs and their user identities in the corresponding OSN. You can use the user identities in aboutme.sql to link user's activities in multiple social networks. With the post IDs, you can further pull the acutal posts via respective APIs. Read our tutorial on using these APIs.

  • flickr.sql (53MB)
  • googleplus.sql (34MB)
  • instagram.sql (19MB)
  • tumblr.sql (49MB)
  • twitter.sql(479MB)
  • youtube.sql (19MB)(479MB)

For more details, please visit https://github.com/WING-NUS/aboutme

Related Publications: https:doi.org/10.1145/2808797.2808820
10635/137541
Citation: When using this data, please cite the original publication and also the dataset.
  • B. H. Lim, D. Lu, T. Chen, and M.-Y. Kan, “#mytweet via Instagram,” in Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 - ASONAM ’15, 2015, pp. 113–120.
  • Bang Hui Lim, Dongyuan Lu, Tao Chen, KAN MIN-YEN (2017-11-13). About.Me Dataset for User-centric OSN Analysis. Scholar@BankNUS Repository. [Dataset]. https://doi.org/10.25540/941F-64FU
License: Attribution-NonCommercial 4.0 International
http://creativecommons.org/licenses/by-nc/4.0/
Appears in Collections:Staff Dataset

Show full item record
Files in This Item:
File Description SizeFormatAccess Settings 
about_me.sql##Dataset This is a mysql dump file that contains user's profile name in about.me and their user identities in the six OSNs.1.47 MBUnknown

OPEN

View/Download
googleplus.sql229.41 MBUnknown

OPEN

View/Download
flickr.sql338.76 MBUnknown

OPEN

View/Download
instagram.sql79.9 MBUnknown

OPEN

View/Download
tumblr.sql250.09 MBUnknown

OPEN

View/Download
youtube.sql95.58 MBUnknown

OPEN

View/Download
twitter.sql2.45 GBUnknown

OPEN

View/Download

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons