Please use this identifier to cite or link to this item:
Title: About.Me Dataset for User-centric OSN Analysis
Creators: Bang Hui Lim
Dongyuan Lu
Tao Chen
Subject: Twitter
Joining processes
User interfaces
Social networking (online)
User behaviour
Online Social Networks
User behaviour
Multiple online social networks
DOI: 10.25540/941F-64FU

A user-centric study of cross Online Social Network (OSN) behavior requires a collective study of many individual users, each of whom use multiple OSNs. We purposefully sidestep the issue of user linkage to focus our attention on user behavior.

We leverage an OSN aggregation service called , which enables people to easily create a public online identity that unifies a self-described short biography with prominent links to the person’s other OSN accounts and personal websites.

Using the API, we collected a set of registered 180,000 user profiles, and further limit the users that link to at least 4 out of 6 OSNs (i.e., Flickr, Google+, Instagram, Tumblr, Twitter and Youtube). These six OSNs are chosen for the reason that they are represent the breadth of media and functionality common in today’s Web 2.0 ecology, and they expose most user information publicly through an API. This selection criteria resulted in 15, 595 users in our dataset.

With these 15, 595 users and their user identity in the six OSNs, we crawled each user’s publicly accessible activities in the six OSNs via the respective APIs on 15 August 2013. Since all of the data we have obtained is public, and since we believe that the compiled dataset is a valuable resource for studying multiple OSN behavior, we have released our dataset for others to conduct further study. As required by most OSNs, we are not able to distribute the actual content of a post. Followed the other public social datasets, we release the post ID and its user identity instead.

If you use our dataset, please cite our ASONAM 2015 paper:

"about_me.sql (727K)"is a mysql dump file that contains user's profile name in and their user identities in the six OSNs.

The following six dump files contain the post IDs and their user identities in the corresponding OSN. You can use the user identities in aboutme.sql to link user's activities in multiple social networks. With the post IDs, you can further pull the acutal posts via respective APIs. Read our tutorial on using these APIs.

  • flickr.sql (53MB)
  • googleplus.sql (34MB)
  • instagram.sql (19MB)
  • tumblr.sql (49MB)
  • twitter.sql(479MB)
  • youtube.sql (19MB)(479MB)

For more details, please visit

Related Publications:
Citation: When using this data, please cite the original publication and also the dataset.
  • B. H. Lim, D. Lu, T. Chen, and M.-Y. Kan, “#mytweet via Instagram,” in Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 - ASONAM ’15, 2015, pp. 113–120.
  • Bang Hui Lim, Dongyuan Lu, Tao Chen, KAN MIN-YEN (2017-11-13). About.Me Dataset for User-centric OSN Analysis. ScholarBank@NUS Repository. [Dataset].
License: Attribution-NonCommercial 4.0 International
Appears in Collections:Staff Dataset

Show full item record
Files in This Item:
File Description SizeFormat 
about_me.sql##Dataset This is a mysql dump file that contains user's profile name in and their user identities in the six OSNs.1.47 MBUnknownView/Download
googleplus.sql229.41 MBUnknownView/Download
flickr.sql338.76 MBUnknownView/Download
instagram.sql79.9 MBUnknownView/Download
tumblr.sql250.09 MBUnknownView/Download
youtube.sql95.58 MBUnknownView/Download
twitter.sql2.45 GBUnknownView/Download

Page view(s)

checked on Nov 20, 2017


checked on Nov 20, 2017

Google ScholarTM



This item is licensed under a Creative Commons License Creative Commons