Please use this identifier to cite or link to this item: https://doi.org/10.1186/s42400-018-0011-x
Title: Automated identification of sensitive data from implicit user specification
Authors: Yang, Z. 
Liang, Z. 
Keywords: Conceptual Space
Noun Phrase
Parse Tree
Sensitive Data
Sensitive Keywords
Issue Date: 2018
Publisher: Springer
Citation: Yang, Z., Liang, Z. (2018). Automated identification of sensitive data from implicit user specification. Cybersecurity 1 (1) : 13. ScholarBank@NUS Repository. https://doi.org/10.1186/s42400-018-0011-x
Rights: Attribution 4.0 International
Abstract: The sensitivity of information is dependent on the context of application and user preference. Protecting sensitive data in the cloud era requires identifying them in the first place. It typically needs intensive manual efforts. More importantly, users may specify sensitive information only through an implicit manner. Existing research efforts on identifying sensitive data from its descriptive texts focus on keyword/phrase searching. These approaches can have high false positives/negatives as they do not consider the semantics of the descriptions. In this paper, we propose S3, an automated approach to identify sensitive data based on users’ implicit specifications. Our approach considers semantic, syntactic and lexical information comprehensively, aiming to identify sensitive data by the semantics of its descriptive texts. We introduce the notion concept space to represent the user’s notion of privacy, by which our approach can support flexible user requirements in defining sensitive data. Our approach is able to learn users’ preferences from readable concepts initially provided by users, and automatically identify related sensitive data. We evaluate our approach on over 18,000 top popular applications from Google Play Store. S3 achieves an average precision of 89.2%, and average recall 95.8% in identifying sensitive data. © 2018, The Author(s).
Source Title: Cybersecurity
URI: https://scholarbank.nus.edu.sg/handle/10635/213280
ISSN: 2096-4862
DOI: 10.1186/s42400-018-0011-x
Rights: Attribution 4.0 International
Appears in Collections:Staff Publications
Elements

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1186_s42400-018-0011-x.pdf1.22 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons