Abstract | We develop WallGuard for helping users in online social networks (OSNs) avoid regrettable posts and disclosure of sensitive information. Using WallGuard the users can control their posts and can (i) detect inappropriate, regrettable messages before they are posted, as well as (ii) identify already posted messages that could negatively impact user's reputation and life. WallGuard is based on deep learning architectures and NLP based methods. To evaluate the effectiveness of WallGuard, we developed a semi-supervised self-training methodology, which we use to create a new, large-scale corpus for regret detection with 4,7 million OSN messages. The corpus is generated by incrementally labelling messages from large OSN platforms relying on human-labelled and machine-labelled messages. Training Facebook's FastText word embeddings and Word2vec embeddings on our corpus, we created domain specific word embeddings, we referred to as regret embeddings. Our approach allows us to extract features that are discriminative/intrinsic for regrettable disclosures. Leveraging both regret embeddings and the new corpus, we successfully train and evaluate five new multi-label deep-learning based models for automatically classifying regrettable posts. Our evaluation of the proposed models demonstrate that we can detect messages with regrettable topics, achieving up to 0,975 weighted AUC, 82,2% precision and 74,6% recall. WallGuard is free and open-source. |
---|