Over the past decade, social media platforms have been key in spreading rumors, leading to significant negative impacts. To counter this, the community has developed various Rumor Detection (RD) algorithms to automatically identify them using user comments as evidence. However, these RD methods often fail in the early stages of rumor propagation when only limited user comments are available, leading the community to focus on a more challenging topic named Rumor Early Detection (RED).
Typically, existing RED methods learn from limited semantics in early comments. However, our preliminary experiment reveals that the RED models always perform best when the number of training and test comments is consistent and extensive. This inspires us to address the RED issue by generating more human-like comments to support this hypothesis. To implement this idea, we tune a comment generator by simulating expert collaboration and controversy and propose a new RED framework named CAMERED. Specifically, we integrate a mixture-of-expert structure into a generative language model and present a novel routing network for expert collaboration. Additionally, we synthesize a knowledgeable dataset and design an adversarial learning strategy to align the style of generated comments with real-world comments. We further integrate generated and original comments with a mutual controversy fusion module. Experimental results show that CAMERED outperforms state-of-the-art RED baseline models and generation methods, demonstrating its effectiveness.
Despite the effectiveness of existing RED methods, the limitedly available comments consistently hampers the peformance of detection models. We present a preliminary experiment with a SOTA RED method to observe the issue as depicted as follows. On one hand, when validated on practical applications, the scarcity of test comments on unknown posts prevents the model from extracting meaningful features. On the other hand, the limited number of comments in the training data also leads to insufficient model training. Upon the observations, we assume that the model performs best when the number of training and test comments is consistently extensive.
We draw inspiration from the empirical observation that the RED model achieves optimal performance when the number of comments during training and testing is consistent and extensive. However, in RED, the number of comments available during testing is always insufficient to support these observations. This naturally leads us to the idea of tuning a comment generator to fill this gap. Upon this idea, inspired by the processes of collaboration and controversy among human experts, we design a comment generator based on the mixture-of-experts structure, which produces diverse, knowledgeable, and human-like comments through expert collaboration. Additionally, we develop a new comment integration model that simulates the controversies among experts to derive the final judgment. Accordingly, we propose a new RED framework CAMERED, which consists of three basic modules: collaborating generator tuning, mutual controversy fusion, and veracity classifier. The overall framework is depicted as follows.
To evaluate the performance of comments generated by our CGT module and our overall CAMERED framework, we compare them with SOTA RED baselines and comments generation methods.
Experimental results under the RED scenario (we fix M = 16 and M' = 2).
Experimental results under the RED scenario (we fix M = 2 and M' = 2).
None