Cyberbullying victimization identification and large language model-assisted assessment: a study of cyberbullying victimization lexicon construction and validation - Summary - MDSpire

Cyberbullying victimization identification and large language model-assisted assessment: a study of cyberbullying victimization lexicon construction and validation

  • By

  • Xingyun Liu

  • Yuehan Liao

  • Fan Feng

  • Yiming Tu

  • Xin Kang

  • Miao Liu

  • Nuo Han

  • June 24, 2026

  • 0 min

Share

Objective:

To construct and validate a Chinese cyberbullying victimization lexicon using large language models to improve identification and facilitate intervention.

Approach:
  • Lexicon Construction: Developed from social media data, focusing on cyberbullying methods, perceived harm, and coping strategies, with the assistance of large language models.
  • Validation Methodology: Evaluated validity through correlations between word-frequency statistics from Weibo posts and expert ratings, utilizing large language models for analysis.
  • Model Comparison: Compared outputs of DeepSeek-R1 and GPT-4o with human evaluations in lexicon development tasks, assessing their effectiveness in vocabulary selection and weight assignment.
Key Findings:
  • The lexicon includes 442 words across three dimensions: cyberbullying methods, perceived harm, and coping strategies.
  • Strong validity in identifying cyberbullying victimization expressions across dimensions (cyberbullying methods: r=0.500, p < 0.001; perceived harm: r=0.408, p < 0.001; coping strategies: r=0.509, p < 0.001; overall r = 0.870, p < 0.001).
  • DeepSeek-R1 showed good performance in small-scale text classification (Kappa = 0.775–0.781) but limitations in large-scale processing.
Interpretation:

The study indicates that large language models can assist in structured tasks but require human oversight for complex research phases.

Limitations:
  • Significant discrepancies between model outputs and human evaluations in vocabulary selection and weight assignment tasks.
  • Limited effectiveness of models in large-scale processing, with Kappa values indicating substantial inconsistencies.
Conclusion:

The research highlights the potential of a human-machine collaborative approach for optimal outcomes in cyberbullying victimization identification.

Original Source(s)

Related Content