‘Opinion tree’: a method for mapping online discussions based on neural-network topic modeling and abstractive summarization

Download paper
Svetlana S. Bodrunova

Doctor of Political Sciences, Professor, Department of Media Management and Mass Communications, Institute “Higher School of Journalism and Mass Communications”, Saint Petersburg State University, Saint Petersburg, Russia; ORCID 0000-0003-0740-561X

e-mail: s.bodrunova@spbu.ru
Ivan S. Blekanov

Head of the Department of Programming Technologies, Faculty of Applied Mathematics – Control Processes, St. Petersburg State University, Saint Petersburg, Russia; ORCID 0000-0002-7305-1429

e-mail: I.blekanov@spbu.ru
Nikita A. Tarasov

Junior Research Fellow, Faculty of Applied Mathematics – Control Processes, St. Petersburg State University, Saint Petersburg, Russia; ORCID 0000-0002-9473-6179

e-mail: nkt.tarasov@yandex.ru

Section: Artificial Intelligence in Media and Communication Studies

So far, no neural-network-based methodologies that aim at online opinion detection allow for representing user discussions on social networks in forms that would simultaneously capture cumulation, shift, and dissipation of consensus. Such a method would allow for scrupulous tracing of the opinion dynamics (including polarization of views), shorten the time for evaluation of opinion dynamics, and help address several theoretical assumptions on the nature of cumulative opinions. We propose a method for construction of ‘opinion trees’ within user discussions on social networks. The case dataset features a Reddit discussion of the 27th UN Climate Change Conference (COP27/UNFCCC2022). The method includes three methodological steps, namely defining the topicality bifurcation points, measuring the ‘thickness’ of ‘branches’, and summarizing the meaning of individual ‘branches’, thus allowing for both topicality divergence assessment and quick enough opinion tracing. Our method integrates recursive BERTopic-based topic modeling and Pegasus-based abstractive summarization, allowing for opinions to be seen as ‘folded’, ‘unfolded’, and ‘polar’, as detected in summaries of varying length.

Keywords: cumulative deliberation, online discussions, topic modeling, abstractive summarization, BERT, Reddit, COP27
DOI: 10.55959/msu.vestnik.journ.5.2025.179208

References:

Abulaish M., Fazil M. (2018) Modeling topic evolution in Twitter: An embedding-based approach. IEEE Access 6: 64847–64857.

Ahmed A., Xing E. P. (2012) Timeline: A dynamic hierarchical Dirichlet process model for recovering birth/death and evolution of topics in text stream. arXiv:1203.3463.

Alam M. H., Ryu W. J., Lee S. (2017) Hashtag-based topic evolution in social media. World Wide Web 20: 1527–1549.

Allan J., Carbonell J. G., Doddington G., Yamron J., Yang Y. (1998) Topic detection and tracking pilot study final report. Available at: people.cs.pitt.edu/~chang/265/proj10/sisref/1.pdf (accessed: 31.10.2025).

Alsaedi N., Burnap P., Rana O. (2017) Can we predict a riot? Disruptive event detection using Twitter. ACM Transactions on Internet Technology 17(2): 1–26.

AlSumait L., Barbará D., Domeniconi C. (2008) On-line LDA: Adaptive topic models for mining text streams with applications to topic detection and tracking. In 8th IEEE international conference on data mining. IEEE. Pp. 3–12.

Blei D. M., Lafferty J. D. (2006) Dynamic topic models. In 23rd International conference on machine learning.ACM. Pp. 113–120.

Blekanov I. S., Bodrunova S. S., Zhuravleva N., Smoliarova A., Tarasov N. (2020) The ideal topic: Interdependence of topic interpretability and other quality features in topic modelling for short texts. In International Conference on Human-Computer Interaction. Cham: Springer. Pp. 19–26.

Blekanov I. S., Tarasov N., Bodrunova S. S. (2022) Transformer-based abstractive summarization for Reddit and Twitter: Single posts vs. comment pools in three languages. Future Internet 14 (3). Available at: https://www.mdpi.com/1999-5903/14/3/69/pdf (accessed: 31.10.2025).

Blekanov I. S., Tarasov N., Bodrunova S. S., Sergeev S. L. (2023) Mapping opinion cumulation: Topic modeling-based dynamic summarization of user discussions on social networks. In International Conference on Human-Computer Interaction. Cham: Springer. Pp. 25–40.

Bodrunova S. S. (2021) Practices of cumulative deliberation: A meta-review of the recent research findings. In International Conference on Electronic Governance and Open Society: Challenges in Eurasia. Cham: Springer. Pp. 89–104.

Bodrunova S. S. (2023) Abstractive Summarization of Social Media Texts as a Tool for Representation of Discussion Dynamics: A Scoping Review. In International Conference on Human-Computer Interaction. Cham: Springer. Pp. 41–54.

Bodrunova S. S. (2023) Kumulyativnaya deliberatsiya: novaya normativnost’ v izuchenii publichnykh sfer onlayn [Cumulative deliberation: new normativity in the study of online public spheres]. Vestnik Moskovskogo universiteta. Seriya 10. Zhurnalistika 1: 87–122. (In Russian)

Bodrunova S. S., Blekanov I. S., Kukarkin M. (2019) Topics in the Russian Twitter and relations between their interpretability and sentiment. In 2019 Sixth International Conference on Social Networks Analysis, Management and Security (SNAMS). IEEE. Pp. 549–554.

Bodrunova S. S., Blekanov I. S., Tarasov N. (2021) Global agendas: Detection of agenda shifts in cross-national discussions using neural-network text summarization for Twitter. In International Conference on Human-Computer Interaction. Cham: Springer. Pp. 221–239.

Bohman I. J., Rehg W. (1997) Deliberative Democracy. Cambridge: MIT Press.

Brigadir I., Greene D., Cunningham P. (2014) Adaptive representations for tracking breaking news on Twitter. arXiv:1403.2923.

Cai H., Tang Z., Yang Y., Huang Z. (2014) Eventeye: Monitoring evolving events from tweet streams. In 22ndACM International conference on multimedia. ACM. Pp. 747–748.

Chang Y., Tang J., Yin D., Yamada M., Liu Y. (2016) Timeline Summarization from Social Media with Life Cycle Models. In IJCAI-16 Proceedings. AAAI. Pp. 3698–3704.

Chen G., Xu N., Mao W. (2018) An encoder-memory-decoder framework for sub-event detection in social media. In 27th ACM international conference on information and knowledge management. ACM. Pp. 1575–1578.

Dermouche M., Velcin J., Khouas L., Loudcher S. (2014) A joint model for topic-sentiment evolution over time. In 2014 IEEE International conference on data mining. IEEE. Pp. 773–778.

Gao W., Peng M., Wang H., Zhang Y., Han W., Hu G., Xie Q. (2020) Generation of topic evolution graphs from short text streams. Neurocomputing 383: 282–294.

Golovnin O. K., Kurganov A. V. (2022) Avtomatizirovannaya sistema analiza kommentariyev pol’zovateley v sluzhbe mgnovennogo obmena soobshcheniyami Telegram [Automated system for analyzing user comments in the instant messaging service Telegram]. In Proceedings of the International Scientific and Technical Conference “Advanced Information Technologies”. Samara. Pp. 23–25. (In Russian)

Goyal P., Kaushik P., Gupta P., Vashisth D., Agarwal S., Goyal N. (2020) Multilevel event detection, storyline generation, and summarization for tweet streams. IEEE Transactions on Computational Social Systems 7(1): 8–23.

Grootendorst M. (2022) BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv:2203.05794.

Habermas J. (1996) Between facts and norms: Contributions to a discourse theory of law and democracy. Boston: MIT Press.

Hong L., Yin D., Guo J., Davison B. D. (2011) Tracking trends: incorporating term volume into temporal topic models. In 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM. Pp. 484–492.

Huang J., Peng M., Wang H., Cao J., Gao W., Zhang X. (2017) A probabilistic method for emerging topic tracking in microblog stream. World Wide Web 20: 325–350.

Jo Y., Hopcroft J. E., Lagoze C. (2011) The web of topics: discovering the topology of topic evolution in a corpus. In 20th International conference on world wide web. ACM. Pp. 257–266.

Kawamae N. (2018) Topic chronicle forest for topic discovery and tracking. In 11th ACM International conference on web search and data mining. ACM. Pp. 315–323.

King A. A., Anderson F. D. (1971) Nixon, Agnew, and the ‘silent majority’: A case study in the rhetoric of polarization. Western Speech 35 (4): 243–255.

Koch T., Arendt F., Maximilian L. (2017) Media effects: Cumulation and duration. In P. Rössler, C. A. Hoffner, L. Zoonen (eds.) The International Encyclopedia of Communication. DOI: 10.1002/9781118783764.wbieme0217.

Kol’tsova O. Yu., Maslinskiy K. A. (2013) Vyyavlenie tematicheskoy struktury rossiyskoy blogosfery: avtomaticheskie metody analiza tekstov [Identification of the thematic structure of the Russian blogosphere: automatic text analysis methods]. Sotsiologiya: metodologiya, metody, matematicheskoe modelirovanie 36: 113–139. (In Russian)

Lee P., Lakshmanan L. V., Milios E. E. (2013) Event evolution tracking from streaming social posts.arXiv:1311.5978.

Lee P., Lakshmanan L. V., Milios E. E. (2014) Incremental cluster evolution tracking from highly dynamic network data. In 2014 IEEE 30th International conference on data engineering. IEEE. Pp. 3–14.

Li J., Li S. (2013) Evolutionary hierarchical dirichlet process for timeline summarization. In 51st Annual Meeting of the Association for Computational Linguistics. Volume 2: Short Papers. ACL. Pp. 556–560.

Lin C. X., Mei Q., Han J., Jiang Y., Danilevsky M. (2011) The joint inference of topic diffusion and evolution in social communities. In 2011 IEEE 11th International conference on data mining IEEE. Pp. 378–387.

Lu X., Guo Y., Chen J., Wang F. (2022) Topic change point detection using a mixed bayesian model. Data Mining and Knowledge Discovery 36: 146–173.

Mitrofanova O. A., Adamova M. A., Bukreeva L. A., Golubev R. V., Gusyatskaya P. A., Zernova A. K., Litvinova A. A., Makeev K. V., Pavlikova V. S., Plyusnina E. P., Sologub P. Yu., Sukhan D. D., Troshina A. V., Utkina A. A. (2024) Intellektual’nyy analiz dannykh v korpuse tekstov po korpusnoy i komp’yuternoy lingvistike [Data mining in a corpus of texts on corpus and computational linguistics]. International Journal of Open Information Technologies 12: 11–26. (In Russian)

Momeni E., Karunasekera S., Goyal P., Lerman K. (2018) Modeling evolution of topics in large-scale temporal text corpora. International AAAI Conference on web and social media 1 (12): 656–659.

Mouffe C. (2000) The Democratic Paradox. New York: Verso.

Noelle‐Neumann E. (1974) The spiral of silence a theory of public opinion. Journal of Communication 24 (2): 43–51.

Passali T., Gidiotis A., Chatzikyriakidis E., Tsoumakas G. (2021) Towards human-centered summarization: A case study on financial news. In First workshop on bridging human-computer interaction and natural language processing ACL. Pp. 21–27.

Pfetsch B. (2018) Dissonant and disconnected public spheres as challenge for political communication research. Javnost – The Public 25 (1–2): 59–65.

Saha A., Sindhwani V. (2012) Learning evolving and emerging topics in social media: A dynamic NMF approach with temporal regularization. In 5th ACM International conference on web search and data mining ACM. Pp. 693–702.

Smoliarova A. S., Bodrunova S. S., Blekanov I. S., Maksimov A. (2020) Discontinued public spheres? Reproducibility of user structure in Twitter discussions on inter-ethnic conflicts. In International Conference on Human-Computer Interaction. Cham: Springer International Publishing. Pp. 262–269.

Song J., Huang Y., Qi X., Li Y., Li F., Fu K., Huang T. (2016) Discovering hierarchical topic evolution in time‐stamped documents. Journal of the Association for Information Science and Technology 67 (4): 915–927.

Sorokina S. G. (2024) Intellektual’naya obrabotka tekstovoy informatsii: obzor avtomatizirovannykh metodov summarizatsii [Intelligent text processing: a review of automated summarization methods]. Virtual’naya kommunikatsiya i sotsial’nye seti 3: 203–222. (In Russian)

Srijith P. K., Hepple M., Bontcheva K., Preotiuc-Pietro D. (2017) Sub-story detection in Twitter with hierarchical Dirichlet processes. Information Processing & Management 53 (4): 989–1003.

Tang X., Yang C. C. (2011) Following the social media: Aspect evolution of online discussion. In Social Computing, Behavioral-Cultural Modeling and Prediction: 4th International Conference Proceedings. Springer. Vol. 4. Pp. 292–300.

Wang P., Zhang P., Zhou C., Li Z., Yang H. (2017) Hierarchical evolving Dirichlet processes for modeling nonlinear evolutionary traces in temporal data. Data Mining and Knowledge Discovery 31: 32–64.

Wang Z., Chen J., Chen J., Chen H. (2024) Identifying interdisciplinary topics and their evolution based on BERTopic. Scientometrics 129: 7359–7384. DOI: 10.1007/s11192-023-04776-5.

Wang Z., Shou L., Chen K., Chen G., Mehrotra S. (2015) On summarization and timeline generation for evolutionary tweet streams. IEEE Transactions on Knowledge and Data Engineering 27 (5): 1301–1315.

Yardi S., boyd d. (2010) Dynamic debates: An analysis of group polarization over time on Twitter. Bulletin of Science, Technology & Society 30 (5): 316–327.

Zhang Z., Fang M., Chen L., Namazi-Rad M. R. (2022) Is neural topic modelling better than clustering? An empirical study on clustering with contextual embeddings for topics. arXiv:2204.09874.

Zhou H., Yu H., Hu R., Hu J. (2017) A survey on trends of cross-media topic evolution map. Knowledge-Based Systems 124: 164–175.

Zhou Y., Kanhabua N., Cristea A. I. (2016) Real-time timeline summarisation for high-impact events in Twitter. In 22nd European conference on artificial intelligence. IOS Press. Pp. 1158–1166.


To cite this article: Bodrunova S. S., Blekanov I. S., Tarasov N. A. (2025) “Derevo mneniy”: metod dinamicheskogo meppinga onlayn-diskussiy na osnove neyrosetevogo tematicheskogo modelirovaniya i abstraktivnoy summarizatsii [‘Opinion tree’: a method for mapping online discussions based on neural-network topic modeling and abstractive summarization]. Vestnik Moskovskogo Universiteta. Seriya 10. Zhurnalistika 5: 179–208. DOI: 10.55959/msu.vestnik.journ.5.2025.179208