Article
The Mapping Cultural Sentiments in Indonesian Digital Literature: An Annotated and Validated Multicultural Dataset


This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright
The authors shall retain the copyright of their work but allow the Publisher to publish, copy, distribute, and convey the work.
License
Digital Technologies Research and Applications (DTRA) publishes accepted manuscripts under Creative Commons Attribution 4.0 International (CC BY 4.0). Authors who submit their papers for publication by DTRA agree to have the CC BY 4.0 license applied to their work, and that anyone is allowed to reuse the article or part of it free of charge for any purpose, including commercial use. As long as the author and original source are properly cited, anyone may copy, redistribute, reuse, and transform the content.
Received: 1 December 2025; Revised: 26 January 2026; Accepted: 27 January 2026; Published: 9 February 2026
This study develops an annotated and validated multicultural sentiment dataset derived from Indonesian digital literature. The study integrates Cultural Sentiment Analysis (CSA) and Critical Discourse Analysis (CDA) to address a significant gap in existing research. No prior corpus has systematically combined cultural affective dimensions with linguistic and ethnographic validation, making this dataset crucial for mapping cultural value representations in a multicultural context. The corpus includes over 100 digital literary texts—short stories, online novels, and poems—sourced from platforms such as Wattpad, KBM App, and scholarly blogs, selected through purposive sampling to ensure diverse ethnic and thematic coverage. Annotation was carried out by trained annotators using a culturally grounded emotion lexicon, identifying sentiment polarity (positive, negative, and neutral), cultural values (social harmony, cooperation, spirituality, resistance, and adaptation), and linguistic indicators. Validation involved linguistic review for semantic accuracy and ethnographic verification through Focus Group Discussions with cultural experts from various ethnic groups. The resulting multi-layered dataset provides authentic, contextually grounded, and bias-mitigated representations of cultural sentiment in Indonesian digital literature. Beyond enriching digital humanities scholarship, it offers a reusable open resource for future research, automated sentiment analysis development, and data-driven policy formulation, all aimed at enhancing digital cultural literacy and intercultural understanding in Indonesia.
Keywords:
Dataset Cultural Sentiment Digital Literature Linguistic Validation Ethnographic VerificationReferences
- Zheng, J.; Fan, W. Readers’ reception of translated literary work: Fortress Besieged in the digital English world. Digit. Scholarsh. Humanit. 2023, 38, 408–419. DOI: https://doi.org/10.1093/llc/fqac017
- Rebora, S.; Boot, P.; Pianzola, F.; et al. Digital humanities and digital social reading. Digit. Scholarsh. Humanit. 2021, 36, ii230–ii250. DOI: https://doi.org/10.1093/llc/fqab020
- Chang, W.-C.; Lin, R.-T. Designing for wearable and fashionable interactions: Exploring narrative design and cultural semantics for design anthropology. Interact. Stud. 2020, 21, 200–219. DOI: https://doi.org/10.1075/is.17047.cha
- Ricoeur, P. Time and Narrative; McLaughlin, K., Pellauer, D. Trans.; University of Chicago Press: Chicago, IL, USA, 2016; Volume 1.
- Berry, J. Diversity and Equity. Cross Cult. Strateg. Manag. 2016, 23, 413–430. DOI: https://doi.org/10.1108/CCSM-03-2016-0085
- Damayanti, I.L.; Moeharam, N.Y.; Asyifa, F. Unfolding layers of meanings: Visual–verbal relations in Just Ask—A children’s picture book. Indones. J. Appl. Linguist. 2021, 11, 345–357. DOI: https://doi.org/10.17509/ijal.v11i2.39195
- Cahyani, I.; Mardani, P.; Widianingsih, Y. Digital storytelling in cultural tourism. Int. J. Manag. Entrep. Soc. Sci. Humanit. 2023, 6, 45–69.
- Nensilianti; Jahrir, A.S.; Saguni, S.S.; et al. Toxic talk and narrative power in virtual arenas: A pragmatic-narrative analysis of impoliteness in online game communication. Forum Linguist. Stud. 2024, 7, 617–631. DOI: https://doi.org/10.30564/fls.v7i6.9945
- APJII Internet Survey 2025; APJII: Jakarta, Indonesia, 2025. Available online: https://survei.apjii.or.id/ (accessed on 1 December 2025). (in Indonesian)
- Nensilianti; Hajrah; Ridwan; et al. Decoding the narrative syntax and macrostructure of Toraja folklore: A quest for meaning. J. Lang. Teach. Res. 2025, 16, 493–504. DOI: https://doi.org/10.17507/jltr.1602.15
- Adji, R.D.; Setiawan, I. Literature and identity in the digital era: A multicultural perspective. J. Poet. 2020, 8, 23–34. (in Indonesian)
- Ismawati, S.; Suyata, S.; Marmanto, S.; et al. Multiculturalism in Indonesian literature as a teaching material for literary appreciation. J. Pendidik. Bahas. 2019, 8, 19–33. (in Indonesian)
- Younis, E. New horizons of digital literary criticism. Digit. Scholarsh. Humanit. 2021, 36, 501–508. DOI: https://doi.org/10.1093/llc/fqaa026
- Cowen, A.S.; Brooks, J.A.; Prasad, G.; et al. How emotion is experienced and expressed in multiple cultures: A large-scale experiment across North America, Europe, and Japan. Front. Psychol. 2024, 15, 1350631. DOI: https://doi.org/10.3389/fpsyg.2024.1350631
- Kubin, E.; von Sikorski, C. The role of (social) media in political polarization: A systematic review. Ann. Int. Commun. Assoc. 2021, 45, 188–206. DOI: https://doi.org/10.1080/23808985.2021.1976070
- Liu, B. Sentiment Analysis: Mining Opinions, Sentiments, and Emotions; Cambridge University Press: Cambridge, UK, 2020.
- Mohammad, S.M.; Turney, P.D. Crowdsourcing a word–Emotion association lexicon. Comput. Intell. 2013, 29, 436–465. DOI: https://doi.org/10.1111/j.1467-8640.2012.00460.x
- Fairclough, N. Language and Power; Routledge: London, UK, 2013.
- Winata, G.I.; Aji, A.F.; Cahyawijaya, S.; et al. NusaX: Multilingual parallel sentiment dataset for 10 Indonesian local languages. arXiv preprint 2022, arXiv:2205.15960. DOI: https://doi.org/10.48550/arXiv.2205.15960
- Liu, C.C.; Gurevych, I.; Korhonen, A.; et al. Culturally aware and adapted natural language processing: A taxonomy and a survey of the state of the art. Trans. Assoc. Comput. Linguist. 2024, 13, 652–689. DOI: https://doi.org/10.1162/tacl_a_00760
- Santy, S.; Liang, J.T.; Le Bras, R.; et al. NLPositionality: Characterizing design biases of datasets and models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics; Rogers, A., Boyd-Graber, J., Okazaki, N., Eds.; Association for Computational Linguistics: Kerrville, TX, USA, 2023; Volume 1, pp. 9080–9102.
- Pawar, S.; Park, J.; Jin, J.; et al. Survey of cultural awareness in language models. Comput. Linguist. 2025, 51, 907–1004. DOI: https://doi.org/10.1162/coli.a.14
- Kadan, A.; Padmanabhan, D.; Bhadra, S. Understanding latent affective bias in large pre-trained language models. Nat. Lang. Process. J. 2024, 7, 100062. DOI: https://doi.org/10.1016/j.nlp.2024.100062
- Underwood, T. Distant Horizons: Digital Evidence and Literary Change; University of Chicago Press: Chicago, IL, USA, 2019.
- Milani, T.M.; Richardson, J.E. Discourse and affect. Soc. Semiotics 2021, 31, 671–676. DOI: https://doi.org/10.1080/10350330.2020.1810553
- Bimantara, I.M.S.; Purwitasari, D.; Sanjaya ER, A.S.; et al. Balinese story texts dataset for narrative text analyses. Data Brief 2024, 56, 110781. DOI: https://doi.org/10.1016/j.dib.2024.110781
- Blei, D.M. Probabilistic topic models. Commun. ACM 2012, 55, 77–84.
- Jockers, M.L. Macroanalysis: Digital Methods and Literary History; University of Illinois Press: Urbana, IL, USA, 2013.
- Evidence-Based Policy: Functional Area 1 of the SDG4 High-Level Steering Committee. Available online: https://www.unesco.org/sdg4education2030/en/evidence-and-policy (accessed on 1 December 2025).
- Wang, F.; Hannafin, M.J. Design-based research and technology-enhanced learning environments. Educ. Technol. Res. Dev. 2005, 53, 5–23.
- Hall, S. Representation: Cultural Representations and Signifying Practices; Sage: London, UK, 1997.
- Bhabha, H.K. The Location of Culture; Routledge: London, UK, 1994.
- Anggraini, A. Lampung culture in the short story “Sebambangan” by Budi P. Hatees. Aksara 2017, 29, 49–62. (in Indonesian)
- Amiel, T.; Reeves, T.C. Design-based research and educational technology: Rethinking technology and the research agenda. Educ. Technol. Soc. 2008, 11, 29–40.
- Artstein, R.; Poesio, M. Inter-coder agreement for computational linguistics. Comput. Linguist. 2008, 34, 555–596. DOI: https://doi.org/10.1162/coli.07-034-R2
- Catalano, T.; Waugh, L.R. Critical Discourse Analysis, Critical Discourse Studies and Beyond; Springer: Cham, Switzerland, 2020. DOI: https://doi.org/10.1007/978-3-030-49379-0
- Lukito, J. Critical computation: Mixed-methods approaches to big text data analysis. Rev. Commun. 2023, 23, 62–78.
- Houston, L. Distant reading. In Technology and Literature; Hammond, A., Ed.; Cambridge University Press: Cambridge, UK, 2023.
- Consilvio, D. Computational Close Reading: A Critique of Digital Literary Studies. PhD Thesis, University of Rhode Island, Kingston, RI, USA, 2023.
- Iskandar, D.; Islahuddin; Harun, M.; et al. Metaphors in Indonesian and Acehnese proverbs of similar meanings: Semantic and cultural analyses. Indones. J. Appl. Linguist. 2025, 14, 512–523. DOI: https://doi.org/10.17509/ijal.v14i3.63350
- Hambali, M.; Istianah, A.; Susilowati, N.E.; et al. Battling the climate crisis: WAR and THREAT metaphors in Indonesian news media through a corpus-ecolinguistics lens. Cogent Arts Humanit. 2025, 12, 2526143. DOI: https://doi.org/10.1080/23311983.2025.2526143
- Martutik; Setiawan, A.; Rani, A.; et al. Exploring flaming in the discourse of negative judgment: Invoked strategies used by Indonesian netizens in Instagram comments. Cogent Arts Humanit. 2024, 11, 2333601. DOI: https://doi.org/10.1080/23311983.2024.2333601
- Rahmi, R.; Suhardijanto, T.; Pratama, H.S.; et al. Tales of health crises: Indonesia’s dynamic narratives and counter-narratives about pandemics. Cogent Arts Humanit. 2025, 12, 2451507. DOI: https://doi.org/10.1080/23311983.2025.2451507
- Winchcombe, Z. Time in the Gutter: A Narratological Approach to the Comics Medium. Master's Thesis, University of Ottawa, Ottawa, ON, Canada, 2018.
- Reagan, A.J.; Mitchell, L.; Kiley, D.; et al. The emotional arcs of stories are dominated by six basic shapes. EPJ Data Sci. 2016, 5, 31. DOI: https://doi.org/10.1140/epjds/s13688-016-0093-1
- Hipson, W.E.; Mohammad, S.M. Emotion dynamics in movie dialogues. Plos One 2021, 16, e0256153. DOI: https://doi.org/10.1371/journal.pone.0256153
- Samothrakis, S.; Tsakalidis, A.; Fasli, M. Emotional sentence annotation helps predict fiction genre. Plos One 2015, 10, e0141922. DOI: https://doi.org/10.1371/journal.pone.0141922
- Spivak, G.C. Can the subaltern speak? In Marxism and the Interpretation of Culture; Nelson, C., Grossberg, L., Eds.; University of Illinois Press: Urbana, IL, USA, 1988; pp. 271–313.
- Ahlstrand, J.; Maniam, V. Kartini, online media, and the politics of the Jokowi era: A critical discourse analysis. Asian Stud. Rev. 2025, 49, 78–98. DOI: https://doi.org/10.1080/10357823.2024.2347859
- Dwifatma, A.; Beta, A.R. The “Funny Line Veil” and the mediated political subjectivity of Muslim women in Indonesia. Asian J. Commun. 2024, 34, 284–297. DOI: https://doi.org/10.1080/01292986.2024.2320900
- Winarnita, M.; Bahfen, N.; Mintarsih, A.R.; et al. Gendered digital citizenship: How Indonesian female journalists participate in gender activism. Journalism Pract. 2022, 16, 621–636. DOI: https://doi.org/10.1080/17512786.2020.1808856
- Wilcock, C.A. From hybridity to networked relationality: Actors, ideologies and the legacies of sudan’s comprehensive peace agreement. J. Interv. Statebuild. 2021, 15, 221–243. DOI: https://doi.org/10.1080/17502977.2020.1822619
- Osei-Appiah, S.; Mutsvairo, B.; Orgeret, K.S. Women and digital political communication in non-Western societies. Inf. Commun. Soc. 2023, 26, 2507–2517. DOI: https://doi.org/10.1080/1369118X.2023.2257995
- Lukiyanto, K.; Wijayaningtyas, M.; Bernarto, I.; et al. Gotong royong as social capital to overcome micro and small enterprises’ problems. Heliyon 2020, 6, e04879. DOI: https://doi.org/10.1016/j.heliyon.2020.e04879
- Syahputra, W.; Dhowi, B.; Sianipar, S.M.; et al,. The psychological perspective of Bhinneka Tunggal Ika: A national unifying tool. Open Psychol. J. 2023, 16. DOI: https://doi.org/10.2174/0118743501260487231122043601
- Humaedi, M.A.; Wibowo, D.P.; Hariyanto, W.; et al. Shifting collective values: The role of rural women and gotong royong in village fund policy. Humanit. Soc. Sci. Commun. 2025, 12, 411. DOI: https://doi.org/10.1057/s41599-025-04577-6
- Berry, J.W. Acculturation: Living successfully in two cultures. Int. J. Intercult. Relat. 2005, 29, 697–712. DOI: https://doi.org/10.1016/j.ijintrel.2005.07.013
- Sherman, S.R.; Koven, M.J. Folklore/Cinema: Popular Film as Vernacular Culture; University Press of Colorado: Boulder, CO, USA, 2007. DOI: https://doi.org/10.2307/j.ctt4cgnbm
- Assmann, J. Cultural Memory and Early Civilization: Writing, Remembrance, and Political Imagination; Cambridge University Press: Cambridge, UK, 2011.
- Arps, A. Memori melompat (‘jumping memory’): The mnemonic motion of Indonesian popular culture and beyond. Memory Stud. 2023, 16, 1579–1594. DOI: https://doi.org/10.1177/17506980231204176
- Piper, A. Enumerations: Data and Literary Study; University of Chicago Press: Chicago, IL, USA, 2018. DOI: https://doi.org/10.7208/chicago/9780226568898.001.0001
- Liu, Z.; Wan, G.; Zuo, X.; et al. Sentiment analysis of Chinese ancient poetry based on multidimensional knowledge attention. Digit. Scholarsh. Humanit. 2025, 40, 214–226. DOI: https://doi.org/10.1093/llc/fqae069

Download
