Donghyun Kim

Research Interests

Knowledge-Integrated AI Systems
Knowledge Acquisition and Representation / Retrieval-Augmented Language Models / Multimodal Information Retrieval
Trustworthy and Explainable AI
Knowledge Refinement / Factuality Verification / Evidence-Grounded Reasoning
Data-Efficient AI
Active Learning / Data Selection / Adaptive Supervision

Education

Yonsei University
Ph.D. in Computer Science
Sep. 2019 – Aug. 2026 (Expected)
Pusan National University
B.S. in Computer Science and Engineering
Mar. 2013 – Feb. 2017

Experience

NC AI
AI Research Intern
Apr. 2026 – Present
Republic of Korea Army
Commissioned Officer (Signal Platoon Leader)
Mar. 2017 – Jun. 2019

Publications

International

[17] Aligning Retrievers with Evidence-Grounded Reasoning for Retrieval-Augmented Knowledge Graph Validation
Donghyun Kim*, Hyeongjun Yang, Seokju Hwang, Kyong-Ho Lee, and Chanhee Lee
Under Review
[16] Reliable Distillation of Recommendation Rationales from Large Language Models for Conversational Recommendation
Hyeongjun Yang, Donghyun Kim, Kyuhwan Yeom, Kyong-Ho Lee, Byungkook Oh, and Xiongnan Jin
Under Review
[15] FineESC: Fine-Grained Emotional Support Conversation with Turn-level Importance Modeling via Subquestion Reasoning
Seokju Hwang, Midan Shim, Yeseul Gong, Donghyun Kim, and Kyong-Ho Lee
Under Review
[14] FLOW: Complex Fact-Checking through Linking Implicit and Explicit Claims with Large Language Model Reasoning
Heeyeon Koo, Donghyun Kim, Hyojun Choi, and Kyong-Ho Lee
Under Review
[13] LLMs as Knowledge Graph Refiners: Mitigating Factual Inconsistencies in Generative Knowledge Extraction
Donghyun Kim*, Hyeongjun Yang, Seokju Hwang, Kyong-Ho Lee, and Chanhee Lee
ACL 2026 (Main, Acceptance Rate: 19%)
PDF

Knowledge graphs (KGs) provide a structured representation of real-world facts as triples consisting of entities and their relationships. With the rapid progress of large language models (LLMs), recent studies increasingly explore LLMs for end-to-end KG construction from text. In particular, generative knowledge extraction (GKE) builds KGs by directly generating structured triples from documents. However, generation errors are inevitable, and the resulting KGs often contain triples that do not align with the facts expressed in the source text. To address these issues, we propose GraphRefine, a framework that performs triple-level refinement on KGs constructed via GKE. We first analyze factual inconsistencies that arise in GKE and categorize their types based on a human evaluation. We then construct training data reflecting these types and fine-tune an LLM as a KG refiner. Given a draft KG, the fine-tuned refiner selects a refinement operation for each triple and, if needed, deletes, edits, or rewrites it to reduce factual inconsistencies. Extensive experiments demonstrate that GraphRefine goes beyond deletion-only approaches and improves KG quality from diverse perspectives.
[12] Data-Efficient Adaptation to Contextual Shifts in LLM-based Conversational Recommendation
Hyeongjun Yang, Donghyun Kim, Seokju Hwang, Midan Shim, Kyuhwan Yeom, Kaehyun Um, and Kyong-Ho Lee
ACL 2026 (Findings)
PDF

Large language model (LLM)-based conversational recommender systems (CRSs) have demonstrated strong capabilities in capturing user preferences and generating contextually relevant recommendations. Nevertheless, the recommendation quality of the models frozen after training inevitably degrades under contextual shifts, such as changes in language and social trends. While periodic model updates are essential to maintain alignment with real-world preferences, training on large-scale data incurs substantial costs. This motivates data-efficient adaptation. However, existing data selection methods struggle to distinguish learnable samples under contextual shifts. To address this, we propose Contextual Shift-Adaptive Data Pruning and Training (CAPT), a framework agnostic to underlying LLM-based CRSs. Specifically, we conceptualize a three-class data taxonomy comprising familiar, valuable, and outlier samples to formalize data behavior under contextual shifts. Based on this taxonomy, we design an importance score estimation scheme that quantifies a sample’s relative learnability for shift adaptation. Leveraging these importance scores, CAPT prioritizes highly learnable samples and further guides shift-adaptive training to actively steer the model toward evolving preferences. Experiments on three CRS benchmarks with real‑world temporal splits demonstrate that CAPT outperforms baselines, matching or surpassing full-data fine-tuning performance using only 10-50% of the training data.
[11] LLM-Assisted Ontology Restriction Verification with Clustering-based Description Generation
Seungyeon Kim, Donghyun Kim, Seokju Hwang, Kyong-Ho Lee, and Kyunghwa Lee
IEEE Access, 2025 (Q1/IF:3.6)
PDF

An ontology is a scheme for structuring relationships between concepts in a domain, promoting data interoperability and system integration. However, poorly designed ontologies can lead to errors and performance issues. While systems engineering has standardized evaluation guidelines (e.g., ISO/IEC), ontology engineering lacks such standards, leading to various independent evaluation methods. One frequent issue among novice developers is the misuse of ontology restrictions, particularly ‘allValuesFrom’ and ‘someValuesFrom’, which can significantly impact the correctness and reliability of ontologies. However, existing studies have not adequately addressed effective methods for detecting such errors. To address this gap, we propose a context-aware verification framework utilizing large language models to detect and correct misuse in ontology restrictions. Unlike conventional methods, our framework integrates contextual descriptions derived from ontological axioms, enabling more accurate verification. Additionally, we introduce a clustering-based description generation method that systematically organizes contextual information, further enhancing verification accuracy. Experimental evaluation conducted on diverse ontology datasets suggests that contextual integration improves verification performance. Moreover, the clustering-based description generation improves restriction misuse detection and correction compared to traditional approaches. By automating ontology restriction verification, this study contributes significantly to enhancing the reliability of ontology evaluation and provides a foundation for developing more scalable and standardized verification techniques.
[10] Active Learning Framework for Improving Knowledge Graph Accuracy
Donghyun Kim*, Hyeongjun Yang, Seokju Hwang, Kyuhwan Yeom, Midan Shim, and Kyong-Ho Lee
IEEE Access, 2025 (Q1/IF:3.6)
PDF

Knowledge graphs are graph-structured data models that provide a robust scheme for representing real-world relational facts with structured triples. The structural and factual information in knowledge graphs are extensively leveraged in various downstream applications. Unfortunately, knowledge graphs often contain incorrect triples due to the automated extraction processes. Therefore, to ensure the reliability and usability of knowledge graphs, it is crucial to identify and rectify these incorrect triples. However, this remains a challenging task, as knowledge graphs are intricate structures encompassing a vast number of triples formed by diverse entities, relations, and their complex interconnections. This paper proposes an effective method to enhance knowledge graph accuracy by introducing an active learning framework. The proposed framework integrates the advantages of machine-based models and human involvement to enable efficient and reliable improvement in knowledge graph accuracy. Additionally, the proposed method includes sampling strategies that consider the relation distribution in knowledge graphs to maximize the effectiveness of the active learning framework. Extensive experimental results demonstrate the effectiveness of the proposed active learning framework and sampling strategies in improving knowledge graph accuracy. Furthermore, this paper provides an exploration of the level of human involvement and a discussion of practical approaches to improve knowledge graph accuracy in real-world scenarios.
[9] CoreSense: Social Commonsense Knowledge-Aware Context Refinement for Conversational Recommender System
Hyeongjun Yang, Donghyun Kim, Gayeon Park, Kyuhwan Yeom, and Kyong-Ho Lee
IEEE Transactions on Knowledge and Data Engineering, 2025 (Q1/IF:10.4)
PDF

Unlike the traditional recommender systems that rely on historical data such as clicks or purchases, a conversational recommender system (CRS) aims to provide a personalized recommendation through a natural conversation. The conversational interaction facilitates capturing not only explicit preference from mentioned items but also implicit states, such as a user’s current situation and emotional states from a dialogue context. Nevertheless, existing CRSs fall short of fully exploiting a dialogue context since they primarily derive explicit user preferences from the items and item-attributes mentioned in a conversation. To address this limitation and attain a comprehensive understanding of a dialogue context, we propose CoreSense, a conversational recommender system enhanced with social commonsense knowledge. In other words, CoreSense exploits the social commonsense knowledge graph ATOMIC to capture the user’s implicit states, such as a user’s current situation and emotional states, from a dialogue context. Thus, the social commonsense knowledge-augmented CRS can provide a more appropriate recommendation from a given dialogue context. Furthermore, we enhance the collaborative filtering effect by utilizing the user’s states inferred from commonsense knowledge as an improved criterion for retrieving other dialogues of similar interests. Extensive experiments on CRS benchmark datasets show that CoreSense provides human-like recommendations and responses based on inferred user states, achieving significant performance improvements.
[8] Optimizing Training Data for Persona-Grounded Dialogue via Synthetic Label Augmentation
Chanhee Lee, Donghyun Kim, Wongyu Kim, Kyungchan Lee, Youbin Ahn, Kyong-Ho Lee, Donghoon Shin, and Yeonsoo Lee
Expert Systems With Applications, 2025 (Q1/IF:7.5)
PDF

Persona-grounded dialogue systems aim to enhance the quality of AI agent responses by bolstering persona consistency and promoting response diversity. Although model tuning has seen significant advancements, there is an ongoing need to refine the training data itself. Expanding the scope of personas has been suggested as a means to bridge this gap. Nevertheless, the lack of gold labels that align with these expanded personas poses a challenge for AI agents in training the extent of real-world knowledge. To tackle these challenges, we propose the Synthetic Label Augmentation framework. This framework (1) creates a background skeleton from the original gold labels, masking persona-related elements, (2) infuses the background skeleton with expanded-persona features, generating synthetic gold labels, (3) identifies the most appropriate synthetic gold labels among the candidates, and (4) merges them into persona-grounded dialogue dataset. Through extensive experiments on the Persona-Chat, we demonstrate that the proposed framework effectively integrates the content of expanded personas to generate synthetic gold labels suitable for the dialogue context. Furthermore, response generation experiments using the Optimized Persona-Chat show that our framework significantly enhances AI agents’ performance in terms of persona consistency and response diversity.
[7] Multi-Domain Dialogue State Tracking via Dual Dynamic Graph with Hierarchical Slot Selector
Yeseul Gong, Heeseon Kim, Seokju Hwang, Donghyun Kim, and Kyong-Ho Lee
Knowledge-Based Systems, 2025 (Q1/IF:7.6)
PDF

Dialogue state tracking aims to maintain user intent as a consistent state across multi-domains to accomplish natural dialogue systems. However, previous researches often fall short in capturing the difference of multiple slot types and fail to adequately consider the selection of discerning information. The increase in unnecessary information correlates with a decrease in predictive performance. Therefore, the careful selection of high-quality information is imperative. Moreover, considering that the types of essential and available information vary for each slot, the process of selecting appropriate information may also differ. To address these issues, we propose HS2DG-DST, a Hierarchical Slot Selector and Dual Dynamic Graph-based DST. Our model is designed to provide maximum information for optimal value prediction by clearly exploiting the need for differentiated information for each slot. First, we hierarchically classify slot types based on the multiple properties. Then, two dynamic graphs provide highly relevant information to each slot. Experimental results on MultiWOZ datasets demonstrate that our model outperforms state-of-the-art models.
[6] Dialogue Act-based Partner Persona Extraction for Consistent Personalized Response Generation
Kyungchan Lee, Chanhee Lee, Donghyun Kim, and Kyong-Ho Lee
Expert Systems With Applications, 2024 (Q1/IF:7.5)
PDF

The ability of a dialogue model to keep not being out of context during a conversation, so-called keeping the consistency, has long been a critical issue in generating more human-like personalized responses. However, most of the previous works have focused on a self persona to sustain the self consistency during a conversation. Since the consistency is not limited to only the self side, there still lies the problem where generated responses often contradict the utterance of a partner. This kind of behavior discourages the user from responding and eventually causes the user to leave the conversation. To prevent this from happening, our work focuses on recognizing the partner persona. We propose a new model, PEDA (Persona Extractor based on Dialogue Act), and construct an appropriate dataset for training the model. Specifically, dialogue acts are utilized to identify the utterances that capture a user’s persona. The proposed model extracts the partner persona from the combination of utterances and their dialogue acts. We propose a dialogue act conductor to properly consider the dialogue act as an input and make use of it with the pre-trained language model. A proposed gating mechanism controls the probability distributions from dialogue act conductor and pre-trained language model. Finally, we train the model further by setting up a reinforcement learning framework with our evaluation network.
[5] Active Learning for Cross-sentence N-ary Relation Extraction
Seungmin Seo, Byungkook Oh, Jeongbeom Jeoung, Donghyun Kim, Kyong-Ho Lee, Donghoon Shin, and Yeonsoo Lee
Information Sciences, 2023 (Q1/IF:6.8)
PDF

N-ary relation extraction models are required to be trained on large amount of high-quality data, but it is challenging to obtain such data in effect; thus, models are forced to rely on a limited amount of low-quality labeled data. This paper proposes an active learning method for cross-sentence n-ary relation extraction, addressing the following question: “How can we train an n-ary relation extraction model incrementally without a large amount of high-quality data?” To answer this research question, we introduce a schema-aware sampling strategy that selects informative samples in unlabeled dataset that will be used for model training. This method exploits structural relatedness between a relation and its entities to generate the context embeddings of inferred relations. Using the similarity between the clusters of context embeddings and target samples, we detect a set of informative samples in unlabeled dataset. Moreover, the paper proposes a balanced incremental learning method updating the extraction model without bias with only a small computational cost at each training iteration. Experimental results on benchmark datasets are demonstrated to confirm the validity and effectiveness of the proposed method.
[4] Persona Expansion with Commonsense Knowledge for Diverse and Consistent Response Generation
Donghyun Kim*, Youbin Ahn, Wongyu Kim, Chanhee Lee, Kyungchan Lee, Kyong-Ho Lee, Jeonguk Kim, Donghoon Shin, and Yeonsoo Lee
EACL 2023 (Main, Acceptance Rate: 24%)
PDF

Generating diverse and consistent responses is the ultimate goal of a persona-based dialogue. Although many studies have been conducted, the generated responses tend to be generic and bland due to the personas’ limited descriptiveness. Therefore, it is necessary to expand the given personas for more attractive responses. However, indiscriminate expansion of personas threaten the consistency of responses and therefore reduce the interlocutor’s interest in conversation. To alleviate this issue, we propose a consistent persona expansion framework that improves not only the diversity but also the consistency of persona-based responses. To do so, we define consistency criteria to avoid possible contradictions among personas as follows: 1) Intra-Consistency and 2) Inter-Consistency. Then, we construct a silver profile dataset to deliver the ability to conform with the consistency criteria to the expansion model. Finally, we propose a persona expansion model with an encoder-decoder structure, which considers the relatedness and consistency among personas. Our experiments on the Persona-Chat dataset demonstrate the superiority of the proposed framework.
[3] Concept-based Persona Expansion for Improving Diversity of Persona-Grounded Dialogue
Donghyun Kim*, Youbin Ahn, Chanhee Lee, Wongyu Kim, Kyong-Ho Lee, Donghoon Shin, and Yeonsoo Lee
EACL 2023 (Main, Acceptance Rate: 24%)
PDF

A persona-grounded dialogue model aims to improve the quality of responses to promote user engagement. However, because the given personas are mostly short and limited to only a few informative words, it is challenging to utilize them to generate diverse responses. To tackle this problem, we propose a novel persona expansion framework, Concept-based Persona eXpansion (CPX). CPX takes the original persona as input and generates expanded personas that contain conceptually rich content. We constitute CPX with two task modules: 1) Concept Extractor and 2) Sentence Generator. To train these modules, we exploit the duality of two tasks with a commonsense dataset consisting of a concept set and the corresponding sentences which contain the given concepts. Extensive experiments on persona expansion and response generation show that our work sufficiently contributes to improving the quality of responses in diversity and richness.
[2] Emp-RFT: Empathetic Response Generation via Recognizing Feature Transitions between Utterances
Wongyu Kim, Youbin Ahn, Donghyun Kim, and Kyong-Ho Lee
NAACL 2022 (Main, Acceptance Rate: 21%)
PDF

Each utterance in multi-turn empathetic dialogues has features such as emotion, keywords, and utterance-level meaning. Feature transitions between utterances occur naturally. However, existing approaches fail to perceive the transitions because they extract features for the context at the coarse-grained level. To solve the above issue, we propose a novel approach of recognizing feature transitions between utterances, which helps understand the dialogue flow and better grasp the features of utterance that needs attention. Also, we introduce a response generation strategy to help focus on emotion and keywords related to appropriate features when generating responses. Experimental results show that our approach outperforms baselines and especially, achieves significant improvements on multi-turn dialogues.
[1] Active Learning on Pre-trained Language Model with Task-Independent Triplet Loss
Seungmin Seo, Donghyun Kim, Youbin Ahn, and Kyong-Ho Lee
AAAI 2022 (Main, Acceptance Rate: 15%)
PDF

Active learning attempts to maximize a task model’s performance gain by obtaining a set of informative samples from an unlabeled data pool. Previous active learning methods usually rely on specific network architectures or task-dependent sample acquisition algorithms. Moreover, when selecting a batch sample, previous works suffer from insufficient diversity of batch samples because they only consider the informativeness of each sample. This paper proposes a task-independent batch acquisition method using triplet loss to distinguish hard samples in an unlabeled data pool with similar features but difficult to identify labels. To assess the effectiveness of the proposed method, we compare the proposed method with state-of-the-art active learning methods on two tasks, relation extraction and sentence classification. Experimental results show that our method outperforms baselines on the benchmark datasets.

Domestic

[2] Ontology-based Fact Checking for Verification of Defense Domain Sentences
Seokju Hwang, Donghyun Kim, Kyong-Ho Lee, and Kyunghwa Lee
Journal of Korea Multimedia Society, 2024 (KCI)
[1] Building a Korean Dataset for Knowledge Extraction in the Military Domain: Focusing on Navy Force and Weapon Systems
Kyuhwan Yeom, Donghyun Kim, Seungyeon Kim, Yongsu Bae, Kyong-Ho Lee, and Kyunghwa Lee
Journal of Korea Multimedia Society, 2024 (KCI)