Notes on reading the paper --- Graph-model complementarity: an overview of the fusion of knowledge graphs and big models --- organized by reference
with respect to
- First published: 2024-09-13
- Link to original paper:/zh/article/doi/10.14188/j.1671-8836.2024.0040/
- Organize the references in the article in a way that basically maintains the table of contents structure of the original article
Introduction, related presentations
-
Knowledge Graphs: Opportunities and Challenges
- 24 Mar 2023
- Traditional graph construction and complementation techniques also face many challenges, such as data acquisition, entity recognition, knowledge extraction and entity disambiguation.
-
Knowledge Extraction with No Observable Data
- 6 September 2019
- /snudatalab/KegNet
- Knowledge extraction is the extraction of valuable information from various data sources, covering scenarios such as structured and semi-structured textual data, unstructured textual data, and so on. In knowledge extraction of (semi-)structured data, the model achieves knowledge extraction from structured data (e.g., relational databases) or semi-structured data (e.g., Wikipedia) through simple rules and fixed patterns. While knowledge extraction from unstructured data is the difficult part of the extraction task, which generally includes entity recognition, relationship extraction and event extraction, and requires the extraction of atomic information, semantic relationships between entities, etc. from text. An example is the generative network KEGNET, which performs knowledge extraction without observable data.
-
A review: Knowledge reasoning over knowledge graph
- 1 March 2020
- Early knowledge graph reasoning was based on certain rules and restrictions and needed to rely on rules, assumptions, and other preconditions.
-
NeuInfer: Knowledge Inference on N-ary Facts
- January 2020
- /gsp2014/NeuInfer
- As research in machine learning continues, the neural network model Neulnfer reasons about unknown elements from facts composed of primary triples and auxiliary descriptions.
-
KnowledGPT: Enhancing Large Language Models with Retrieval and Storage Access on Knowledge Bases
- Submitted on 17 Aug 2023
- Knowledge graphs can provide a means of interpreting and reasoning about knowledge for exploring the complex working steps and reasoning processes within large models. For example, KnowledGPT, a personalized knowledge base and large model integrated retrieval framework, improves the ability to handle complex searches and ambiguities
-
Joint Knowledge Graph and Large Language Model for Fault Diagnosis and Its Application in Aviation Assembly
- Date of Publication: 08 March 2024
- Knowledge graph fusion with large models is a hot research area
-
Unifying Large Language Models and Knowledge Graphs: A Roadmap
- A forward-looking roadmap for unifying big models and knowledge graphs is presented, existing state-of-the-art technologies for big models and knowledge graphs are summarized, and challenges and development directions related to the convergence of big models and knowledge graphs are discussed.
Large Model Enhanced Knowledge Graph
Enhanced Knowledge Graph Construction
-
ChatIE: Zero-Shot Information Extraction via Chatting with ChatGPT
- Submitted on 20 Feb 2023 (v1), last revised 27 May 2024 (this version, v2)
- /cocacola-lab/chatie
- ChatIE transforms entity extraction, command entity recognition, and event extraction tasks into multi-round quiz questions, aiming to decompose the complex information extraction task into multiple simple subtasks, and designing cue words to be entered into ChatGPT, which achieves good results on multiple information extraction datasets.
- Does not involve large model fine-tuning tasks
- Extracting information by prompting the project
- ChatExtract: Extracting Accurate Materials Data from Research Papers with Conversational Language Models and Prompt Engineering
- Submitted on 7 Mar 2023 (v1), last revised 21 Feb 2024 (this version, v3)
- ChatExtrac designs a strongly migratory data extraction method, the core of which realizes the automatic identification and data extraction function of large models in zero-sample learning by constructing a series of engineered cue words for extracting structured data, and ChatExtract can be applied to a variety of conversational large models to produce high-quality data extraction results.
- Does not involve large model fine-tuning tasks
- Extracting Hinted Problem Sets by Hinted Projects
- AutoKG: LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities
- Submitted on 22 May 2023 (v1), last revised 18 Aug 2024 (this version, v3)
- To address the inability of large models to capture complex relational dynamics when dealing with knowledge-intensive tasks, AutoKG proposes a lightweight and efficient automated knowledge graph construction method. Different from the traditional way of searching by semantic similarity, AutoKG uses pre-trained big models to construct a simplified knowledge graph.The knowledge graph constructed by AutoKG is different from the traditional knowledge graph structure in that it uses keywords as nodes, records the relevance weights among nodes, and AutoKG builds edges in the knowledge graph based on the degree of relevance among keywords. However, AutoKG only utilizes the information extraction capability of the big model in the stage of extracting keywords from the external knowledge base, and does not incorporate the big model in the subsequent stage of enriching the knowledge graph.
- Keyword Extraction by Pre-trained Large Models
-
Does Synthetic Data Generation of LLMs Help Clinical Text Mining?
- Submitted on 8 Mar 2023 (v1), last revised 10 Apr 2023 (this version, v2)
- Explore the effect of macromodels on entity extraction in healthcare by generating a large amount of high-quality labeled training data through macromodels and locally fine-tuning macromodels as cue information. The experimental results show that the fine-tuned macromodel has a better performance improvement compared to the generalized macromodel.
- Generating annotated samples using ChatGPT
-
Extract, Define, Canonicalize: An LLM-based Framework for Knowledge Graph Construction
- Submitted on 5 Apr 2024
Enhanced Knowledge Graph Completion
The construction of knowledge graphs is often incomplete, and traditional knowledge graph completion tasks utilize the structural information of knowledge graphs to predict missing entities in triples, but at a high cost in solving long-tailed entities. Large models can be used as an additional knowledge base to extract trustworthy knowledge and complete the knowledge graph completion.
-
Exploring Large Language Models for Knowledge Graph Completion
- Submitted on 26 Aug 2023 (v1), last revised 18 Feb 2024 (this version, v4)
- /yao8839836/kg-llm
- One approach is to directly apply the macromodels to the knowledge graph completion task by transforming the ternary classification, relationship prediction and entity (link) prediction into cue text, and generating the prediction results by cueing the macromodels. For example, literature [67] constructed experiments on fine-tuned big models KG-ChatGLM-6B and KG-LLaMA (7B and 13B) using entity and relationship descriptions of ternary groups as hints, and achieved better performance in the complementation task.
- Knowledge graph embedded with big models
-
KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion
- Submitted on 4 Feb 2024 (v1), last revised 23 Feb 2024 (this version, v2)
- /weiyanbin1999/kicgpt
- The other approach to complementation is to indirectly utilize the cueing macromodel. For example, KICGPT integrates the big model with a traditional structure-aware model (knowledge-completion retriever), where the retriever queries the missing triples in the link prediction task, generates a list of candidate entities based on the scores sorting, and then re-ranks the list of candidate entities using structural knowledge encoding as a big model scenario learning example.
- Large models as an additional knowledge base
-
Contextualization Distillation from Large Language Model for Knowledge Graph Completion
- Submitted on 28 Jan 2024 (v1), last revised 24 Feb 2024 (this version, v3)
- /david-li0406/contextulization-distillation
- Similarly, the literature [69] generates high-quality contextual descriptions for triples by prompting a large model, after which the model is trained with the help of a traditional complementation scheme to indirectly complement the knowledge graph through the large model.
- Large models as training data generators
-
Making Large Language Models Perform Better in Knowledge Graph Completion
- Submitted on 10 Oct 2023 (v1), last revised 14 Apr 2024 (this version, v2)
- /zjukg/kopa
- However, the simple question-and-answer approach cannot utilize the reasoning ability of the big model, and lacking the description of entities and relationships in the structure of the knowledge graph, the big model may not be able to understand the structural information of the knowledge graph. To achieve more effective and accurate graph completion, scholars have incorporated structural information such as entities and relationships in the knowledge graph into the big model, so that the big model has structure-aware reasoning capability. Literature [70] proposes a knowledge prefix adapter named KoPA to integrate the knowledge graph structural information into the big model.The KoPA model embeds the knowledge graph structural information with knowledge and projects it into the text space. Enhanced input cue sequences with the help of virtual knowledge tokens (Token) generated by KoPA fine-tune the big model, which enables the big model to decode the answers to the instructions in a structure-aware state, and improves the performance of the big model in the task of knowledge graph complementation.
- Knowledge graph embedded with big models
Enhanced Knowledge Graph Reasoning
-
Knowledge Graph and Big Model Fusion
- May 2022
- In previous knowledge graph reasoning tasks, researchers have used knowledge computing to perform explicit reasoning by embedding entities and relationships into high-dimensional vector spaces with the help of deep learning models [71]. However, this method relies on the knowledge of the knowledge graph itself and lacks the support of "world knowledge", and the arrival of big models has made the joint reasoning of general knowledge and knowledge graph possible.
-
Evaluating and Enhancing Large Language Models for Conversational Reasoning on Knowledge Graphs
- Submitted on 18 Dec 2023 (v1), last revised 4 Feb 2024 (this version, v2)
- LLM-ARK [72] directly utilizes the macromodel as an agent without accessing the model parameters, and represents the multi-hop inference task as a reinforcement learning sequential decision problem, which enables the macromodel to obtain an embedded representation of the rich information state by aggregating multi-scale inputs through full-textual contextual cues. The specific steps are that LLM-ARK formulates knowledge graph inference as a six-tuple of\((S, O, A, T, R, \gamma)\) The Markov Decision Process (MDP) described. Where.\(S\) A collection of states describing the environment;\(O\) Used to observe the complete state of the environment;\(A\) Indicates the next possible action;\(T\) Record status updates;\(R\) Indicates a reward signal;\(\gamma\)denotes the reward discount factor and introduces a proximal policy optimization (PPO) reinforcement learning algorithm to make the model learn continuously in the inference task. Experiments show that the model LLaMA-2-7B-ARK exhibits superior performance in a two-hop inference task.
- Large Models Guide Knowledge Graph Reasoning
-
Complex Logical Reasoning over Knowledge Graphs using Large Language Models
- Submitted on 2 May 2023 (v1), last revised 31 Mar 2024 (this version, v3)
- /akirato/llm-kg-reasoning
- LARK [73] also utilizes big models to guide complex reasoning in knowledge graphs, the difference is that it transforms complex knowledge graph reasoning into a two-step combination of contextual knowledge graph search and logical query reasoning, decomposes multi-operation complex logical queries into single-operation basic queries that big models are better at, and finally converts retrieved neighborhoods and decomposed queries into big-model hints to obtain the outputs; at the same time Using the big model to evaluate the final answer set, LARK utilizes the advantages of both graph extraction algorithms and big models to achieve high-performance complex reasoning.
- Large Models Guide Knowledge Graph Reasoning
-
ChatRule: Mining Logical Rules with Large Language Models for Knowledge Graph Reasoning
- Submitted on 4 Sep 2023 (v1), last revised 22 Jan 2024 (this version, v3)
- /RManLuo/ChatRule
- In the direction of traditional deductive reasoning, inference rule mining suffers from the problems of large resource cost, lack of scalability of logic rules, and poor effect of intensive rule search, which makes it difficult to extend to large-scale knowledge graphs. While the big model can understand natural language text, use the semantic knowledge and structural information internalized into the deep learning model to generate meaningful rules, combined with the knowledge of the knowledge graph, to achieve assisted mining of rules and evaluation schemes for knowledge graph reasoning. For example, the rule generator ChatRule [74], the framework is shown in Fig. 3. ChatRule takes rule instances in the knowledge graph as big model hints, generates a set of rough candidate rules, designs a logic rule scorer to filter out the unqualified rules, and finally employs Chain of Thoughts (CoT) [75] to strengthen the inference ability of the big model to validate the correctness of the rules, and to eliminate the big model's potential spurious Rules.
- Large models as rule generators
Enhanced Knowledge Graph Quiz
-
Hic-KGQA: Improving multi-hop question answering over knowledge graph via hypergraph and inference chain
- 9 October 2023
- retrieval-reasoning structure
- Traditional Knowledge Graph-based Question and Answer (KBQA) systems, such as retrieval-reasoning structures [76] and semantic parsing-based frameworks [77], are models that face problems such as the need for a large amount of training data to support them, an overdependence on human experts in the construction process, and a lack of generalization capabilities for specialized systems.
-
Knowledge Graph Question Answering with semantic oriented fusion model
- 7 June 2021
- Based on semantic parsing framework
- Traditional Knowledge Graph-based Question and Answer (KBQA) systems, such as retrieval-reasoning structures [76] and semantic parsing-based frameworks [77], are models that face problems such as the need for a large amount of training data to support them, an overdependence on human experts in the construction process, and a lack of generalization capabilities for specialized systems.
-
Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources
- Submitted on 22 May 2023 (v1), last revised 21 Feb 2024 (this version, v4)
- /damo-nlp-sg/chain-of-knowledge
- Whereas graph-model complementarity creates new opportunities for knowledge graph Q&A systems, today's augmented Q&A models enable relevant operations through fine-tuning techniques or direct application of large models [78]
-
LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT
- Submitted on 13 Jul 2023
- Some scholars have even explored the potential of ChatGPT for SPARQL query tasks [79]
-
Leveraging LLMs in Scholarly Knowledge Graph Question Answering
- Submitted on 16 Nov 2023
- /huntila/scholarly-kgqa
- Literature [80] proposes a method for answering academic knowledge graph questions with fewer samples using a large model. The specific steps are: for a target question, the model finds questions similar to the test question by analyzing the questions in the training set; these similar questions are input into the big model as hints to generate SPARQL queries for the target question, and finally the answer is obtained by querying the knowledge graph. This process does not involve pre-training of the big model, which reduces the consumption of computational resources and time. The study shows that the model acquires better scores in the SciQA-one challenge benchmark.
- Generate SPARQL queries by prompting the project
-
ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models
- Submitted on 13 Oct 2023 (v1), last revised 30 May 2024 (this version, v2)
- /lhrlab/chatkbqa
- ChatKBQA [81], on the other hand, uses instruction tuning techniques to fine-tune the open-source big model to obtain satisfactory semantic parsing capabilities, utilizes the big model to transform new natural language questions into logical forms based on the candidates, and finally generates corresponding SPARQL queries by retrieving entity information in the transformed logical forms.ChatKBQA combines the powerful semantic parsing functionality and knowledge graph interpretability, and introduces a new Graph of Thought Query (GQoT) paradigm for graph-model complementation, which better utilizes external knowledge to improve the interpretability of Q&A and at the same time reduces the phantom phenomenon of the big model.
- Fine-tuning Large Models to Generate SPARQL Queries
-
Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering
- Submitted on 11 Nov 2023 (v1), last revised 10 Jun 2024 (this version, v3)
- /zjukg/knowpat
- In contrast to the big model dialog augmentation for path reasoning and retrieval, KnowPAT [82] proposes a preference alignment framework for dealing with domain-specific question answering, a novel approach to augmenting knowledge graph-aware Q&A through knowledge preferences.Believing that the big model should utilize the domain knowledge for generating reliable answers, KnowPAT constructs a style preference set and a knowledge preference set. In addition, KnowPAT designs a new alignment goal that aligns the macromodel preferences with human preferences to train a more reliable and friendly Q&A system for the real-scenario Q&A domain. However, the KnowPAT model cannot be applied to certain domains, such as in the field of medicine and computer science, where KnowPAT shows a significant decrease in capability compared to the original model.
- Fine-tuning large models of knowledge preferences
-
Bring Your Own KG: Self-Supervised Program Synthesis for Zero-Shot KGQA
- Submitted on 14 Nov 2023 (v1), last revised 22 May 2024 (this version, v2)
- /amazon-science/BYOKG-NAACL24
- BYOKG [83] designs a generalized question answering system that aims to improve the generality and convenience of the model and is able to run on any knowledge graph without any manually labeled data. Inspired by human curiosity-driven learning, the system first adopts an unsupervised learning approach to learn unknown knowledge graphs through graph traversal. It also generates natural language questions to supplement the explored corpus with the help of a large model, and finally uses retrieval-enhanced reasoning to realize Q&A prediction.
- Big models as a complementary corpus
Knowledge Graph Enhanced Large Models
Enhancing the performance of large models themselves
Knowledge graphs to build pre-trained corpora
-
The Woman Worked as a Babysitter: On Biases in Language Generation
- Submitted on 3 Sep 2019 (v1), last revised 23 Oct 2019 (this version, v2)
- /ewsheng/nlg-bias
- Research has shown [85] that the use of structured data can improve the performance of the big model, which helps the big model to confirm factual issues and avoid illusions and fabrications from the perspective of the model itself. Therefore, in the pre-training stage of the big model, the structured information (entities, relationships, link paths) in the knowledge graph can be used as training data to enhance the big model's own emergence ability.
- KELM corpus.Knowledge Graph Based Synthetic Corpus Generation for Knowledge-Enhanced Language Model Pre-training
- Submitted on 23 Oct 2020 (v1), last revised 13 Mar 2021 (this version, v2)
- /google-research-datasets/KELM-corpus
- For example, the KELM corpus, the generalized model KGPT provides real textual information transformed based on knowledge graphs for large models.
- For example, literature [17] designs the TEKGEN linguistic pipeline model to convert entity subgraphs into a high-quality corpus (KELM). The pipeline consists of four parts: a heuristic aligner, a generator that converts triples to text, an entity subgraph creator, and a post-processing filter that removes low-quality output. High-quality natural language text is generated by fine-tuning the training corpus using a sequence-to-sequence model (T5 model [86]).
-
Text-to-Text Pre-Training for Data-to-Text Tasks
- Submitted on 21 May 2020 (v1), last revised 9 Jul 2021 (this version, v3)
- /google-research-datasets/ToTTo
- By using the sequence-to-sequence model (T5 model[86]) to fine-tune the training corpus and generate high-quality natural language text.
-
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
- Submitted on 5 Oct 2020 (v1), last revised 11 Oct 2020 (this version, v2)
- For example, the KELM corpus, the generalized model KGPT provides real textual information transformed based on knowledge graphs for large models.
- KGPT [18] is a knowledge-enhanced pre-trained language model that utilizes automatic alignment of knowledge graphs and texts to construct the knowledge-based corpus KGTEXT, and selects a highly semantically overlapping alignment strategy for denoising. Meanwhile, the model itself has a very strong generalization ability and shows a good performance on zero and few sample learning.
-
ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base
- Submitted on 10 May 2023 (v1), last revised 17 May 2024 (this version, v2)
- /siyuyuan/analogykb
- In addition to this, the pre-training corpus includes the ANALOGYKB corpus that enhances analogical reasoning for large models [87].
Knowledge graph embedded with big models
-
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
- Submitted on 11 Oct 2018 (v1), last revised 24 May 2019 (this version, v2)
- /google-research/bert
- In fact, shortly after the release of pre-training models represented by BERT [88] and GPT, a number of scholars have started to study knowledge-enhanced pre-training models [89, 90].
-
K-BERT: Enabling Language Representation with Knowledge Graph
- Submitted on 17 Sep 2019
- /autoliuweijie/K-BERT
- /alibaba/EasyNLP
- Examples include K-BERT [89], which injects triples of knowledge graphs into sentences as domain knowledge, and KnowBERT [90], an entity linking model.
-
Knowledge Enhanced Contextual Word Representations
- Submitted on 9 Sep 2019 (v1), last revised 31 Oct 2019 (this version, v2)
- /allenai/kb
- KnowBERT designs an approach to retrieve relevant entity embeddings using integrated entity links to update contextual word representations in the form of word-to-entity concerns, and to embed multiple knowledge bases into large-scale models to achieve knowledge enhancement.
-
ERNIE: Enhanced Language Representation with Informative Entities
- Submitted on 17 May 2019 (v1), last revised 4 Jun 2019 (this version, v3)
- /thunlp/ERNIE
- Today's knowledge-embedded macromodels usually use alignment techniques to associate knowledge graphs with natural language. Examples include ERNIE [91] and ERNIE 3.0 [92].ERNIE is an augmented language representation model that significantly improves knowledge-driven performance by constructing structured knowledge encoding modules that incorporate knowledge into linguistic understanding.
-
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
- Submitted on 5 Jul 2021
- /PaddlePaddle/PaddleNLP/tree/develop/legacy/model_zoo/ernie-3.0
- ERNIE 3.0 improves on this by fusing both autoregressive and self-coding networks, trained using large amounts of plain text and large-scale knowledge graphs.
-
SKILL: Structured Knowledge Infusion for Large Language Models
- Submitted on 17 May 2022
- Unlike the above models, SKILL [93] devised a method to train T5 models directly on factual triples of knowledge graphs, avoiding the differences between knowledge graph models and enabling the models to easily learn embedded factualized knowledge for application in Q&A work in a variety of industry domains.
-
KLMo: Knowledge Graph Enhanced Pretrained Language Model with Fine-Grained Relationships
- November 7–11, 202
- In addition, knowledge-enhanced macromodels include KLMo, a pre-trained language model that integrates fine-grained relationships between entities [94]
-
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation
- Submitted on 13 Nov 2019 (v1), last revised 23 Nov 2020 (this version, v3)
- /THU-KEG/KEPLER
- Unified Knowledge Embedding and Pre-trained Language Representation Model KEPLER
-
Structured Knowledge Grounding for Question Answering
- Submitted on 17 Sep 2022 (v1), last revised 5 Jun 2023 (this version, v3)
- Embedding Models for Converting Knowledge Graph Data into Natural Language [96]
-
KnowGPT: Knowledge Graph based Prompting for Large Language Models
- Submitted on 11 Dec 2023 (v1), last revised 4 Jun 2024 (this version, v5)
- KnowGPT, a black-box knowledge injection method based on ChatGPT [97]
-
Knowledge Graph-Enhanced Molecular Contrastive Learning with Functional Prompt
- 2022/6/28
- Literature [98] proposes a Knowledge Graph Augmented Molecular Contrastive Learning (KANO) approach based on functional cues, which utilizes an element-directed graph augmented contrastive learning framework to achieve the embedding of expertise maps in the field of chemistry, while the knowledge graphs are also used to generate functional cues, which help the large model to better understand the task-relevant knowledge during the fine-tuning process and provide reasonable chemical explanations for the predicted results.
- pdf: //AAAI/article/download/20313/20072
Enhanced Large Model Reasoning
-
Large Language Models' Understanding of Math: Source Criticism and Extrapolation
- Submitted on 12 Nov 2023
- Large models do not perform well in handling structured reasoning (e.g., solving mathematical problems [99])
-
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
- Submitted on 17 May 2023 (v1), last revised 3 Dec 2023 (this version, v2)
- /princeton-nlp/tree-of-thought-llm
- Tree of Thought (ToT)
-
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
- Submitted on 18 Aug 2023 (v1), last revised 6 Feb 2024 (this version, v4)
- /spcl/graph-of-thoughts
- Thinking Maps (GoT)
-
CohortGPT: An Enhanced GPT for Participant Recruitment in Clinical Study
- Submitted on 21 Jul 2023
- Thought chaining is an approach to augment large model reasoning tasks with fewer sample example cues that can perform complex reasoning by generating intermediate reasoning steps, e.g., CohortGPT [102] used a chained thought sampling strategy supplemented with a domain knowledge graph to augment a large model for reasoning in the medical domain.
-
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
- Submitted on 7 May 2023 (v1), last revised 9 Dec 2023 (this version, v2)
- /milesaturpin/cot-unfaithfulness
- However, the chain-of-thinking model may lead to compromised inference results when using biased features [103], changing its original direction.
-
JointLK: Joint Reasoning with Language Models and Knowledge Graphs for Commonsense Question Answering
- Submitted on 6 Dec 2021 (this version), latest version 2 May 2022 (v2)
- /yueqing-sun/jointlk
- The JointLK model, on the other hand, is optimized on the QA-GNN model, which is unable to accomplish bidirectional interactions since QA-GNN only attaches the QA context as an extra node to the knowledge graph. JointLK, on the other hand, realizes multi-step joint reasoning of language models and knowledge graphs through dense bi-directional attention modules. Specifically, the JointLK model combines two different modalities of information, namely task text information and external knowledge graph data, and designs a joint inference module that generates fine-grained bi-directional attentional mappings between each question token and each knowledge graph node to realize the fusion of different modalities of information. Meanwhile, the JointLK model designs a dynamic graph cropping module to denoise by removing irrelevant graph nodes to ensure that the model correctly uses complete and appropriate evidence for reasoning. Experimental results show that JointLK performs well in solving complex reasoning problems with negations.
-
QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering
- Submitted on 13 Apr 2021 (v1), last revised 13 Dec 2022 (this version, v5)
- JointLK [104] and QA-GNN [105], on the other hand, use graph neural networks (GNNs) and knowledge graphs to improve model inference. Compared to the previous model where text and knowledge are independent, QA-GNN connects the problem context with the retrieved knowledge to form a joint graph.
-
RoBERTa: A Robustly Optimized BERT Pretraining Approach
- Submitted on 26 Jul 2019
- /facebookresearch/fairseq
- Experiments show that QA-GNN exhibits better results compared to than RoBERTa [106].
-
Deep Bidirectional Language-Knowledge Graph Pretraining
- Submitted on 17 Oct 2022 (v1), last revised 19 Oct 2022 (this version, v2)
- /michiyasunaga/dragon
- The DRAGON [107] model employs a self-supervised learning strategy based on a deep two-way union (QA-GNN) of text and knowledge graphs. By unifying two self-supervised inference tasks, including masked language modeling (MLM) and link prediction, the DRAGON model achieves comprehensive pre-training for both text and knowledge graphs. This self-supervised learning strategy enables the model to better understand the relationship between text and knowledge graphs and thus reason more accurately.
Enhanced Large Model Search
-
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
- Submitted on 22 May 2020 (v1), last revised 12 Apr 2021 (this version, v4)
- Knowledge graph augmentation of large models by retrieval is one of the research directions that have received much attention in the field of natural language processing. A representative approach is Retrieval-Augmented Generation (RAG) [108]. RAG aims to solve the problem that large models cannot update their own knowledge by indexing external real knowledge vectors.RAG is a generative model that contains a combination of a Query Encoder, a pre-trained Retriever, and a pre-trained generative model. end-to-end training method. The specific steps are to retrieve potential information through a document retriever, execute the big model with the potential text as an additional context to the input information, and finally obtain the text target sequence.
-
LaMDA: Language Models for Dialog Applications
- Submitted on 20 Jan 2022 (v1), last revised 10 Feb 2022 (this version, v3)
- Similarly, the LaMDA [39] model includes LaMDA-Base and LaMDA-Research models, during the execution process, the LaMDA-Base model is first called to generate the outputs, but the outputs may have untrue information, and the model continues to call LaMDA-Research to interact with the information retrieval system several times, until the LaMDA-Research responds to the user output and replaces the unverifiable results. Compared to RAG, LaMDA has a smaller search range, but the accuracy of the search results is relatively high.
-
Improving language models by retrieving from trillions of tokens
- Submitted on 8 Dec 2021 (v1), last revised 7 Feb 2022 (this version, v3)
- To address the problem of excessive memory overhead for large models, RETRO [109] enhances the language model by retrieving similar document chunks from a large corpus, and achieves GPT-3 performance results for small-parameter model performance when dealing with downstream knowledge-intensive tasks.Both RAG [108] and RETRO [109] need to train the retrieval model separately and may lead to increased computational costs when dealing with super-large-scale and updatable highly updated external documents, which may lead to an increase in computational cost.
-
Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution
- Submitted on 9 Oct 2023 (v1), last revised 23 May 2024 (this version, v2)
- Based on retrieval, KaLMA [110] constructs a pipeline based on retrieval, reordering and generation, where retrieval enhances the knowledge-awareness property of the big model, and proposes the setting of "Conscious Incompetence", which means that when the language model realizes that it is not able to provide certain knowledge, it will When the language model realizes that it cannot provide certain knowledge, it will insert the mark "[NA]" in the text to indicate that the part of knowledge cannot be verified. The experimental results show that KaLMA [110] has some potential to improve the citation generation ability and retrieval accuracy of large models.
-
Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples
- Submitted on 22 Nov 2021 (v1), last revised 19 Oct 2022 (this version, v4)
- /ntunlp/kmlm
- Unlike the above, KMLM [111] is a knowledge-enhanced pre-training method directly based on multilingual triples, which transforms the triple information into multilingual text in the form of code while attaching structural information of the text during the pre-training process.The experimental results show that KMLM improves the reasoning ability of tacit knowledge, and exhibits a significant cross-language knowledge-intensive task (factual knowledge retrieval) with a significant performance improvement.
Enhancing Large Model Interpretability
-
LMExplainer: Grounding Knowledge and Explaining Language Models
- Submitted on 29 Mar 2023 (v1), last revised 16 Jul 2024 (this version, v3)
- Traditional approaches to the problem of model interpretability have focused on model intrinsic and ex post explanations, such as interpretable model structure and ex post feature selection. However, these approaches fall short in explaining the model decision-making process. Recent studies have provided textual explanations by means of multi-instance learning, attention matrices, and external knowledge structures, but still do not fully understand the model's reasoning process.LMExplainer [112] designed a knowledge-enhanced explanatory module that combines knowledge graphs with macromodels, and employs knowledge graphs and Graph Attention Networks (GATs) [113] to extract key decision signals of macromodels that Comprehensive, clear and understandable text interpretation is provided. The process is shown in Fig. 4, where the embedding of the input language is generated by the grand model, while relevant knowledge is retrieved from the knowledge graph to construct the subgraph, the language embedding and subgraph are used as the inputs to the graph neural network, and the graph attention network is used to obtain the attention scores, generating the final prediction results and the explanation process of the decision. The experimental results show that LMExplainer not only improves the model performance, but also can explain the model reasoning process more accurately.
-
Graph Attention Networks
- Submitted on 30 Oct 2017 (v1), last revised 4 Feb 2018 (this version, v3)
- LMExplainer [112] designed a knowledge-enhanced explanation module that combines knowledge mapping with macromodels, and employs knowledge graphs and graph attention networks (GATs) [113] to extract key decision signals from macromodels, providing comprehensive, clear, and comprehensible textual explanations.
-
XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making
- Submitted on 15 Nov 2023
- Similarly, XplainLLM [114] is the first dataset that captures the elements of big model reasoning and presents the decision-making process through human-understandable explanations, fulfilling the need for transparency, interpretability, and comprehensibility of big models in decision-making processes. By combining knowledge graphs and graph attention networks, a question-answer-explanation (QAE) triad is constructed that connects the big model reasoning process with entities and relationships in the knowledge graph. Evaluation results show a 2.4% improvement in the performance of the big model using explanations and better interpretability and comprehensibility in question-answer tasks.
Figure-mode complementary applications and prospects
Chart-mode complementary applications
-
Large Language Models for Biomedical Knowledge Graph Construction: Information extraction from EMR notes
- Submitted on 29 Jan 2023 (v1), last revised 9 Dec 2023 (this version, v2)
- Clinical diagnostic cases, electronic health records, and other types of patient medical records are an important source of knowledge in the field of computerized medicine. Big models are able to utilize these redundant records to automatically construct medical knowledge graphs [115]
-
Leveraging A Medical Knowledge Graph into Large Language Models for Diagnosis Prediction
- Submitted on 28 Aug 2023
- At the same time, the graph-model complementary system is also capable of reasoning to predict clinical cases and advancing the development of medical diagnostic decision-making systems [116]
-
GraphCare: Enhancing Healthcare Predictions with Personalized Knowledge Graphs
- Submitted on 22 May 2023 (v1), last revised 17 Jan 2024 (this version, v3)
- For example, GraphCare [117] constructs a personalized healthcare knowledge graph for patients by extracting knowledge from rich clinical knowledge through cue engineering and utilizes a bi-attentive augmented (BAT) graph neural network (GNN) model for downstream task prediction.
-
GPT-RE: In-context Learning for Relation Extraction using Large Language Models
- Submitted on 3 May 2023 (v1), last revised 9 Dec 2023 (this version, v3)
- For more complex entity-relationship extraction, GPT-RE [118] adopts task-aware retrieval and gold label (gold label) induced reasoning to realize contextual learning for relationship extraction.
- GPT-RE employs two task-aware retrieval methods to emphasize entity and relationship information in text through coded representations, followed by injecting inference logic through gold standard induced reasoning method (similar to chain of thought), obtaining a large number of demonstration cases of alignment of inputs and labels, and finally realizing the high accuracy and interpretability of the relationship extraction by suggesting a large model.
-
REALM: RAG-Driven Enhancement of Multimodal Electronic Health Records Analysis via Large Language Models
- Submitted on 10 Feb 2024
- Furthermore, in order to avoid the illusion phenomenon, the REALM model [119] proposes a Retrieval Augmented Generation (RAG)-driven framework that is capable of extracting entities from a variety of unstructured data (clinical records and EHRs) and matching them with external expertise maps to ensure the consistency and accuracy of the model's output results.
-
Exploring the Feasibility of ChatGPT for Event Extraction
- Submitted on 7 Mar 2023 (v1), last revised 9 Mar 2023 (this version, v2)
- event extraction
-
ChatGPT Evaluation on Sentence Level Relations: A Focus on Temporal, Causal, and Discourse Relations
- Submitted on 28 Apr 2023 (v1), last revised 26 Jan 2024 (this version, v3)
- Temporal Relationship Exploration
-
Zero-shot Temporal Relation Extraction with ChatGPT
- Submitted on 11 Apr 2023
- Temporal Relationship Extraction
-
Temporal Knowledge Graph Forecasting Without Knowledge Using In-Context Learning
- Submitted on 17 May 2023 (v1), last revised 20 Oct 2023 (this version, v3)
- However, since the temporal knowledge graph (TKG) is a complex temporal multi-relational graph data structure, large models are unable to directly understand structured temporal relational data, which holds some potential for prediction in the temporal knowledge graph [123].
-
GenTKG: Generative Forecasting on Temporal Knowledge Graph with Large Language Models
- Submitted on 11 Oct 2023 (v1), last revised 16 Apr 2024 (this version, v5)
- /mayhugotong/gentkg
- For example, GenTKG [124] introduces the instruction tuning macromodel into temporal knowledge graph, adopts a temporal rule-based retrieval strategy to realize the prediction of temporal relations, and outperforms the traditional rule-based and embedded schemes. Meanwhile, GenTKG greatly reduces the training cost through a lightweight instruction tuning technique.
-
Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models
- Submitted on 15 Jun 2023 (v1), last revised 27 Jun 2023 (this version, v2)
- /damo-nlp-sg/tempreason
- The large model also faces some problems in temporal knowledge reasoning [125], where the temporal reasoning ability of the large model can be biased by changes in temporal information.
-
Back to the Future: Towards Explainable Temporal Reasoning with Large Language Models
- Submitted on 2 Oct 2023 (v1), last revised 8 Oct 2023 (this version, v2)
- To address this problem, the literature [126] utilizes temporal knowledge graph and large model fusion to design a system for complex temporal reasoning, which achieves interpretable prediction of future event occurrences by learning the context of temporal reasoning. Meanwhile, this literature proposes a multi-source instruction-adjusted dataset for interpretable temporal reasoning (ExpTime), which is constructed as shown in Fig. 5(b), and accomplishes the construction task with the help of the temporal knowledge graph dataset and its temporal reasoning path. Experiments show that this dataset can improve the event prediction and interpretive reasoning ability of the large model (LlaMA2).
(sth. or sb) else
Large open knowledge base:
- YAGO
- DBpedia
- Freebase
- Wikidata
Open Source Repository:
-
/thunlp/OpenKE
- An Open-Source Package for Knowledge Embedding (KE)
- /tatsu-lab/stanford_alpaca
- /facebookresearch/fairseq
- /alibaba/EasyNLP
- /PaddlePaddle/PaddleNLP