LLM-Driven Ontology Construction for Enterprise Knowledge Graphs

🌐 English · 한국어

Summary

OntoEKG is an LLM-driven pipeline by Oyewale & Soru (Liber AI Research) that generates domain-specific RDF/OWL ontologies for enterprise knowledge graphs (EKGs) directly from unstructured enterprise text. It decomposes ontology modelling into two phases: an extraction module that identifies core classes and properties, and an entailment module that logically structures those classes into a hierarchy before RDF serialisation. Evaluated on a new three-sector dataset (Data, Finance, Logistics), it reaches a fuzzy-match F1 of 0.724 in the Data domain but reveals clear limitations in scope definition and hierarchical reasoning. The paper doubles as a call to action for a comprehensive end-to-end ontology-construction benchmark.

Key Contributions

OntoEKG pipeline — a two-step LLM process (extraction then entailment) that turns unstructured enterprise text into a formal RDF ontology serialised to Turtle.
A call for a benchmark — argues existing benchmarks (OntoURL, Text2KGBench, OSKGC, LLMs4OL) do not support end-to-end ontology construction from unstructured text, and urges the community to build one.
A new evaluation dataset — three enterprise policy-text use cases in the Data, Finance, and Logistics sectors (released in the OntoEKG GitHub repo).

Methodology and Architecture

Formalisation: from text T, infer classes C^T and properties P^T. Each class has a label and description; each property has a label, a domain class, and a range class. Classes form a hierarchy (c1 ⊆ c2). In RDF, classes are owl:Class, properties are owl:ObjectProperty, and datatypes are reified into their own classes (Schema.org style).

Four-stage pipeline:

Data Ingestion — unstructured text in; Pydantic data models force valid JSON output (classes, properties, descriptions, domains, ranges).
Ontological Element Extraction — an extraction LLM with a specialised system prompt identifies Classes (entity types) and Properties (relationships), constrained to the provided schema.
Hierarchy Construction with Entailment — an entailment LLM iteratively reasons over class descriptions to infer subclass/inheritance relationships and build the taxonomy.
RDF Serialisation — properties + hierarchy merged and written to Turtle via rdflib.

Models: Extraction = Google Gemini 3 Flash (preview); Entailment = Anthropic Claude 4.5 Opus. Run on Google Colab. Other entailment candidates (Gemini 2.5 Flash/Pro, 3 Flash preview, Claude 4.5 Sonnet) were tried but underperformed; Gemini 2.5 Pro was dropped for efficiency.

Results

Two matching schemes on the three use cases (fuzzy match = embedding-based triple alignment, thresholds 0.94/0.94/0.95):

Use case	Exact F1	Fuzzy F1
Data	0.102	0.724
Finance	0.000	0.121
Logistics	0.048	0.431

Data was best (fuzzy P=0.656, R=0.807, F1=0.724); Finance worst (F1=0.121), due to disagreement over which terms belong in the ontology.
Qualitative failures: “Policy” and “GovernanceStandard” each declared a subclass of the other (spurious equivalence), and an ambiguous “isTypeOf” property unclear between rdf:subClassOf and rdf:type.
Limitations: LLMs struggle to set ontology scope autonomously, sometimes propose individuals instead of classes (no declared abstraction level), and confuse the directionality of hierarchy relations with loose subsumption — hurting logical consistency.
Future work: end-to-end text→RDF translation, handling named individuals and provenance metadata, progressive ontology construction by feeding the existing model back into OntoEKG, and a community benchmark.

No closely related papers in the wiki yet. The current siblings (pedestrian-robot interaction, sidewalk delivery-robot evaluation, pedestrian capacity) are HRI/logistics topics that do not overlap with this paper’s LLM ontology-construction method.

LLM Wiki — Logistics · Ontology · Delivery-Robot HRI

Explorer

LLM-Driven Ontology Construction for Enterprise Knowledge Graphs

Summary

Key Contributions

Methodology and Architecture

Results

Graph View

Table of Contents

Backlinks

LLM Wiki — Logistics · Ontology · Delivery-Robot HRI

Explorer

LLM-Driven Ontology Construction for Enterprise Knowledge Graphs

Summary

Key Contributions

Methodology and Architecture

Results

Related Papers

Graph View

Table of Contents

Backlinks