<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>LLMs | LARK: NLP &amp; AI Research Lab @ CU</title><link>https://www.larknlp.com/tag/llms/</link><atom:link href="https://www.larknlp.com/tag/llms/index.xml" rel="self" type="application/rss+xml"/><description>LLMs</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Tue, 09 Jun 2026 00:00:00 +0000</lastBuildDate><image><url>https://www.larknlp.com/media/icon_hu5ad624ac1c82e37640b1fcc57b0f97c5_291087_512x512_fill_lanczos_center_3.png</url><title>LLMs</title><link>https://www.larknlp.com/tag/llms/</link></image><item><title>LogosKG: Hardware-Optimized Scalable and Interpretable Knowledge Graph Retrieval</title><link>https://www.larknlp.com/projects/logoskg/</link><pubDate>Tue, 09 Jun 2026 00:00:00 +0000</pubDate><guid>https://www.larknlp.com/projects/logoskg/</guid><description>&lt;p>&lt;em>Project theme&lt;/em>: Enabling scalable, interpretable multi-hop retrieval over large biomedical knowledge graphs as a foundation for trustworthy KG-LLM integration.&lt;/p>
&lt;p>&lt;em>Lead&lt;/em>: This project was led by He Cheng, Postdoctoral Researcher at the LARK Lab.&lt;/p>
&lt;h2 id="project-motivation">Project Motivation&lt;/h2>
&lt;p>Integrating knowledge graphs (KGs) with large language models (LLMs) is one of the most promising paths toward structured, verifiable biomedical reasoning. The central operation in this integration is multi-hop retrieval: starting from a query entity, traversing chains of relations across genes, pathways, diseases, and therapies to surface relevant evidence. Yet existing systems cannot simultaneously satisfy three basic requirements: they are either efficient but opaque, interpretable but slow, or scalable but approximate.&lt;/p>
&lt;p>This limitation is not incidental. It reflects a mismatch between how KG traversal is conventionally implemented — as iterative graph database queries, and how modern hardware actually executes computation. At billion-edge scale, conventional approaches become slow and memory-intensive. Interpretability is sacrificed for speed, or scale is sacrificed for faithfulness.&lt;/p>
&lt;p>This project asks a precise question:&lt;/p>
&lt;h4 id="can-multi-hop-kg-retrieval-be-made-simultaneously-efficient-scalable-and-interpretable--without-trading-one-off-against-the-others">Can multi-hop KG retrieval be made simultaneously efficient, scalable, and interpretable — without trading one off against the others?&lt;/h4>
&lt;h2 id="what-we-built">What We Built&lt;/h2>
&lt;p>We introduce &lt;strong>LogosKG&lt;/strong>, a hardware-aligned framework that reformulates k-hop KG traversal as a sequence of hardware-efficient matrix operations over decomposed subject, object, and relation representations. By grounding traversal in symbolic KG formulations rather than neural approximations, LogosKG preserves full interpretability: every retrieved path is an exact, traceable chain of relations.&lt;/p>
&lt;p>To scale to billion-edge graphs, LogosKG integrates three mechanisms:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Degree-aware partitioning&lt;/strong> — distributes graph structure across memory hierarchies in proportion to node connectivity, avoiding bottlenecks at high-degree hubs.&lt;/li>
&lt;li>&lt;strong>Cross-graph routing&lt;/strong> — enables traversal across heterogeneous KG sources without requiring a unified index, supporting multi-source biomedical KGs.&lt;/li>
&lt;li>&lt;strong>On-demand caching&lt;/strong> — loads subgraph neighborhoods lazily, reducing peak memory footprint for sparse traversal patterns.&lt;/li>
&lt;/ul>
&lt;p>Together, these allow LogosKG to perform chained inferences across genes, pathways, diseases, and therapies on a single workstation at scales previously requiring distributed infrastructure.&lt;/p>
&lt;h2 id="key-results">Key Results&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Substantial efficiency gains&lt;/strong> over CPU and GPU baselines across k-hop retrieval benchmarks, with no loss of retrieval fidelity relative to exact methods.&lt;/li>
&lt;li>&lt;strong>Billion-edge scalability&lt;/strong> demonstrated on large biomedical KGs, with latency and memory costs substantially lower than Neo4j and GPU-accelerated graph frameworks.&lt;/li>
&lt;li>&lt;strong>KG-LLM interaction analysis&lt;/strong>: a two-round pipeline using LogosKG reveals how KG topology shapes the alignment between structured biomedical knowledge and LLM diagnostic reasoning. Dense, well-studied regions of the KG produce high LLM-KG alignment; sparse regions expose the limits of LLM biomedical knowledge.&lt;/li>
&lt;/ul>
&lt;h2 id="why-this-matters">Why This Matters&lt;/h2>
&lt;p>Most current KG-LLM integration work treats the KG as a static retrieval index and the LLM as a fixed reasoner. LogosKG enables a different relationship: because retrieval is fast, interpretable, and scalable, it becomes feasible to study how the &lt;em>structure&lt;/em> of the KG shapes LLM reasoning, and to design integration strategies that account for that structure rather than ignoring it.&lt;/p>
&lt;p>This has direct implications for biomedical discovery. When LLM reasoning is anchored to KG regions with dense, well-curated evidence, it is more reliable. When it ventures into sparse, long-tail regions, it is more likely to hallucinate or confabulate. LogosKG makes this distinction visible and actionable.&lt;/p>
&lt;h2 id="connection-to-broader-research">Connection to Broader Research&lt;/h2>
&lt;p>LogosKG is the scalable knowledge substrate underlying our AI4Science research program. It directly addresses the alignment coverage and scalability gap, and provides the retrieval foundation for the multi-step reasoning.&lt;/p>
&lt;h2 id="publication">Publication&lt;/h2>
&lt;p>&lt;strong>LogosKG: Hardware-Optimized Scalable and Interpretable Knowledge Graph Retrieval&lt;/strong>&lt;br>
LARK Lab, University of Colorado Anschutz&lt;br>
&lt;em>ACL 2026 (Main Conference)&lt;/em>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Paper:&lt;/strong> &lt;a href="https://arxiv.org/abs/2604.18913" target="_blank" rel="noopener">openreview.net/forum?id=AvpJrTtFKb&lt;/a>&lt;/li>
&lt;li>&lt;strong>Keywords:&lt;/strong> knowledge graphs, neurosymbolic approaches, biomedical knowledge graphs, clinical NLP, hardware-efficient inference&lt;/li>
&lt;/ul></description></item></channel></rss>