Is Your AI Target Defensible? How RAGRecon Solves the Trust Gap in Cybersecurity

RAGRecon, a system to improve Cyber Threat Intelligence through the integration of Large Language Models and Retrieval-Augmented Generation.

As cyber threats grow in volume and complexity, traditional security measures are struggling to keep up. Rapid collection, analysis, and application of Cyber Threat Intelligence (CTI) is now essential for effective defense. Integrating Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) offers a new approach to CTI analysis. The combination leverages LLMs’ advanced text processing, anchored in real-time processing, to enable automated cyber threat intelligence.

View RAGRecon Reaserch Paper

RAGRecon Proof-of-Concept

RAGRecon research is a novel proof-of-concept that integrates LLMs, RAG, and Knowledge Graphs (KGs) to deliver explainable cyber threat intelligence. Its core function is to provide clear, context-aware answers to complex cybersecurity questions. The system ingests domain-specific documents, such as technical threat reports. Additionally, when a user poses a query, the system retrieves the most relevant information to form a context.

In context, RAGRecon generates not only a conversational textual answer but also a visual Knowledge Graph that graphically represents the key entities and relationships the model used in its reasoning. A dual-output approach provided a transparent, interpretable layer, enabling analysts to trust and verify the system’s conclusions in real-time.

RAGRecon, a system designed to improve Cyber Threat Intelligence through the integration of Large Language Models and Retrieval-Augmented Generation.

Foundational Work, Architecture, and Performance of theRAGRecon System

As shown in the study, RAGRecon employed an end-to-end data pipeline to process and retrieve unstructured cyber threat intelligence data efficiently. Source documents, such as PDF reports, are segmented into 1000-character chunks with a 100-character overlap to maintain context. Each chunk is converted into an embedding using the sentence-transformers/all-MiniLM-L6-v2 model.

RAGRecon evaluated LLMs using two custom datasets, one on conventional CTI and another on blockchain-specific threats, each with 50 questions. The RAG system showed strong factual and reliability scores, consistently above 0.8 on a 1.0 scale, indicating minimal hallucination.

The retrieval of the Achan scaleism was efficient, as only 8% of retrieved contexts yielded a complete answer. Manual analysis of 2,050 automated decisions confirmed a correct decision rate of 90% to 97%. Minor performance variations are linked to occasional errors from both the generation and self-evaluation models.

RAGRecon Dual-Output Generation For Textual and Visual Insights

A key innovation of RAGRecon is its dual-output capability, delivering both a direct textual answer. As well as a visual explanation of data relationships. For the textual response, the setup integrated the retrieved context and user query into a model-agnostic prompt. Next, instruct the LLM to generate a coherent answer based solely on the provided information. Simultaneously, a specialized prompt directs the LLM to extract primary entities and relationships. While outputting them in a structured JSON format, then parsed and visualized as an interactive graph.

RAGRecon Enhanced Efficacy for Security Operations

The research offers a valuable approach to automated cyber threat intelligence for cybersecurity professionals. Notably, a robust RAGRecon system will reduce the cognitive load and manual effort needed to analyze unstructured cyber threat reports. Additionally, it will enable cyber analysts to visualize and understand complex cyber intelligence threats quickly. At the same time, identify cybersecurity vulnerabilities and develop a cyber threat response.

Conclusion

The field of Cyber Threat Intelligence requires AI systems that are both powerful and explainable. The research builds on the proven RAGRecon system, which already generates accurate, fact-based answers from complex security documents. Their research addressed the main limitations by developing methodologies for reliable Knowledge Graph generation, the key to its explainability. Resolving the bottleneck will unlock the full potential of an LLM-driven cyber threat intelligence analysis. At the same time, it delivers a trustworthy and effective tool for cybersecurity professionals.

Sources

Large Language Models for Explainable Threat Intelligence. (RAGRecon). Large Language Models, 7 Nov 2025. https://arxiv.org/abs/2511.05406

Large Language Models for Explainable Threat Intelligence. (RAGRecon). Large Language Models, 7 Nov 2025. https://arxiv.org/pdf/2511.05406

Disclosure: This Page may contain affiliate links. We may receive compensation if you click on these links and make a purchase.