Episode 2: Contextualizing Data with RAG | Solving AI Hallucinations in ERPNext
In this second episode of our AI series, we move beyond the initial server setup to explore the critical role of data preparation and contextualization.
We discuss why simply installing a model like Qwen is not enough to handle the complex requirements of an ERP system.
The Mechanics of Next-Word Prediction
Large Language Models function by finding the most probable next word or letter in a sequence. This process is based on the foundational concept of Attention from the paper Attention is All You Need, which allows models to process tokens and predict subsequent text through transformers. Without specific guidance, a model will choose the most statistically likely word based on its general training, which leads to inaccurate results in a specialized business setting.
The ERP Challenge: Why General AI Hallucinates
A standard LLM often provides incorrect but polite guesses, or hallucinations, when asked to write SQL for ERP systems because it lacks specific attention to your database structure. For example, a general model might try to select data from a table named customers, but in ERPNext, the correct table name is tabCustomer. Furthermore, models struggle with ambiguity; a user might ask about John or pens, but the system requires specific Customer Codes like T0034 or Item Codes like ITM-001 to execute a valid query. The problem is even greater with private apps and custom fields that are not published on GitHub, as the LLM has no way to guess these internal table and field names.
Reducing Hallucinations with RAG
To solve these issues and achieve accurate results, we implement Retrieval-Augmented Generation, or RAG. This architecture works in three specific stages:
1. Retrieval: The system retrieves specific facts from your local server, such as master table data, table names, field names, and the joins needed to connect them.
2. Augmentation: The original user question is enhanced or augmented with this retrieved metadata to provide the necessary context.
3. Generation: The LLM generates a precise response or SQL query grounded in these actual system facts rather than general probability.
The Role of Embedding Models
We use embedding models like Nomic-AI to convert words and schema into vectors, which are multi-dimensional numbers. These models allow the system to perform mathematical matching to find the nearest neighbor to a user's query among billions of parameters. By supplying the right context as vectors, we shift the mathematical weight so the model generates the correct business logic instead of a generic answer.
Why RAG Focuses on Schema over Transactions
While it might seem helpful to send millions of sales invoices as context, transaction data is far too massive for an LLM to process efficiently. Instead, we use RAG to provide the map—specifically master tables, field names, and joins—and then rely on SQL or ORM to calculate totals and sums from the actual records. This ensures the AI provides the correct query to handle the heavy data processing.
Key Topics Covered:
The fundamental logic of Transformers and Attention.
Why generic LLMs fail with private ERPNext apps and custom fields.
Using RAG to ground AI in master table data and specific schema.
The difference between processing transactions and providing context.
How embedding models like Nomic-AI manage mathematical vectors.




