Watch this episode here https://youtu.be/_HDwoLMKr1g
This episode focuses on the rigorous pipeline required to achieve 90%+ accuracy, specifically detailing how the embedding model is fine-tuned and deployed1....
1. Preparation of Training and Synthetic Data
The core of this episode is the preparation of a high-quality training dataset for the Nomic AI (modernbert-embed-base) embedding model1.
• Anchors and Positives: The training data uses an "anchor" (the user question) and "positives" (the metadata required for SQL, such as table names, field names, joins, and master data identifiers)14.
• Messy Variants: To make the model robust against real-world human phrasing, the team uses GPT-4 mini to generate "synthetic" messy variants for every question56. For example, a question about "out of stock items" is expanded into versions like "products with zero stock" or "items not present in bin"5.
2. The Enriched Schema and Master Data
Standard MySQL row schemas lack the descriptive depth necessary for high-accuracy AI67. Episode 3 details the creation of an Enriched Schema:
• Contextual Metadata: This involves adding synonyms, descriptions, and join chunks to standard ERPNext Doctypes78.
• Automation: While currently a time-consuming manual process involving AI tools like ChatGPT, the team is working toward automating enriched schema generation8.
3. Fine-Tuning via Google Colab
The team utilizes Google Colab Pro+ for its GPU power to handle the billions of parameters required for training9.
• The Loss Function: A critical technical explanation in this episode is the Loss Function, which mathematically pushes "positive" metadata vectors closer to the user’s question vector in the embedding space while pushing irrelevant "negative" vectors away1011.
• Evaluation Metrics: The model's performance is measured using specific metrics, including MRR (Mean Reciprocal Rank) to track how quickly the model finds the first correct answer and NDCG to ensure relevant results are ranked at the top10.
4. Deployment on Hugging Face and Replicate
Once fine-tuned, the model is pushed to Hugging Face for storage and then deployed as a production-grade AI server212.
• Replicate and Docker: The models are wrapped in Docker containers (using Cog) and deployed via Replicate213.
• The Multi-Model Strategy: The architecture utilizes a Qwen 4B model for heavy tasks like SQL generation and rewriting questions based on chat history1415. A lighter Qwen 1.5B model is used for formatting database results into a user-friendly, "friendly" text response1416.
5. Open Source and Contribution
The episode concludes with an invitation for the community to contribute training datasets via the ERPGulf GitHub and Hugging Face repositories317. By sharing training data and reporting bugs, developers can help refine the model's understanding of various ERP schema styles




