shape

Big tech company fine-tunes generative AI with 235 domain experts

How to differentiate a GenAI open-source LLM: have it fine-tuned by data annotators who are qualified experts in their field
shape dots shape dots

Our client wanted to fine-tune its GenAI open-source large language model (LLM) to increase its accuracy, safety and robustness. Realizing those goals would be hard to achieve with a conventional crowdsourcing approach to data annotation, the company reached out to RWS who leveraged its TrainAI team to quickly recruit, train and manage a scalable team of qualified subject-matter experts as data annotators to complete the work.

TrainAI by RWS follows the principles of responsible AI to deliver dependable LLM training and fine-tuning data that’s ethically sourced, fair, accurate and reliable, transparent and explainable, private and secure.

Challenges

  • Maximize LLM accuracy by training it on specific topic areas 
  • Improve safety and security by mitigating the risk of generating hallucinations or harmful content 
  • Achieve a standard that makes the LLM a resource for professionals

Solution

  • TrainAI from RWS
    • Generative AI data services
    • Domain expertise: recruiting, training and managing subject-matter experts as data annotators 
    • Content creation: prompt engineering 
    • Model fine-tuning: prompt-response QA, fact extraction and verification 
    • Risk mitigation: red teaming

Results

  • 4-week project ramp-up 
  • 235 domain experts recruited as part-time RWS employees 
  • 32,000 hours of work done in the first 3 months 
  • Supported training and rollout of the client's latest LLM version