ServiceNow leveraged SuperAnnotate – the platform that helps enterprises build efficient human data and evaluation pipelines to ship better agentic, multimodal, and frontier AI – to develop a domain-specific vision-language (DSvLM). By leveraging the platform and SuperAnnotate’s AI Operations Services, they were able to build an 18 k-image training corpus, enabling ServiceNow to fine-tune an open-weight Llama-based model that now outperforms GPT-4o. The result is StarFlow, an open-source vision-language model that lets any ServiceNow user transform hand-drawn ideas into executable automation in minutes, illustrating the advantages of fine-tuning on domain specific data to increase model performance on specific tasks.

About ServiceNow
ServiceNow is the cloud software company that makes the world of work work better for everyone. Its low-code Flow Designer sits at the heart of a platform used by over 7,700 enterprises to automate IT requests, HR onboarding, security operations, finance approvals, and more.
Challenge
ServiceNow aimed to launch a new workflow design feature powered by a domain specific vision‑language model that could convert hand‑drawn workflow sketches—the format brainstorms actually start in, into fully functional, ServiceNow workflows.
The main operational hurdle was to obtain the best possible domain specific data: accuracy demands large, diverse annotations covering the endless ways humans draw boxes, arrows, and decision nodes.
Internally, ServiceNow scientists had labeled only a few hundred diagrams, as the task is time-consuming and falls outside their core responsibilities. As a result, outsourcing became a necessary part of the workflow. However, past outsourcing efforts posed challenges — some labelled samples were of low quality and had to be discarded. In one instance, the team even had to invest significant time and effort to redo portions of the labeling themselves, delaying model training and downstream results. This rework considerably slowed progress on their research roadmap.
"Our previous experience outsourcing data labeling was challenging. Poor quality and lack of transparency made it difficult to trust external providers."
— Perouz Taslakian, Senior Research Scientist, ServiceNow
Why SuperAnnotate
ServiceNow selected SuperAnnotate for its integrated AI data platform and expert DataOps consulting. Unlike traditional data service providers that operate like black boxes, SuperAnnotate’s software let ServiceNow integrate their own domain experts in the loop, monitor annotation and overall project progress, give real‑time feedback, and maintain data quality throughout the entire end-to-end process, whether using SuperAnnotate‑vetted annotators or any external vendor on the same platform.
“Previously, annotated data was stored in large monolithic files, making it hard to manage and reference. SuperAnnotate’s folder-wise structure and intuitive interface allowed us to streamline batch access, manage annotations more cleanly, and ensure better collaboration across our team.”
— Spandana Gella, Senior Research Manager, ServiceNow
Solution & Implementation
SuperAnnotate implemented a structured annotation approach tailored specifically to ServiceNow’s complex vision-language needs. Leveraging weekly feedback cycles, SuperAnnotate swiftly adapted evolving annotation guidelines and refined project setups in an iterative approach, ensuring the resulting datasets accurately represented the diverse sketches users might produce.
Each image passed through a three‑stage QC process, initial label, primary QA, secondary QA, with the in platform commenting thread functionality being utilized to highlight issues for discussion. The result was 18,881 high‑fidelity images mapped to structured JSON files, covering more than fifty workflow patterns, triggers, and component combinations. (You can access the dataset on Huggingface)
After the datasets were created, the datasets were transferred to ServiceNow’s internal clusters, where engineers fine‑tuned a LLaMA-3.2-11B-Vision-Instruct model. (Available on Huggingface).
“SuperAnnotate improved the overall manageability of the project. It enabled faster retrieval and better organization of data, which made day-to-day operations smoother. Additionally, the platform facilitated easier project management, particularly when working with multiple people who could process and evaluate data from our end, multiple batches and annotation types.“
— Spandana Gella, Senior Research Manager, ServiceNow
Results of training StarFlow with SuperAnnotate data.
Substantial performance gains in both in-distribution and real-world use.
- 95.5% accuracy achieved by the fine-tuned LLaMA-3.2-11B-Vision-Instruct model on internal validation sets, more than doubling the performance of the base model (46.6%) and outperforming GPT‑4o (78.6%) on the same tasks.
- When evaluated on real, in-platform production workflows, including out-of-distribution inputs, the fine-tuned model delivered a 40–50% improvement in enterprise workflow interpretation compared to general-purpose models like GPT-4o and LLaMA base variants.
- Dramatically accelerated workflow creation, reducing manual effort and pushing the boundaries of what enterprise users can achieve with the model.
“The ability to combine our domain experts with outsourced teams inside one platform significantly improved our iteration speed. Spotting and correcting errors early in the labeling process saved us considerable downstream effort and made it possible to build the high-quality datasets needed to achieve these results.”
— Patrice Béchard, Applied Scientist, ServiceNow
To explore the research details, dataset, and models, check out the following resources:
- ServiceNow Research Blog Post: StarFlow AI Turns Sketches into Workflows
- Hugging Face Dataset: BigDocs-Sketch2Flow
- Models on Hugging Face:
- Code Repository: ServiceNow StarFlow on GitHub
Ready to replicate these results? Get started with SuperAnnotate today.
About SuperAnnotate
SuperAnnotate is the enterprise AI data platform that brings expert knowledge into AI-ready datasets and repeatable evaluation workflows. Its human- and agent-in-the-loop engine blends evaluation, automated quality controls, and golden dataset creation, thus enabling enterprises to advance from pilots into production-ready models in record time. Backed by NVIDIA, Dell Technologies Capital, Databricks Ventures, and Cox Enterprises, SuperAnnotate empowers teams across industries to deliver trustworthy AI faster.