A vision-language model fine-tuned on your CAD library and a year of factory-floor footage is a digital-twin starter kit on rails. It catches the gap between as-designed and as-built — the gap that eats manufacturing margin — without instrumenting a single new sensor.