Two of the more credible voices in the field are publicly calling the demo cycle a fundraising tool, which is useful air cover for operators getting pressured by boards or customers to pilot whatever just trended on X. Hurst saying startups "prey on" anthropomorphization and Levine framing the real test as pouring wine from any bottle into any glass in any environment is the same generalization gap that has kept Agility's Digit on a narrow tote-moving task at GXO rather than a broader work envelope.
The operator translation: when evaluating a humanoid vendor, ignore the highlight reel and ask for quantitative large-scale evaluation data across SKU variance, lighting conditions, and operator handoffs. If the vendor can only produce a stage demo, the unit economics conversation is premature. This also reframes Physical Intelligence's own pitch, since Levine is effectively arguing that foundation-model generalization (his company's thesis) is the binding constraint, not hardware.