近期关于[ITmedia P的讨论持续升温。我们从海量信息中筛选出最具价值的几个要点,供您参考。
首先,A growing countertrend towards smaller (opens in new tab) models aims to boost efficiency, enabled by careful model design and data curation – a goal pioneered by the Phi family of models (opens in new tab) and furthered by Phi-4-reasoning-vision-15B. We specifically build on learnings from the Phi-4 and Phi-4-Reasoning language models and show how a multimodal model can be trained to cover a wide range of vision and language tasks without relying on extremely large training datasets, architectures, or excessive inference‑time token generation. Our model is intended to be lightweight enough to run on modest hardware while remaining capable of structured reasoning when it is beneficial. Our model was trained with far less compute than many recent open-weight VLMs of similar size. We used just 200 billion tokens of multimodal data leveraging Phi-4-reasoning (trained with 16 billion tokens) based on a core model Phi-4 (400 billion unique tokens), compared to more than 1 trillion tokens used for training multimodal models like Qwen 2.5 VL (opens in new tab) and 3 VL (opens in new tab), Kimi-VL (opens in new tab), and Gemma3 (opens in new tab). We can therefore present a compelling option compared to existing models pushing the pareto-frontier of the tradeoff between accuracy and compute costs.
。新收录的资料对此有专业解读
其次,比如特斯拉Optimus机器人也开始进驻工厂,进行分拣电池、行走等任务;奇瑞墨甲人形机器人“莫茵”已在4S店帮忙卖车……
据统计数据显示,相关领域的市场规模已达到了新的历史高点,年复合增长率保持在两位数水平。
,这一点在新收录的资料中也有详细论述
第三,Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.,这一点在新收录的资料中也有详细论述
此外,当前,大模型正快速向具备自主规划能力的「智能体(Agent)」方向演进,AI 需要频繁回顾动辄数万字的上下文,导致系统性能的制约因素已从「算力不足」转变为「数据传输太慢」。
展望未来,[ITmedia P的发展趋势值得持续关注。专家建议,各方应加强协作创新,共同推动行业向更加健康、可持续的方向发展。