2026年5月14日 星期四

DeepSeek的後續是將進一步擴大中國在開源人工智能領域的影響力(1/2)

Recently the New York Times reported the following:

DeepSeek’s Sequel Set to Extend China’s Reach in Open-Source A.I. (1/2)

Chinese companies have embraced making their most advanced artificial intelligence models available to all.

The NYT - By Meaghan Tobin and Cade Metz - Meaghan Tobin reported from Taipei, Taiwan, and Cade Metz from San Francisco.

April 24, 2026

Updated 11:47 a.m. ET

When the Chinese start-up DeepSeek published details about one of its artificial intelligence models last year, it sent shock waves through the tech industry.

The company said it had built its system by spending far less on computer chips than American rivals like OpenAI and Anthropic. It marked the start of what became known as China’s “DeepSeek moment,” shorthand for the belief that Chinese A.I. companies were ready to showcase their technical capabilities to the world.

The DeepSeek moment reflected a shift in the global A.I. landscape. The change was about not only lower costs but also openness in how the technology is shared.

DeepSeek released its models as open source, which means others can freely use and modify them. By contrast, OpenAI and Anthropic kept their leading models proprietary. The episode demonstrated that an open-source system could perform almost as well as closed versions. In the months that followed, Chinese firms released dozens of other open-source models. By the end of 2025, these models made up a significant share of global A.I. use.

On Friday, DeepSeek released a preview of V4, its long-awaited follow-up model, which it intends to open source. The new model excels at writing computer code, an increasingly important skill for leading A.I. systems. It significantly outperformed every other open-source system at generating code, according to tests from Vals AI, a company that tracks the performance of A.I. technologies. China’s push into open-source A.I. has become a major economic advantage at home, according to a new study by a U.S. congressional advisory body. With few barriers to use, the systems have spread across industries such as robotics, logistics and manufacturing. The study found that these industrial applications generate real-world data that are used to improve A.I. systems. This approach has allowed Chinese tech firms to capture global influence, as programmers and engineers around the world adopt their systems to build new products.

DeepSeek released its new model just days after Moonshot AI, another Chinese start-up, introduced its latest open-source model, Kimi 2.6. While these systems trail the coding capabilities of the leading U.S. models from Anthropic and OpenAI, the gap is narrowing.

The implications are meaningful. Using A.I. to write code is faster and frees up human programmers to focus on bigger issues. It also means people can use DeepSeek’s latest release to power A.I. agents, which are personal digital assistants that can use other software applications on behalf of office workers, including spreadsheets, online calendars and email services.

As A.I. systems improve at writing computer code, they are also getting better at finding security vulnerabilities in software — a skill that is fundamentally changing cybersecurity. That means tools like DeepSeek’s can be used to both attack and defend computer networks.

Across tasks, DeepSeek V4 is on a par with Moonshot’s latest model. “They are basically neck and neck,” said Rayan Krishnan, the chief executive of Vals AI.

In the months leading up to DeepSeek’s latest release, foreign rivals moved to pre-empt another round of glowing headlines. Silicon Valley’s A.I. giants, Anthropic and OpenAI, said DeepSeek had unfairly piggybacked on their technology through distillation, a process in which engineers mimic a rival model by querying it millions of times and copying its behavior.

The competition to build the best-performing A.I. systems has transformed into a geopolitical power struggle. While Silicon Valley leaders at Anthropic and OpenAI warn that their technology would be dangerous in the hands of autocratic countries, China has invested billions to become an A.I. superpower, viewing the technology as a critical engine of economic growth.

(to be continued)

Translation

DeepSeek的後續是將進一步擴大中國在開源人工智能領域的影響力(1/2

中國企業已開始積極向全世界開放其最先進的人工智能模型

去年,中國新創公司DeepSeek公佈了其人工智能模型之一的詳細信息,這在科技行業引起了巨大震動。

該公司表示,其係統在晶片上的投入遠低於OpenAIAnthropic等美國競爭對手。這標誌著中國「DeepSeek時刻」的開始,這一時刻象徵著中國人工智能公司已準備好向世界展示其技術實力。

“DeepSeek時刻 反映了全球人工智能格局的轉變。這項轉變不僅關乎降低成本,更關乎技術共享方式的開放性。

DeepSeek 將其模型開源,這意味著其他人可以自由使用和修改它們。相較之下,OpenAI Anthropic 則將其領先的模式設為專有模式。這一事件表明,開源系統幾乎可以達到與封閉版本相同的效能。在接下來的幾個月裡,中國公司發布了數十個其他開源模型。到 2025 年底,這些模型在全球人工智能應用中佔據了相當大的份額。

上週五,DeepSeek 發佈了其期待已久的後續模型 V4 的預覽版,並計劃將其開源。新模型在編寫電腦程式碼方面表現出色,這對於領先的人工智能系統而言是一項日益重要的技能。根據追蹤人工智能技術性能的 Vals AI 公司的測試,V4 在程式碼生成方面顯著優於所有其他開源系統。

就在另一家中國新創公司Moonshot AI發佈其最新開源模型Kimi 2.6幾天後,DeepSeek也發佈了其新模型。雖然這些系統的編碼能力與美國領先的AnthropicOpenAI模型相比仍有落後,但差距正在縮小。

發佈有重大意義。使用人工智能編寫程式碼速度更快,能夠使程式設計師騰出精力專注於更重要的問題。這也意味著人們可以使用DeepSeek的最新版本來驅動人工智能代理,這些代理是個人電子助理,可以代表辦公室員工使用其他軟體應用程序,包括電子表格、線上日曆和電子郵件服務。

隨著人工智能系統編寫程式碼能力的提升,它們在發現軟體安全漏洞方面也越來越出色 - 這項技能正在從根本上改變網路安全。這意味著像DeepSeek這樣的工具既可以用於攻擊電腦網絡,也可以用於防禦電腦網路。

在各項任務中,DeepSeek V4的表現與Moonshot的最新模型平起平坐。 Vals AI 的執行長 Rayan Krishnan 表示: 「他們基本上不分伯仲」。

DeepSeek 最新版本發佈前的幾個月裡,外國競爭對手試圖搶先一步,去阻止另一輪的引人注目的報道。矽谷的人工智能巨頭 Anthropic OpenAI 聲稱,DeepSeek 透過「蒸餾」技術不公平地搭了他們的順風車。蒸餾是一種工程師透過數百萬次查詢競爭對手的模型並複製其行為來模擬其運行過程的方法。

建構最佳人工智能系統的競爭已經演變成一場地緣政治權力鬥爭。儘管 Anthropic OpenAI 的矽谷領導人警告說,他們的技術落入專制國家手中將十分危險,但中國已投入數十億美元力圖成為人工智能超級大國,並將這項技術視為經濟成長的關鍵引擎。

(待續)

Note:

1.  Distillation is basically a process in which engineers mimic a rival model by querying it millions of times and copying its behavior. In AI development, distillation (often called knowledge distillation) is a technique where a smaller AI model learns to imitate a larger, more capable model. The basic idea is that: A large, expensive model (“teacher”) generates outputs, while a smaller model (“student”) trains on those outputs. The student learns patterns, reasoning styles, and behaviors from the teacher. The result is a model that is faster and cheaper to run while keeping much of the teacher’s performance. (ChatGPT)

沒有留言:

張貼留言