Huawei Pangu Under Fire for Alleged Alibaba Qwen “Copycat” Accusations: Official Response

Research suggesting Huawei’s Pangu large model shares significant parameter similarity with Alibaba’s Qwen, sparking plagiarism allegations. Huawei’s Pangu team denies this, citing different training hardware and adherence to open-source licenses for shared components. While some code within Pangu bears Qwen’s copyright, it’s attributed to proper utilization of open-source licenses rather than outright plagiarism. The original research report has been removed pending peer review.

On June 30, 2025, Huawei officially announced the open-sourcing of its Pangu 7B parameter dense model, the Pangu Pro MoE 72B Mixture-of-Experts model, and its Ascend-based model inference technology. Subsequently, research posted on GitHub by @HonestAGI regarding Huawei’s Pangu large models ignited considerable industry discussion. The author of this research asserted that Huawei’s Pangu Pro MoE (Pangu Pro MoE) exhibits a high degree of similarity in its parameter structure to Alibaba’s Qwen-2.5 14B model.

HonestAGI’s comparative tests indicated that the Pangu Pro MoE model and the Qwen-2.5 14B model share an average correlation in attention parameter distribution of 0.927, significantly exceeding the typical range for comparisons between similar industry models, which usually does not surpass 0.7.

Huawei's Pangu Large Model Accused of 'Copying' Alibaba's Qwen: Official Response

The process of training deep learning models is inherently prone to randomness, involving complex data sampling, weight initialization, and optimization paths. The observation of nearly identical distributions across numerous attention parameters between two models suggests a minuscule probability of occurring naturally. Consequently, this pronounced similarity immediately fueled accusations of potential intellectual property infringement.

Later, individuals identifying as part of the Pangu large model team responded on GitHub, refuting the plagiarism allegations and deeming the author’s evaluation methodology as unscientific.

The user in question stated their evaluations of several models using the methodology described in their paper:

pangu-72b-a16b vs. Qwen2.5-14b = 0.92
baichuan2-13b vs. Qwen1.5-14b = 0.87
baichuan2-13b vs. pangu-72b-a16b = 0.84
baichuan2-13b vs. Qwen2.5-14b = 0.86

This comparison, they noted, showed highly similar results to the Qwen-2.5 14B model even when pitted against other models of comparable parameter scale under their evaluation framework, implying that the paper and its metric lacked practical significance. The Pangu team reiterated its stance against any form of plagiarism.

HonestAGI, however, expressed dissatisfaction with the Pangu team’s rebuttal. “Pangu still shows the highest similarity, correct? We are happy to see you reproduce our results! In fact, any classification problem has a threshold to identify decision boundaries (e.g., supposed value of 0.9 in this case). This is primarily a tool for preliminary comparison. Pangu ‘unfortunately’ triggered this warning signal before we could investigate further. We don’t make judgments solely based on attention parameters. This is just the motivation…”

HonestAGI further provided a comparison between Qwen and Hunyuan A13B, revealing starkly different internal patterns across various layers, indicative of distinct architectures and learned representations. This data was clearly intended to underscore the validity of its testing methodology.

Huawei's Pangu Large Model Accused of 'Copying' Alibaba's Qwen: Official Response

However, HonestAGI appears to have since taken down its research report on the Pangu large models. In a subsequent update, HonestAGI stated, “We plan to submit it to a peer-reviewed conference (ACML? Or later?) once the paper is finalized and all code is released.”

Furthermore, a review of the Pangu Pro model code, officially released by the Pangu team on Gitcode (China’s equivalent of GitHub), reveals a copyright notice for the ‘transformers’ component: “Copyright notice: Copyright 2024 The Qwen team, Alibaba Group and the HuggingFace Team. All rights reserved.” This statement explicitly attributes the copyright ownership of the ‘transformers’ component to the Qwen team, Alibaba Group, and the HuggingFace team. This has led many netizens to interpret it as definitive proof of plagiarism.

Huawei's Pangu Large Model Accused of 'Copying' Alibaba's Qwen: Official Response

It’s crucial to note that this code was officially published by “Ascend Tribe,” the Pangu large model team, not a third party. This has emboldened those who see it as solid evidence of intellectual property theft.

However, industry insiders suggest this is standard practice for open-source declarations. When a team utilizes open-source software developed by others, they are legally obligated to inform users according to the terms of the respective open-source licenses. The notice merely clarifies that the Pangu large model incorporates the ‘transformers’ component developed by Alibaba’s Qwen team and HuggingFace, operating under the “Apache License 2.0.” This license permits unrestricted use, modification, and distribution of the software, even for commercial purposes. Therefore, this declaration only signifies that the Pangu large model team has utilized certain open-source code and has complied with the licensing requirements, which does not inherently constitute plagiarism.

Pangu Large Model Team Issues Official Response

At 4:59 PM on July 5th, Noah’s Ark Lab, Huawei’s division responsible for the Pangu large model’s development, released a statement officially addressing the “plagiarism” accusations.

Noah’s Ark Lab stated that the Pangu Pro MoE open-source model is a foundational large model developed and trained on the Ascend hardware platform, not an incremental training of another vendor’s model. It boasts key innovations in architectural design and technical features, positioning it as the world’s first MoE model specifically engineered for Ascend hardware. The lab highlighted its novel Grouped Mixture-of-Experts (MoGE) architecture, which effectively tackles load balancing challenges in large-scale distributed training, thereby enhancing training efficiency.

Nevertheless, Noah’s Ark Lab did acknowledge that, “The code implementation of certain foundational components in the Pangu Pro MoE open-source model references industry open-source practices, involving some open-source code from other open-source large models. We strictly adhere to the requirements of open-source licenses, clearly marking the copyright statements of open-source code within the code files. This not only aligns with common practice in the open-source community but also embodies the industry’s advocated spirit of open collaboration. We remain committed to open innovation, respecting third-party intellectual property, while championing an inclusive, fair, open, united, and sustainable approach to open source.

Huawei's Pangu Large Model Accused of 'Copying' Alibaba's Qwen: Official Response

Additionally, Xinzhixun discovered online reports from Baidu Tieba suggesting that Wang Yunhe, the head of Noah’s Ark Lab responsible for Pangu’s development, also responded internally. His core points were twofold: 1) The Pangu large model was trained on Ascend chips, a different training hardware than what Qwen utilizes; 2) The Llama and Qwen components used in the Pangu large model are already open-sourced and do not constitute plagiarism.

Huawei's Pangu Large Model Accused of 'Copying' Alibaba's Qwen: Official Response

In conclusion, there is currently no conclusive evidence to substantiate the claims that the Pangu large model engaged in plagiarism of Alibaba’s Qwen. The reliability of the testing methodology employed by HonestAGI in its now-removed research paper remains subject to further validation, especially given the incomplete disclosure of its testing code. As for the inclusion of Qwen’s open-source code within the Pangu model, the Pangu team’s statements and actions are compliant with open-source licensing. This simply indicates that Pangu was not built entirely from scratch to be absolutely original; utilizing some open-source components does not equate to plagiarism.

Original article, Author: Tobias. If you wish to reprint this article, please indicate the source:https://aicnbc.com/4093.html

Like (0)
Previous 9 hours ago
Next 8 hours ago

Related News