We Tested The New Qwen3.5 Open Weight, Qwen3.5-Plus AI Models in Real Hands-on Tests

We Tested The New Qwen3.5 Open Weight, Qwen3.5-Plus AI Models in Real Hands-on Tests

Alibaba’s Qwen lineup has developed quickly over the previous few weeks. We just lately noticed Qwen3-Coder-Subsequent focusing on builders with an AI coding assistant. This was adopted by Qwen Picture 2.0, which pushed the platform’s picture technology high quality even additional. Every launch strengthened a selected functionality throughout the ecosystem. Now, constructing on that evolution, comes the Qwen 3.5 household with two new AI fashions – its first open weight mannequin: the Qwen3.5 397B-A17B, and the Qwen3.5-Plus.

Among the many two, the previous, or the Qwen3.5 397B-A17B, is the flagship mannequin, whereas the Qwen3.5-Plus is the hosted mannequin out there through Alibaba Cloud Mannequin Studio. Each fashions can now be accessed on the Qwen Chat.

From what Alibaba tells us, the Qwen 3.5 household focuses on stronger reasoning, coding, agentic capabilities, multimodal understanding, and improved effectivity. Extra importantly, it displays a broader push by Alibaba towards AI programs that may deal with complicated, multi-step duties with higher autonomy. If you happen to take a look at it fastidiously, the mannequin is extra than simply an improve – it’s a sign of the place the Qwen household is heading.

On this article, we cowl what’s new in Qwen 3.5, the place it stands competitively, and what our hands-on testing reveals about its real-world efficiency. Let’s soar proper in.

What’s Qwen 3.5?

Qwen 3.5 isn’t simply “the subsequent Qwen mannequin.” Alibaba has formally kicked off the Qwen 3.5 sequence by open-sourcing the primary mannequin, and has formally named it ‘Qwen3.5-397B-A17B.’

Now right here’s an important half, so far as its functioning goes – the mannequin has 397 billion complete parameters, but it surely doesn’t use all of them each time. Because of a sparse Combination-of-Consultants (MoE) setup, it prompts solely 17B parameters per ahead move. This can be a fancy method of claiming: massive mind, but it surely solely “wakes up” the elements it wants, so inference stays quick and cost-efficient.

Much more importantly, this can be a native vision-language mannequin. Which means it’s constructed to deal with textual content + pictures collectively, not as an afterthought. Alibaba claims it performs strongly throughout reasoning, coding, agent capabilities, and multimodal understanding in benchmark evaluations.

And there’s a really “real-world” improve too: language assist jumps from 119 to 201 languages and dialects, which issues in case you’re constructing any global-facing apps.

In parallel, Alibaba has additionally introduced Qwen3.5-Plus, which is a hosted model out there through Alibaba Cloud Mannequin Studio. It provides a 1 million-token context window by default and contains built-in instruments with adaptive software use. This makes it appropriate for long-context workflows and agent-style automation.

This brings us to the query – how does Qwen 3.5 do all this? Let’s take a look below its hood to grasp this.

Below the Hood: How Qwen 3.5 Works

Qwen 3.5 is attention-grabbing not simply due to its measurement, however how effectively it makes use of that scale.

On the infrastructure stage, the mannequin separates how imaginative and prescient and language elements are processed as a substitute of forcing them right into a one-size-fits-all pipeline. This heterogeneous setup permits textual content, pictures, and video inputs to be processed extra effectively, enabling near-100% coaching throughput even on combined multimodal information.

Effectivity is additional boosted by sparse activations. This enables totally different elements to compute in parallel. Add to {that a} native FP8 pipeline – making use of low precision the place secure whereas preserving increased precision in delicate layers – and the system cuts activation reminiscence by roughly 50% whereas enhancing pace.

Alibaba additionally constructed a scalable asynchronous reinforcement studying framework to constantly refine the mannequin. By separating coaching and inference workloads, the system improves {hardware} utilization, balances load dynamically, and recovers rapidly from failures. Methods like speculative decoding, rollout replay, and multi-turn rollout locking additional enhance throughput and stability, particularly for agent-style workflows.

Pretraining: Energy, Effectivity, and Versatility

Qwen 3.5 was pretrained with a transparent deal with three issues: energy, effectivity, and flexibility.

It was educated on a considerably bigger mixture of visible and textual content information than Qwen 3, with stronger multilingual, STEM, and reasoning protection. Regardless of activating solely 17B parameters at a time, the mannequin reportedly matches the efficiency of a lot bigger trillion-parameter programs.

Architecturally, it builds on the Qwen3-Subsequent design, combining higher-sparsity MoE with hybrid consideration mechanisms. This enables dramatically sooner decoding speeds whereas sustaining comparable efficiency.

The mannequin can be natively multimodal, fusing textual content and imaginative and prescient early in coaching. Language protection expands from 119 to 201 languages and dialects, whereas a bigger 250k vocabulary improves encoding and decoding effectivity throughout languages.

Benchmark Efficiency: The place Qwen 3.5 Stands

Benchmarks present us the place a mannequin begins to separate itself from the herd of choices on the market. Primarily based on Alibaba’s launched evaluations, Qwen3.5-397B-A17B delivers aggressive efficiency throughout reasoning, agentic workflows, coding, and multimodal understanding. Here’s a take a look at its benchmarks and what it means:

Instruction Following & Reasoning

  • IFBench (Instruction Following): 76.5 — among the many prime scores in its class
  • GPQA Diamond (Graduate-level reasoning): 88.4 — aggressive with frontier reasoning fashions

These outcomes recommend robust comprehension and structured reasoning that are essential for real-world workflows.

Agentic & Device Use Capabilities

  • BFCL v4 (Agentic software use): 72.9
  • BrowseComp (Agentic search): 78.6
  • Terminal-Bench 2 (Agentic terminal coding): 52.5

Qwen 3.5 performs particularly nicely in agent-driven duties, reinforcing its positioning for workflow automation and power orchestration.

Coding & Developer Workflows

This locations it solidly within the vary of fashions able to dealing with actual coding and debugging workflows.

Multilingual Information

The rating aligns with its expanded language protection and improved data retrieval.

Multimodal & Visible Reasoning

  • MMMU-Professional (Visible reasoning): 79.0
  • OmniDocBench v1.5 (Doc understanding): 90.8
  • Video-MME (Video reasoning): 87.5
  • VITA-Bench (agentic multimodal interplay): 49.7

These numbers spotlight considered one of Qwen 3.5’s largest strengths: multimodal comprehension throughout paperwork, visuals, and video.

Embodied & Spatial Reasoning

This displays enhancing capabilities in real-world and embodied reasoning situations.

What These Benchmarks Actually Imply

As an alternative of dominating a single class, Qwen 3.5 exhibits balanced energy throughout reasoning, agentic execution, coding, and multimodal understanding. That steadiness issues as a result of fashionable AI workloads aren’t single-task issues. They contain instruments, paperwork, pictures, code, and multi-step workflows, and Qwen 3.5 seems to be constructed for precisely that actuality.

Palms-on With Qwen 3.5

We carried out a few checks on each Qwen3.5 397B-A17B and the Qwen3.5-Plus. Listed below are the checks and the outcomes.

Job 1 – Coding with Qwen3.5-Plus

Immediate:

You might be an knowledgeable frontend developer and UI/UX designer.

Construct a contemporary, responsive promotional web site (single-page touchdown website) for the next occasion. The location must be visually premium, conversion-focused, and optimized for registrations.

Occasion Particulars:
Title: iqigai AI Fellowship Problem 2026
Tagline: India’s Largest AI and Information Tech Hunt
Offered by: Fractal
Accomplice: Analytics Vidhya
Registration Hyperlink:

Content material to Embrace:
– Headline: India’s Largest AI and Information Tech Hunt is now dwell!
– Description:
The iqigai AI Fellowship Problem 2026 is greater than a hackathon — it’s a career-defining platform the place members compete, get nationally ranked, and acquire visibility amongst prime employers.
– Dates: twentieth January – eighth March 2026
– Complete Prize Pool: ₹20 Lacs
– High Prizes:
Winner – ₹5 Lakhs
1st Runner-up – ₹3 Lakhs
2nd Runner-up – ₹2 Lakhs

Web site Necessities:
1. Use HTML, CSS, and JavaScript (or React if most well-liked).
2. Totally responsive (desktop + cell).
3. Fashionable gradient/AI-tech themed styling.
4. Clean scrolling navigation.
5. Clear CTA buttons linking to registration web page.
6. Sections:
– Hero part (massive headline + CTA)
– Concerning the Problem
– Key Highlights / Why Take part
– Prize Part (playing cards or visible badges)
– Timeline / Dates
– Name-to-Motion Banner
– Footer

Design Pointers:
– Darkish tech gradient background
– Delicate animations / hover results
– Clear typography
– Playing cards with shadows and rounded corners
– Non-compulsory icons or illustrations
– Preserve skilled occasion branding tone

Output Necessities:
– Present full runnable code
– Arrange clearly into recordsdata
– Remark essential elements
– Do NOT embrace placeholder lorem ipsum
– Guarantee production-ready construction

Generate the complete web site code now.

Output:

  

Job 2 – Textual content-to-image with Qwen3.5-Plus

Immediate:

Create a cinematic anime-style transformation scene that includes Vegeta from Dragon Ball Tremendous unlocking Extremely Ego — depict a darkish cosmic battlefield as his physique radiates harmful god-like ki, muscular tissues tightening and posture shifting into fierce confidence, hair turning deep purple and eyes glowing magenta, surrounded by a raging flame-like violet aura that crackles and distorts the surroundings; seize the essence of a God-of-Destruction mindset the place energy grows via battle depth and injury, emphasizing savage delight, chaotic power waves, shattered terrain, and dramatic lighting — ultra-detailed, excessive distinction, dynamic digital camera angles, movement blur, and explosive anime shading, conveying overwhelming harmful dominance and unstoppable escalation.

Output:

  

Job 3 – Picture-to-video with Qwen3.5-Plus

Merely click on the Create Video possibility on the Picture

Output:

  

Job 4 – Textual content-to-image with Qwen3.5 Open Weight

Immediate:

“Slash and Burn” could possibly be a spirit or power of nature, embodying the cycle of destruction and renewal. It would seem as a fiery, elemental being that consumes all the pieces in its path, just for new life to emerge from the ashes. This entity could possibly be worshipped or feared as a deity of transformation and rebirth. backside left signature “sapope”

Output:

  

Job 5 – Picture-to-video with Qwen3.5 Open Weight

Merely click on the Create Video possibility on the Picture

Output:

  

Closing Video:

  

Conclusion

The Qwen 3.5 household, with Qwen3.5 Open, is a step towards a extra succesful, unified AI system. With its hybrid MoE structure, native multimodal design, expanded language protection, and powerful efficiency throughout reasoning, coding, and doc understanding benchmarks, Alibaba is clearly optimizing for real-world workloads.

What stands out most is the steadiness. As an alternative of excelling in a single slender job, Qwen 3.5 exhibits constant energy throughout agentic workflows, multimodal reasoning, and effectivity at scale. As AI strikes from chat interfaces to execution-driven programs, fashions constructed for versatility and throughput will matter extra. With the benchmark performances and the outcomes we see in our hands-on checks, Qwen 3.5 positions itself firmly in that future.

Technical content material strategist and communicator with a decade of expertise in content material creation and distribution throughout nationwide media, Authorities of India, and personal platforms

Login to proceed studying and luxuriate in expert-curated content material.