BREAKING NEWS
Jul 11, 20256 min read

Daily AI News Brief - July 11, 2025

Eight major AI developments including Zhipu's AI Slides PPT generation feature, Kling AI's Ketu 2.1 image model with 180+ styles, NVIDIA's DiffusionRenderer for editable 3D scenes, ModAI's 30-second p...

AIToolery

Published Jul 11, 2025

July 11, 2025 presents eight significant AI developments spanning presentation automation, image generation, 3D rendering, rapid prototyping, personalized avatars, workflow optimization, video creation, and specialized coding models.

1️⃣ Zhipu Launches AI Slides: Manus-Style PPT Generation Feature with Unlimited Free Access

Zhipu has quietly launched a new PPT generation feature called AI Slides, leveraging the latest GLM-Experimental model to quickly generate high-quality presentations based on user-provided research topics or documents. This new functionality is currently available for free use without limitations, representing a significant advancement in automated presentation creation.

Key Features:

  • Topic-Based Generation: AI Slides can rapidly generate high-quality PPTs based on provided topics or documents
  • Clear Structure and Data Visualization: Generated PPTs feature clear structure with intuitive chart displays for easy understanding
  • Free Access: Users can experience AI Slides functionality for free at chat.z.ai without usage restrictions

Available at https://chat.z.ai, users can access the feature by logging in, switching to GLM-Experimental model, and clicking the AI Slides icon. The system first analyzes content to generate structural outlines, ensuring logical flow, then automatically generates individual pages based on the outline, significantly simplifying traditional PPT creation processes.

2️⃣ Kling AI Releases Ketu 2.1 Model: Massive Image Generation Upgrade Supporting 180+ Styles

Kling AI has released the new generation image generation model Ketu 2.1, featuring significant improvements in instruction following, portrait aesthetics, and cinematic texture capabilities. The new model possesses powerful text generation abilities and supports over 180 style responses, providing users with richer creative choices.

Enhanced Capabilities:

  • Complex Instruction Understanding: New model excels in complex instruction comprehension, capable of accurately generating high-quality images
  • Enhanced Text Generation: Text generation functionality strengthened, supporting 180+ style responses to expand creative space
  • Free Trial Period: Kling AI launched Ketu 2.1 model with significantly improved image generation capabilities, offering users 7 days of free experience

The model demonstrates outstanding performance in understanding complex instructions, accurately capturing various elements and logical relationships in prompts. It achieves breakthroughs in image clarity, element richness, and detail authenticity, particularly excelling in portrait presentation with cinematic quality through advanced composition and lighting techniques.

3️⃣ NVIDIA Launches DiffusionRenderer: Revolutionary AI Model for Video-to-Editable 3D Scenes

NVIDIA and partners have introduced DiffusionRenderer, a breakthrough technology that combines video generation with editing capabilities, enabling understanding and manipulation of 3D scenes. The model works through collaboration between neural inverse renderer and neural forward renderer, enhancing video realism and adaptability while demonstrating excellence across multiple tasks.

Technical Innovation:

  • Generation and Editing Integration: DiffusionRenderer combines generation and editing functions, bringing new possibilities for 3D scene creation
  • Dual Renderer Collaboration: Neural inverse renderer and neural forward renderer work together to enhance video realism and adaptability
  • Practical Applications: Real-world applications include dynamic lighting, material editing, and object insertion, helping creators easily perform video creation tasks

Available with demonstration at https://youtu.be/jvEdWKaPqkc, the framework enables creation of advanced, editable photorealistic 3D scenes from single videos, making sophisticated video editing not only feasible but also practical, reliable, and high-quality for creative professionals.

4️⃣ ModAI Major Launch: Generate High-Fidelity Editable Prototypes from Ideas in 30 Seconds

ModAI has launched a revolutionary prototype generation feature enabling users to generate high-fidelity, editable prototypes from ideas in just 30 seconds. The system supports multi-round conversational optimization and local modifications, significantly improving product design and validation efficiency for rapid development cycles.

Revolutionary Features:

  • 30-Second Generation: Generate editable prototypes within 30 seconds, supporting multi-terminal adaptation and multi-round conversational optimization
  • Multi-Image Input Support: Supports various image inputs, intelligently parsing sketches, wireframes, and other interface generation methods
  • Dual-Mode Editing: Dual-mode editing with automatic document generation, achieving design-to-code functionality covering multiple scenarios

The platform represents a significant advancement in rapid prototyping, enabling product managers and designers to quickly transform conceptual ideas into functional, editable prototypes through natural language interaction and intelligent interface generation capabilities.

5️⃣ Upload 10 Photos, AI Instantly Creates Fashion Blockbusters: Higgsfield Soul ID Goes Viral Globally

Soul ID is a revolutionary AI tool launched by Higgsfield AI capable of generating highly personalized virtual avatars by uploading 10 or more personal photos. Core functionality includes perfect fusion of realism and diversity, diverse style presets, and automatic prompt optimization, providing powerful creative tools for content creators and fashion bloggers.

Core Capabilities:

  • Personalized Training: Users need only upload 10+ photos to generate exclusive AI characters
  • Diverse Style Presets: Built-in 60+ advanced style presets enabling one-click switching between multiple visual languages
  • Automatic Prompt Optimization: Users input simple descriptions, AI automatically optimizes generation conditions to output high-quality images

Available at https://higgsfield.ai/, Soul ID eliminates generic avatars and provides complete creative control through a simple, powerful workflow. Users can train unique avatars, generate with curated style presets, achieve consistent results across different expressions and lighting, and create unlimited personas for various projects.

6️⃣ Google DeepMind Open Sources GenAI Processors: One-Click Real-Time AI Workflow Construction

Google DeepMind has open-sourced the GenAI Processors library, providing developers with lightweight, efficient tools for building asynchronous, composable generative AI workflows. This open-source library aims to simplify complex AI workflow development processes, supporting real-time processing of multimodal data including audio, video, and text, significantly improving Gemini API-based application development efficiency.

Library Highlights:

  • Modular Design: GenAI Processors simplifies complex AI workflow development through modular design
  • Multimodal Support: Supports asynchronous stream processing of audio, video, text, and other multimodal data, improving real-time application efficiency
  • Open Source Collaboration: Open source community collaboration will further expand library functionality, covering more scenarios and programming languages

Available on GitHub at https://github.com/google-gemini/genai-processors, the library features a unified Processor interface that enables developers to decompose complex AI workflows into modular processing units, supporting full pipeline processing from input preprocessing to model calls to output generation with asyncio optimization.

7️⃣ Google Veo3 Adds Image-to-Video Feature: 40+ Million Videos Created in Seven Weeks

Google continues advancing in AI video generation by launching image-to-video functionality and strengthening content recognition mechanisms, demonstrating strong market demand for AI creative tools. The platform has achieved remarkable adoption with over 40 million videos generated across platforms within seven weeks of launch.

Platform Expansion:

  • New Functionality: Google adds image-to-video generation through Gemini application, expanding AI creative tool capabilities
  • Enhanced Creation: Users can upload photos to generate video clips and add descriptive audio, supporting download or sharing of works
  • Content Traceability: All videos generated using Veo3 models will include visible and invisible digital watermarks ensuring content traceability

The image-to-video capability transforms still photographs into dynamic 8-second video clips with sound, rolling out to Google AI Pro and Ultra subscribers in select countries. The feature maintains character consistency across multiple shots while providing rich camera movement options including dolly shots for professional video creation.

8️⃣ Mistral AI Releases Devstral2507: Built for Code-Centric Language Modeling

Mistral AI has partnered with All Hands AI to launch the Devstral2507 series models, including open-source Devstral Small 1.1 and enterprise-grade Devstral Medium2507. These models focus on code reasoning, program synthesis, and structured task execution, suitable for practical applications in large software codebases with impressive benchmark performance.

Model Performance:

  • Benchmark Excellence: Devstral Small 1.1 achieves 53.6% score on SWE-Bench benchmark testing, while Devstral Medium2507 scores 61.6%
  • Commercial Outperformance: Performance surpasses some commercial models in code-centric tasks and software engineering applications
  • Specialized Focus: Models specifically designed for code reasoning, program synthesis, and structured task execution in enterprise environments

The series represents advancement in specialized AI models for software development, with Devstral Small 1.1 available as open-source for local deployment while Devstral Medium2507 targets enterprise applications through API access, demonstrating superior performance in agentic coding tasks and tool use patterns compared to traditional copilot-style systems.