type
status
date
slug
summary
tags
category
icon
password
Zhipu AI released its latest large base model GLM-4-Plus and demonstrated visual capabilities similar to the OpenAI GPT 4o model, capable of free voice calls and visual reasoning, and announced that it will be open on August 30!
Major Updates of GLM-4-Plus
- Language base model GLM-4-Plus: The performance in language understanding, instruction following, long text processing, etc. has been comprehensively improved, maintaining the international leading level.
- CogView-3-Plus, a Wensheng graph model, has performance close to that of the current best models such as MJ-V6 and FLUX.
- Image/Video Understanding Model GLM-4V-Plus: It has excellent image understanding capabilities and time-aware video understanding capabilities. This model will be launched on the open platform (bigmodel.cn) and become the first general video understanding model API in China.
- Video generation model CogVideoX: After the release and open source of version 2B, version 5B was also officially open sourced. Its performance was further enhanced, making it the best choice among current open source video generation models.
GLM-4-Plus has demonstrated excellent performance in many aspects, achieving significant improvements in language understanding, instruction following, long text processing, and many other aspects.
Functions and Features of GLM-4-Plus
Language comprehension and processing ability
- Enhanced language understanding: GLM-4-Plus has improved its performance in language understanding, instruction following, and long text processing, and can better understand and process complex text tasks.
- Long text processing: Through a more accurate long and short text data mixing strategy, the long text reasoning effect of GLM-4-Plus has been significantly improved, comparable to the international advanced level.
- GLM-4-Plus is comparable to GPT-4o and Llama3.1 with 405B parameters in terms of language and text capabilities.
Model construction and data synthesis
- High-quality synthetic data: GLM-4-Plus uses a large amount of model-assisted construction of high-quality synthetic data to improve model performance, especially in the performance of reasoning (such as mathematics and code algorithm questions), better reflecting human preferences.
Multimodal Capabilities:
- Image and video understanding: GLM-4V-Plus, as an extension of GLM-4-Plus, has excellent image understanding capabilities and adds time-aware video understanding capabilities, which can understand complex video content and perform temporal reasoning.
- Image and video generation: In conjunction with models such as CogView-3-Plus and CogVideoX, GLM-4-Plus can demonstrate superior performance in tasks such as image editing and video generation.
Video Understanding Example by GLM-4-Plus
User:
What does the player in green do in the entire video?
GLM-4-Plus:
Throughout the video, the player in green dribbles the ball down the court before jumping and throwing it into the basket.
User:
What is the highlight of this video? At which second does it happen?
GLM-4-Plus:
The highlight of this video happens at the 4th second, when the player in green jumps and puts the ball into the basket.
Zhipu's Qingyan APP will have a "video call" function, which is also the first video call function open to C-end users in China. Qingyan's video call capability spans text, audio and video modes, and has the ability to perform real-time reasoning.
It can make smooth calls and respond quickly even if it is frequently interrupted. As long as the camera is turned on, Qingyan can also see what we see, and can understand the instructions and execute them accurately.
The video call function will be launched on August 30, and will be first available to some Qingyan users, and will also be open to external applications.
- Author:KCGOD
- URL:https://kcgod.com/glm-4-plus-by-zhipu-ai
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!
Relate Posts
Google Launches Gemini-Powered Vids App for AI Video Creation
FLUX 1.1 Pro Ultra: Revolutionary AI Image Generator with 4MP Resolution
X-Portrait 2: ByteDance's Revolutionary AI Animation Tool for Cross-Style Expression Transfer
8 Best AI Video Generators Your YouTube Channel Needs
Meta AI’s Orion AR Glasses: Smart AI-Driven Tech to Replace Smartphones