The present disclosure provides an image segmentation method and apparatus, an electronic device, and a storage medium. The image segmentation method including: obtaining an image to be segmented; determining a preliminarily segmented image and a target normal vector image that are corresponding to the image to be segmented; and performing image fusion on the preliminarily segmented image and the target normal vector image to obtain a target segmented image.
Legal claims defining the scope of protection, as filed with the USPTO.
. An image segmentation method, comprising:
. The method according to, wherein the determining the preliminarily segmented image and the target normal vector image that are corresponding to the image to be segmented comprises:
. The method according to, before the inputting the image to be segmented to the image segmentation model that has been pre-trained, further comprising:
. The method according to, wherein the training the big model to obtain the teacher model comprises:
. The method according to, wherein the calculating the big model segmentation loss between the big model segmented image and the segmentation marked image corresponding to the sample segmented image comprises:
. The method according to, wherein the calculating the big model normal vector loss between the big model normal vector image and the sample normal vector image corresponding to the sample segmented image comprises:
. The method according to, wherein the training the small model comprises:
. The method according to, further comprising:
. The method according to, wherein the calculating the small model segmentation output loss based on the small model segmented image of the sample segmented image, the segmentation marked image, and the big model segmented image output by the teacher model comprises:
. The method according to, wherein the calculating the small model normal vector output loss based on the small model normal vector image of the sample segmented image, the sample normal vector image, and the big model normal vector image output by the teacher model comprises:
. The method according to, further comprising:
. The method according to, wherein the performing image fusion on the preliminarily segmented image and the target normal vector image to obtain the target segmented image comprises:
. The method according to, after the performing image fusion on the preliminarily segmented image and the target normal vector image, further comprising:
. (canceled)
. An electronic device, comprising:
. A storage medium comprising computer-executable instructions, wherein the computer-executable instructions, when executed by a computer processor, cause implementing the image segmentation method according to.
. The electronic device according to, wherein the determining the preliminarily segmented image and the target normal vector image that are corresponding to the image to be segmented comprises:
. The electronic device according to, before the inputting the image to be segmented to the image segmentation model that has been pre-trained, further comprising:
. The electronic device according to, wherein the training the big model to obtain the teacher model comprises:
. The electronic device according to, wherein the training the small model comprises:
. The electronic device according to, wherein the performing image fusion on the preliminarily segmented image and the target normal vector image to obtain the target segmented image comprises:
Complete technical specification and implementation details from the patent document.
The present disclosure claims the priority to Chinese patent application No. 202210475990.9, filed in the Chinese Patent Office on Apr. 29, 2022, which is incorporated herein by reference in its entirety.
Embodiments of the present disclosure relate to the image processing technology, e.g., to an image segmentation method and apparatus, an electronic device, and a storage medium.
At present, image segmentation may be realized by a deep learning algorithm based on a convolutional neural network, or may be realized by a traditional algorithm based on edge detection and plane estimation information.
However, the deep learning algorithm based on the convolutional neural network may have the problem of a poor segmentation effect due to partially missed segmentation. The traditional algorithm based on edge detection and plane estimation information imposes a high requirement on a segmented image (for example, a segmented portion of the segmented image is smooth, etc.), making it difficult to reasonably segment a segmented image with blurred edges or irregular edges.
The disclosure provides an image segmentation method and apparatus, an electronic device, and a storage medium to improve the accuracy and stability of image segmentation.
In a first aspect, the embodiments of the present disclosure provide an image segmentation method, which includes:
In a second aspect, the embodiments of the present disclosure provide an image segmentation apparatus, which includes:
In a third aspect, the embodiments of the present disclosure provide an electronic
In a fourth aspect, the embodiments of the present disclosure provide a storage medium including computer-executable instructions, where the computer-executable instructions, when executed by a computer processor, cause implementing the image segmentation method according to any one of embodiments of the present disclosure.
Embodiments of the present disclosure will be described below with reference to the drawings. While some embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. It should be understood that the drawings and embodiments of the present disclosure are only for exemplary purposes.
It should be understood that the various steps described in the method embodiments of the present disclosure may be performed in different orders and/or in parallel. Furthermore, the method embodiments may include additional steps and/or omit performing the illustrated steps.
As used herein, the term “include,” “comprise,” and variations thereof are open-ended inclusions, i.e., “including but not limited to.” The term “based on” is “based, at least in part, on.” The term “an embodiment” represents “at least one embodiment,” the term “another embodiment” represents “at least one additional embodiment,” and the term “some embodiments” represents “at least some embodiments.” Relevant definitions of other terms will be given in the description below.
It should be noted that concepts such as the “first,” “second,” or the like mentioned in the present disclosure are only used to distinguish different devices, modules or units, and are not used to limit the interdependence relationship or the order of functions performed by these devices, modules or units.
It should be noted that the modifications of “a,” “an,” “a plurality of,” or the like mentioned in the present disclosure are illustrative rather than restrictive, and those skilled in the art should understand that unless the context clearly indicates otherwise, these modifications should be understood as “one or more.”
The names of the messages or information exchanged between a plurality of apparatuses in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of these messages or information.
It will be understood that before using the technical solutions disclosed in various embodiments of the present disclosure, a user should be notified of a type, a range of use, a usage scenario and the like of personal information involved in the present disclosure in an appropriate manner in accordance with relevant laws and regulations, and these should be authorized by the user.
For example, in response to receiving an active request from a user, a prompt message is sent to the user to explicitly prompt the user that the operation the user requests to perform will require to acquire and use the personal information of the user. Thus, the user can independently select, according to the prompt message, whether or not to provide the personal information to software or hardware such as an electronic device, an application, a server or a memory medium that performs the operations of the technical solutions of the present disclosure.
As an optional implementation, in response to receiving an active request from a user, a manner of sending a prompt message to the user may be, for example, using a pop-up window in which the prompt message may be presented in the form of text. Furthermore, the pop-up window may also carry option controls for a user to select to “agree” or “disagree” with providing personal information to an electronic device.
It will be understood that the processes of notifying of and authorizing by a user described above are merely exemplary and do not constitute a limitation on the implementations of the present disclosure, and other manners meeting relevant laws and regulations may also be applied to the implementations of the present disclosure.
It will be understood that data (including data itself, and the acquisition and use of data) involved in the present technical solutions should follow corresponding laws and regulations and requirements of relevant stipulations.
is a flowchart of an image segmentation method provided by an embodiment of the present disclosure. The embodiment of the present disclosure is applicable to a scenario of performing image segmentation on a predetermined portion to be segmented of an image. The image segmentation method may be performed by an image segmentation apparatus that may be implemented in the form of software and/or hardware and optionally implemented by an electronic device. The electronic device may be a mobile terminal, a personal computer (PC), a server, or the like.
As shown in, the method includes:
S: obtaining an image to be segmented.
The image to be segmented may be an image having a portion to be segmented, where the portion to be segmented may be a portion needing to be segmented. For example, the portion to be segmented may be a floor, a wall, a ceiling, etc.
Exemplarily, the image to be segmented may be obtained from a shooting apparatus. The image to be segmented may also be obtained by uploading or downloading by a user and the like. A way of obtaining the image to be segmented may be set according to an actual situation.
Optionally, the image to be segmented may also be a video frame to be segmented in a video, e.g., each frame or part of frames in the video. Exemplarily, obtaining the image to be segmented may include obtaining a target video frame in a target video and taking the target video frame as the image to be segmented. For example, obtaining the target video frame in the target video may include: obtaining a video frame in the target video frame by frame as the target video frame, or obtaining a video frame in the target video at intervals of a preset number of video frames as the target video frame, or obtaining a video frame in the target video at intervals of a preset duration as the target video frame, or obtaining a video frame including a target object in the target video as the target video frame.
S: determining a preliminarily segmented image and a target normal vector image that are corresponding to the image to be segmented.
The preliminarily segmented image may be a segmented image obtained by preliminarily segmenting the image to be segmented. The preliminarily segmented image includes a portion to be segmented that is roughly segmented. The preliminarily segmented image may be an image obtained by performing segmentation processing on the image to be segmented based on a segmentation model, or may be an image obtained by performing calculation on the image to be segmented based on an image segmentation algorithm. The target normal vector image may be an image obtained by extracting a normal vector from the image to be segmented. The target normal vector image may be an image obtained based on a normal vector extraction model or a normal vector calculation method. The normal vector extraction model may be trained based on a sample segmented image and a sample normal vector image corresponding to the sample segmented image.
It needs to be noted that a pixel value of each pixel in the normal vector image characterizes a normal vector corresponding to the pixel in the image to be segmented corresponding to the normal vector image. The normal vector may be a value obtained based on normal assisted stereo depth estimation, or may be obtained based on other normal vector determination ways.
Exemplarily, after the image to be segmented is obtained, preliminary segmentation processing may be performed on the image to be segmented to obtain the preliminarily segmented image, and normal vector extraction processing may be performed on the image to be segmented to obtain the target normal vector image. Alternatively, an overall image processing model is pre-trained to preliminarily extract a segmented image and a normal vector image, and the image to be segmented is processed by the overall image processing model to obtain the preliminarily segmented image and the target normal vector image.
S: performing image fusion on the preliminarily segmented image and the target normal vector image to obtain a target segmented image.
The target segmented image may be a finally obtained segmented image that is segmented to obtain the portion to be segmented.
Exemplarily, a corresponding weight of each pixel may be determined based on the obtained target normal vector image, and then a pixel value of each pixel in the preliminarily segmented image may be weighted based on the corresponding weight of each pixel so that a weighted segmented image can be obtained. The weighted segmented image is the target segmented image.
Optionally, image fusion may be performed on the preliminarily segmented image and the target normal vector image to obtain the target segmented image by the following steps.
Step 1: for each pixel in the preliminarily segmented image, determining a predicted weight of the pixel based on a predicted pixel value of the pixel in the target normal vector image, and a preset segmentation threshold.
The preset segmentation threshold may be a threshold used to identify a normal vector of the portion to be segmented. The predicted pixel value may be a value at a corresponding pixel in a preset channel in the target normal vector image. The predicted weight may be obtained by calculating with the predicted pixel value and the preset segmentation threshold, and the predicted weight is a weight for subsequently weighting the pixel value of each pixel in an initially segmented image. For example, the predicted weight may be a quotient of the predicted pixel value and the preset segmentation threshold.
Exemplarily, for each pixel in the preliminarily segmented image, the predicted pixel value corresponding to each pixel in the target normal vector image is determined. The predicted weight of each pixel is calculated based on the predicted pixel value of each pixel and the preset segmentation threshold and is used for subsequently weighting the pixel value of each pixel in the initially segmented image.
Step 2: weighting a pixel value of the pixel in the preliminarily segmented image based on the predicted weight to obtain a target pixel value of the pixel.
The target pixel value may be a product of the pixel value in the preliminarily segmented image and the predicted weight.
Exemplarily, for each pixel in the preliminarily segmented image, the product of the pixel value of the pixel in the preliminarily segmented image and the predicted weight is taken as the target pixel value of the pixel.
Step 3: determining the target segmented image based on the target pixel value of each pixel in the preliminarily segmented image.
Exemplarily, the target pixel value of each pixel in the preliminarily segmented image is integrated according to the position of each pixel so that the target segmented image can be obtained.
Exemplarily, the portion to be segmented is a floor portion, and the target normal vector image is a three-channel image. For a second channel value of the target normal vector image, it is usually 255 for the floor and usually 0 for the ceiling. Therefore, the preliminarily segmented image of the floor may be processed based on this information to reduce the ceiling portion that is segmented as the floor portion. The preset segmentation threshold may be threshold=140.0. The target segmented image may be determined according to the following formula:
Thus, refined_mask may be standardized to define the value of refined_mask within [0, 255] such that the pixel value of each pixel in the target segmented image is assigned to be between 0 and 255.
In consideration of the portion to be segmented in the image to be segmented being related to shooting angle information of an image shooting apparatus. For example, in the case of the portion to be segmented is the floor portion, when the shooting angle information is elevation 90 degrees, the image to be segmented may be considered as having no floor portion. Therefore, after performing image fusion on the preliminarily segmented image and the target normal vector image, adjustment processing is performed on the image:
The image shooting apparatus may be an apparatus for shooting the image to be segmented, such as a smart phone, a video camera, and a digital camera. The shooting angle information may be elevation information of the image shooting apparatus when shooting the image to be segmented. The shooting angle information may be measured by an inertial measurement unit (IMU).
Exemplarily, the shooting angle information when shooting the image to be segmented may be obtained based on the IMU in the image shooting apparatus. Whether the target segmented image includes the portion to be segmented is determined based on the shooting angle information, and the target segmented image is processed based on a determination result to obtain the final target segmented image. If it is determined that the target segmented image does not include the portion to be segmented based on the shooting angle information, each pixel in the target segmented image may be set to zero, and the image set to zero may be taken as the final target segmented image. If it is determined that the target segmented image includes the portion to be segmented based on the shooting angle information, the target segmented image may be taken as the final target segmented image.
Exemplarily, when the portion to be segmented in the image to be segmented is the floor, if the shooting angle information is 30 degrees to 90 degrees, each pixel in the target segmented image is set to zero, and if the shooting angle information is other angles, the pixel value of each pixel in the target segmented image is maintained.
According to the technical solution of this embodiment of the present disclosure, by obtaining the image to be segmented, determining the preliminarily segmented image and the target normal vector image that are corresponding to the image to be segmented, and performing image fusion on the preliminarily segmented image and the target normal vector image to obtain the target segmented image, the problems of poor accuracy and stability of image segmentation are solved, and the technical effect of improving the accuracy and stability of image segmentation is achieved.
Unknown
October 23, 2025
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.