Deep Image-to-Image Network Learning for Medical Image Analysis

PublishedAugust 28, 2018

Assigneenot available in USPTO data we have

InventorsS. Kevin Zhou Dorin Comaniciu Bogdan Georgescu Yefeng Zheng David Liu+1 more

Technical Abstract

Patent Claims

22 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method for automatically performing a medical image analysis task on a medical image of a patient, comprising: receiving an input medical image of a patient; and automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN), wherein the DI2IN uses a conditional random field (CRF) energy function to estimate the output image based on the input medical image and uses a trained deep learning network to model unary and pairwise terms of the CRF energy function, wherein automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN) comprises: generating an image pyramid with a plurality of reduced resolution images of the input medical image, and generating a respective output image that provides a result of the target medical image analysis task on each of the reduced resolution images of the input medical image using a sequence of trained DI2INs including a respective DI2IN trained at each of a plurality of resolution levels of the plurality of reduced resolution images, wherein the respective output image generated for each of the plurality of reduced resolution images of the input medical image defines a region of interest that is cropped from an image at a subsequent higher resolution of the input medical image to constrain the respective DI2IN trained at the subsequent higher resolution.

2. The method of claim 1 , wherein the target medical image analysis task is detection of one or more anatomical landmarks in the input medical image, and the estimated output image is one of a mask image in which only locations of the one or more anatomical landmarks have non-zero pixel or voxel values or an image with a Gaussian-like circle defined surrounding locations of the one or more anatomical landmarks.

3. The method of claim 1 , wherein the target medical image analysis task is detection of an anatomy of interest in the input medical image, and the estimated output image is a mask image in which only pixels or voxels located within a bounding box of the anatomy of interest have non-zero values.

4. The method of claim 1 , wherein the target medical image analysis task is segmentation of one or more anatomies of interest in the input medical image, and the estimated output image is one of a mask image in which only pixels or voxels located within boundaries of the one or more anatomies of interest have non-zero values or an image with a Gaussian-like band defined surrounding boundaries of the one or more anatomies of interest.

5. The method of claim 1 , wherein the target medical image analysis task is lesion detection, segmentation, and characterization, and the estimated output image is a multi-label mask image in which only pixels or voxels within lesion boundaries of one or more lesions have non-zero values assigned to each of the one or more lesions corresponding to a lesion type for each lesion.

6. The method of claim 1 , wherein the target medical image analysis task is an image denoising task, and the estimated output image is a reduced noise image of the input medical image.

7. The method of claim 1 , wherein the input medical image is a medical image in a source domain, the target medical image analysis task is cross-domain image synthesis, and the estimated output image is a synthesized medical image in a target domain corresponding to the input medical image.

8. The method of claim 1 , wherein receiving the input medical image includes receiving the input medical image in a pair of input medical images acquired using different imaging modalities, the target medical image analysis task is registration of the pair of input medical images, and the estimated output image is a deformation field that provides the registration between the pair of input medical images.

9. The method of claim 1 , wherein receiving the input medical image includes receiving the input medical image in a set of input medical images, the target medical image analysis task is a quantitative parametric mapping task, and the estimated output image is a set of quantitative parameters that generate the set of input medical images given a pointwise generative model.

10. The method of claim 1 , further comprising, in a training stage prior to receiving the input medical image of the patient: defining a type of output image that provides the result of the target medical image analysis task; receiving a plurality of input training images; receiving or generating corresponding output training images for the plurality of input training images, resulting in a training set of paired input and output training images; and training the DI2IN by learning weight parameters of a deep learning network that models the unary and pairwise terms of the CRF energy function that result in a maximum likelihood for paired input and output training images over the training set of paired input and output training images.

11. The method of claim 1 , wherein automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN) comprises: estimating an output image that maximizes a likelihood of the CRF energy function given the input medical image and a set of learned weight parameters of the trained deep learning network, wherein the trained deep learning network calculates the unary and pairwise terms of the CRF energy function based on the input medical image, the estimated output image, and the set of learned weight parameters of the trained deep learning network.

12. The method of claim 1 , wherein automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN) further comprises: dividing the input medical image into a plurality of parts; automatically generating a respective output image that provides a result of the target medical image analysis task on each of the plurality of parts of the input medical image using a respective trained DI2IN for each of the plurality of parts; and aggregating the output images that provide the results of the target medical image analysis task on each of the plurality of parts to generate a final output image that provides the result of the target medical image analysis task on the input medical image.

13. An apparatus for automatically performing a medical image analysis task on a medical image of a patient, comprising: means for receiving an input medical image of a patient; and means for automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN), wherein the DI2IN uses a conditional random field (CRF) energy function to estimate the output image based on the input medical image and uses a trained deep learning network to model unary and pairwise terms of the CRF energy function, wherein the means for automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN) comprises: means for generating an image pyramid with a plurality of reduced resolution images of the input medical image, and means for automatically generating a respective output image that provides a result of the target medical image analysis task on each of the reduced resolution images of the input medical image using a sequence of trained DI2INs including a respective DI2IN trained at each of a plurality of resolution levels of the plurality of reduced resolution images, wherein the respective output image generated for each of the plurality of reduced resolution images of the input medical image defines a region of interest that is cropped from an image at a subsequent higher resolution of the input medical image to constrain the respective DI2IN trained at the subsequent higher resolution.

14. The apparatus of claim 13 , wherein the target medical image analysis task is one of anatomic landmark detection, anatomic structure detection, anatomic structure segmentation, lesion detection, segmentation or characterization, image denoising, cross-domain image synthesis, cross-modality image registration, or quantitative parameter mapping.

15. The apparatus of claim 13 , further comprising means for training the DI2IN comprising: means for defining a type of output image that provides the result of the target medical image analysis task; means for receiving a plurality of input training images; means for generating corresponding output training images for the plurality of input training images, resulting in a training set of paired input and output training images; and means for training the DI2IN by learning weight parameters of a deep learning network that models the unary and pairwise terms of the CRF energy function that result in a maximum likelihood for paired input and output training images over the training set of paired input and output training images.

16. The apparatus of claim 13 , wherein the means for automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN) comprises: means for estimating an output image that maximizes a likelihood of the CRF energy function given the input medical image and a set of learned weight parameters of the trained deep learning network, wherein the trained deep learning network calculates the unary and pairwise terms of the CRF energy function based on the input medical image, the estimated output image, and the set of learned weight parameters of the trained deep learning network.

17. The apparatus of claim 13 , wherein the means for automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN) further comprises: means for dividing the input medical image into a plurality of parts; means for automatically generating a respective output image that provides a result of the target medical image analysis task on each of the plurality of parts of the input medical image using a respective trained DI2IN for each of the plurality of parts; and means for aggregating the output images that provide the results of the target medical image analysis task on each of the plurality of parts to generate a final output image that provides the result of the target medical image analysis task on the input medical image.

18. A non-transitory computer readable medium storing computer program instructions for automatically performing a medical image analysis task on a medical image of a patient, the computer program instructions when executed by a processor cause the processor to perform operations comprising: receiving an input medical image of a patient; and automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN), wherein the DI2IN uses a conditional random field (CRF) energy function to estimate the output image based on the input medical image and uses a trained deep learning network to model unary and pairwise terms of the CRF energy function, wherein automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN) comprises: generating an image pyramid with a plurality of reduced resolution images of the input medical image, and automatically generating a respective output image that provides a result of the target medical image analysis task on each of the reduced resolution images of the input medical image using a sequence of trained DI2INs including a respective DI2IN trained at each of a plurality of resolution levels of the plurality of reduced resolution images, wherein the respective output image generated for each of the plurality of reduced resolution images of the input medical image defines a region of interest that is cropped from an image at a subsequent higher resolution of the input medical image to constrain the respective DI2IN trained at the subsequent higher resolution.

19. The non-transitory computer readable medium of claim 18 , wherein the target medical image analysis task is one of anatomic landmark detection, anatomic structure detection, anatomic structure segmentation, lesion detection, segmentation or characterization, image denoising, cross-domain image synthesis, cross-modality image registration, or quantitative parameter mapping.

20. The non-transitory computer readable medium of claim 18 , wherein the operations further comprise, in a training stage prior to receiving the input medical image of the patient: defining a type of output image that provides the result of the target medical image analysis task; receiving a plurality of input training images; receiving or generating corresponding output training images for the plurality of input training images, resulting in a training set of paired input and output training images; and training the DI2IN by learning weight parameters of a deep learning network that models the unary and pairwise terms of the CRF energy function that result in a maximum likelihood for paired input and output training images over the training set of paired input and output training images.

21. The non-transitory computer readable medium of claim 18 , wherein automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN) comprises: estimating an output image that maximizes a likelihood of the CRF energy function given the input medical image and a set of learned weight parameters of the trained deep learning network, wherein the trained deep learning network calculates the unary and pairwise terms of the CRF energy function based on the input medical image, the estimated output image, and the set of learned weight parameters of the trained deep learning network.

22. The non-transitory computer readable medium of claim 18 , wherein automatically generating an output image that provides a result of a target medical image analysis task on the input medical image using a trained deep image-to-image network (DI2IN) further comprises: dividing the input medical image into a plurality of parts; automatically generating a respective output image that provides a result of the target medical image analysis task on each of the plurality of parts of the input medical image using a respective trained DI2IN for each of the plurality of parts; and aggregating the output images that provide the results of the target medical image analysis task on each of the plurality of parts to generate a final output image that provides the result of the target medical image analysis task on the input medical image.

Patent Metadata

Filing Date

Unknown

Publication Date

August 28, 2018

Inventors

S. Kevin Zhou

Dorin Comaniciu

Bogdan Georgescu

Yefeng Zheng

David Liu

Daguang Xu

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search