Batch Data Ingestion in Database Systems

PublishedJanuary 19, 2021

Assigneenot available in USPTO data we have

InventorsBenoit Dageville Varun Ganesh Jiansheng Huang Jiaxing Liang Haowei Yu+1 more

Technical Abstract

Patent Claims

21 claims

Legal claims defining the scope of protection, as filed with the USPTO.

1. A method comprising: obtaining, at a database system, an ingest request to ingest one or more files into a table of a database; after obtaining the ingest request and prior to the ingesting of the one or more files, persisting the one or more files in a first file queue that corresponds to the table, the first file queue further corresponding to a client account, and the database system further comprising a second file queue that corresponds to both a second client account and a second table; assigning the one or more files to one or more execution nodes to be ingested into the table; ingesting, by the one or more execution nodes, the one or more files into one or more micro-partitions of the table, each of the one or more micro-partitions comprising contiguous units of storage of a storage device; and registering metadata after the one or more files are ingested into the one or more micro-partitions of the table, the metadata identifying the one or more files and the one or more micro-partitions.

2. The method of claim 1 , wherein the ingest request comprises a notification that includes a list of the one or more files.

3. The method of claim 2 , wherein obtaining the ingest request comprises receiving the notification on behalf of a client account that is associated with the one or more files.

4. The method of claim 1 , wherein obtaining the ingest request comprises polling a data lake for added files, the data lake being associated with a client account that is associated with the one or more files, the data lake comprising data storage containing a plurality of files, the plurality of files comprising the one or more files.

5. The method of claim 1 , wherein ingesting, by the one or more execution nodes, the one or more files into one or more micro-partitions of the table comprises: operating an ingest poller to poll the first file queue; and ingesting the one or more files into one or more micro-partitions of the table via one or more pipes.

6. The method of claim 1 , wherein assigning the one or more files to the one or more execution nodes to be ingested into the table comprises: generating an ingest task for each of the one or more execution nodes, each generated ingest task identifying the table and one or more of the one or more files; and assigning each generated ingest task to an execution node in the one or more execution nodes.

7. The method of claim 6 , wherein assigning each generated ingest task to an execution node in the one or more execution nodes comprises assigning each generated ingest task to a different core of an execution node in the one or more execution nodes.

8. A database system comprising: at least one processor; and one or more non-transitory computer readable storage media containing instructions executable by the at least one processor for causing the at least one processor to perform operations comprising: obtaining, at the database system, an ingest request to ingest one or more files into a table of a database; after obtaining the ingest request and prior to the ingesting of the one or more files, persisting the one or more files in a first file queue that corresponds to the table, the first file queue further corresponds to a client account, and the database system further comprising a second file queue that corresponds to both a second client account and a second table; assigning the one or more files to one or more execution nodes to be ingested into the table; ingesting, by the one or more execution nodes, the one or more files into one or more micro-partitions of the table, each of the one or more micro-partitions comprising contiguous units of storage of a storage device; and registering metadata after the one or more files are ingested into the one or more micro-partitions of the table, the metadata identifying the one or more files and the one or more micro-partitions.

9. The database system of claim 8 , wherein the ingest request comprises a notification that includes a list of the one or more files.

10. The database system of claim 9 , wherein obtaining the ingest request comprises receiving the notification on behalf of a client account that is associated with the one or more files.

11. The database system of claim 8 , wherein obtaining the ingest request comprises polling a data lake for added files, the data lake being associated with a client account that is associated with the one or more files, the data lake comprising data storage containing a plurality of files, the plurality of files comprising the one or more files.

12. The database system of claim 8 , wherein ingesting, by the one or more execution nodes, the one or more files into one or more micro-partitions of the table comprises: operating an ingest poller to poll the first file queue; and ingesting the one or more files into one or more micro-partitions of the table via one or more pipes.

13. The database system of claim 8 , wherein assigning the one or more files to the one or more execution nodes to be ingested into the table comprises: generating an ingest task for each of the one or more execution nodes, each generated ingest task identifying the table and one or more of the one or more files; and assigning each generated ingest task to an execution node in the one or more execution nodes.

14. The database system of claim 13 , wherein assigning each generated ingest task to an execution node in the one or more execution nodes comprises assigning each generated ingest task to a different core of an execution node in the one or more execution nodes.

15. One or more non-transitory computer readable storage media containing instructions executable by at least one processor for causing the at least one processor to perform operations comprising: obtaining, at a database system, an ingest request to ingest one or more files into a table of a database; after obtaining the ingest request and prior to the ingesting of the one or more files, persisting the one or more files in a first file queue that corresponds to the table, the first file queue further corresponds to a client account, and the database system further comprising a second file queue that corresponds to both a second client account and a second table; assigning the one or more files to one or more execution nodes to be ingested into the table; ingesting, by the one or more execution nodes, the one or more files into one or more micro-partitions of the table, each of the one or more micro-partitions comprising contiguous units of storage of a storage device; and registering metadata after the one or more files are ingested into the one or more micro-partitions of the table, the metadata identifying the one or more files and the one or more micro-partitions.

16. The non-transitory computer readable storage media of claim 15 , wherein the ingest request comprises a notification that includes a list of the one or more files.

17. The non-transitory computer readable storage media of claim 16 , wherein obtaining the ingest request comprises receiving the notification on behalf of a client account that is associated with the one or more files.

18. The non-transitory computer readable storage media of claim 15 , wherein obtaining the ingest request comprises polling a data lake for added files, the data lake being associated with a client account that is associated with the one or more files, the data lake comprising data storage containing a plurality of files, the plurality of files comprising the one or more files.

19. The non-transitory computer readable storage media of claim 15 , wherein ingesting, by the one or more execution nodes, the one or more files into one or more micro-partitions of the table comprises: operating an ingest puller to poll the first file queue; and ingesting the one or more files into one or more micro-partitions of the table via one or more pipes.

20. The non-transitory computer readable storage media of claim 15 , wherein assigning the one or more files to the one or more execution nodes to be ingested into the table comprises: generating an ingest task for each of the one or more execution nodes, each generated ingest task identifying the table and one or more of the one or more files; and assigning each generated ingest task to an execution node in the one or more execution nodes.

21. The non-transitory computer readable storage media of claim 20 , Wherein assigning each generated ingest task to an execution node in the one or more execution nodes comprises assigning each generated ingest task to a different core of an execution node in the one or more execution nodes.

Patent Metadata

Filing Date

Unknown

Publication Date

January 19, 2021

Inventors

Benoit Dageville

Varun Ganesh

Jiansheng Huang

Jiaxing Liang

Haowei Yu

Scott Ziegler

Want to explore more patents?

Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.

Browse All Patents Try Prior Art Search