Want to dive deep into Apache Hudi 1.0, the origins of Hudi, and the future of #datalakehouses and #datacatalogs? We have a webinar for you! Join this fireside chat next week with Ananth P. of Data Engineering Weekly and Vinoth Chandar, founder and CEO of Onehouse and Apache Hudi PMC Chair.
Bridging the Gap: A Database Experience on the Data Lake
www.linkedin.com
Time for the party to start
Can you talk abit more on Metadata table and how its going to be evolved. most of the new indexing features are tied to it. would like to learn more on this
true databases would have lot of running param "Much more than just format "
This seems to be Databricks strategy: build another layer and accept diverse file formats.
DebeziumSource implementation is available for Mysql and Postgres. What's the plan for other db like MongoDB?
Maybe the community would benefit from a deep dive into MT table
Hi
Hi
Attending
Specialist - Data Engineering at NPCI
12hA simple issue: Let's say I have a streaming pipeline running in spark which is writing a Hudi table, now I want to insert data into this table by selecting some columns from another table, or let's say I want to back populate data into this table from a historical table, I have to create another spark/flink pipeline for this rather than firing a simple Trino SQL query insert into hudi_table select x,y,z from historical_table. Another point is table maintenance operations like compaction, clustering, cleaning would be very easy to schedule as SQL queries automated via Trino plus some workflow tool like airflow/dagster