Onehouse’s Post

Want to dive deep into Apache Hudi 1.0, the origins of Hudi, and the future of #datalakehouses and #datacatalogs? We have a webinar for you! Join this fireside chat next week with Ananth P. of Data Engineering Weekly and Vinoth Chandar, founder and CEO of Onehouse and Apache Hudi PMC Chair.

Bridging the Gap: A Database Experience on the Data Lake

Bridging the Gap: A Database Experience on the Data Lake

www.linkedin.com

Megh Vidani

Specialist - Data Engineering at NPCI

12h

A simple issue: Let's say I have a streaming pipeline running in spark which is writing a Hudi table, now I want to insert data into this table by selecting some columns from another table, or let's say I want to back populate data into this table from a historical table, I have to create another spark/flink pipeline for this rather than firing a simple Trino SQL query insert into hudi_table select x,y,z from historical_table. Another point is table maintenance operations like compaction, clustering, cleaning would be very easy to schedule as SQL queries automated via Trino plus some workflow tool like airflow/dagster

Like
Reply
Kyle Weller

Head of Product @ Onehouse.ai | ex Azure Databricks

13h

Time for the party to start

Sai Sri Harsha G.

Technical Staff at Secureworks

12h

Can you talk abit more on Metadata table and how its going to be evolved. most of the new indexing features are tied to it. would like to learn more on this

Like
Reply
Soumil S.

Sr. Software Engineer | Big Data & AWS Expert | Apache Hudi Specialist | Spark & AWS Glue| Data Lake Specialist | YouTuber

13h

true databases would have lot of running param "Much more than just format "

Like
Reply

This seems to be Databricks strategy: build another layer and accept diverse file formats.

Like
Reply
Amit Kumar

Product Development Engineer III @ Phenom People | Scalable System Design

12h

DebeziumSource implementation is available for Mysql and Postgres. What's the plan for other db like MongoDB?

Like
Reply
Sai Sri Harsha G.

Technical Staff at Secureworks

12h

Maybe the community would benefit from a deep dive into MT table

Like
Reply
Aman Yadav

Data Engineer II @ Rakuten | NITK '21

13h

Hi

Like
Reply
Ezechiel YINDOULA

Consultant BI | Talend | Oracle | PostgreSQL | Power BI & CGP - Consultant Patrimonial

13h

Hi

Like
Reply
Harsh Raj Srivastav

Engineering @ClickPe (YC W23) | Ex-Backend Intern @Liveasy | Skilled in Java, Python & AWS | Specializing in Backend & Cloud Solutions.

13h

Attending

See more comments

To view or add a comment, sign in

Explore topics