LinkedIn and 3rd parties use essential and non-essential cookies to provide, secure, analyze and improve our Services, and to show you relevant ads (including professional and job ads) on and off LinkedIn. Learn more in our Cookie Policy.
Select Accept to consent or Reject to decline non-essential cookies for this use. You can update your choices at any time in your settings.
Good evening everybody and once again welcome back to the channel. A few days back I wrote this blog syncing all the three table formats such as Hoodie, Delta and Iceberg with Apache X table running on AWS Lambda function in this architecture of course. The Apache X table sync command is running on a Lambda function. Lambda function being serverless, can easily scale up and down, and it's also quite cheap and cost effective. Right? So the way this works is it takes the path to the config file. The config file resides on S3. The Lambda function downloads it locally in the temp directory and executes the sync command. So I remember I said I'll write the infrastructure code over the weekends and then I'll update you. I have done that and I just want to show you that particular process. So the infrastructure code is quite simple. I define a service right provider. AWS architecture is ARM 64. The runtime is going to be 3.8, memory is going to be 3 gigabytes, timeout is set to 500. And then here is my IAM permission. I'm giving access to S3 and then here is my Lambda function. Now to deploy the stack, all you gotta do is just say one or two words, SLS deploy and this will deploy. The stack on AWS I have already actually deployed it so I want to show you that. So if I can show you here function here you can see X table sync dev X table right? And then now I can of course test and show you. So first you will write your config via e-mail file, right? Which points to the hoodie table on Nash 3 where it is right? So if I go to. Three yeah, this is where my holy table is right inside data like demo 1995 inside the folder hoodie, right? I defined that I wanna I wanna build the metadata for iceberg Once I defined my config dot yml file, now push your config dot yml file to H3. So here I'm using the command AWS 3 CP. This is the path to the config file locally and this is where I want to ship it on my H3. You see, I'm gonna do that. So the conflict has been uploaded and now I can fire up the Lambda function. So now if I go to Lambda, let's see if it actually works, right? Because last time when I showed you, I ran it locally. I ran the Lambda function locally on Docker containers, and now I'm actually going to run on AWS Lambda. So let's refresh. Alright, so I'm going to go to the test section and then I'm going to I'm going to put the payload here you can see path and this is the path where the config is, right? It's on 03. So the Lambda function should download it, right? So if I click on test. Uh, you can see the Lambda function is executing and the execution is successful. Now if I go to H3 and I simply refresh. Look at that metadata and here you can see Iceberg related metadata. So coming back to the architecture, yes, the Lambda function works flawlessly. It builds the metadata for all the three table formats. You can schedule this via event bridge if you want, right on a daily, hourly, however you like. Or you can hook up an API gateway and invoke it by a REST API. You can also do that through a process driven right. For example, let's say your Glue job or Spark job is complete, you can just fire up. This API call and they will do the metadata right so. So in this video what I just wanted to show you is the solution is now deployed on AWS. I have tested it for Iceberg so it works and I've also updated the infrastructure code. So now you can deploy the solution with one command which is serverless deploy and it will create a repo in ECR, it will push the docker image, it will create the Lambda function and then all you got to do to use it or for the clients upload the config on S3 and pass the payload. The Lambda function and it will download it, execute the string command and it does everything for you. So that's all I have for the video. I hope you guys have enjoyed the video. If you have any other questions let me know and also I'll leave the blog links in the description. So in case if you want to try the solution out, feel free to try it out. OK, with that being said, keep smiling, keep programming and I'll see you next time.