Datavolo’s Post

#NiFi + #Kafka together = Magic! Our team recently put Redpanda Data through the paces for some scale / performance testing of Datavolo and the product and process was flawless. We loved working with the team at Redpanda and highly recommend reaching out to them for all of your Kafka needs!

View profile for Joe Witt, graphic

Helping gather all the data

A prospect recently asked Datavolo if we can help with the following requirements: ◦ Rates in excess of 1 Million events per second at more than 1 GB /sec. ◦ Sub-second latency from Redpanda Data in a BYOC deployment to a SaaS data platform. Only problem…  While we’ve done a ton of work with Kafka from other providers we’d never integrated with Redpanda.  Well clearly we were missing out because Redpanda is awesome! After a quick discussion with their partner team the next day we had a Datavolo BYOC Redpanda cluster established capable of the rates we needed peered to our own BYOC deployment. Datavolo’s pipeline has a few key steps and does so with easy scaling: ◦ Grab the data from Redpanda using Apache Kafka’s excellent wire protocol using SASL/SCRAM supported by Redpanda. ◦ Decompress the data so while Redpanda is handing us 750MB/sec of data, what we’re actually seeing is more than 1.93 GB/sec at a rate of more than 1.25 million events per second. ◦ Transform each entry from line oriented CSV into fully typed and schema compliant JSON documents. ◦ Cluster records together by matching field types which allows the downstream platform to optimize query performance. ◦ Stream into the destination service fully typed records. Redpanda was stunningly simple to set up including the VPC and security including establishing both user and service accounts.  Deploying a BYOC cluster and the control plane management were very impressive and well designed.  Achieving claimed spec data rates was flawless every time.  And notably the Redpanda team is super responsive and helpful! The price to performance is extremely easy to reason over and the user interface is very intuitive.  In fact, it helped us find and make substantial performance improvements in both Apache NiFi and Datavolo’s runtimes thanks to how well they show consumer groups and lag information along with message data and compression information. Below is an image of a test run we did recently to recreate the successful engagement.  In this run you see us having 10, then 20, then 40 consumers in a Datavolo Runtime and the entire time Redpanda was flawless. If you’re looking for an excellent Apache Kafka platform Redpanda is a must to check out.  Their service is very well designed by a highly skilled team and price/performance is exactly as advertised!  Their control plane and data plane design is one to emulate and Datavolo itself has learned a lot from them.  We’re grateful for their collaboration. Thanks Chris Larsen and the Redpanda Data team!

  • No alternative text description for this image

To view or add a comment, sign in

Explore topics