Server Monitoring Authors: Carmen Gonzalez, AppDynamics Blog, Yeshim Deniz, Liz McMillan, Pat Romanski

News Feed Item

Syncsort's New Data Integration Solutions Provide a Smarter Approach to Hadoop ETL

Two New Hadoop Offerings and DMX Innovations Bring Benefits of Better ETL through Hadoop and Better Hadoop with Enhanced ETL

WOODCLIFF LAKE, N.J., May 20, 2013 /PRNewswire/ -- Syncsort, a global leader in Big Data integration solutions, today announced the availability of its Spring '13 release, including two brand new Hadoop products and breakthrough enhancements to DMX that turn Hadoop into a more robust, feature rich and easy-to-use ETL solution.

(Logo: http://photos.prnewswire.com/prnh/20130520/NY16823LOGO )

Big Data is prompting organizations to look at Hadoop to process more data in less time and for less money, but Hadoop is not yet a complete ETL solution. Syncsort's two new offerings for Hadoop – DMX-h ETL Edition  and DMX-h Sort Edition are designed to strengthen Hadoop by providing the full functionality required to deliver enterprise ETL capabilities. They provide greater ease-of-use and maximize node performance compared to non-native, code-generating ETL tools. In addition, performance and connectivity enhancements to DMX expand usage by end-users and partners.   

"Analyzing Big Data is critical to our customers' ability to sustain competitiveness, but the avalanche of information is breaking traditional data integration architectures ─ many of the tools are too code and resource intensive and ultimately drive costs too high," said Josh Rogers, senior vice president, data integration business, Syncsort. "With our new DMX editions, we are strengthening Hadoop by providing seamless and powerful ETL and sort capabilities and at the same time, reinvigorating the value proposition of ETL by leveraging the power of Hadoop to scale core processing of Big Data."

The new DMX-h solutions take advantage of Syncsort's recent contribution to Apache Hadoop, which provides a unique level of native integration to deliver best in class data integration capabilities and Sort acceleration for Apache Hadoop distributions.

Highlights of the DMX-h ETL include:

  • Smarter Architecture. DMX-h has the only ETL engine that runs natively within MapReduce, maximizing node performance.
  • Smarter Development. Hadoop ETL without coding. Developers can leverage an easy-to-use Windows GUI and deploy seamlessly into Hadoop.
  • Smarter Productivity. "Use case accelerators" – a library of pre-built templates help developers fast-track Hadoop ETL implementations.
  • Smarter Connectivity. Extends access to and delivery of all data, including from the mainframe.
  • Smarter Economics. Smarter architecture, development, connectivity and productivity combine to help drive results in less time and at a fraction of the cost of other solutions.

Benchmark Results

Recent Syncsort benchmarks show significant Hadoop performance and resource efficiency improvements when using DMX-h. More importantly, the results show very predictable and sustainable throughput even as data volumes grow. Using the TeraSort benchmark, DMX-h Sort Edition achieved a sustainable throughput of over 100 megabytes per second per node (MB/S/N) delivering upwards of 2x higher throughput per node­ than Hadoop's native sort at 45 MB/S/N. Similarly, DMX-h ETL Edition achieved sustainable throughput in excess of 255 MB/S/N for up to 2.5x faster performance than Pig when aggregating 2TB of Web log data. In both cases, tests were run for data volumes ranging from 500GB to 2TB of data. While alternatives such as Hadoop's native sort and Pig reach a saturation point - where throughput starts to decline - at around 500GB of data, DMX-h delivered sustainable and predictable performance from 500GB to 2TB. The implications are huge for organizations, as they can more efficiently size their Hadoop infrastructure, minimize uncertainty and achieve a more predictable cost–structure as Big Data becomes even bigger.

Supporting Quotes

"Hadoop is lowering the cost structure of processing data at scale, but deploying Hadoop at the enterprise level is not free, and significant hardware and IT productivity costs can damage ROI," said Evan Quinn, Senior Principal Analyst, Enterprise Strategy Group. "Syncsort's Spring '13 release provides unique capabilities in Hadoop to help maximize savings, delivering best-in-class ETL technology at a price point that is highly disruptive for the data integration market, and more consistent with the cost structure of open source solutions."

"In tag management, we facilitate a huge number of interactions between marketers and their vendors, and as a result, we are able to see the complex journey a consumer takes prior to making a purchase. This involves a huge amount of data processing.  To be competitive, we must convert the high volume of 'path-to-purchase' data captured by our platform into actionable intelligence that drives decisions by both marketers and their vendors," said Ave Wrigely, CTO of TagMan.  "What's compelling about Syncsort's latest DMX product deliveries is the unique approach to replacing older code-driven approaches with a streamlined, GUI-driven way to collect, cleanse and distribute information inside and outside of Hadoop, saving time and resources and giving us maximum flexibility in preparing Big Data for business analytics and data visualization."

"Cloudera sees ETL as one of the top use cases for Hadoop ─ it is essential to our mission of maximizing the value of big data," said Amr Awadallah, Chief Technology Officer, Cloudera. "We see Syncsort's new DMX-h offerings enabling our mutual customers with critical data integration and ETL capabilities which simplify ETL deployments while efficiently processing data natively on Hadoop.  The CDH 4.2 release includes Syncsort's contribution to Apache Hadoop making the sort phase pluggable, enabling DMX-h, and broadening use cases on Hadoop."

Fast Start DMX-h ETL Test Drive

Anyone looking to leverage DMX-h ETL can now download a free test drive that contains everything they require without the need to set up their own Hadoop cluster. It includes a Linux Virtual Machine with Cloudera CDH 4.2 and DMX-h ETL Edition pre-installed, along with use case accelerators and sample data.

About Syncsort's Data Integration Business

Syncsort provides data-intensive organizations across the big data continuum with a smarter way to collect and process the ever-expanding data avalanche.  With thousands of deployments across all major platforms, including mainframe, Syncsort helps customers around the world to overcome the architectural limits of today's ETL and Hadoop environments, empowering their organizations to drive better business outcomes in less time, with fewer resources and lower TCO.  For more information visit www.syncsort.com.

Additional Resources

Media Contacts:

Michael Kornspan
Syncsort Incorporated
Director, Corporate Communications
Tel: 201-930-8216
[email protected]

Joanne Hogue
Smart Connections PR  
Tel: 410-658-8246
[email protected]

SOURCE Syncsort

More Stories By PR Newswire

Copyright © 2007 PR Newswire. All rights reserved. Republication or redistribution of PRNewswire content is expressly prohibited without the prior written consent of PRNewswire. PRNewswire shall not be liable for any errors or delays in the content, or for any actions taken in reliance thereon.

IoT & Smart Cities Stories
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists examined how DevOps helps to meet the de...
According to Forrester Research, every business will become either a digital predator or digital prey by 2020. To avoid demise, organizations must rapidly create new sources of value in their end-to-end customer experiences. True digital predators also must break down information and process silos and extend digital transformation initiatives to empower employees with the digital resources needed to win, serve, and retain customers.
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, will provide an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life ...
Smart Cities are here to stay, but for their promise to be delivered, the data they produce must not be put in new siloes. In his session at @ThingsExpo, Mathias Herberts, Co-founder and CTO of Cityzen Data, discussed the best practices that will ensure a successful smart city journey.
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
LogRocket helps product teams develop better experiences for users by recording videos of user sessions with logs and network data. It identifies UX problems and reveals the root cause of every bug. LogRocket presents impactful errors on a website, and how to reproduce it. With LogRocket, users can replay problems.
@CloudEXPO and @ExpoDX, two of the most influential technology events in the world, have hosted hundreds of sponsors and exhibitors since our launch 10 years ago. @CloudEXPO and @ExpoDX New York and Silicon Valley provide a full year of face-to-face marketing opportunities for your company. Each sponsorship and exhibit package comes with pre and post-show marketing programs. By sponsoring and exhibiting in New York and Silicon Valley, you reach a full complement of decision makers and buyers in ...
There are many examples of disruption in consumer space – Uber disrupting the cab industry, Airbnb disrupting the hospitality industry and so on; but have you wondered who is disrupting support and operations? AISERA helps make businesses and customers successful by offering consumer-like user experience for support and operations. We have built the world’s first AI-driven IT / HR / Cloud / Customer Support and Operations solution.
Data Theorem is a leading provider of modern application security. Its core mission is to analyze and secure any modern application anytime, anywhere. The Data Theorem Analyzer Engine continuously scans APIs and mobile applications in search of security flaws and data privacy gaps. Data Theorem products help organizations build safer applications that maximize data security and brand protection. The company has detected more than 300 million application eavesdropping incidents and currently secu...
Rafay enables developers to automate the distribution, operations, cross-region scaling and lifecycle management of containerized microservices across public and private clouds, and service provider networks. Rafay's platform is built around foundational elements that together deliver an optimal abstraction layer across disparate infrastructure, making it easy for developers to scale and operate applications across any number of locations or regions. Consumed as a service, Rafay's platform elimi...