Hortonworks DataFlow (HDF) 3.2 was made generally available to our customers on Aug 13th 2018.
HDF 3.2 includes the following major innovations and enhancements:
HDF and Hortonworks Data Platform (HDP) in the same ecosystem
HDF and HDP can now be run in the same ecosystem. This means that HDF and HDP can be managed using the same version of Ambari – all security policies can be shared from a single Ranger instance, components can share a common security gateway via a single Knox instance and a single Apache Atlas instance can be used for all the metadata and governance services for both HDP and HDF components. Managing both components in the same instance ensures that errors are limited and there is no longer an operational burden to keep two instances of Ambari in sync.
Improved platform stability on large clusters and performance boost on complex flows
HDF 3.2 now supports Apache NiFi 1.7. Core improvements to Apache NiFi help ensure that large clusters run in a more stable manner. This reduces the operational impact since operations is no longer burdened with having to manually re-attach nodes that have entered an errored state. At the same time, the enhancements ensure that NiFi flows are capable of running as much as 100K processors without slowing down the core engine. This ensures that organizations have plenty of headroom for developing more complex flows and more room for several more multi-tenant users in the same environment.
Kerberos keytab isolation
Kerberos keytabs can now be isolated at a per principal level. This allows for users in a multi-tenant environment to safely be able to reference specific keytabs and principals. This ensures that just because a user has access to a HDFS keytab they will not have access to all of the HDFS principals. This provides a more granular control so that users are limited to only the principals they require.
Kafka 1.1.1 Support
In HDF 3.2, Kafka has been upgraded from 1.0.0 to 1.1.1. Key features and improvements have been added with respect to security and governance. In addition to these bug fixes, an important new feature was added to capture producer and topic metrics at partition level without instrumenting or configuring interceptors on the clients. This provides a non-invasive approach to capture important metrics for producers without refactoring/modifying your existing Kafka clients
Hive 3 support
Apache NiFi now supports Hive 3 running on HDP 3.0. This support ensures better performance for Hive streaming to HDP, Hive streaming to S3, and the ability to write directly to ORC from NiFi without first converting your datasets to Avro. Writing directly to ORC for better Hive query performance is accomplished by using the NiFi PutORC processor. With HDF 3.2, a few other processors related to HBase and HDFS have also been updated and enhanced.
Streaming Analytics Manager (SAM) enhancements
SAM has been updated to be compatible with HDP 3.0 services. Some of the processors such as Rule, Projection and Aggregate processors support writing complex conditions. SAM has also been enhanced to support test mode without a cluster.
Addition of Knox Service
Apache Knox has been added to HDF as a first-class service providing gateway proxy and SSO services for various HDF services including NiFi.
In HDF 3.2, many expression language updates have been added to MiNiFi. Another cool feature is the support for TensorFlow in MiNiFi. A C-based SDK has also been introduced in this version of MiNiFi.
For more information on HDF, visit hortonworks.com/hdf