백서: Apache Hadoop 보안 및 데이터 거버넌스 개선다운로드
As organizations pursue Hadoop initiatives to capture new opportunities for data-driven insights, data governance and data security requirements can pose a key challenge. Hortonworks created an Apache Hadoop Data Governance Initiative to address the need for open source governance solution to manage data classification, data lineage, security and data lifecycle management.
데이터를 효과적으로 관리하고 제어하려면 수동적이거나 단순히 수사적이어서는 안 됩니다. 일관된 데이터 분류를 통한 중앙집중식 액세스 제어는 동적인 보안의 기초이자 Open Enterprise Hadoop의 핵심적인 요구사항입니다. 이 목표를 달성하기 위해, Hortonworks는 Apache Atlas와 Apache Ranger를 사용하여 데이터 분류와 보안 정책 집행을 하나로 모으는 새로운 공개 미리보기 기능의 출시를 발표합니다.
Apache Atlas, created as part of the Hadoop data governance initiative, empowers organizations to apply consistent data classification across the data ecosystem. Apache Ranger provides centralized security administration for Hadoop. By integrating Atlas with Ranger, Hortonworks empowers enterprises to institute dynamic access policies at run time that proactively prevents violations from occurring.
The Atlas/ Ranger integration represents a paradigm shift for big data governance and data security in Apache Hadoop. By integrating Atlas with Ranger enterprises can now implement dynamic classification-based security policies, in addition to role-based security. Ranger’s centralized platform empowers data administrators to define security policy based on Atlas metadata tags or attributes and apply this policy in real-time to the entire hierarchy of data assets including databases, tables and columns.
Hortonworks empowers data managers to ensure the transparency, reproducibility, auditability and consistency of the Data Lake and the assets it contains. Apache Atlas now provides the ability to visualize cross-component lineage, delivering a complete view of data movement across a number of analytic engines such as Apache Storm, Kafka, Falcon and Hive. Hadoop operations, stewards, operations, and compliance personnel now have the ability to visualize a data set’s lineage and then drill down into operational, security and provenance-related details. As this tracking is done at the platform level, any application that uses multiple engines will be natively tracked. This allows for extended visibility beyond a single application view.