뉴스레터

이메일로 Hortonworks의 새 업데이트를 받으세요.

한 달에 한 번 빅 데이터와 관련한 최신 인사이트, 동향, 분석 정보, 지식을 받아 보세요.

AVAILABLE NEWSLETTERS:

Sign up for the Developers Newsletter

한 달에 한 번 빅 데이터와 관련한 최신 인사이트, 동향, 분석 정보, 지식을 받아 보세요.

CTA

시작하기

클라우드

시작할 준비가 되셨습니까?

Sandbox 다운로드

어떤 도움이 필요하십니까?

* 저는 언제든지 구독을 해지할 수 있다는 점을 이해합니다. 또한 저는 Hortonworks이 개인정보 보호정책에 추가된 정보를 확인하였습니다.
닫기닫기 버튼
December 03, 2018
이전 슬라이드다음 슬라이드

Getting the Most Out of Your Data in the Cloud with Cloudbreak

작성자:
Jon Dybik

There are three common abilities across the cloud providers that I want to focus on and to see how they work together and build on each other to help you maximize agility and data insights in the cloud. They are: cloud storage, running workloads on demand, and elastic resource management. In addition, we’ll talk about how you can pull this all together with Hortonworks Cloudbreak on a path towards big data insights in the cloud.

Let’s start with cloud storage. Cloud storage is key and lays the foundation to take full advantage of the other abilities we’ll talk about. Simply put, cloud storage is elastic and HDFS is not. This is critical when capacity planning for a shared data environment can be tough to get right and you commonly end up with costly ad-hoc provisioning of unplanned resources or suffer from low resource utilization due to costly up-front provisioning. Cloud storage’s pay-as-you-go model allows you to effectively manage cost as your storage needs grow. All the while the cloud storage provider is provisioning resources under the hood and transparently to you.

However, a big benefit with cloud storage is we can now separate storage from compute, and as a result, we can now launch use case specific workloads on demand in a shared data environment. For example, this separation of compute and storage allows for different Apache Spark applications such as a data engineering ETL job and an ad-hoc data science model training cluster to run on their own clusters, preventing concurrency issues that affect multi-user fixed-sized Apache Hadoop clusters. This separation and the flexible accommodation of running disparate workloads on demand not only lowers cost but also improves the user experience.

Now that you have separated storage from compute and disparate workloads to run on demand, you can truly take advantage of the elasticity that the cloud provides on a level of granularity that makes business and technical sense. For example if you have a cluster experiencing YARN memory saturation or a need to increase data read throughput, you can simply scale up the existing cluster or launch a new, larger cluster for a smaller period of time to handle the increased workload and meet business demand.

How do we tie all this together and operationalize our big data environment in the cloud? That’s where Cloudbreak comes in. Cloudbreak simplifies the deployment of big data workloads with cloud storage on cloud providers such as Amazon Web Services, Microsoft Azure, Google Cloud Platform, and OpenStack. It is easy to get started with Cloudbreak and you can use the wizard interface to deploy your first Hadoop cluster in 6 easy steps using one of our prebuilt Apache Ambari blueprints for data science, EDW, or ETL style workloads. When you are ready to take things to the next level, Cloudbreak is full of enterprise features including:

  • A CLI and API to automate cluster provisioning or integrate with other orchestration and cloud management tools.
  • Security first architecture for deploying kerberized clusters, integrating your cluster with LDAP/Active Directory for authentication, and the ability to protect your cluster with a secured gateway powered by Apache Knox.
  • Support for your unique configuration needs through custom Ambari blueprints, custom OS images, and injecting your own scripts into the cluster build process.
  • Auto-scaling of workloads based on time of day and Ambari metrics to optimize for peak workload demand and cost.

To learn more about Cloudbreak and and see a live demonstration, please join us for an upcoming webinar on December 5th. Details can be found here.

Comments

Han says:

Thank you for the article! Super useful, will have a look into it.
Have you ever used MyAirBridge? It’s an amazing service and it’s a pity it is not mentioned often, I would love to read what you think.

답변을 남기십시오

귀하의 이메일 주소는 공개되지 않을 것입니다. 필수 내용은 *로 표시되어 있습니다.