Amazon Elastic MapReduce makes it easy and cost-effective to run and manage vast amounts of data. It simplifies distributed data processing frameworks, allowing you to distribute data across scalable EC2 instances.
AWS takes this service one step further with the release of version 4.1.0. This new version introduces several new features:
- Spark 1.5.0 – upgrading from Spark 1.4.1, this service now includes a host of new features and bug fixes. This includes additional functions for Spark SQL/Dataframes, new algorithms in MLlib, improvements in the Python API for Spark Streaming, and more.
- HUE 3.7.1 – Hadoop User Experience lets you easily create queries and workflows for Hadoop ecosystem applications, look at tables in the Hive Metastore, and browse files in Amazon S3 as well as on-cluster HDFS.
- Hadoop KMS for HDSF Transparent Encryption –Hadoop Key Management Service is a cryptographic key management server that can provide keys for HDFS Transparent Encryption. Learn more about Hadoop KMS here.
- EMR Sandbox – this new feature allows you access to new applications for EMR that are still in development and not yet available for general release. Version 4.1.0 currently has:
- Presto 0.119 – an open-source SQL query-engine that can query any amount of data sets from several data sources.
- Zeppelin 0.6 – an open-source GUI that can be used to create interactive notebooks for data exploration and visualization. Several users can share and collaborate on a notebook, and they may publish their visualizations on external dashboards.
- Oozie 4.0.1 – a Hadoop workflow scheduler that can trigger Hadoop workflows as well as Directed Acyclic Graphs of actions.
- Intelligent Resize Feature – you may now change the size of your EMR cluster with minimal impact on your running jobs. When you add instances to your cluster, EMR may start using the provisioned capacity the moment it is available.
You can now start using EMR version 4.1.0 from the AWS Management Console. If you want to create a new cluster for your enterprise, or if you have concerns that are particular to your organization, please contact our AWS cloud consultants here at PolarSeven.
The post Amazon EMR 4.1.0 Released appeared first on PolarSeven Cloud Consulting.