Wednesday, February 24, 2016

Leveraging the Serengeti API with vSphere Big Data Extensions

I've been working with VMware Big Data Extensions more with a couple of customers as we look at providing Hadoop as a Service (HaaS) leveraging the Serengeti API. So what is Big Data Extensions (BDE), and what is the Serengeti API, and why would I use it?

What is it?

BDE is an orchestration layer for deploying and managing Hadoop clusters. It's deployed as an OVA and registered as a plug in in the vCenter web interface. What is unique about BDE is that it allows VMware administrators to manage Hadoop clusters as a single instance, and provides all of the under the hood orchestration. Is supports both deploying the cluster as well as scaling the cluster. BDE is available to all Enterprise + ESXi customers and supported by VMware. You can get it here:

http://www.vmware.com/go/download-bigdataextensions

While BDE is the commercially supported release it's built on a project that VMware released to the open source community call Serengeti. The open source Serengeti project can be found here:

https://github.com/vmware-serengeti

Why would I use it?

The BDE plugin is preconfigured to manage Hadoop clusters as a single instance, which is great if you are a VMware admin with access to vCenter. What happens when you need to offer HaaS to data scientists, and you don't really want to give them access to vCenter. That's where the Serengeti API comes in, we can use it to call out to BDE from another platform.

If you already leverage vRealize Automation you are in luck. VMware has pre-built a plugin pack for vRealize Automation and Orchestration to offer HaaS. You can get it from the solutions exchange here. But what happens if you use another portal? That's where the Serengeti API comes into play.

Dig into the API after the break


Monday, February 8, 2016

Installing a signed SSL cert for EMC ECS Object services

I was working thorough a proof of concept for a customer backing up Cassandra to an S3 object store this weekend. Since I already had the EMC ECS community edition running in the lab I had an S3 object store ready to go, but I needed to install a signed certificate on it to make my customers backup  of Cassandra data to object storage work.

If you are looking for an object store to play with EMC ECS is available free for non-production use. You can get it here, and the EMC CODE team has been nice enough to package up docker containers of the nodes.

Why do I need to do this?

You will want to leverage a signed cert any time your clients need secured access to the object store. You can use a cert from your internal certificate authority as long as it is added to the trusted root of your clients, or one from a public trusted certificate authority.

Let's dive into the technical details after the break.