We'll respond shortly.
One of the biggest announcements coming out of Google Cloud Platform’s Next Conference last week was about Apple moving workloads from AWS, but there is much more to the story than the headline. The world of poly-cloud is making big moves from financial justifications to customer moves to product developments. We cover it in this week’s BUILD Newsletter.
Today, VMware and Pivotal share an important milestone in our promise to deliver a next generation, turnkey Cloud Native platform that will fundamentally transform how companies deliver and run custom enterprise software. We are announcing the availability of the open source Photon Platform Cloud Provider Interface (CPI) for Cloud Foundry’s BOSH, an API that is used to interact with an underlying IaaS to create and manage objects on an infrastructure, including images, VMs and disks. Simply put, now Cloud Foundry users have the ability to manage their application’s lifecycle on the lightweight VMware Photon IaaS.
The Apache Software Foundation (ASF) is one of the open source organizations for the Google Summer of Code 2016 (GSoC 2016) program. As a sponsor of the ASF, Pivotal is keen on supporting students looking to work in the complex and growing field of big data by developing features across a number of ASF Incubating projects that power up our data products including Apache Geode (incubating), Apache HAWQ (incubating) and Apache MADlib (incubating). For students around the world, it also offers an opportunity to pair and learn from Pivotal’s data engineers, as well as earn $5500! Deadline to apply is Friday, March 25, 2016.
In this post, Scott Hajek, from Pivotal’s famous data science team, explains approaches for working with unstructured text, extracting the data and turning it into structured records. He explains entity recognition and related NLP techniques, such as human-supervised feedback loops, that use machine learning to automate extraction. Several examples are provided.
In this post, one of Pivotal’s data scientists, Scott Hajek, explains how Greenplum Database (either the open source or the Pivotal version) can be used for information extraction. After a brief introduction, he walks through the concepts, the capabilities within Greenplum and the processing steps with plenty of example code.
Open source software, such as the Apache HadoopⓇ standard within the Big Data realm, has become the default and dominate choice when companies are choosing to deploy software. Not too long ago, most executives didn’t necessarily see how much of their operations run on open source. Now, they do. There are three key reasons why, and we outline these in this post along with an upcoming webinar on an Open Source Playbook for 2016.
I recently attended PGconf in Vienna, Austria, where we announced the open-sourcing of Pivotal Greenplum, which has become the first open source massively parallel data warehouse. Now known as Greenplum Database in it's open source form, anyone can clone the github repo and build the product, but there is another segment of the community that just wants to try out the functionality of the product without going through that process. For that group, we now have the Pivotal Greenplum Sandbox Virtual Machine that allows a free trial of the open source Greenplum Database, the commercially available Pivotal Greenplum Command Center management tool, Apache MADlib (incubating), PostGIS, PL/R, PL/Perl, and PL/Java into an easy-to-use virtual machine which runs in either VirtualBox or VMware Fusion.
Welcome to the November 2015 edition of the Build Newsletter. For this edition, we start out with the big Dell acquisition news, talk about our donation of 5 million lines of code to open source, discuss unicorns and disruption, and provide a huge list of Cloud Native development updates.
It is an exciting time for customers using data-related software—with more data being processed an analysed than ever. To do this effectively you need some heavy duty technology. And one of the best for a long time has been the Pivotal Greenplum Database. Now, this database is available as open source—giving customers even more flexibility, and enabling a flourishing community of developers and contributors. In this episode, host Simon Elisha explores more about what this means and how you can be a part of it.
Today, Pivotal unveiled the first massively parallel processing (MPP) data warehouse to open source. The release of Greenplum Database is significant for two important reasons. First, while it is the only open source option for data warehousing, it also leads the industry with query optimization technology that is multiple times faster than any other commercial offering today. Second, it completes Pivotal's 10 month transition to engage more openly with all the communities we affect—and helping companies of all types on their digital transformation journey. With nearly 10 million lines of code released this year, this release signals to the industry that a sea change has happened, and the days of closed source, vendor-lock in, legacy models are coming to an end.