An insider's look at AWS re:Invent 2014
A comprehensive collection of articles, videos and more, hand-picked by our editors
Amazon Redshift has enjoyed a swift rise in popularity over the last year, but IT pros say third-party tools are...
necessary for a complete deployment.
Amazon's Redshift data warehouse, which received feature updates this week, allows enterprise IT pros to execute complex SQL queries against large data sets.
While users can interface with Redshift directly, without third-party software as a go-between, partner tools that collect and transfer data into Redshift as well as tools that query the data warehouse and display data visually are preferable to the service's native interfaces, IT pros said.
XO Group, Inc., the company behind such brands as TheKnot.com, a website which helps couples plan weddings, uses Redshift because of its partnership with a data collection company, San Francisco-based Segment.io. Segment made its integration with Redshift public this week, and XO Group was a beta tester of the integration.
XO Group might have adopted Redshift without this partnership, but not as quickly, said Jon Hawkins, director of SEO and data analytics for the New York-based company.
Segment joins several other Amazon Redshift partners that can collect and transform data for analysis in the Redshift data warehouse. It collects and streams data from browsers on desktops and mobile devices directly into Redshift, without another database acting as a middleman, according to the company.
XO Group initially used Segment alone, and sent streams of application data -- at a clip of about a billion events per month -- to individual product development teams for feedback on the applications they created. But the company needed to see the big picture to make strategic decisions that span multiple apps and products, Hawkins said.
That's where Redshift came in.
"We liked the speed -- that was the first thing we noticed," Hawkins said. "We worked with many different analytics providers; they all received the same data we put into Amazon Redshift, and only one of them [Mixpanel] was as fast."
One of the insights gleaned from Redshift on the back end was that users on the company's sites prefer to share content via text and email rather than social networks, which informed the development of sharing options in the application.
This required help from Amazon partners Mode Analytics and Chartio, both of which help users construct SQL queries through a visual interface and display the data returned by Redshift visually for analysis.
Another Amazon Redshift customer, online ticketing service Etix, based in Raleigh, N.C., said earlier this year that it uses a partner product, Attunity's CloudBeam, for data integration between an on-premises Oracle deployment and the Redshift service.
Amazon Redshift gets incremental update
The Amazon data warehouse was also brushed up this week with new features, including resource tagging, the ability to cancel running queries, enhanced data load and unload, and support for larger clusters of up to 128 nodes -- up from 40 nodes at the product's launch. The platform also supports 16 new SQL queries and commands.
"It makes sense to me that for a data warehouse they would focus on improving the basics: importing and exporting data, running queries, analytics functions, etc.," said Daniel Heacock, senior business systems analyst for Etix. "That being said, none of these are critical to our business."
In the future, "triggers and stored procedures would be nice," Heacock said. Etix has designed external operations for transforming data once it arrives in Redshift, which can be tricky to manage. "The company would prefer to keep these operations contained in the database itself, as is customary with relational databases."