There are several tool options to consider when adding a cloud search engine to a website or application, but some...
cloud "super users" favor ElasticSearch.
ElasticSearch is the newest of three popular options for cloud search, all based on Apache Lucene, the open source search engine utility. Others mentioned in the AWS Super Users Online Meetup this week were Amazon's CloudSearch and Apache Solr.
"A great feature of ElasticSearch that I love is called 'more like this,' which means if you set up ElasticSearch in the right way, you can basically say, 'What are the documents that are most like this one?'" said Jeff Whelpley, chief architect at GetHuman LLC, a Boston-based proprietor of a website that helps users find shortcuts to customer service reps at large consumer companies.
This related search works "very fast and extremely well" in ElasticSearch, Whelpley said.
Unlike other panelists on the webinar, Whelpley does not use Amazon Web Services (AWS) to host ElasticSearch; instead, GetHuman uses managed service provider Found.no.
While Found.no’s customer service has been good, “ElasticSearch in particular is relatively new, and the companies in the space are also relatively small, and relatively new," Whelpley said. That has led to some frustration with customer service at Found.no competitors, including bonsai and Qbox. ElasticSearch also had problems with data corruption prior to its 1.0 release last year, but those have been resolved, Whelpley added.
Another participant in the webinar said his company uses a self-managed open source Apache Solr framework to manage cloud search running on AWS, but it is intrigued by ElasticSearch and is considering a move.
"If I were doing this over again today, I would choose ElasticSearch over Solr, simply because, like AWS CloudSearch, ElasticSearch is very [application programming interface] API-based," said Jon Dokulil, VP of engineering at Agile Sports Technologies Inc., which runs Hudl, a website that hosts highlight videos and other content for high school and college athletes. "With Solr, you're going to find that you're shuffling XML files around a lot, as well as text files."
Hudl has a staff of mostly developers, so the more API-driven a utility can be, the better, Dokulil said, adding that while comparable with Solr, ElasticSearch wins from an operational standpoint.
AWS' CloudSearch was represented in the discussion by Tom Hill, solutions architect for CloudSearch with AWS. He discussed the advantages of a fully hosted cloud search engine implementation, in which AWS manages data replication among search indexes and redundancy among search clusters.
Dokulil considered CloudSearch, but said the costs were significantly higher than rolling his own with Solr. Using m3.xlarge instances on Amazon, Hudl runs a triply redundant cluster for about $1,500 a month. CloudSearch would have been twice that, with redundancy over only two availability zones instead of three, he said. Found.no charges start at $54 a month for a very small hosted ElasticSearch configuration with 256 MB of memory, and can range to more than $2,500 a month for larger installations with 32 GB memory, according to its website.
"Obviously you're paying for something there, in how they run it for you," Dokulil said, referring to Amazon CloudSearch. "But we have an operations team, since we need to run other infrastructures as well."
Self-managing AWS nodes also means more flexibility with instance types, Dokulil added.
"When we get SSDs [solid-state drives] available, for example, we can start using them tomorrow," he said.
New cloud partnerships integrates Elasticsearch with Google