Large scale distributed search using Elasticsearch

a_thakur
skek

MErging Sessions

skek and me have requested the DrupalCon Committe to merge this session and skek's session https://events.drupal.org/barcelona2015/sessions/elasticsearch-connector-not-just-search-engine. According to us a collaboration would lead to a better sesssion. skek would cover Roapmap of Elasticsearch connector module, feature requests and rest of the module and I would talk on distributed nature of Elasticsearch and how to configure clustering in Elasticsearch and it's configuration with Drupal.

Introduction

Search engines are essential tools for managing and mining big text data. Recent years have seen tremendous increase in text data. Managing and mining text data is one of the key challenges. Many open source frameworks and libraries like Lucene, Solr, Sphinx and Elasticsearch exist to tackle this challenge.

This session would cover how Elasticsearch(A distributed, scalable and highly available search server based on Lucene) can be integrated with a Drupal site to offer large scale distributed search which is one of the key demands of any enterprise Drupal site.

Topics

This session would cover following topics

  • Introduction to text retrieval
  • Difference between text retrieval and database retrieval
  • Introduction to Elasticsearch
  • Elasticsearch and Drupal integration
  • Clustering in elasticsearch: We would touch base upon some elasticsearch concepts like clusters, nodes, shards(primary, replica) etc. This section would cover how clustering in elasticsearch can be used to create distributed search cluster with failover and how horizontal scaling can be used to add more nodes to clusters to offer more scalable search.
  • Integrate elasticsearch in Drupal using Elastic Search Connector and Search API modules. This would be the fun part when we would start killing nodes in the cluster and see our search still working and see how nodes readjust among themselves in the cluster to provide failover :)

Tools, Frameworks and Modules used

Session Format

The session would be a mix of presentation and live demo using above tools.

At the end of the session participants would gain knowledge about how easy it is to configure clustering in Elasticsearch and the ease of integrating with Drupal site using modules mentioned above.

Session Track

Coding and Development

Experience Level

Intermediate

Drupal Version