Here’s everything you need to know about Solr — the answer to all your enterprise search questions
Solr is an open-source enterprise-search platform from the Apache Lucene project which significantly speeds up your search responses. Written in Java, this flexible and powerful cross-platform software provides full-text search, hit highlight, database integration, rich document handling and much more.
Thanks to the scalable and versatile technology, it is the go-to open-source tool when search capabilities for websites are needed. Running on a standalone server, Solr uses the Lucene Java search to achieve usability with the majority of programming languages via HTTP/XML and JSON APIs.
Let us have a look at a basic Solr installation scenario and go in depth on how to get it running in high availability. If at any point you are unfamiliar with some of the terminologies used, you can take a look at this Solr terminology guide.
Technical considerations for a Solr installation
Solr features packages for Unix based systems and Windows alike, and the first step when installing is done through a simple download which you can find on the Solr downloads page. Based on what you need and how you configure Solr, you can choose to place the package on a single server or on multiple servers to feature a clustered environment (which brings high availability in a production environment). You can choose between standalone and cloud mode by running a different Solr start command.
There are a few points to consider before and during the installation process:
- If you're planning to go live with Solr, make sure you have a look at the Solr guide for taking it to production .
- Be aware that Solr allows free access to its web interface and is not password protected. It can, however, support basic authentication for users with the use of the Basic Authentication Plugin. Make sure Solr servers are accessible only by the intended audience.
- Remember to use the Solr installation guide throughout the process.
- Don’t forget that settings can be changed in the "solrconfig.xml", but you can leave this untouched and add your options via the JVM start parameters.
- As always with the JVMs make sure the correct -Xms and -Xmx values are set up.
Planning, sizing and optimization
There are also a number of things you need to consider before, during and after the installation process:
- The /opt directory is used for installing Solr. When the script runs, the "-i" parameter can be used as an option to change the location.
- Installing Solr with the root user is not recommended for security reasons. A user solr will be created, it has ownership over /opt/solr and /var/solr directories. If there is a need for a different user, use the option "-u" when installing.
- Solr features writable files that are changed during the running of the application. These are located under /var/solr. If you wish to change this behavior use the parameter “-d” when installing.
- If you want to share your Zookeeper cluster to more than 1 Solr cluster (or share it with something else), make sure to isolate your Solr znodes via the Zookeeper chroot.
- If there are multiple Solr servers, take time to change Solr Hostnames - SOLR_HOST to enable correct ZooKeeper registration.
Technical considerations for a Solr installation
Zookeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. (To find out more about Zookeeper, check out the guide from Apache ZooKeeper.)
It’s important to consider that you will need an ODD number of Zookeeper servers to keep them in sync (a minimum of 3 Zookeepers servers for Zookeeper redundancy). While there are implementations out there that have Solr and Zookeeper on the same server, or that aren’t using an odd number of Zookeeper servers, this can be risky and more complicated to maintain.
Here’s how the setup looks in one of our production environments:
Is there an alternative way to get Solr?
For those who aren’t looking to install it themselves, there are other Solr implementations to choose from. When dealing with enterprise-level software, a faster and hassle-free way to leverage that software is always welcomed.
As is the case with many other tools, AWS offers a Solr container that can be deployed in an instant. The container is secured and kept up to date as it's offered via Bitnami.
This allows you to:
- Deploy more than one node and create a Cluster Cloud Solution
- Manage nodes via AWS with ease
- Ensure the base image is slim with no unneeded software
- Save time as there are no more package installations
Solr as SaaS?
In short, yes. Stepping into the SaaS area, Solr can also be provided as a service and is available via AWS teamed with Measured Search.
Benefits of implementing Solr this way include:
- Options for fully managed services when Solr is involved
- Scaling, backup, monitoring, alert and analytics
- 24/7 service and support backed up by SLAs
Stay tuned for our next blog post where we take a hands-on approach regarding the installation of Solr and Zookeeper.