Comments:"Techu Search Server v0.1-beta"
URL:http://georgepsarakis.github.io/techu-search-server/
HOW IT WORKS
The logic behind Techu is pretty straightforward, as you can see in the flow diagram to the right. Application code sends an HTTP request that corresponds to a specific action. There are 2 major groups of operations; indexing& searching. Indexing involves inserting or deleting a document from the index or modifying attributes or text fields for a document. Searching on the other hand involves performing full-text searches and retrieving highlighted excerpts.
In the current beta version, most request data are passed via a single data parameter with a JSON-formatted value, with some exceptions usually involving requests for the Sphinx configurations handling, but in the future all requests will be following this protocol for simplicity and uniformity. Oh, and yes, now you can keep you Sphinx configurations in order, by storing them in Techu's MySQL DB schema (although this feature can be bypassed also). On each regeneration command Techu will automatically restart the corresponding searchd.
After the application dispatches a request, Nginx receives it and the Django Python Web framework processes the request. As you can see in the diagram, indexing operations can be optionally queued for asynchronous execution. In that case, a Redis key is returned as a response and the request is later converted to SphinxQL and sent to Sphinx with the script referred as applier (we probably should find a better name for this).
If no queueing is required, then the data are converted to SphinxQL statement (or statements if you are batch inserting documents). For a searching operation, either full-text search or highlighted excerpts, the response can either originate directly from Redis (cache) or if there is no cache entry, the attribute filters and the query will be converted to SphinxQL and retrieved from Sphinx directly.
WHY REDIS?
Key-value storage, optionally persistent, with very large value length limit (512M) capable of storing a lot of text. Redis list and hash structures are key components of the caching and queueing sub-systems.
WHY SPHINXQL?
It is faster than the API.
WHY NGINX?
Faster web server, ensures high concurrency and low latency
WHY THESE COMPONENTS OVERALL?
Every component is well established software and excels in its area. Also they can be commonly found in most stacks. We wouldn't like to reinvent the wheel, plus there is no need for some exotic configuration for you to learn or setup!