Map/Reduce two-liner

Despite being around for quite a few years, and successfully used in a number of high profile implementation there is certain confusion around map/reduce concept, especially among non-technical folks.

If you need to explain this concept (yes, still fresh in memory 🙂 , here’s a two-line explanation:

1. Map: distribute data gathering tasks across grid.

2. Reduce:  collect the results, and eliminate duplicates

This is the processes in a nutshell, the devil, as usual, is in details; the data grid refers to distributed storage and computing infrastructure, and there is nothing trivial in parsing query into parallel data gathering  tasks, and processing the returned results.

