Map-Reduce
is a technique used for computation in distributed systems. As the name suggests the Map-Reduce
function is divided into two major tasks.
First, the input data collected from multiple distributed systems
are broken down into the smallest distinguished streams
for easily mapping together and converted
into the required format, then a reduction operation
takes the output of the map and is applied to collect data into a single variable
.
Map-Reduce
is generally broken down into smaller units
and computed via multiple threads
hence is easy to scale
.
Befoe java 8
, it was not possible to load a lot of data in memory and perform the bulk operations but since Java 8, with the introduction of Streams
in Java 8, it's possible to perform bulk operations parallelly
without loading data into memory.
package org.wesome.java8;
import lombok.AllArgsConstructor;
import lombok.Data;
import java.util.Arrays;
import java.util.List;
@Data
@AllArgsConstructor
class Fruit {
int fruitId;
String fruitName;
String taste;
double price;
}
class Apple {
public static void main(String args[]) {
List<Fruit> fruits = Arrays.asList(new Fruit(1, "Macintosh", "sweet", 1.1), new Fruit(2, "Fuji", "sweet", 2.2), new Fruit(3, "Gala", "sour", 1.1), new Fruit(4, "Jonagold", "sour", 2.2));
System.out.println("*--------------IntStream Sequential Map-Reduce--------------*");
double priceSum = fruits.stream().mapToDouble(Fruit::getPrice).sum();
System.out.println("*--------------IntStream Parallel Map-Reduce--------------*");
priceSum = fruits.parallelStream().mapToDouble(Fruit::getPrice).sum();
}
}