Friends , Functions in Pig come in four types

1. Eval function

– A function that takes one or more expressions and returns another expression.

– Some function are aggregate function like MAX

– Some functions are algebraic, which means that the result of the function may be calculated incrementally.

– In MapReduce term algebric functions make use of combiner and are much more efficient to calculate .

– Supports UDF by importing org.apache.pig.EvalFunc , extend EvalFunc & overriding exec method

2. Filter function :

– It returns logical boolean results

– FILTER removes unwanted rows

– EX: IsEmpty

– Supports UDF by importing org.apache.pig.FilterFunc , extend FilterFunc & overriding exec method

3. Load function

– Loads the data into a relation from external storage

– Supports UDF by importing org.apache.pig.LoadFunc , extend LoadFunc but override different other function like setLocation , getInputFormat , prepareTORead , getNext methods.

4. Store function

– Specifies how to save the contents of a relation to external storage

– Ex: PigStorage which loads data from delimited text files , can store data in the same format.

Detailed list is given below:

Pig Built-in Function | ||

Eval | AVG | Calculate Avg(Mean) value of entries in a bag |

CONCAT | Concatenates byte arrays or chareacter array together | |

COUNT | Calculate number of non-null entries in a bag | |

COUNT_STAR | Calculate all entries including nulls | |

DIFF | Calculates the set difference of two bags. If the two arguments are not bags | |

, returns a bag containing both if they are equal;otherwise,returns a nempty bag | ||

MAX | Calculate max | |

MIN | Calculate Min | |

SIZE | for character arrays, it is the num of char. For byte arrays the number of bytes | |

for containers(tuple , bag,map) it is number of entries | ||

SUM | Calculate summation of the values of entries in a bag | |

TOBAG | Convert one or emore expresssions to individual tuple which are then put in a bag | |

TOKANIZE | Tokenizes a character array into a bag of it’s constituent words | |

TOMAP | Converts an even number of expressions to a map of key-value pairs | |

TOP | Calculate top n tuples in a bag | |

TOTUPLE | Convert one or more expresssions to a tuple | |

Filter | IsEmpty | Test weather bag or map is empty |

Load/Sttore | PigStorage | Loads or stores relations using a field-delimited text format defaults to a tab character |

BinStorage | Loads or store relations from or to binary files in a pig specific format that uses HadoopWritable Object | |

TextLoader | Loads relations from a plain-text format. | |

JsonLoader,JsonStorage | Loads or store s relations from or to a JSON format. | |

HBaseStorage | Loads or stores relation from or to Hbase |