Member-only story

Hadoop PIG

Avinash Navlani
4 min readJan 1, 2022

In this tutorial, we will focus on scripting language Hadoop PIG for processing big data.

Hadoop Pig is a scripting platform that runs top on Hadoop. It is a high-level and declarative language. It is designed for non-java programmers. Pig uses Latin scripts data flow language.

Why do we need Pig?

Hadoop is written in Java and initially, most of the developers write map-reduce jobs in Java. It means till then Java is the only language to interact with the Hadoop system. Yahoo came with one scripting language known as Pig in 2009. Here are a few other reasons why Yahoo developed the Pig.

  • Lots of on-Programmers were unable to utilize MapReduce because they don’t know the Java programming language.
  • It is difficult to maintain, optimize and write productive code.
  • It is also difficult to consider all the MapReduce phases such as map, sort, and reduce during programming.

Yahoo’s managers have faced problems in performing small tasks. For each small and big change, they need to call the programming team. Also, programmers need to write lengthy codes for small tasks. To overcome these problems, yahoo developed a scripting platform Pig. Pig help researchers to analyze data with simple and few lines of declarative syntax.

Pig Features

--

--

Avinash Navlani
Avinash Navlani

Written by Avinash Navlani

Sr Data Scientist| Analytics Consulting | Data Science Communicator | Helping Clients to Improve Products & Services with Data

No responses yet