Shubham Jain

About

I have been a professional Bigdata Engineer since 2016. Always eager to learn new technologies.

I have experience in handling the complete lifecycle of a data engineering project. Be it Data Migration, Exploratory Data Analysis or creating data lake over cloud, handled the complete project with extreme dedication.

Basic Information

Age:

25

Email:

shubhamakachamp@gmail.com

Phone:

+91-7828733244

Address:

Jain Mechanical Works, Pali Road, Sheopur, MP, India

Language:

English, Hindi

Professional Skills

Python

SQL

Apache Spark

Hadoop(Sqoop,Hive,Impala,Oozie,HDFS,Flume)

Amazon Web Services

Databases (RDBMS, NoSQL)

Data Wrangling/EDA

Data Modelling

Portfolio

Work Experience

May 2019 - Present

Quantiphi Analytics

Big Data Engineer

Quantiphi is a category defining Applied AI and Machine Learning software and services company focused on helping organizations translate the big promise of Big Data & Machine Learning technologies into quantifiable business impact.

I have been working as a BigData Engineer fulfilling client's need for handling large and complex datasets and creating data lake on cloud. My responsibilities here includes:

Design and Implementation of data lake on AWS using S3, Redshift, Glue and DMS.
Implementation of document style NOSQL Data Hub on AWS using DynamoDB, Glue, Step functions, Athena, S3, EC2 and Infoworks.
Expertise in creating ETL pipelines with the help of Apache Spark over Glue and EMR.
Implementation of DynamoDB Streams to load Elasticsearch indexes for faster search results with the help of AWS Lambda.
Optimizing Apache Spark jobs by optimizing joins and reducing the data shuffle over the network.
Optimization of Redshift cluster performance, using optimum dist keys and sort keys and tuning sql queries for maximum performance.
Optimization of DynamoDB queries by implementing appropriate GSI and choosing the best partition and sort key.
Implementation of Flask based API to fetch data from dynamodb and serve customer and deploying it on AWS Elastic Container Service using Docker images.
Worked on creating Docker images for enabling Spark history server, glue local dev endpoints and enabling data lineage for glue jobs and deploying them on ECS.

Tech Stack:

AWS(Redshift, Glue, S3, Lambda, Redshift Spectrum), Apache Spark, Python, Pandas, Boto3, Pyarrow

Dec 2016 - May2019

Tata Consultancy Services

System Engineer

My role in TCS include working with the third largest banking client, Understanding their needs with the data and devloping architecture to streamline the process in a big data ecosystem. Majority of my responsibilities includes:

Design and implementation of python based framework to import data from various relational databases(Teradata, Oracle, SQL Server, DB2) to hadoop using Sqoop, hive and Oozie.
Implementing spark jdbc based data ingestion framework from Informix database.
Developed spark based application to parse complex xml data received as file.
Developed spark based reconciliation processes to maintain the data integrity in hadoop.
Worked Effectively with spark Dataframes and pandas Dataframes.
Worked in unix environment and created various unix scripts as per the requirements.
Imlemented Sqoop export framework for exporting the data from hadoop to Teradata.
Orchestration of data flow pipeline using oozie workflows from dynatrace to hdfs.
Handling of protobuf message format.
End to end setup of flume agent for fetching live transactions.
Used spark 2.1.0 over cloudera CDH5.13 to perform analytics on data in hive.

Environment

Hadoop 2, Spark 2.1.0, Sqoop, Hive 1.4.x, Impala, Oozie, Cloudera CDH 5.13.5, Flume 1.6.0, Python 2.7, Java 1.7, Maven, Git, Jenkins

Education

2012 - 2016

Bachelor's Degree

Bachelor of Engineering in Computer Science

Maharana Pratap College of technology

CGPA: 8.1

Developed a food ordering website using Java Spring and Hibernate MVC architecture. And created a project for controlling streetlights remotely.

2000 - 2012

Higher Secondary and Secondary School

Higher Secondary (PCM)

Nehru Higher Secondary School

Perceentage: 84.4%

Secondary School

Nehru Higher Secondary School

Perceentage: 82.3%

Contact Me

Address

A/9, Divine Lights CHS, Hanuman Nagar, Andheri East, Mumbai, Maharashtra(400093)

Phone

+91-7828733244

Email

shubhamakachamp@gmail.com