, US
Phone: xxx-xxx-xxxx
Email: xxx@xxxx.xxx
Looking For: senior data engineer,
Occupation: IT and Math
Degree: Bachelor's Degree
Career Level: Experienced
Languages:
Highlights:
Skills:Python, Spark, SQL, AWS, Airflow, Databricks
Senior Data Engineer 09/2022 - current
S & P Global, Hyderabad, Telangana India
Industry: Finance
Highly experienced data engineer with 8+ years of experience in designing, building, and maintaining large-
scale data pipelines and infrastructure.
• Strong understanding of data warehousing principles and experience working with distributed data systems
such as Hadoop and Spark.
S&P Global, Senior Data Engineer Sep 2022 – present | Hyderabad, India
• Extensive experience designing, building, and maintaining large-scale data pipelines and infrastructure
• Experience designing and deploying data pipelines using Data Bricks, including using its notebook interface
and managing jobs, clusters, and the workspace
• Experience designing and implementing data workflows using Apache Airflow, including creating custom
operators and designing DAGs
• Strong knowledge of AWS services such as S3, EC2, RDS, and Redshift, and experience implementing data
pipelines utilizing these services
• Extensive implementation knowledge of various layers in predicting the oil supply for future
• Experience working with Delta Lake, including designing and implementing data lake architecture, managing
data lake storage and access, and optimizing data lake performance
• Strong understanding of Delta Lake's unique features such as schema validation, time travel, and data
versioning
• Experience with Delta lake's integration with other big data systems, such as Databricks
• Experience with Delta Lake's data access features, including reading and writing data using Spark SQL and
Python DataFrame API
• Strong experience using the Rest API to trigger Airflow DAGs and integrating with Databricks notebooks--
Data Engineer 08/2021 - 09/2022
VMware, , India
Industry: Computer Software
Managed a process re-engineering project to improve and consolidate end-to-end service processes;
restructured data flow among 10 departments and cut down time by 75%
• Maintained data pipeline uptime of 99.8% while ingesting transactional data across 8 different primary
sources using Spark, Hive, Presto, and Python
VMware, Data Engineer Aug 2021 – Sep 2022 | Bangalore
• Managed a process re-engineering project to improve and consolidate end-to-end service processes;
restructured data flow among 10 departments and cut down time by 75%
• Maintained data pipeline uptime of 99.8% while ingesting transactional data across 8 different primary
sources using Spark, Hive, Presto, and Python
• Used Python for data pipeline and transformations, ingested data from multiple source systems (SAP MDG,
Salesforce, etc.) into Presto, automated ETL processes across billions of rows of data, which reduced manual
workload
• Developed HQL scripts and complex SQL logic to build the data pipeline
• Pioneered advanced data visualization dashboard in Tableau--
Specialist - Technical Analytics 12/2019 - 08/2021
Novartis Health Care, , India
Provide analytical support to Novartis Internal Customers (SCM) on analytical reports.
• Expand data visualization dashboard in Qlik sense/ Tableau for entire supply chain management ,saved
nearly $400k as part of continuous improvement
• Analyze data quality issues, their trends and impact on business through root cause analysis.
• Create and publish the monthly/weekly reports and KPI analysis.
• Automated the data extracts from Oracle data base using Python Scripting & R .
Provide analytical support to Novartis Internal Customers (SCM) on analytical reports.
• Expand data visualization dashboard in Qlik sense/ Tableau for entire supply chain management ,saved
nearly $400k as part of continuous improvement
• Analyze data quality issues, their trends and impact on business through root cause analysis.
• Create and publish the monthly/weekly reports and KPI analysis.
• Automated the data extracts from Oracle data base using Python Scripting & R .--
Systems Engineer 02/2016 - 12/2019
Tata Consultancy Services, , India
Work with various source systems like JDE, peoplesoft,SQL,Oracle extract relevant data and migrate into
organization central repoistory (SAP ERP)
• Developed ETL mappings/workflows using Informatica here data will be transformed, validated and loaded
into Ivalua.
Tata Consultancy Services, Systems Engineer Feb 2016 – Dec 2019 | Hyderabad, India
Client: Honeywell, USA
• Work with various source systems like JDE, peoplesoft,SQL,Oracle extract relevant data and migrate into
organization central repoistory (SAP ERP)
• Developed ETL mappings/workflows using Informatica here data will be transformed, validated and loaded
into Ivalua.
• Developed deduplication reports in Informatica Developer(IDQ) based on the data from SAP to avoid
duplicate records loading into SAP.
• Expanded & developed the data quality dashboard across various entities
Client: Lennar, USA
• Sales Warehouse Environment (SWE) is a group in Lennar focusing on providing analytical data to various
business groups.
• This data helps the business community in increasing the Sales business in various parts of United States.
• Enhancement of the existing application to provide more functionality.
• Developed ETL data warehouse in SCD1, SCD2 and SCD3.
• Built complex sql, procedures and functions.
• Optimized the existing long running processes to run faster with usage of bulk Loader, hints and limits on
bulk loader--
Login to view resume: Santosh Resume - Data Engineer , ETL , Python , Spark , ML