Edwardsville IL, US
Phone: xxx-xxx-xxxx
Email: xxx@xxxx.xxx
Looking For: Data Engineer, Data Analyst
Occupation: Business and Finance
Degree: Master's Degree
Career Level: Experienced
Languages:
Highlights:
Skills:
Data Analytics Engineer 02/2024 - current
Southern Illinois University Edwardsville, Edwardsville, IL United States
Experienced Data Analytics Engineer with 5 + years of experience with a focus on Microsoft Analytics ecosystem, specializing in the design, implementation, and optimization of Data Warehousing, Data Lakehouse and ETL processes. Proficient in developing sophisticated data models, visualizations, and analytics solutions, adept at collaborating with cross-functional teams to drive informed decision-making.
• Implemented Microsoft Fabric from scratch, integrating Azure data services to support comprehensive data analytics projects. Led ETL processes for data extraction, transformation, and loading into the data warehouse, ensuring data accuracy and completeness.
• Responsible for gathering requirements for the new projects and creating the data flow model of the business requirement.
• Involved in creation/review of functional requirement specifications and supporting documents for business systems, experience in database design process and data modeling process.
• Led the implementation of Medallion architecture in creating and developing the Lakehouse and Data Warehouse in the Microsoft Fabric.
• Connected to different appliances like Oracle, MySQL etc.
• Applied data science techniques, including regression analysis and machine learning (RandomForestRegressor), to predict student retention and enrollment trends.
• Managed data extraction, transformation, and loading (ETL) using SQL and Azure Data Factory, improving data accuracy by 20%.
• Performed dimensional modeling with Erwin, designed, and built relational data mart. Developed database objects such as stored procedures, indexes, and views in SQL Server.
• Developed and maintained analytical dashboards and reports in Power BI, enhancing decision-making processes across the university. Collaborated with university departments to understand their data needs, contributing to a 35% increase in data-driven decisions.--
Data Analyst - Graduate Assistant 06/2022 - 12/2023
Southern Illinois University Edwardsville, Edwardsville, IL United States
Led the Design and development of visually appealing analytical dashboards, KPI scorecards and reports using Microsoft Power BI, Excel, and Access which enhanced data visibility, resulting in a 20% increase in timely decision-making based on enrollment numbers and admission trends.
• Utilized SQL and data management techniques to extract, transform, and load data from university databases, Improved data accuracy by 15% through data cleansing and transformation processes.
• Collaborated with stakeholders to define data requirements and specifications, built strong relationships, fostering a data-driven culture, and leading to a 30% increase in data-driven decisions.
• Developed and reviewed SQL queries with use of join clauses in Power BI Desktop for validating static and dynamic data for data validation.
• Implemented Power BI Power Query to extract data from sources and modify the data to generate the reports.
• Utilized Power BI gateway to keep dashboards and reports up to-date with on premise data sources. Made plans to ensure secure data extraction into Power BI such as SQL Server.
• Ensured timely updating of dashboards and reports with on premise data sources by configuring Power BI gateway.
• Wrote complex DAX expressions to generate calculated columns and filters for role-based security.--
Senior Data Engineer 06/2021 - 01/2022
Tata Consultancy Service, Hyderabad, Telangana India
• Architect & implement medium to large scale BI solutions on Azure using Azure Data Platform services (Azure Data Lake, Data Factory, Data Lake Analytics, Stream Analytics, Azure SQL DW, Azure Databricks, NoSQL DB). Created and implemented a data lake architecture to seamlessly ingest raw data from various sources int Azure Data Lake using ETL service like Azure Data Factory.
• Performed ETL on data from different source systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and PySpark. Data Ingestion to one or more Azure Services - (Azure Data Lake Gen 2, Azure Storage, Synapse SQL Pools) and processing the data in Azure Databricks.
• Managed a team of 6 members in Agile software design and development of a complete BI ETL and Reporting platform for generating and distributing priority reports.
• Designed and build automated Azure cloud services for processing and storing data on daily basis in Azure Data Lake with the help if Azure Databricks Sparks which enabled availability of 60 TB structured data for data science projects.
• Developed scalable, fault-tolerant pipelines from existing workflows into Azure Data Factory pipelines for ingesting large volumes of data. Migrated the existing reporting product to an open-source platform. Built and maintained more than 60 ETL jobs.
• Managing full lifecycle of on-premises to cloud data migration, using Azure Data Factory and T-SQL for transformation, loading data into Power BI's Fact and Dimension tables with incremental refresh, and establishing pipelines with T-SQL logics and Azure Logic Apps for handling failures and sending notifications. Collaborated with DevOps engineers to developed automated CI/CD and test-driven development pipeline using azure as per the client requirement.
• Develop conceptual solutions & create proof-of-concepts to demonstrate viability of solutions.
• Developed streaming pipelines in Databricks using Azure Event Hub, handling live complex JSON data, processed via PySpark notebooks or Scala code, and operationalized those Spark streaming jobs.
• Implemented continuous monitoring of the Spark Cluster using Log Analytics, enhancing cluster stability.
• Leveraged Azure Synapse for workload management, facilitating data delivery for analytics and business intelligence purposes.
• Collaborated with different teams including data scientists, architects, analysts and Business Partners to ensure data quality and accessibility.--
Data Engineer 06/2019 - 06/2021
Tata Consultancy Service, Hyderabad, Telangana India
Lead dashboard development/Analytics initiatives – including data prep by extracting data from different databases like AWS, SQL Server, and used Python to build out predictive models and used Tableau for developing presentation layer in customer facing dashboards to ensure data integrity and improve data quality, resulting in a 15% enhancement in data reliability.
• Crafted compelling data visualizations using Tableau and Power BI, enhancing data-driven storytelling, and achieving a 10% rise in stakeholder engagement.
• Conducted advanced statistical analyses, driving data-driven decision-making, and contributing to a 12% increase in operational efficiency.
• Working on Azure Data Factory pipeline automating different cases, handling pipeline failures, optimizing tables, introducing automation for data validation, and enhancing pipeline utility with custom triggers for stored procedures.
• Developed automated pipelines to deployed 30+ KPIs for the Supply chain Customer Service hub, such as Fill Rate & its associated metrics like Overall Cuts in Dollars & cases, In Yard/In House cuts, on time Delivery, Unit Accuracy, Space Utilization, Etc.
• Implemented and managed a data pipeline using Airflow to ensure smooth and efficient data processing, while enhancing data quality by 15% through vigilant monitoring and troubleshooting of pipeline failures and inconsistencies.
• Employed Azure Data Factory and Azure Synapse to craft data flows, conducted various transformations leveraging Azure Databricks (Pyspark, SQL), and efficiently loaded the processed data into Azure Synapse.
• Wrote complex SQL procedures and functions in MSSQL for building analytical reports.
• Automated Azure Data Factory pipeline for data validation, reducing manual effort and ensuring data accuracy with a 99.9% success rate. Designed Spark streaming pipelines integrating Azure Event Hub, merging both batch and streaming functionalities seamlessly. Interacts with Business Analysts, Users, and SMEs on requirements.
• Implemented data ingestion from sources like HTTP and Azure Blob Storage into Azure Data Lake Gen2 through Azure Data Factory (ADF) and loaded into ADLS Gen2.
• Utilized tools such as Azure SQL Server, Data Factory, and Databricks to construct end-to-end data pipelines for collecting, cleansing, and processing client data.
• Constructed Directed Acyclic Graphs (DAGs) in Apache Airflow for scheduling ETL processes, integrating Apache Airflow components like Pool, Executors, and multi-node capability to enhance workflow efficiency.--
Southern Illinois University Edwardsville 01/2022 - 12/2023
Edwardsville, IL, United States
Degree: Master's Degree
Major:Management Information Systems
(GPA – 3.75)
• Project Management Fundamentals and Best Practices.
• Project Procurement and Risk Management in Projects
• Project Management Standard Processes
• Information Systems and Technology.
• Enterprise Resource Planning (SAP).
• Database Design.
• Software Systems Design.
• Quantitative Analysis
Login to view resume: Resume -