
Vivek Kovvuru
Senior Data Engineer
I'm a Data Engineer with over 10 years of experience, passionate about every stage of the data journey—from building robust pipelines to advanced analytics, visualization, and integrating cutting-edge AI and LLM models. I love experimenting with the modern data stack and sharing my insights with the data community.
Cloud Architecture
Data Engineering
ETL & Orchestration
AI Agents
Skill Dashboard
Technical Skill Matrix
Focus Areas for Data Engineering
KPIs
Tools & Platforms
Python
SQL
AWS
Azure

Python
SQL
AWS
Azure


Docker
Jenkins

Docker
Jenkins
Professional Journey
Lead Data Engineer at HP Inc.
April 2022 – Present
- Manage a portfolio of 25+ data products and 95+ data pipelines ensuring scalability and efficiency.
- Led migration from AWS Redshift to Databricks Lakehouse.
- Optimized a 650TB Delta table—improving capacity by 40% and performance by 30%.
- Achieved over $300K in annual cost savings through compression and system rearchitecture.
- Enhanced data observability with a custom KPI model using Atlan and Unity Catalog.
- Championed data privacy, security policies, and data deletion frameworks.
- Drove innovation with proofs-of-concept and rearchitected pipelines with Delta Live Tables.
- Mentored team members to foster growth.
Senior Data Engineer at HP Inc.
July 2019 – April 2022
- Built scalable data pipelines using Python and Apache Spark (processing 5TB daily).
- Implemented automated ETL frameworks, data quality checks, and error handling (improving speed by 15%).
- Set up DAGs with Apache Airflow and deployed CI/CD pipelines with Jenkins.
- Managed Delta Lake and Redshift table operations, boosting processing speed by 25%.
Cloud Data Engineer at AOL Inc. (Verizon Digital Media)
August 2016 – July 2019
- Migrated 7TB of on-premise data to a cloud-based platform.
- Administered AWS services including S3, SES, and DynamoDB.
- Automated processes using AWS Lambda and CloudWatch.
Experiments
Tool Exploration: Serverless Data Pipelines
A scalable pipeline leveraging Databricks Serverless Compute for real-time analytics.
ETL Automation Suite
An automated ETL framework built with Python and Azure Devops for CI/CD.
Real-Time Data Monitoring
An experiment in creating a real-time data monitoring dashboard using Grafana, Lakehouse, and custom data pipelines.