Vivek Kovvuru

Vivek Kovvuru

Senior Data Engineer

I'm a Data Engineer with over 10 years of experience, passionate about every stage of the data journey—from building robust pipelines to advanced analytics, visualization, and integrating cutting-edge AI and LLM models. I love experimenting with the modern data stack and sharing my insights with the data community.

Cloud Architecture Data Engineering ETL & Orchestration AI Agents

Skill Dashboard

Technical Skill Matrix

Focus Areas for Data Engineering

KPIs

Tools & Platforms

Python
SQL
Databricks Databricks
DBT DBT
AWS
Azure
GCP GCP
Snowflake Snowflake
Tableau Tableau
PowerBI PowerBI
Python
SQL
Databricks Databricks
DBT DBT
AWS
Azure
GCP GCP
Snowflake Snowflake
Tableau Tableau
PowerBI PowerBI
Airflow Airflow
Kibana Kibana
Grafana Grafana
Splunk Splunk
ADF ADF
Redshift Redshift
Docker
Kubernetes Kubernetes
Jenkins
Azure Devops Azure Devops
Airflow Airflow
Kibana Kibana
Grafana Grafana
Splunk Splunk
ADF ADF
Redshift Redshift
Docker
Kubernetes Kubernetes
Jenkins
Azure Devops Azure Devops

Professional Journey

Lead Data Engineer at HP Inc.

April 2022 – Present

  • Manage a portfolio of 25+ data products and 95+ data pipelines ensuring scalability and efficiency.
  • Led migration from AWS Redshift to Databricks Lakehouse.
  • Optimized a 650TB Delta table—improving capacity by 40% and performance by 30%.
  • Achieved over $300K in annual cost savings through compression and system rearchitecture.
  • Enhanced data observability with a custom KPI model using Atlan and Unity Catalog.
  • Championed data privacy, security policies, and data deletion frameworks.
  • Drove innovation with proofs-of-concept and rearchitected pipelines with Delta Live Tables.
  • Mentored team members to foster growth.

Senior Data Engineer at HP Inc.

July 2019 – April 2022

  • Built scalable data pipelines using Python and Apache Spark (processing 5TB daily).
  • Implemented automated ETL frameworks, data quality checks, and error handling (improving speed by 15%).
  • Set up DAGs with Apache Airflow and deployed CI/CD pipelines with Jenkins.
  • Managed Delta Lake and Redshift table operations, boosting processing speed by 25%.

Cloud Data Engineer at AOL Inc. (Verizon Digital Media)

August 2016 – July 2019

  • Migrated 7TB of on-premise data to a cloud-based platform.
  • Administered AWS services including S3, SES, and DynamoDB.
  • Automated processes using AWS Lambda and CloudWatch.

Experiments

Tool Exploration: Serverless Data Pipelines

A scalable pipeline leveraging Databricks Serverless Compute for real-time analytics.

ETL Automation Suite

An automated ETL framework built with Python and Azure Devops for CI/CD.

Real-Time Data Monitoring

An experiment in creating a real-time data monitoring dashboard using Grafana, Lakehouse, and custom data pipelines.

Connect With Me