Led migration from old USQL based stack to new contract driven Spark based workload running in Azure Synapse and integrating with Azure Purview. New platform is responsible for aggregating, transforming and delivering 2+ TB of financial data monthly that represents $110 billion revenue. New system is responsible for an on average 8x E2E time improvement while only costing 1/5th the old platform.
Worked on a big data engineering team to move an existing big data platform from the commercial space to a secure federal environment.
Lead Developer on Microsoft funded POC for Air Force Test Center POC demonstrating how using modern technologies like Azure Storage, Azure Data Lake, Data Factory, and Jupyter notebooks can streamline processes and enable access and analytics to traditionally siloed data across test wings.
Technical Team Lead for Data Integration Platform allowing upload of data into a secure cloud environment and automatic processing pipelines to visualization with PowerBI.
Software consulting in web an=caddyd mobile technologies
A spark utility for generating indexes in a data lake to reduce the amount of data included when joining across massive datasets. Useful when data is poorly or not at all partitioned and cannot be moved, or duplication is too costly. Supports indexes where a single file can have multiple values so a more typical iceberg in place setup would not work.
Dayton, OH, US
Bachelor’s degree in computer science
Learn to effectively and efficiently develop software and analyze programs to create packages that meet specifications. As well as develop familiarity with various packages to be used in the real world.
June 2019 – June 2022
March 2021 – March 2023