Description
For the summer of 2023, I spent my time interning at Capital One on the auto loan valuation team in Dallas, Texas. Purposely avoiding detail, the team can be described as utilizing customer data to train machine learning algorithms to calculate tailored auto loan interest rates.
My responsibilities incluided the following:
-Introducing Kubeflow functionalities for the auto loan valuation's PySpark data processing pipelines
-Parallelizing data processing build pipeline for a 47% fitting efficiency increase (60 hrs to 32 hrs)
-Creating new PySpark transformers for accurate inflation modeling in the time dependent data processing pipeline
-Refactoring scikit-learn transformers into PySpark for big data runtime efficiency increases