27 Nov

BASF Hackathon “Coding Chemistry”

Team ButterPy

Together with my team I have won the challenge Beyond big data – Control the butterfly effect at a Hackathon organized by the chemical company BASF.  Over 24 hours we developed Python code to combine information from various data sources and analyzed this data to improve the chemical process and identify potential sources for errors in the production. The process was fun, but at the same time very challenging to integrate several rather unstructured data sources into one data set.

Our main finding was that besides obvious parameter (quality of the starting products, production cycle, etc.) other external factors such as weekday and weather can have a substantial effect on quantity & quality of the final product.

 

26 Oct

Explorative data analysis of loan data

In this project for the Udacity Nanodegree Data Analyst, I explored loan data from Prosper, an US-based lending platform. The data set contains 113,937 loans and 84 variables. The objectives of the analysis was to summarize the data to determine (1) the relationship between the various variables of interest and (2) how the interest rates for individuals loans can be predicted with the available data. Using R, examined the data with a wide range of exploratory plots and linear regression analysis to determine the aspects that influence interest rates of consumer loans in the US. The complete report, the data and the R-code can be found in this github repository.

Read More