I’m Kamil, currently working in the Accenture Poland as the Data Engineer/Big Data Developer.
During my bachelor studies, I realized that I want to do something other than work in the factory as an automation control engineer. At my university, a new Data Science course has been created. That was in the English language, so I chose it for my master’s studies. During this course, I had the chance to learn a lot about statistics, processing data and images, classifiers, big data environment, and data visualization.
All projects in python were done outside of my studies (you can see them on my website). I had the chance to use an SQLite database, use the NLP in the identification of drug interactions from the summary of product characteristics, and predict the seasonality of sales. Additionally, I created my model for customer segmentation that compares two groups and indicates which one should be targeted. In this case, the requests are sent in JSON format by Postman.
- Python
- GCP
- Data Science
- pyspark
- SQL
- Professional Scrum Master (PSM-1) (https://www.scrum.org/user/833957)
- Microsoft Certified: Data Analyst Associate (DA-100: Analyzing Data with Microsoft Power BI) https://docs.microsoft.com/en-gb/learn/certifications/data-analyst-associate/
Project | Description | Language |
---|---|---|
Identification of drug interactions from the summary of product characteristics | The website (in polish) allows you to process the SmPC (Summary of Product Characteristics) to find interactions between the substances. Every medicine authorized in Poland has the SmPC, including the section Interactions with other medicinal products and other forms of interaction . Based on this passage, I have tried to extract the names of substances that interact with the product. The list of substances is taken from the Register of Medicinal Products, which contains links to the SmPC of the medicine in question and the names of the active substances in foreign (English/Latin). The whole site was deployed using the Streamlit library. Nlp processing is done using the Spacy library and Thefuzz for comparing names of medicinal substances. |
Python |
Prediction model of sales in alcohol stores by using the Prophet | Build a prediction model by the prophet to indicate if the credit could be granted to some stores. For the stores, there is information about the revenue of the alcohol sales. The clustering has been implemented to find stores with similar attributes. | Python |
Forecasting of sales | The aim was to create a model prediction of the sales for the next three weeks. Currently, the sales forecast is set 3 weeks ahead based on last week’s sales. The Weighted Absolute Percent Error (WAPE) is used for comparison purposes. The whole dataset contains 3 CSV files. | Python |
Predicting profitable customer segments | Models for customer segmentation that compares two groups and indicates which one should be targeted were created. Based on the approaches, five different models have been made. For one of the models (GradientBoostingClassifier) to predict if the campaign should be launched for the group (one of them, or none of them)), the requests are sent in JSON format by the Postman. | Python |
Stock price: jumping out and in of dividend stocks around ex dividend dates | The payment of dividends by a company is an attractive morsel for a shareholder. Such a company is better perceived because of its attractiveness and its willingness to share its profits with investors. But is it always profitable to own shares when dividends are paid? In this note, I will try to answer this question. For the analysis, I have chosen companies that regularly pay dividends on the Polish stock exchange. | Python |
NFZ API | The website allows for comprehensive data analysis. In particular, the emphasis was placed on sales for given branches of the NFZ. Additionally, a year-to-year analysis was introduced. As well as analysis of the division of age groups according to the attributes. | R |
Hotel reservation | This project focused on presenting the potential of creating a hotel reservation system design by R packages. On the website, it is a hotel gallery and system reservation. Additionally, there is an option to cancel the reservation without contact with the hotel reception (using ID and the PIN). When the booking is done, the shinyapps generate a QR code to scan it by the bank app and make a payment. | R |