About me
Passionate and innovative, I am a driven Machine Learning Engineer and Data Scientist with 1.5 years of industry experience. I excel in leveraging cutting-edge technologies within Machine Learning and Deep Learning to solve complex real-world problems. My focus areas include Large Language Models (LLMs), Natural Language Processing (NLP), MLOps, and multimodal data analysis. I thrive on collaborating with diverse teams to develop scalable solutions that drive business growth and innovation. Outside of work, I enjoy participating in hackathons to continually challenge myself and learn from others. I am deeply excited about the endless possibilities of data science and its potential to transform industries.
Languages: Python, Java, R, SQL, HTML, CSS, JavaScript (basics), Bash
Frameworks: Django, Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, NLTK, Keras,
Tensorflow, PyTorch, transformers, MLflow, Airflow
Tools: Git, Github Actions, Jupyter Notebook, PyCharm, Visual Studio Code, Anaconda, Docker,
Linux
Projects
This project, part of the Deep Learning course at Warsaw University of Technology, investigated various diffusion models for image generation. We implemented and tested three models—DDPM, DDIM, and PNDM—using the LSUN Bedroom dataset to generate 128x128 pixel images. We trained these models with PyTorch and the Hugging Face Diffusers library, comparing their performance over different epochs. Evaluations included FID scores and visual interpolation assessments. Comprehensive and reproducible results are detailed in the repository's README.
The project, undertaken as part of the Deep Learning course at Warsaw University of Technology, aimed to develop custom RNN architectures and test Whisper and Audio Spectogram Transformer and evaluate their performance. Utilizing the Speech Commands dataset, all models were trained using PyTorch. Comprehensive results, documented in the repository's README file, are reproducible for further analysis.
The project, undertaken as part of the Deep Learning course at Warsaw University of Technology, aimed to develop custom CNN architectures and evaluate their performance against pretrained models. Utilizing the CINIC-10 dataset, all models were trained using PyTorch. Comprehensive results, documented in the repository's README file, are reproducible for further analysis.
The project, undertaken as part of the Advanced Machine Learning course at Warsaw University of Technology, aimed to implement Logistic Regression from scratch with 3 different optimizers: SGD, IRLS and Adam. Additionaly, evaluate their performance in binary classification task alongside with LDA, QDA, Decision Tree and Random Forest. Comprehensive results, documented in the repository's README file, are reproducible for further analysis.
A Django web app can help students manage their studies by providing a platform to create and manage tasks and projects. Each task can have a priority level, and the app also has a translator tool to help with notes and assignments in different languages, which can be useful for international students or those studying in a foreign language.
Two projects from my studies at Warsaw University of Technology. The first one is a classification task, where I had to predict the activity of a person based on the data from their phone. The second one is a clustering task, where I had to cluster documents based on their content.
Experience
December 2023 - March 2024: Junior Machine Learning Engineer - Grid Dynamics
-
Utilizing acquired expertise in 2 client projects
-
NLP project
- Enhancing product recognition algorithms by refining pattern matching using Regex, and incorporating algorithms based on Levenstein distance and fuzzy search.
- Participating in a dynamic three-person team, actively contributing to collaborative problem-solving initiatives aimed at addressing challenges and optimizing product recognition algorithms.
- Data cleaning and preprocessing using pandas
-
Optimization project
- Assisting in refactoring the code base and addressing performance issues on Databricks
-
NLP project
June 2023 - December 2023: Machine Learning Engineer Intern - Grid Dynamics
- Acquiring Proficiency in the Following Areas:
- Python
- SQL
- Flask and FastAPI
- Docker conterization
- Apache Airflow
- MLflow
- PySpark Data Processing
- Cloud Expertise (AWS, GCP)
- Jenkins & Terraform
- Achieved certification:
July 2022 - December 2022: NLP Intern - Samsung R&D
- Development of Bixby, working on automation tools for linguist
- Improving deep learning model performance in NLU area (mainly Tensorflow/transformers, significant improvement 5% accuracy on production data)
- Implementing features in Android project written in Java
- Technical Skills: Python with Tensorflow and HuggingFace, Java (Android project), Linux tools, Scripting (Bash), Git, Github, Github Actions.
- Soft Skills: Teamwork, Time Management, Communication.