Piotr Bielak

Piotr Bielak

Graph Machine Learning Researcher & Engineer

Wrocław University of Science and Technology

About me

I am a graph machine learning specialist with a recently obtained PhD, possessing over 4 years of industrial experience. My research expertise centers on graph representation learning in terms of unsupervised and self-supervised learning, yielding over 100 citations for my work. As an accomplished author of both conference and journal articles, I’ve introduced several pioneering methods in this field, such as GBT, AttrE2vec, and FILDNE. I am a dedicated Python enthusiast and a well-rounded practitioner, proficient in full-stack machine learning development, including DevOps/MLOps, as well as model implementation and evaluation.

Download my resumé.

Interests
  • Graph Representation Learning
  • Unsupervised Learning
  • Self-supervised Learning
  • Attributed Graphs
Education
  • PhD in Computer Science (Machine Learning), 2019-2023

    Wrocław University of Science and Technology

  • MSc in Computer Science (Data Science specialization), 2018-2019

    Wrocław University of Science and Technology

  • BEng in Computer Science, 2014-2018

    Wrocław University of Science and Technology

Publications

Retrofitting structural graph embeddings with node attribute information
Retrofitting structural graph embeddings with node attribute information

Representation learning for graphs has attracted increasing attention in recent years. In this paper, we define and study a new problem of learning attributed graph embeddings. Our setting considers how to update existing node representations from structural graph embedding methods when some additional node attributes are given. To this end, we propose Graph Embedding RetroFitting (GERF), a method that delivers a compound node embedding that follows both the graph structure and attribute space similarity. Unlike other attributed graph embedding methods, GERF is a novel representation learning method that does not require recalculation of the embedding from scratch but rather uses existing ones and retrofits the embedding according to neighborhoods defined by the graph structure and the node attributes space. Moreover, our approach keeps the same embedding space all the time and allows comparing the positions of embedding vectors and quantifying the impact of attributes on the representation update. Our GERF method updates embedding vectors by optimizing the invariance loss, graph neighbor loss, and attribute the neighbor loss to obtain high-quality embeddings. Experiments on WikiCS, Amazon-CS, Amazon-Photo, and Coauthor-CS datasets demonstrate that our proposed algorithm receives similar results compared to other state-of-the-art attributed graph embedding models despite working in retrofitting manner.

Experience

 
 
 
 
 
Visiting scholar
University of Notre Dame
Aug 2023 – Sep 2023 Notre Dame, IN, USA

During the research visit at the Lucy Institute for Data and Society (prof. Nitesh Chawla), two projects were undertaken: (1) building representations of neural networks based on their weights and training dynamics (weight-space models), (2) development of a novel selfsupervised graph representation learning method, founded on the Joint Embedding Predictive Architecture. Responsibilities span across the full research stack, i.e., problem definition, model implementation and experimental evaluation.

Tech stack: Python, PyTorch, DVC, Scikit-learn, Hydra, PyTorch-Geometric, GNN

 
 
 
 
 
ML Ops Developer
Sep 2022 – Jun 2023 Wrocław

Development of machine learning solutions tailored for debt collection processes. This involved the creation and deployment of predictive models, along with automation of data pipelines to enhance the efficiency and effectiveness of debt collection operations.

Tech stack: Python, PyTorch, DVC, Scikit-learn, Pandas, XGBoost

 
 
 
 
 
Senior Machine Learning Developer
Sep 2020 – Dec 2022 Wrocław

Development of machine learning-based recommendation solutions for company-company interactions. The role encompassed comprehensive responsibilities throughout the entire project pipeline, i.e., from the initial data preprocessing and feature engineering stages (text representations and graph building), through model development (GNN and recommendation) to the final deployment of these models, ensuring that the recommendations were fine-tuned for maximum effectiveness and tailored to the specific needs of a company.

Tech stack: Python, PyTorch, DVC, Pandas, Jupyter, PyTorch-Geometric, GNN, Scikit-learn, Sentence-Transformers, Docker, Google Cloud, MLFlow, Weaviate, AirFlow, Streamlit

 
 
 
 
 
Machine Learning Developer
Jun 2019 – Aug 2020 Wrocław

Development of a user behavior prediction model based on clickstream data using gradient boosting trees classification. Shared responsibilities across the full pipeline, from data cleaning and feature extraction to model training and evaluation, as well as demos preparation and a production-level PoC implementation.

Tech stack: Python, PyTorch, DVC, AWS, Docker, Jupyter, XGBoost, Redis

 
 
 
 
 
Machine Learning Developer
Jun 2019 – Dec 2019 Wrocław

Development of a financial transactions overdue prediction model based on a transaction graph. Contributing to various stages of the whole project pipeline, with responsibilities in data preprocessing, feature extraction, model training, and evaluation.

Tech stack: Python, PyTorch, DVC, Docker, Jupyter, FeatureTools, NumPy, GNN, XGBoost

 
 
 
 
 
Research Assistant
Jan 2019 – Present Wrocław

Recently finished Ph.D. studies at the Department of Artificial Intelligence have been accompanied by research in various areas, with a primary focus on graph representation learning, complemented by expertise in self-supervised and unsupervised learning. Responsible of leading a research group dedicated to graph representation learning. Additional practical experience in didactics, including active involvement in the development of educational materials for the Artificial Intelligence master’s degree program.

Research areas: graph representation learning, self-supervised learning Teaching: Probabilistic Machine Learning, Representation Learning, Large-Scale Data Processing

 
 
 
 
 
Junior DevOps
Jan 2018 – Dec 2018 Wrocław

In the role within the Public Cloud team, responsibilities encompassed the maintenance and development of the OpenStack cloud infrastructure. Key tasks included implementing test automation using Jenkins and Gerrit, streamlining the testing process for enhanced efficiency. The notable achievement of presenting ”From messy XML to wonderful YAML and pretty JobDSL – an in-Jenkins migration story” at the OpenStack Summit Berlin 2018 underscored the commitment to improving and innovating cloud operations.

Tech stack: Python, Bash, Openstack, Jenkins, Gerrit

 
 
 
 
 
Software Developer Intern
Jul 2017 – Sep 2017 Gdańsk

The role involved dedicated efforts in the research and development of a machine learningbased resource manager designed for modern cluster schedulers. This engagement contributed to the advancement of resource allocation methodologies, leveraging machine learning techniques (state-of-the-art reinforcement learning) to optimize the efficiency and scalability of cluster management systems. Responsibilities across the entire project pipeline, including environment preparation, model implementation and result analysis.

Tech stack: Python, Tensorflow, Keras, Reinforcement Learning

 
 
 
 
 
Junior Java & Javascript Developer
Nov 2016 – Mar 2017 Wrocław

Fullstack web development of an application dedicated to staff room allocation. The responsibilities encompassed active involvement in both frontend and backend aspects of the project, including the creation of interactive room maps and the development of the backend REST API. A pivotal role was played in bugfixing and the implementation of new features, thus making significant contributions to the overall application enhancement and functionality.

Tech stack: Java, Spring Boot, MongoDB, AngularJS

 
 
 
 
 
QA Test Automation Engineer
Feb 2016 – Oct 2016 Wrocław

Automation of integration tests for a management application for logistics companies. Responsibilities encompassed various aspects, including specification analysis, defect reporting, and the creation and review of test scripts. Additionally, a secondary project was undertaken involving the development of a test script crawler and result analyzer in Python, contributing to more efficient testing processes and quality assurance.

Tech stack: Python, BeautifulSoup

Contact