On the crossroads

Jacek Grela

Diagnosis of Alzheimer’s disease

[github]

tech stack: scikit-learn, lightgbm

Feasibility study of using fractal-aware features in classifying Alzheimer's disease. Features are extracted based on MRI scans while we investigate three types of classifiers:

  1. k-nearest neighbor
  2. support vector classifier
  3. gradient-boosted decision trees
The training comprises of nested cross-validation framework and a resampling procedure which helps with the unbalanced dataset.

MMX - Meme trends analyzer

[github] [backend API] [test frontend]

tech stack: docker, flask, tensorflow_lite

A tool for meme scraping and AI-based analysis of meme clustering. Works as a API server based on docker and written in Flask. mmx has 4 containers:

Both test backend and example frontend are available online.

Video captioning pipeline

[github] [blog]

tech stack: flask, torchserve, huggingface, asyncio

A tool for creating video captions in bulk. Uses the BLIP-2 + OPT img2txt model as the captioning module. Comes in two variants:

Both versions have few example scripts to be used on the request side. Single GPU version is simpler for smaller tasks while the multiple GPUs version is more flexible and has the ability to work in the batch and stream modes.

Scraping and Exploratory Data Analysis of social media data

[github] [kaggle] [blog]

tech stack: beautifulsoup, pandas

A set of scripts for a large dataset scraping project. Used to download extensive information from the social medium wykop.pl website, available on Kaggle as a public dataset wykop-data-2022. Dataset consists of a 6 months of links published on the page with the 500 most popular tags. Each link has information on its creator, popularity and status. Used in preparation for a set of blogposts:

This analysis concluded that on the website there exist: