A tool for meme scraping and AI-based analysis of meme clustering. Works as a API server based on docker and written in Flask. mmx has 4 containers:
- mmx:scrape - handles online scraping of memes and saves it to the mongodb database
- mmx:feat_extract - continuously takes in the memes, calculates for each meme image a feature vector using an efficientnet-v2 model compressed with tensorflow_lite package.
- mmx:api - API server for user requests
- mmx:nginx - reverse-proxy nginx server used in a production ready API server
Both test backend and example frontend are available online.
A set of scripts for a large dataset scraping project.
Used to download extensive information from the social medium wykop.pl website, available on Kaggle as a public dataset wykop-data-2022.
Dataset consists of a 6 months of links published on the page with the 500 most popular tags.
Each link has information on its creator, popularity and status. Used in preparation for a set of blogposts:
This analysis concluded that on the website there exist:
- time-resolved activity of some users point towards bot-like actviity
- a simple criterion for link promotion was identified (see the plot above)
- existence of cooperative user clusters which upvote/downvote same links