My Projects

Below are some of the key projects I've been working on:

Next Word Prediction

GitHub | Demo

This FastAPI application offers text prediction capabilities using a pre-trained RoBERTa model. It serves a static HTML page and provides an endpoint for generating text predictions in real-time.

Next Word Prediction Demo

Fast + Focused

GitHub | Demo

The Fast + Focused App is a web-based application designed to enhance your reading speed by displaying words from a text file at a rapid pace. This method, often referred to as speed reading, allows users to consume content faster by minimizing the eye movement and focusing on central vision. The app dynamically reads a text file and displays each word individually at a user-defined speed.

Fast + Focused Demo

Tasnif

GitHub

Tasnif is a Python package designed for clustering images into user-defined classes based on their visual content. It utilizes deep learning to generate image embeddings, Principal Component Analysis (PCA) for dimensionality reduction, and K-means for clustering. Tasnif supports processing on both GPU and CPU, making it versatile for different computational environments.

from tasnif import Tasnif

# Initialize Tasnif with 5 classes, PCA dimensions set to 16, and GPU usage
classifier = Tasnif(num_classes=5, pca_dim=16, use_gpu=False)

# Read images from a specified directory
classifier.read('path/to/your/images')

# Calculate embeddings, PCA, and perform clustering
classifier.calculate()

# Export clustered images and grids
classifier.export('path/to/output')

Localscan

GitHub

This is a simple network scanner written in Python. It utilizes ARP requests to discover devices on a given network and performs a lightweight scan on each discovered device using Nmap.

Localscan Demo

Easy Web Summarizer

GitHub

A Python script designed to summarize webpages from specified URLs using the LangChain framework and the ChatOllama model. It leverages advanced language models to generate detailed summaries, making it an invaluable tool for quickly understanding the content of web-based documents.

python summarizer.py -u "http://example.com/document"
docker build -t web_summarizer .
docker run -p 7860:7860 web_summarizer
  
# Run if you run ollama on host
docker run -d --network='host' -p 7860:7860 web_summarizer

Image Clustering

GitHub

Easy image clustering tool.

usage: main.py [-h] [-i INPUT] [-c CLUSTER] [-p PCA]

Image caption CLI

optional arguments:
  -h, --help                        show this help message and exit
  -i INPUT, --input INPUT           Input directory path, such as ./images
  -c CLUSTER, --cluster CLUSTER     How many cluster will be
  -p PCA, --pca PCA                 PCA Dimensions
  --cpu                             Run on CPU

Contact Sheet Generator

GitHub

Contact Sheet Generator is a Python script that generates a contact sheet from a directory of images. It uses the PIL library to process images and multiprocessing to generate thumbnails in parallel. The contact sheet is created by arranging the thumbnails in a grid pattern.

python contract_sheet.py /path/to/images --shuffle --heic_to jpeg --img-size 500 --no-crop result.jpg

Audio Embedding

GitHub

A simple Python script for extracting audio embeddings.


Image Similarity Calculator

GitHub

This project provides a simple image similarity calculator using the CLIP (Contrastive Language-Image Pre-training) model. It consists of two Python scripts, predictor.py and app.py, that allow you to calculate the cosine similarity between two images.


Audio Genre Detection

GitHub

The Audio Genre Detection project is a robust and sophisticated system for determining the genre of audio files. Leveraging the power of Essentia, a comprehensive library for audio analysis, and TensorFlow, this system provides accurate and efficient genre classification. It includes a Dockerized environment that streamlines the process of running the audio genre detection system, making it accessible and hassle-free.


Image Captioning

GitHub

Captioning is an img2txt model that uses the BLIP. Exports captions of images.


Unlabeled Image Autoencoder

GitHub

This project focuses on building and utilizing an autoencoder for clustering unlabeled image datasets. The autoencoder is designed to compress images into a lower-dimensional representation and then reconstruct them from this compressed form. The project includes training the autoencoder, extracting features from any image test dataset, and visualizing the embeddings using t-SNE to further reduce dimensionality for visualization.


KSampler Advanced Tile

GitHub

This node introduces enhancements to the KSamplerAdvanced node by adding tiling functionality. The key changes are encapsulated in two new classes: KSamplerAdvancedTile and CircularVAEDecode. KSamplerAdvancedTile: This class brings in the capability to handle tiling along the X and Y axes independently. It includes methods for setting layer padding based on tiling parameters, applying asymmetric tiling to all convolutional layers, hijacking and restoring Conv2d methods for customized forward passes, and a tailored sampling method that accounts for tiling preferences, noise addition, and denoise levels. CircularVAEDecode: A class that extends VAEDecode, introducing circular padding for Conv2d layers during the decoding process.