Speech AI / Generative AI / Computer Vision

Faisal Ahmed

AI researcher and developer building practical systems across language and speech, generative AI, computer vision, biomedical AI, and production ML workflows.

View projects Download CV Email me LinkedIn

1 Open-source PyPI package

15 Selected projects

35K+ Tutorial views

1 IEEE publication

Profile

Applied AI with a research mindset.

I design and ship AI systems that move from experiments into usable products, with a focus on multilingual language intelligence, speech processing, perception, and reliable backend delivery.

My work spans automatic speech recognition, transformer-based prediction models, document question answering, agentic and multimodal AI, face verification, custom OCR, data pipelines, visualization apps, medical image classification, and road-safety vision systems.

Language & Speech Information extraction, speech-to-text, text-to-speech, conversational AI, and RAG.
Generative AI LLM applications, agentic AI systems, and multimodal generation across speech, vision, and text.
Computer Vision Smart traffic monitoring, vehicle detection, OCR, virtual try-on, and pattern recognition.
Biomedical & Healthcare AI Biometric recognition, facial recognition, biomedical signal processing, and medical imaging.

Experience

Engineering AI systems from prototypes to services.

Recent roles blend research, model development, backend APIs, and practical integrations for production-oriented machine learning work.

Business Automation Limited, Dhaka November 2024 - Present

Machine Learning Engineer

Developed an automatic speech recognition system for English language workflows.
Built AI auto-suggestion and cluster-based remark suggestion features for a product.
Developed face verification authentication and OCR-based user information extraction from PDFs and images.
Built data archiving pipelines using Apache NiFi and Apache Airflow.
Created interactive data visualization web apps with Plotly Dash.
Developed FastAPI and PostgreSQL backend services with SMS, email, and custom PDF generation modules.

Next Solution Lab, Dhaka February 2024 - October 2024

AI Engineer

Developed a deep learning-based virtual try-on product for clothes and accessories.
Created OCR solutions tailored to business-specific document requirements.
Researched transformer-based non-English language models for text classification, question answering, and summarization.

Next Solution Lab, Dhaka April 2022 - January 2024

Associate AI Engineer

Developed industrial analog and digital meter recognition using object detection and text recognition.
Implemented named entity recognition for non-English documents.
Conducted R&D for computer vision and NLP paper implementation, product features, and model improvements.

Projects

Selected applied AI work.

A focused set of systems that show the range of my research and engineering work across LLMs, RAG, Bangla NLP, medical AI, OCR, real-time vision, and production tooling.

Open-source PyPI package

LogLensX

A public Python package for the developer community, built to make application log analysis easier for FastAPI and Flask projects.

Published on PyPI as loglensx with MIT licensing and Python 3.8+ support.
Adds a responsive log analytics dashboard to FastAPI or Flask apps with one setup function.
Parses application logs, folds multiline tracebacks, and summarizes errors, warnings, files, and loggers.
Includes searchable tables, Plotly visualizations, JSON APIs, and JSON, CSV, or NDJSON exports.

Install pip install loglensx

PyPI Package GitHub Repo

Open Source PyPI Python FastAPI Flask Plotly

Document QA RAG System

Built a document question-answering workflow with retrieval augmented generation.
Supported Bengali and English document querying with FAISS vector search, FastAPI, and Docker deployment.

GitHub Repo

LLM RAG Llama 3.3 FAISS Docker

Bangla Sentence Punctuation Restoration

Built a Bangla punctuation restoration model using transformer-based language modeling.
Used Llama 3.2 for sentence correction and served the workflow through FastAPI and Docker.

GitHub Repo

Bangla NLP BanglaBERT Llama 3.2 FastAPI

Chattogram to Standard Bangla Conversion

Developed a transformer-based Seq2Seq model for local Chattogram language conversion.
Processed data with a SentencePiece tokenizer and trained an encoder-decoder model with attention.

GitHub Repo

NLP PyTorch Transformers Seq2Seq

Brain Tumor Classification Vision Transformer

Classified brain MRI images into four tumor categories using transfer learning.
Applied a ViT-based medical imaging workflow with PyTorch experimentation.

GitHub Repo

Medical AI PyTorch ViT Transformer

Wrong-Side Vehicle Detection

Detected wrong-side vehicle movement with a YOLO-based computer vision system.
Added license plate extraction using OCR for downstream traffic workflows.

GitHub Repo

Vision YOLOv10 EasyOCR PyTorch

AI Agent MCP Server

Built a Python MCP server foundation for AI agent tooling and workflow integration.
Structured the project for agent-server experiments and reusable tool endpoints.

GitHub Repo

AI Agents MCP Python Systems

GraphRAG Chatbot with FastAPI

Developed a chatbot workflow using GraphRAG and a FastAPI service layer.
Organized an LLM backend for retrieval-grounded responses and API integration.

GitHub Repo

LLM GraphRAG FastAPI Python

Anomaly Detection FastAPI

Packaged an anomaly detection model workflow behind a FastAPI service interface.
Prepared API endpoints for scoring and integration with production-style ML pipelines.

GitHub Repo

ML Systems FastAPI Python Anomaly Detection

Automatic Speech Recognition

Developed an automatic speech recognition repository for speech-to-text experimentation.
Focused on model workflow foundations for audio processing and language AI.

GitHub Repo

Speech ASR Python NLP

Image Cryptography with Autoencoders

Implemented chaotic-map image encryption and decryption with autoencoder experiments.
Included image-processing utilities and metrics such as SSIM, NPCR, and UACI.

GitHub Repo

Deep Learning Autoencoder Security Python

Building Infrastructure Recognition

Studied building infrastructure recognition using deep learning methods.
Combined classification and object-detection experiments for infrastructure analysis.

GitHub Repo

Vision Deep Learning Object Detection Research

Resume Categorization with Transformers

Built a transformer-based resume categorization workflow.
Used EDA and NLP modeling to map resume text into role categories.

GitHub Repo

NLP Transformers BERT EDA

Document Similarity with Doc2Vec

Implemented document similarity measurement using Doc2Vec and Gensim.
Applied NLP feature learning for semantic comparison workflows.

GitHub Repo

NLP Doc2Vec Gensim Python

Face and Eye Blink Detection

Built face and eye-blink detection with dlib and OpenCV.
Focused on real-time vision signals for monitoring and liveness-style interfaces.

GitHub Repo

Vision OpenCV dlib Python

Skills

Tools for research, product, and deployment.

Languages

Python
C/C++
Java
JavaScript
PHP
LaTeX

ML & AI

PyTorch
TensorFlow
Keras
SpaCy
Transformers
LangChain

Full-Stack Development

FastAPI
Flask
Django
React.js
Next.js

Data Analytics

Plotly
Tableau
Power BI

Delivery

Docker
CI/CD
Apache NiFi
Apache Airflow
Redis

Databases

MongoDB
MySQL
PostgreSQL

Teaching

Teaching practical AI for the Bangla community.

Flagship Bangla NLP playlist

Natural Language Processing with Python in Bangla

One of my strongest contributions to the Bangla AI learning community: a practical NLP series that helps learners study modern language processing in their own language.

Explains core NLP ideas in Bangla for students, self-learners, and early AI practitioners.
Connects Python text processing, preprocessing, feature extraction, and model-building workflows.
Designed as community-first content for learners who want a local-language path into NLP.

Watch Bangla NLP playlist

Computer Science Tutorials

Published tutorials on YouTube with 35K+ views and 450+ subscribers.

Open playlist

Publication & Education

Research, academic foundation, and leadership.

Building Infrastructure Classification with Hybrid CNN Architecture

F. Ahmed, M. Mahmudul Islam and S. M. Masudul Ahsan, EICT 2021. DOI: 10.1109/EICT54103.2021.9733635.

View publication

Education

B.Sc. in Computer Science and Engineering from Khulna University of Engineering & Technology, Khulna, Bangladesh.

February 2017 - April 2022

Visit KUET

Thesis

A Study of Building Infrastructure Recognition Using Deep Learning Methods.

View GitHub

Leadership

General Secretary, Software Research & Development Community of KUET.

Connect

Contact

Let us build something useful with AI.

Reach out for AI engineering, research collaboration, product prototypes, teaching, or consulting.

faisal.cse16.kuet@gmail.com Download CV