Arabic Sign Language Recognition - Documentation
Welcome to the comprehensive documentation for the Word-Level Arabic Sign Language Recognition project. This documentation is organized as an Obsidian vault with interconnected pages covering both high-level concepts and detailed source code documentation.
documentation karsl sign-language deep-learning
π Quick Start
- Getting Started - Installation, configuration, and first-time setup
- Architecture Overview - System architecture and component interactions
- Troubleshooting - Common issues and solutions
π Conceptual Documentation
API & Backend
- FastAPI Application - Application structure, middleware, and routing
- WebSocket Communication - Real-time communication patterns
- Live Processing Pipeline - Frame processing workflow
Core Components
- MediaPipe Integration - Landmark detection and keypoint extraction
- Keypoint Visualization - Drawing and visualization strategies
Data Processing
- Dataset Overview - KArSL-502 dataset description and structure
- Data Preparation Pipeline - Preprocessing workflow
- Memory-Mapped Datasets - Efficient dataset implementation
Models
- Architecture Design - AttentionBiLSTM model architecture
- Training Process - Training, hyperparameters, and best practices
- ONNX Export Process - Model export and conversion
Frontend
- Web Interface Design - HTML/CSS/JS architecture
- WebSocket Client - Client-side implementation
Deployment & Development
- Docker Setup - Container configuration and usage
- Environment Configuration - Environment variables
- Project Structure - Repository organization
- Contributing Guide - Contribution guidelines
- Makefile Commands - Build automation
π Source Code Documentation
Complete function-level documentation mirroring the repository structure:
API Source Code (src/api/)
- main.py - FastAPI application setup and routes
- websocket.py - WebSocket handler for real-time detection
- live_processing.py - Frame buffer and processing
- cv2_utils.py - OpenCV utilities
- run.py - Application entry point
Core Source Code (src/core/)
- constants.py - System constants and configuration
- mediapipe_utils.py - MediaPipe processing
- utils.py - General utilities
- draw_kps.py - Keypoint visualization
Data Source Code (src/data/)
- data_preparation.py - Dataset preparation
- dataloader.py - PyTorch DataLoader
- lazy_dataset.py - Lazy loading dataset
- mmap_dataset.py - Memory-mapped dataset
- mmap_dataset_preprocessing.py - Preprocessing
- prepare_npz_kps.py - NPZ keypoint preparation
- shared_elements.py - Shared utilities
- write-signs-to-json.py - JSON export
- generate_mediapipe_face_symmetry_map.py - Face symmetry mapping
Modelling Source Code (src/modelling/)
- model.py - Neural network architecture
- train.py - Training script
- parallel_train.py - Parallel training
- export.py - Model export to ONNX
- onnx_benchmark.py - ONNX benchmarking
- visualize_model_performance.py - Performance visualization
Dashboard (src/modelling/dashboard/)
- app.py - Dashboard application
- loader.py - Data loader
- views.py - Dashboard views
- visualization.py - Visualization utilities
Frontend Source Code (static/)
- live-signs.js - WebSocket client and camera handling
- index.html - Main HTML structure
- styles.css - Styling and layout
Configuration Files
- Dockerfile - Container image configuration
- docker-compose.yml - Multi-container orchestration
- Makefile - Build automation
- pyproject.toml - Python project configuration
π Reference
- Function Index - Alphabetical index of all functions
- Class Index - Alphabetical index of all classes
- API Endpoints - Complete API reference
- Configuration Options - All configuration variables
- Dataset Citation - KArSL dataset citation
ποΈ Project Overview
This project implements a real-time Arabic Sign Language (ArSL) recognition system using:
- Dataset: KArSL-502 (502 Arabic sign words)
- Keypoint Extraction: MediaPipe (pose, face, hands)
- Model: Attention-based Bidirectional LSTM
- Inference: ONNX Runtime for optimized CPU execution
- Frontend: HTML5/JavaScript with WebSocket communication
- Backend: FastAPI with async WebSocket support
π Documentation Conventions
- Wiki Links: Use
[[page_name]]to navigate between pages - Tags: Use
#tagfor categorization - Code References: Functions and classes link to their detailed documentation
- Bidirectional Links: Each function shows where itβs called from and what it calls
Last Updated: 2026-01-27