Closer — Smart Glasses for Sign Language

About the Project

The Vision Behind Closer

Closer is an advanced wearable solution integrated into a sleek smart glasses form factor. Utilizing an ultra-compact compute core, the glasses capture hand gestures, process spatial landmarks via MediaPipe, and leverage edge intelligence to provide instantaneous English Sign Language translation. By translating signs into high-fidelity synthesized speech, Closer empowers the deaf and hard-of-hearing community with seamless, real-time social interaction.

👓

Wearable Core

Integrated camera and compute core designed for a lightweight glasses form factor, enabling mobility and hands-free use.

🧠

Neural Translation

GestureNet: A proprietary deep learning architecture optimized for the nuances of Arabic sign semantics.

🔊

Text-to-Speech

Recognized signs are spoken aloud instantly via the Windows Speech API with configurable audio delay.

⚡

Low Latency

End-to-end recognition latency under 300ms — from hand gesture to spoken word.

Dual-Mode System

Closer Modes

Closer is a two-way Closer bridge. Whether you communicate with sign language or speech, a single launcher puts both modes at your fingertips.

🤟

Sign Language Mode

For deaf & mute users who communicate via sign language.

Raspberry Pi 5 camera captures hand gestures
GestureNet AI classifies 26 English signs
Predicted word is spoken aloud via TTS
Gesture Dashboard lets you remap any label

            python launcher.py  →  click Sign Language
          

🎤

Speech → LCD Mode

For hearing users speaking to a deaf/mute person.

Bluetooth Headset Microphone via Raspberry Pi Zero records speech
Faster-Whisper (tiny.en) cloud API transcribes the audio to text
Text is displayed on the Pi Zero's I²C LCD screen
Full scrollable transcript shown in simulator GUI

            python launcher.py  →  click Speech → LCD
          

🤟 Hand Gesture → Pi 5 Camera → GestureNet → 🔊 Spoken Word

— OR —

🎤 Voice → Pi Zero Mic → Whisper AI → 📺 LCD Text

The People

Meet the Team

The brilliant minds behind Closer

Abdelrahman Khaled Fawzy Mansour Abdel Fattah

Project Member

Abdelrahman Ayman Mahmoud Abu El-Makarim

Project Member

Abdelmohsen Mohamed Mahmoud Naeina

Project Member

See It in Action

Project Demo

Watch Closer recognize English sign language in real time.

Visual Overview

Project Gallery

Snapshots from development, testing, and deployment.

Technical Overview

Documentation

A concise technical breakdown for academic review.

📌 Problem Statement

Deaf and mute individuals face significant Closer barriers in everyday life. Existing solutions are either costly, require specialized hardware, or do not support two-way communication. Closer addresses this by providing an affordable, open-source English sign language recognition and speech-to-text system that runs on widely available hardware.

🏗️ System Architecture

The solution utilizes a high-performance distributed architecture:

Wearable Node (Smart Glasses): Integrated capture module → high-frequency hand landmark extraction → localized pre-processing.
Inference Cloud/Server: Advanced neural classification (GestureNet) → temporal smoothing → natural language synthesis (TTS).

This offloaded computation model ensures the wearable remains lightweight and thermal-efficient while delivering server-grade AI performance.

🤖 AI Model — GestureNet

Property	Value
Architecture	Multi-Layer Perceptron (MLP)
Input	63 normalized landmark values
Output	20 classes (English sign vocabulary)
Training Samples	40,000 (20k original + 20k mirrored)
Training Epochs	100
Confidence Threshold	60%
Smoothing	Majority-vote window (n=2)

🔬 Data Collection & Augmentation

Training data was collected using a webcam and MediaPipe for landmark extraction. Each of the 20 gesture classes was recorded with 1,000 samples. To make the model hand-agnostic (works with both left and right hands), the dataset was doubled via Mirror Augmentation: all X-coordinates were negated, mathematically mirroring every right-hand sample into a left-hand equivalent without any additional data collection.

🧪 Preprocessing Pipeline

Raw landmarks from MediaPipe are normalized before training and inference:

Translation: Wrist (landmark 0) is moved to the origin.
Scale normalization: Divided by the wrist-to-middle-MCP distance, making the representation hand-size agnostic.
Flatten: 21 × 3D points → 63-dimensional vector.

This ensures the model recognizes signs regardless of how far the hand is from the camera.

📊 Results

Metric	Value
End-to-end latency	< 300ms
Signs supported	20 English gestures
Hand support	Both left & right
Operating environment	Any indoor lighting
Network requirement	Local Wi-Fi only

The Road Ahead

What's Next?

Planned improvements and future development directions.

Vocabulary Expansion

Full English Sign Language Alphabet

Expanding the vocabulary from 20 signs to the full English Sign Language alphabet and common phrases, enabling richer real-world Closer.

Mobile App

Smartphone-Based Recognition

Bringing the recognition system to Android devices using TensorFlow Lite, eliminating the need for any external hardware.

Continuous Signing

Sentence-Level Recognition

Moving from isolated word recognition to detecting continuous signing sequences with temporal modeling (LSTM / Transformer).

Form Factor

Integrated AR HUD

Developing an Augmented Reality Heads-Up Display to show translated text directly on the glasses lens for the wearer.

Gratitude

Acknowledgements

🏛️

[ University Name ]

For providing the resources, labs, and academic environment that made this project possible.

🤝

Open Source Community

MediaPipe (Google), FastAPI, PyTorch, OpenCV, and all the open-source libraries that power this system.

Engineered for accessibility. Redefining human Closer. 💚

Global Innovation University · Faculty of Technology