Innovation Labs
Offline Local Japanese Learning App
Fully offline JLPT N5 study app â spaced repetition, AI conversation tutor, speaking practice, kanji hub, and 13 study modes. No internet. No subscriptions. No data sent anywhere. Everything runs on your PC.
Overview
The Japanese N5 Tutor covers the full JLPT N5 curriculum â 572 vocabulary words, 100 kanji, hiragana, katakana, grammar patterns, reading passages, and 15 AI tutor conversation lessons. The target is conversational confidence in VRChat-style Japanese, not exam preparation.
At the core is FSRS-6, a state-of-the-art spaced repetition algorithm. Every word, kana, and kanji you study gets its own memory model tracking personal stability and difficulty. The app tells you exactly when to review each item and gates new content behind mastery thresholds so you're never overwhelmed.
The AI tutor is Yuki â a conversation-first teacher powered by a local LLM. She remembers your name, hobbies, and background across sessions. Every Japanese line she speaks plays automatically via VOICEVOX, a local text-to-speech engine. All speaking practice runs through Whisper for offline transcription and pronunciation scoring.
Spaced Repetition â FSRS-6
Every item you study â each word, kana character, and kanji â gets its own memory model with two personal parameters: Stability (how long you'll remember it) and Difficulty (how hard this specific item is for you). After each answer, FSRS-6 updates both and schedules exactly when you should see it next.
Correct answers push the next review further out. Wrong answers reset it to tomorrow. Items climb a five-step status ladder, with streak requirements at each gate to prove genuine retention.
Study Modes â 13 Ways to Learn
AI Tutor â Yuki's 15 Lessons
Conversation-first, grammar-second. Yuki shows Japanese examples before she explains them, and never leads with a grammar table.
Voice Pipeline â 100% Offline
Text-to-speech runs through VOICEVOX, a local Japanese TTS engine using Kasukabe Tsumugi â a clear, gentle voice designed for natural pacing. Every synthesised phrase is cached by MD5 hash so the same sentence is never generated twice.
Speech-to-text uses faster-whisper, running in-process on CPU with no server required. Three model sizes are available (tiny / base / small) with "small" as the default for accuracy. Recordings are captured with sounddevice and fed directly to Whisper as numpy arrays.
Pronunciation scoring works by character-level comparison between what Whisper heard and the target text, giving a 0â100 score with no dependency on any external API.
Database â SQLite, 20+ Tables
Learner Analytics
Computed and stored after every session. Feed into Yuki's post-session written report and next lesson plan.
Tech Stack