Localized/Store & growth

App Preview Videos for Japanese Users

How to design and produce App Store preview videos that work for Japanese users: pacing, captions, music, and what to show in the first three seconds.

App Store preview videos auto-play on the listing page. They're a high-leverage conversion lever, and the right format for Japan looks different from the right format for the US. This article covers what to do differently.

For broader listing strategy, see the Japanese ASO complete guide.

How Japanese Users Watch Preview Videos

Two behaviors that shape design:

1. Sound is mostly off. The App Store auto-plays muted by default. Japanese users very rarely tap to enable sound. Your video must work silently — captions, on-screen text, and visual storytelling carry it.

2. Decision happens in 3 seconds. Users decide whether to keep watching or scroll past inside the first 3 seconds. The opening must answer "what is this and is it for me?"

The implication: the JP preview video is a captioned, visual-first short film. Voiceovers are wasted budget; cinematic openings without text are wasted seconds.

The 3-Second Hook

Your first 3 seconds need to deliver:

  • The app name or brand recognition.
  • The category (what kind of app).
  • A glimpse of the value (what you get from using it).

Three opening structures that work in JP:

Hook A: Tagline + product shot

Frame 1 (0–1s): Tagline in large text. Frame 2 (1–3s): App in action with a caption.

Example:

  • 0–1s: 「習慣を、続けやすく。」
  • 1–3s: User checking off a habit with caption 「1日3分のチェックイン」

Hook B: Problem + solution

Frame 1 (0–1.5s): Problem stated as text. Frame 2 (1.5–3s): App solving it.

Example:

  • 0–1.5s: 「三日坊主、卒業しよう。」
  • 1.5–3s: User on day 30 of a streak with caption 「続いた人、100万人」

Hook C: Trust signal first

Frame 1 (0–1.5s): Number/award/ranking. Frame 2 (1.5–3s): Brand + category.

Example:

  • 0–1.5s: 「100万人が使う」
  • 1.5–3s: 「家計簿アプリ Money Mate」

Use Hook C only if your trust signal is genuinely strong. Otherwise it underdelivers and the user drops off.

Captions and On-Screen Text

Every spoken or implied word should be captioned. Captions in JP preview videos follow some conventions:

  • Position: top-third or bottom-third, not centered over key UI.
  • Background: semi-transparent block, not floating text. Floating text gets lost on busy screens.
  • Font weight: bold or semibold; thin weights are unreadable at preview size.
  • Color: high contrast — typically white on dark, or dark on light pastel.
  • Timing: 2–4 seconds per caption. Faster than 2 seconds and JP users can't read them in time; slower than 4 and you lose pacing.
  • Language: Japanese only, no English text. (Brand names stay as-is.)

Caption pacing for Japanese

Japanese characters are denser than Latin characters. A 12-character caption in Japanese says approximately what a 25-character caption says in English. Don't over-stuff captions thinking you need English-equivalent length.

Pacing

JP preview videos tend to be slightly longer per scene than US videos. The cadence:

  • US convention: 1–1.5 seconds per scene, fast cuts, energetic.
  • JP convention: 2–3 seconds per scene, calmer pacing, more time to read.

The exception: gaming previews, which run at faster pacing in both markets.

This isn't a hard rule — it's a calibration. If you re-edit a US video for JP, slow the cuts ~30%.

What to Show

A workable JP preview video structure (15 seconds total — Apple's limit is 30, but 15 is a sweet spot):

Time Content
0–3s Hook: tagline + brand
3–6s Feature 1: shown + captioned
6–9s Feature 2: shown + captioned
9–12s Feature 3 or social proof
12–15s CTA: 「今すぐダウンロード」 + brand

Each feature segment should:

  • Open with a caption explaining the feature.
  • Show the feature in use (not a static screen).
  • End with a moment of "result" — saved file, completed habit, sent message.

Music

JP users mostly watch with sound off, but the small percentage that taps for sound matters. Music conventions:

  • Tempo: matches the pacing — calmer for productivity, energetic for fitness.
  • Genre: instrumental electronic, light acoustic, or upbeat pop instrumental work universally. Avoid lyrics in any language; English lyrics are particularly mismatched.
  • Licensing: use a JP-friendly music licensing source (Artlist, Epidemic Sound) — make sure the license covers App Store distribution.

If budget is tight, the audio doesn't have to be perfect. Mute users won't hear it.

Voiceover

Voiceovers are mostly wasted budget for JP previews because most viewers have sound off. The exceptions:

  • Wellness, meditation apps where sound design is part of the experience.
  • Audio-first products (podcast apps, music apps) where demonstrating audio is the point.
  • Apps for older demographics who may be more likely to have sound on.

If you do voiceover:

  • Use a native Japanese voice actor.
  • Match the brand voice register (polite-neutral, casual, etc.).
  • Keep the script short — voiceovers in JP previews are typically 5–8 phrases over 15 seconds.

Localizing an Existing US Video

If you have a US video and a small budget:

Minimum viable localization

  • Translate captions to Japanese.
  • Replace any English UI shown in the video with Japanese UI.
  • Re-export.

This is the lowest level of effort. It produces a video that's "fine" but feels imported.

Better

Above + slow the cuts ~30% + add JP-style trust signal in the opening.

Best

Re-edit from scratch with JP pacing, JP-style hook, and JP-native captioning. This is what shipping JP apps with significant budgets do.

Production Constraints to Plan Around

The Apple App Store has specific requirements:

  • Length: 15–30 seconds.
  • Resolution: at least 1080p, varies by device.
  • Aspect ratios: portrait (1080x1920) for iPhone, plus iPad-specific aspect ratios.
  • Format: MP4 or MOV.
  • Up to 3 videos per locale.

For Japan, deliver a JP-specific video, not a multi-locale video with subtitles. Apple will automatically show the JP video in the JP listing.

A/B Testing

Apple's Product Page Optimization lets you A/B test up to 3 variants of preview video against the default for 90-day cycles. For Japan, useful variants to test:

  • Hook variant: tagline-first vs. trust-signal-first vs. problem-first.
  • Pacing variant: 15-second tight vs. 30-second extended.
  • First-feature variant: which of your 3 features to lead with.

Run tests sequentially, not simultaneously, to isolate effects.

A Self-Check Before Submitting

  • Video plays meaningfully with sound off.
  • First 3 seconds answer "what is this."
  • Captions are in Japanese, bold, with high contrast.
  • No English text on screen (except brand names).
  • Pacing matches JP convention (2–3 seconds per scene for non-game).
  • App UI shown in video is in Japanese, not English.
  • Final frame includes brand + CTA in Japanese.
  • Music has no English (or any language) lyrics.

Where to Go Next


We produce Japanese App Store preview videos as part of full listing engagements, including JP-native captioning, pacing, and edit. Contact us for sample work.