First-time YouTube video processing randomly truncates transcript

## Summary

When processing a YouTube video for the first time (uncached), Gemini randomly truncates the transcript to a small portion of the video (0.5%-27% coverage). Subsequent requests for the same video work correctly due to caching.

## Environment details

- Programming language: Python
- OS: Linux (WSL2)
- Language runtime version: Python 3.13.6
- Package version: google-genai 1.52.0

## Configuration

- Model: `gemini-3-flash-preview`
- Settings: `thinking_budget=0`, `fps=0.1`, `media_resolution=LOW`

## Observations

We ran two sequential tests on the same video: first without explicit offsets, then with offsets. The first test processes the video fresh; the second benefits from caching.

### Video 1: https://www.youtube.com/watch?v=lMAnY2B1UnM (27:21 duration)

First uncached run:

| Test Order | Offsets | Cached | Last Timestamp | Coverage |
|------------|---------|--------|----------------|----------|
| 1st | No | No | 00:09 | **0.5%** |
| 2nd | Yes | Yes | 27:08 | 99.2% |

### Video 2: https://www.youtube.com/watch?v=tdIUMkXxtHg (25:30 duration)

First uncached run:

| Test Order | Offsets | Cached | Last Timestamp | Coverage |
|------------|---------|--------|----------------|----------|
| 1st | No | No | 25:21 | 99.4% |
| 2nd | Yes | Yes | 06:49 | **26.7%** |

### Subsequent runs (both cached):

Both tests achieve 90-99% coverage consistently once caching is active.

## Key Findings

1. **First request is unreliable**: The first (uncached) request for a video randomly truncates
2. **Caching masks the issue**: Second request uses `cached_content_token_count` and works correctly
3. **Not offset-related**: Truncation occurs randomly regardless of offset settings
4. **Same input tokens**: Both requests show identical `prompt_token_count`, confirming all video data is sent

## Reproduction

```python
from google import genai
from google.genai import types

VIDEO_URL = "https://www.youtube.com/watch?v=NEW_VIDEO_ID"  # Use a fresh video

PROMPT = """Transcribe this video. Return JSON with format:
{
  "transcript_segments": [
    {"timestamp": "MM:SS", "text": "transcribed text"}
  ]
}
Include all speech from the entire video."""

SCHEMA = {
    "type": "object",
    "properties": {
        "transcript_segments": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "timestamp": {"type": "string"},
                    "text": {"type": "string"},
                },
                "required": ["timestamp", "text"],
            },
        }
    },
    "required": ["transcript_segments"],
}

client = genai.Client(api_key="...")

# First request to a NEW video (uncached) - randomly truncates
video_part = types.Part(
    file_data=types.FileData(
        file_uri=VIDEO_URL,
        mime_type="video/mp4"
    ),
    video_metadata=types.VideoMetadata(fps=0.1),
)

config = types.GenerateContentConfig(
    response_mime_type="application/json",
    response_json_schema=SCHEMA,
    media_resolution=types.MediaResolution.MEDIA_RESOLUTION_LOW,
    max_output_tokens=65536,
    thinking_config=types.ThinkingConfig(thinking_budget=0),
)

response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents=[video_part, PROMPT],
    config=config,
)
# Response may be truncated to first few seconds of video
```

## Expected Behavior

First-time video processing should reliably transcribe the entire video, not randomly truncate.

## Impact

- Applications processing new videos may silently receive incomplete transcripts
- The issue is masked by caching, making it hard to detect in testing
- Users may only notice when processing videos for the first time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

First-time YouTube video processing randomly truncates transcript #1898

Summary

Environment details

Configuration

Observations

Video 1: https://www.youtube.com/watch?v=lMAnY2B1UnM (27:21 duration)

Video 2: https://www.youtube.com/watch?v=tdIUMkXxtHg (25:30 duration)

Subsequent runs (both cached):

Key Findings

Reproduction

Expected Behavior

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

First-time YouTube video processing randomly truncates transcript #1898

Description

Summary

Environment details

Configuration

Observations

Video 1: https://www.youtube.com/watch?v=lMAnY2B1UnM (27:21 duration)

Video 2: https://www.youtube.com/watch?v=tdIUMkXxtHg (25:30 duration)

Subsequent runs (both cached):

Key Findings

Reproduction

Expected Behavior

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions