Clean repo: remove temp files, AI junk, DBs, build artifacts
This commit is contained in:
@@ -1,54 +0,0 @@
|
||||
# Sublogue Installation Guide
|
||||
|
||||
## Synology
|
||||
- Create folders: `./data` and `./media` (or map to Synology shared folders).
|
||||
- In Container Manager, create a project and paste `docker-compose.yml`.
|
||||
- Map volumes to your shared folders (e.g., `/volume1/docker/sublogue` -> `/config`, `/volume1/media` -> `/media`).
|
||||
- Start the stack, then open `http://<NAS-IP>:5000`.
|
||||
|
||||
## Unraid
|
||||
- Create folders: `/mnt/user/appdata/sublogue` and `/mnt/user/appdata/sublogue/media`.
|
||||
- Add the container using `unraid-sublogue.xml` or import `docker-compose.yml` with a compose manager.
|
||||
- Set `TZ`, `PUID`, `PGID` to match your Unraid user (often `99/100`).
|
||||
- Start the container, open `http://<UNRAID-IP>:5000`.
|
||||
|
||||
## Komodo
|
||||
- Add a new stack and paste `docker-compose.yml`.
|
||||
- Ensure the `npm_network` exists (`docker network create npm_network`).
|
||||
- Deploy and open `http://<HOST-IP>:5000`.
|
||||
|
||||
## Portainer
|
||||
- Stacks -> Add Stack -> Web editor -> paste `docker-compose.yml`.
|
||||
- Ensure `npm_network` exists if you are using the proxy compose.
|
||||
- Deploy and open `http://<HOST-IP>:5000`.
|
||||
|
||||
## Bare Metal Docker CLI
|
||||
- Create folders: `mkdir -p ./data ./media`.
|
||||
- Run: `docker compose up -d`.
|
||||
- Open: `http://<HOST-IP>:5000`.
|
||||
|
||||
## Folder Structure
|
||||
- `./data` -> container `/config` (database and settings).
|
||||
- `./media` -> container `/media` (media library access).
|
||||
- For NPM: `./npm/data` and `./npm/letsencrypt`.
|
||||
|
||||
## Permissions (chmod/chown)
|
||||
- If you see permission errors, set `PUID`/`PGID` to your host user ID.
|
||||
- Fix ownership: `sudo chown -R 1000:1000 ./data ./media`.
|
||||
- Fix permissions: `sudo chmod -R 775 ./data ./media`.
|
||||
|
||||
## Updates
|
||||
- Watchtower (auto): run `containrrr/watchtower:latest` with `WATCHTOWER_CLEANUP=true`.
|
||||
- Manual update:
|
||||
- `docker compose pull`
|
||||
- `docker compose up -d`
|
||||
|
||||
## Nginx Proxy Manager (NPM)
|
||||
- Use `docker-compose.proxy.yml`.
|
||||
- In NPM, add a proxy host for your domain -> forward to `sublogue:5000`.
|
||||
- Enable SSL and Let’s Encrypt in NPM (auto-renewal is handled by NPM).
|
||||
- Advanced config (headers):
|
||||
- `proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;`
|
||||
- `proxy_set_header X-Forwarded-Proto $scheme;`
|
||||
- `proxy_set_header X-Forwarded-Host $host;`
|
||||
- `proxy_set_header X-Forwarded-Port $server_port;`
|
||||
@@ -1,8 +0,0 @@
|
||||
# Troubleshooting
|
||||
|
||||
- Permissions denied: set `PUID`/`PGID` correctly and run `chown -R` on your host folders.
|
||||
- Port conflicts: change host port mapping (e.g., `5001:5000`).
|
||||
- Missing network: create `npm_network` with `docker network create npm_network`.
|
||||
- Reverse proxy not working: verify NPM is on the same network and forward to `sublogue:5000`.
|
||||
- Healthcheck failing: confirm the app is listening on port `5000` and `/api/health` returns OK.
|
||||
- No metadata results: ensure at least one integration is enabled in Settings.
|
||||
@@ -1,312 +0,0 @@
|
||||
# Zero Timing Drift Implementation
|
||||
|
||||
## Overview
|
||||
|
||||
The subtitle processor has been completely rewritten to guarantee **zero timing drift** for existing subtitles when injecting plot metadata.
|
||||
|
||||
## Core Guarantee
|
||||
|
||||
**Existing subtitle timestamps remain byte-for-byte identical after processing.**
|
||||
|
||||
- First dialogue text appears at exactly the same timestamp as before
|
||||
- No subtitle blocks are shifted, delayed, or merged
|
||||
- VLC/MPV playback shows no desync
|
||||
- Running the operation twice doesn't duplicate plot blocks (idempotency)
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### Previous Approach (BROKEN)
|
||||
```python
|
||||
# OLD: Shifted ALL subtitles forward by 38 seconds
|
||||
intro_blocks = build_intro_blocks(movie, plot, header_duration=8, plot_duration=30)
|
||||
shift_ms = intro_blocks[-1].end_time # 38000 ms
|
||||
|
||||
for subtitle in existing_subtitles:
|
||||
shifted_subtitle = SubtitleBlock(
|
||||
start_time = subtitle.start_time + shift_ms, # ❌ CAUSES DRIFT!
|
||||
end_time = subtitle.end_time + shift_ms,
|
||||
text = subtitle.text
|
||||
)
|
||||
```
|
||||
|
||||
### New Approach (CORRECT)
|
||||
```python
|
||||
# NEW: Inject plot blocks BEFORE first subtitle without shifting
|
||||
first_subtitle_start = existing_subtitles[0].start_time
|
||||
|
||||
intro_blocks = build_intro_blocks(
|
||||
movie,
|
||||
plot,
|
||||
first_subtitle_start_ms=first_subtitle_start, # Adapt to available time
|
||||
min_safe_gap_ms=1000
|
||||
)
|
||||
|
||||
# Simply prepend intro blocks - NO SHIFTING!
|
||||
final = intro_blocks + existing_subtitles # ✅ ZERO DRIFT
|
||||
```
|
||||
|
||||
## Adaptive Injection Logic
|
||||
|
||||
The system intelligently adapts to available time before the first subtitle:
|
||||
|
||||
### Case 1: Plenty of Time (≥ 6 seconds available)
|
||||
```
|
||||
Timeline:
|
||||
├─ Block 1: Header (0ms - 3000ms)
|
||||
├─ Block 2: Plot (3000ms - [first_subtitle - 1000ms])
|
||||
├─ [1000ms gap]
|
||||
└─ Block 3+: Original subtitles (UNCHANGED TIMING)
|
||||
```
|
||||
|
||||
### Case 2: Limited Time (2-6 seconds available)
|
||||
```
|
||||
Timeline:
|
||||
├─ Block 1: Combined header+plot (0ms - [first_subtitle - 1000ms])
|
||||
├─ [1000ms gap]
|
||||
└─ Block 2+: Original subtitles (UNCHANGED TIMING)
|
||||
```
|
||||
|
||||
### Case 3: Very Tight Timing (< 2 seconds)
|
||||
```
|
||||
Timeline:
|
||||
├─ Block 1: Zero-duration metadata (0ms - 0ms) [invisible]
|
||||
├─ Block 2: Zero-duration plot (0ms - 0ms) [invisible]
|
||||
└─ Block 3+: Original subtitles (UNCHANGED TIMING)
|
||||
```
|
||||
|
||||
Zero-duration blocks preserve metadata for parsing but don't display during playback.
|
||||
|
||||
## Edge Cases Handled
|
||||
|
||||
### 1. Subtitles Starting at 00:00:00,000
|
||||
- Uses zero-duration metadata blocks
|
||||
- No visual display, but metadata preserved in file
|
||||
|
||||
### 2. Very Short First Cue Windows
|
||||
- Automatically detects available time
|
||||
- Adjusts plot display duration accordingly
|
||||
|
||||
### 3. Multiline Subtitle Blocks
|
||||
- Parser handles `\n` characters correctly
|
||||
- Text preserved exactly as-is
|
||||
|
||||
### 4. Files with BOM or Inconsistent Line Endings
|
||||
- Strips BOM (`\ufeff`) automatically
|
||||
- Normalizes `\r\n`, `\n`, `\r` to consistent format
|
||||
|
||||
### 5. Existing Non-Dialogue Cues
|
||||
- Parser intelligently skips empty blocks
|
||||
- Preserves all dialogue cues
|
||||
|
||||
### 6. Malformed SRT Blocks
|
||||
- Defensive parsing with try/catch
|
||||
- Invalid timecodes logged but don't crash processing
|
||||
- Corrupt blocks skipped gracefully
|
||||
|
||||
## Idempotency
|
||||
|
||||
Running the operation multiple times on the same file is safe:
|
||||
|
||||
```python
|
||||
def strip_existing_plot_blocks(blocks):
|
||||
"""
|
||||
Removes SubPlotter-generated blocks before re-processing.
|
||||
|
||||
Detection markers:
|
||||
- "Generated by SubPlotter" text
|
||||
- Zero-duration blocks (0ms - 0ms)
|
||||
- Metadata markers: IMDb:, ⭐, ⏱, "runtime"
|
||||
- Long text blocks in first 2 positions starting before 10s
|
||||
"""
|
||||
```
|
||||
|
||||
**Result**: File processed twice = same as file processed once
|
||||
|
||||
## Code Architecture
|
||||
|
||||
### Data Structures
|
||||
```python
|
||||
@dataclass(slots=True)
|
||||
class SubtitleBlock:
|
||||
index: int
|
||||
start_time: int # milliseconds
|
||||
end_time: int # milliseconds
|
||||
text: str
|
||||
```
|
||||
|
||||
### Key Functions
|
||||
|
||||
1. **`parse_srt(content: str)`**: Robust SRT parser with BOM/line ending handling
|
||||
2. **`build_intro_blocks(..., first_subtitle_start_ms)`**: Adaptive plot block generation
|
||||
3. **`strip_existing_plot_blocks(blocks)`**: Idempotency helper
|
||||
4. **`format_srt(blocks)`**: Serialize blocks back to valid SRT format
|
||||
|
||||
### Time Handling
|
||||
- All time internally stored as **milliseconds** (int)
|
||||
- Uses `datetime.timedelta` principles but optimized for integer math
|
||||
- Timecode format: `HH:MM:SS,mmm` (SRT standard)
|
||||
|
||||
## Testing
|
||||
|
||||
Run comprehensive tests:
|
||||
```bash
|
||||
python test_timing_preservation.py
|
||||
```
|
||||
|
||||
### Test Cases
|
||||
|
||||
1. **Main Timing Preservation Test**
|
||||
- Original subtitles at 10s, 13s, 16s
|
||||
- Verifies timestamps unchanged after injection
|
||||
- Verifies 1-second gap maintained
|
||||
|
||||
2. **Edge Case: Early Subtitle (1 second)**
|
||||
- First subtitle at 1s
|
||||
- Verifies zero-duration blocks used
|
||||
- Confirms no visible display interference
|
||||
|
||||
3. **Idempotency Test**
|
||||
- Processes file twice
|
||||
- Verifies no plot block duplication
|
||||
- Confirms output stable
|
||||
|
||||
### Expected Output
|
||||
```
|
||||
============================================================
|
||||
✅ ALL TESTS PASSED - ZERO TIMING DRIFT CONFIRMED
|
||||
============================================================
|
||||
🎉 All tests passed! Zero timing drift guaranteed.
|
||||
```
|
||||
|
||||
## Acceptance Criteria ✅
|
||||
|
||||
- [x] After injection, diff of original timestamps shows no change
|
||||
- [x] First dialogue text at exactly same timestamp as before
|
||||
- [x] VLC/MPV playback shows no desync
|
||||
- [x] Handles files where first cue starts at 00:00:00,000
|
||||
- [x] Handles very short first cue windows
|
||||
- [x] Preserves multiline subtitle blocks
|
||||
- [x] Handles BOM and inconsistent line endings
|
||||
- [x] Preserves existing non-dialogue cues
|
||||
- [x] Gracefully handles malformed SRT blocks
|
||||
- [x] Idempotent (running twice doesn't corrupt file)
|
||||
|
||||
## What Changed in Codebase
|
||||
|
||||
### Modified Files
|
||||
|
||||
1. **`core/subtitle_processor.py`**
|
||||
- Rewrote `build_intro_blocks()` to accept `first_subtitle_start_ms` parameter
|
||||
- Added adaptive timing logic (3 cases based on available time)
|
||||
- Removed ALL subtitle shifting code (lines 243-254 deleted)
|
||||
- Added `strip_existing_plot_blocks()` for idempotency
|
||||
- Enhanced `parse_srt()` with BOM/line ending handling
|
||||
- Added comprehensive logging for debugging
|
||||
|
||||
### New Files
|
||||
|
||||
1. **`test_timing_preservation.py`**
|
||||
- Comprehensive test suite
|
||||
- Verifies zero timing drift
|
||||
- Tests edge cases and idempotency
|
||||
|
||||
2. **`ZERO_TIMING_DRIFT.md`** (this file)
|
||||
- Complete documentation
|
||||
- Implementation details
|
||||
- Usage examples
|
||||
|
||||
## Usage Example
|
||||
|
||||
The API remains unchanged - zero timing drift is automatic:
|
||||
|
||||
```python
|
||||
processor = SubtitleProcessor(omdb_client, tmdb_client)
|
||||
|
||||
result = await processor.process_file(
|
||||
file_path="movie.srt",
|
||||
duration=40, # Ignored - duration now adaptive
|
||||
force_reprocess=False
|
||||
)
|
||||
|
||||
# result["status"] = "Processed"
|
||||
# Original subtitle timing preserved!
|
||||
```
|
||||
|
||||
## Logging Output
|
||||
|
||||
```
|
||||
2026-01-14 03:06:30,885 - INFO - First subtitle starts at 00:00:10,000 (10000 ms) - injecting plot before this time
|
||||
2026-01-14 03:06:30,885 - INFO - Injecting plot blocks: Header [0ms-3000ms], Plot [3000ms-9000ms], First subtitle: 10000ms
|
||||
2026-01-14 03:06:30,885 - INFO - Stripped plot blocks: 5 → 3 blocks
|
||||
```
|
||||
|
||||
## Benefits
|
||||
|
||||
1. **No Sync Issues**: Subtitles perfectly match video timing
|
||||
2. **Professional Quality**: Industry-standard SRT handling
|
||||
3. **Robust**: Handles edge cases and malformed files
|
||||
4. **Safe**: Idempotent operations prevent corruption
|
||||
5. **Transparent**: Comprehensive logging for debugging
|
||||
6. **Fast**: Integer millisecond math, no datetime overhead
|
||||
7. **Reliable**: Extensive test coverage
|
||||
|
||||
## Technical Implementation Details
|
||||
|
||||
### Why Integer Milliseconds?
|
||||
|
||||
Using `int` milliseconds instead of `datetime.timedelta`:
|
||||
- **Performance**: Integer arithmetic is faster than datetime objects
|
||||
- **Precision**: SRT format uses milliseconds (no need for nanoseconds)
|
||||
- **Simplicity**: Direct conversion to/from SRT timecode format
|
||||
- **Memory**: Smaller memory footprint for large subtitle files
|
||||
|
||||
### Why 1-Second Safety Gap?
|
||||
|
||||
The `min_safe_gap_ms=1000` parameter ensures:
|
||||
- Plot text fully disappears before dialogue starts
|
||||
- Prevents visual overlap in edge cases
|
||||
- Accounts for subtitle rendering timing variations
|
||||
- Industry standard practice for subtitle editing
|
||||
|
||||
### Why Zero-Duration Blocks?
|
||||
|
||||
When first subtitle starts very early (< 2s):
|
||||
- Can't display plot without overlapping dialogue
|
||||
- Zero-duration blocks (0ms-0ms) preserve metadata
|
||||
- Players skip rendering but parsers see the text
|
||||
- Maintains file structure for re-processing
|
||||
|
||||
## Comparison: Before vs After
|
||||
|
||||
### Before (Broken Implementation)
|
||||
- ❌ All subtitles shifted forward 38 seconds
|
||||
- ❌ First dialogue at 00:00:10,000 → moved to 00:00:48,000
|
||||
- ❌ Causes total desync with video
|
||||
- ❌ Unusable output files
|
||||
|
||||
### After (Fixed Implementation)
|
||||
- ✅ No subtitle timing changes
|
||||
- ✅ First dialogue at 00:00:10,000 → stays at 00:00:10,000
|
||||
- ✅ Perfect sync with video
|
||||
- ✅ Professional-quality output
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Possible improvements (not currently needed):
|
||||
|
||||
1. **Variable safety gap** based on subtitle density
|
||||
2. **Multi-language plot blocks** for international content
|
||||
3. **Custom plot positioning** (before/after/both)
|
||||
4. **Interactive plot display timing** adjustment
|
||||
5. **Smart plot splitting** for very long summaries
|
||||
|
||||
## Conclusion
|
||||
|
||||
The subtitle processor now implements **true zero timing drift** using subtitle-aware parsing and adaptive injection. All existing subtitles maintain their exact original timing while plot metadata is safely prepended.
|
||||
|
||||
---
|
||||
|
||||
**Status**: ✅ Production Ready
|
||||
**Test Coverage**: 100% pass rate
|
||||
**Performance**: < 50ms for typical SRT files
|
||||
**Reliability**: Handles all edge cases
|
||||
Reference in New Issue
Block a user