Clean repo: remove temp files, AI junk, DBs, build artifacts
This commit is contained in:
@@ -1,54 +0,0 @@
|
|||||||
# Sublogue Installation Guide
|
|
||||||
|
|
||||||
## Synology
|
|
||||||
- Create folders: `./data` and `./media` (or map to Synology shared folders).
|
|
||||||
- In Container Manager, create a project and paste `docker-compose.yml`.
|
|
||||||
- Map volumes to your shared folders (e.g., `/volume1/docker/sublogue` -> `/config`, `/volume1/media` -> `/media`).
|
|
||||||
- Start the stack, then open `http://<NAS-IP>:5000`.
|
|
||||||
|
|
||||||
## Unraid
|
|
||||||
- Create folders: `/mnt/user/appdata/sublogue` and `/mnt/user/appdata/sublogue/media`.
|
|
||||||
- Add the container using `unraid-sublogue.xml` or import `docker-compose.yml` with a compose manager.
|
|
||||||
- Set `TZ`, `PUID`, `PGID` to match your Unraid user (often `99/100`).
|
|
||||||
- Start the container, open `http://<UNRAID-IP>:5000`.
|
|
||||||
|
|
||||||
## Komodo
|
|
||||||
- Add a new stack and paste `docker-compose.yml`.
|
|
||||||
- Ensure the `npm_network` exists (`docker network create npm_network`).
|
|
||||||
- Deploy and open `http://<HOST-IP>:5000`.
|
|
||||||
|
|
||||||
## Portainer
|
|
||||||
- Stacks -> Add Stack -> Web editor -> paste `docker-compose.yml`.
|
|
||||||
- Ensure `npm_network` exists if you are using the proxy compose.
|
|
||||||
- Deploy and open `http://<HOST-IP>:5000`.
|
|
||||||
|
|
||||||
## Bare Metal Docker CLI
|
|
||||||
- Create folders: `mkdir -p ./data ./media`.
|
|
||||||
- Run: `docker compose up -d`.
|
|
||||||
- Open: `http://<HOST-IP>:5000`.
|
|
||||||
|
|
||||||
## Folder Structure
|
|
||||||
- `./data` -> container `/config` (database and settings).
|
|
||||||
- `./media` -> container `/media` (media library access).
|
|
||||||
- For NPM: `./npm/data` and `./npm/letsencrypt`.
|
|
||||||
|
|
||||||
## Permissions (chmod/chown)
|
|
||||||
- If you see permission errors, set `PUID`/`PGID` to your host user ID.
|
|
||||||
- Fix ownership: `sudo chown -R 1000:1000 ./data ./media`.
|
|
||||||
- Fix permissions: `sudo chmod -R 775 ./data ./media`.
|
|
||||||
|
|
||||||
## Updates
|
|
||||||
- Watchtower (auto): run `containrrr/watchtower:latest` with `WATCHTOWER_CLEANUP=true`.
|
|
||||||
- Manual update:
|
|
||||||
- `docker compose pull`
|
|
||||||
- `docker compose up -d`
|
|
||||||
|
|
||||||
## Nginx Proxy Manager (NPM)
|
|
||||||
- Use `docker-compose.proxy.yml`.
|
|
||||||
- In NPM, add a proxy host for your domain -> forward to `sublogue:5000`.
|
|
||||||
- Enable SSL and Let’s Encrypt in NPM (auto-renewal is handled by NPM).
|
|
||||||
- Advanced config (headers):
|
|
||||||
- `proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;`
|
|
||||||
- `proxy_set_header X-Forwarded-Proto $scheme;`
|
|
||||||
- `proxy_set_header X-Forwarded-Host $host;`
|
|
||||||
- `proxy_set_header X-Forwarded-Port $server_port;`
|
|
||||||
@@ -1,8 +0,0 @@
|
|||||||
# Troubleshooting
|
|
||||||
|
|
||||||
- Permissions denied: set `PUID`/`PGID` correctly and run `chown -R` on your host folders.
|
|
||||||
- Port conflicts: change host port mapping (e.g., `5001:5000`).
|
|
||||||
- Missing network: create `npm_network` with `docker network create npm_network`.
|
|
||||||
- Reverse proxy not working: verify NPM is on the same network and forward to `sublogue:5000`.
|
|
||||||
- Healthcheck failing: confirm the app is listening on port `5000` and `/api/health` returns OK.
|
|
||||||
- No metadata results: ensure at least one integration is enabled in Settings.
|
|
||||||
@@ -1,312 +0,0 @@
|
|||||||
# Zero Timing Drift Implementation
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
The subtitle processor has been completely rewritten to guarantee **zero timing drift** for existing subtitles when injecting plot metadata.
|
|
||||||
|
|
||||||
## Core Guarantee
|
|
||||||
|
|
||||||
**Existing subtitle timestamps remain byte-for-byte identical after processing.**
|
|
||||||
|
|
||||||
- First dialogue text appears at exactly the same timestamp as before
|
|
||||||
- No subtitle blocks are shifted, delayed, or merged
|
|
||||||
- VLC/MPV playback shows no desync
|
|
||||||
- Running the operation twice doesn't duplicate plot blocks (idempotency)
|
|
||||||
|
|
||||||
## Implementation Strategy
|
|
||||||
|
|
||||||
### Previous Approach (BROKEN)
|
|
||||||
```python
|
|
||||||
# OLD: Shifted ALL subtitles forward by 38 seconds
|
|
||||||
intro_blocks = build_intro_blocks(movie, plot, header_duration=8, plot_duration=30)
|
|
||||||
shift_ms = intro_blocks[-1].end_time # 38000 ms
|
|
||||||
|
|
||||||
for subtitle in existing_subtitles:
|
|
||||||
shifted_subtitle = SubtitleBlock(
|
|
||||||
start_time = subtitle.start_time + shift_ms, # ❌ CAUSES DRIFT!
|
|
||||||
end_time = subtitle.end_time + shift_ms,
|
|
||||||
text = subtitle.text
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
### New Approach (CORRECT)
|
|
||||||
```python
|
|
||||||
# NEW: Inject plot blocks BEFORE first subtitle without shifting
|
|
||||||
first_subtitle_start = existing_subtitles[0].start_time
|
|
||||||
|
|
||||||
intro_blocks = build_intro_blocks(
|
|
||||||
movie,
|
|
||||||
plot,
|
|
||||||
first_subtitle_start_ms=first_subtitle_start, # Adapt to available time
|
|
||||||
min_safe_gap_ms=1000
|
|
||||||
)
|
|
||||||
|
|
||||||
# Simply prepend intro blocks - NO SHIFTING!
|
|
||||||
final = intro_blocks + existing_subtitles # ✅ ZERO DRIFT
|
|
||||||
```
|
|
||||||
|
|
||||||
## Adaptive Injection Logic
|
|
||||||
|
|
||||||
The system intelligently adapts to available time before the first subtitle:
|
|
||||||
|
|
||||||
### Case 1: Plenty of Time (≥ 6 seconds available)
|
|
||||||
```
|
|
||||||
Timeline:
|
|
||||||
├─ Block 1: Header (0ms - 3000ms)
|
|
||||||
├─ Block 2: Plot (3000ms - [first_subtitle - 1000ms])
|
|
||||||
├─ [1000ms gap]
|
|
||||||
└─ Block 3+: Original subtitles (UNCHANGED TIMING)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Case 2: Limited Time (2-6 seconds available)
|
|
||||||
```
|
|
||||||
Timeline:
|
|
||||||
├─ Block 1: Combined header+plot (0ms - [first_subtitle - 1000ms])
|
|
||||||
├─ [1000ms gap]
|
|
||||||
└─ Block 2+: Original subtitles (UNCHANGED TIMING)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Case 3: Very Tight Timing (< 2 seconds)
|
|
||||||
```
|
|
||||||
Timeline:
|
|
||||||
├─ Block 1: Zero-duration metadata (0ms - 0ms) [invisible]
|
|
||||||
├─ Block 2: Zero-duration plot (0ms - 0ms) [invisible]
|
|
||||||
└─ Block 3+: Original subtitles (UNCHANGED TIMING)
|
|
||||||
```
|
|
||||||
|
|
||||||
Zero-duration blocks preserve metadata for parsing but don't display during playback.
|
|
||||||
|
|
||||||
## Edge Cases Handled
|
|
||||||
|
|
||||||
### 1. Subtitles Starting at 00:00:00,000
|
|
||||||
- Uses zero-duration metadata blocks
|
|
||||||
- No visual display, but metadata preserved in file
|
|
||||||
|
|
||||||
### 2. Very Short First Cue Windows
|
|
||||||
- Automatically detects available time
|
|
||||||
- Adjusts plot display duration accordingly
|
|
||||||
|
|
||||||
### 3. Multiline Subtitle Blocks
|
|
||||||
- Parser handles `\n` characters correctly
|
|
||||||
- Text preserved exactly as-is
|
|
||||||
|
|
||||||
### 4. Files with BOM or Inconsistent Line Endings
|
|
||||||
- Strips BOM (`\ufeff`) automatically
|
|
||||||
- Normalizes `\r\n`, `\n`, `\r` to consistent format
|
|
||||||
|
|
||||||
### 5. Existing Non-Dialogue Cues
|
|
||||||
- Parser intelligently skips empty blocks
|
|
||||||
- Preserves all dialogue cues
|
|
||||||
|
|
||||||
### 6. Malformed SRT Blocks
|
|
||||||
- Defensive parsing with try/catch
|
|
||||||
- Invalid timecodes logged but don't crash processing
|
|
||||||
- Corrupt blocks skipped gracefully
|
|
||||||
|
|
||||||
## Idempotency
|
|
||||||
|
|
||||||
Running the operation multiple times on the same file is safe:
|
|
||||||
|
|
||||||
```python
|
|
||||||
def strip_existing_plot_blocks(blocks):
|
|
||||||
"""
|
|
||||||
Removes SubPlotter-generated blocks before re-processing.
|
|
||||||
|
|
||||||
Detection markers:
|
|
||||||
- "Generated by SubPlotter" text
|
|
||||||
- Zero-duration blocks (0ms - 0ms)
|
|
||||||
- Metadata markers: IMDb:, ⭐, ⏱, "runtime"
|
|
||||||
- Long text blocks in first 2 positions starting before 10s
|
|
||||||
"""
|
|
||||||
```
|
|
||||||
|
|
||||||
**Result**: File processed twice = same as file processed once
|
|
||||||
|
|
||||||
## Code Architecture
|
|
||||||
|
|
||||||
### Data Structures
|
|
||||||
```python
|
|
||||||
@dataclass(slots=True)
|
|
||||||
class SubtitleBlock:
|
|
||||||
index: int
|
|
||||||
start_time: int # milliseconds
|
|
||||||
end_time: int # milliseconds
|
|
||||||
text: str
|
|
||||||
```
|
|
||||||
|
|
||||||
### Key Functions
|
|
||||||
|
|
||||||
1. **`parse_srt(content: str)`**: Robust SRT parser with BOM/line ending handling
|
|
||||||
2. **`build_intro_blocks(..., first_subtitle_start_ms)`**: Adaptive plot block generation
|
|
||||||
3. **`strip_existing_plot_blocks(blocks)`**: Idempotency helper
|
|
||||||
4. **`format_srt(blocks)`**: Serialize blocks back to valid SRT format
|
|
||||||
|
|
||||||
### Time Handling
|
|
||||||
- All time internally stored as **milliseconds** (int)
|
|
||||||
- Uses `datetime.timedelta` principles but optimized for integer math
|
|
||||||
- Timecode format: `HH:MM:SS,mmm` (SRT standard)
|
|
||||||
|
|
||||||
## Testing
|
|
||||||
|
|
||||||
Run comprehensive tests:
|
|
||||||
```bash
|
|
||||||
python test_timing_preservation.py
|
|
||||||
```
|
|
||||||
|
|
||||||
### Test Cases
|
|
||||||
|
|
||||||
1. **Main Timing Preservation Test**
|
|
||||||
- Original subtitles at 10s, 13s, 16s
|
|
||||||
- Verifies timestamps unchanged after injection
|
|
||||||
- Verifies 1-second gap maintained
|
|
||||||
|
|
||||||
2. **Edge Case: Early Subtitle (1 second)**
|
|
||||||
- First subtitle at 1s
|
|
||||||
- Verifies zero-duration blocks used
|
|
||||||
- Confirms no visible display interference
|
|
||||||
|
|
||||||
3. **Idempotency Test**
|
|
||||||
- Processes file twice
|
|
||||||
- Verifies no plot block duplication
|
|
||||||
- Confirms output stable
|
|
||||||
|
|
||||||
### Expected Output
|
|
||||||
```
|
|
||||||
============================================================
|
|
||||||
✅ ALL TESTS PASSED - ZERO TIMING DRIFT CONFIRMED
|
|
||||||
============================================================
|
|
||||||
🎉 All tests passed! Zero timing drift guaranteed.
|
|
||||||
```
|
|
||||||
|
|
||||||
## Acceptance Criteria ✅
|
|
||||||
|
|
||||||
- [x] After injection, diff of original timestamps shows no change
|
|
||||||
- [x] First dialogue text at exactly same timestamp as before
|
|
||||||
- [x] VLC/MPV playback shows no desync
|
|
||||||
- [x] Handles files where first cue starts at 00:00:00,000
|
|
||||||
- [x] Handles very short first cue windows
|
|
||||||
- [x] Preserves multiline subtitle blocks
|
|
||||||
- [x] Handles BOM and inconsistent line endings
|
|
||||||
- [x] Preserves existing non-dialogue cues
|
|
||||||
- [x] Gracefully handles malformed SRT blocks
|
|
||||||
- [x] Idempotent (running twice doesn't corrupt file)
|
|
||||||
|
|
||||||
## What Changed in Codebase
|
|
||||||
|
|
||||||
### Modified Files
|
|
||||||
|
|
||||||
1. **`core/subtitle_processor.py`**
|
|
||||||
- Rewrote `build_intro_blocks()` to accept `first_subtitle_start_ms` parameter
|
|
||||||
- Added adaptive timing logic (3 cases based on available time)
|
|
||||||
- Removed ALL subtitle shifting code (lines 243-254 deleted)
|
|
||||||
- Added `strip_existing_plot_blocks()` for idempotency
|
|
||||||
- Enhanced `parse_srt()` with BOM/line ending handling
|
|
||||||
- Added comprehensive logging for debugging
|
|
||||||
|
|
||||||
### New Files
|
|
||||||
|
|
||||||
1. **`test_timing_preservation.py`**
|
|
||||||
- Comprehensive test suite
|
|
||||||
- Verifies zero timing drift
|
|
||||||
- Tests edge cases and idempotency
|
|
||||||
|
|
||||||
2. **`ZERO_TIMING_DRIFT.md`** (this file)
|
|
||||||
- Complete documentation
|
|
||||||
- Implementation details
|
|
||||||
- Usage examples
|
|
||||||
|
|
||||||
## Usage Example
|
|
||||||
|
|
||||||
The API remains unchanged - zero timing drift is automatic:
|
|
||||||
|
|
||||||
```python
|
|
||||||
processor = SubtitleProcessor(omdb_client, tmdb_client)
|
|
||||||
|
|
||||||
result = await processor.process_file(
|
|
||||||
file_path="movie.srt",
|
|
||||||
duration=40, # Ignored - duration now adaptive
|
|
||||||
force_reprocess=False
|
|
||||||
)
|
|
||||||
|
|
||||||
# result["status"] = "Processed"
|
|
||||||
# Original subtitle timing preserved!
|
|
||||||
```
|
|
||||||
|
|
||||||
## Logging Output
|
|
||||||
|
|
||||||
```
|
|
||||||
2026-01-14 03:06:30,885 - INFO - First subtitle starts at 00:00:10,000 (10000 ms) - injecting plot before this time
|
|
||||||
2026-01-14 03:06:30,885 - INFO - Injecting plot blocks: Header [0ms-3000ms], Plot [3000ms-9000ms], First subtitle: 10000ms
|
|
||||||
2026-01-14 03:06:30,885 - INFO - Stripped plot blocks: 5 → 3 blocks
|
|
||||||
```
|
|
||||||
|
|
||||||
## Benefits
|
|
||||||
|
|
||||||
1. **No Sync Issues**: Subtitles perfectly match video timing
|
|
||||||
2. **Professional Quality**: Industry-standard SRT handling
|
|
||||||
3. **Robust**: Handles edge cases and malformed files
|
|
||||||
4. **Safe**: Idempotent operations prevent corruption
|
|
||||||
5. **Transparent**: Comprehensive logging for debugging
|
|
||||||
6. **Fast**: Integer millisecond math, no datetime overhead
|
|
||||||
7. **Reliable**: Extensive test coverage
|
|
||||||
|
|
||||||
## Technical Implementation Details
|
|
||||||
|
|
||||||
### Why Integer Milliseconds?
|
|
||||||
|
|
||||||
Using `int` milliseconds instead of `datetime.timedelta`:
|
|
||||||
- **Performance**: Integer arithmetic is faster than datetime objects
|
|
||||||
- **Precision**: SRT format uses milliseconds (no need for nanoseconds)
|
|
||||||
- **Simplicity**: Direct conversion to/from SRT timecode format
|
|
||||||
- **Memory**: Smaller memory footprint for large subtitle files
|
|
||||||
|
|
||||||
### Why 1-Second Safety Gap?
|
|
||||||
|
|
||||||
The `min_safe_gap_ms=1000` parameter ensures:
|
|
||||||
- Plot text fully disappears before dialogue starts
|
|
||||||
- Prevents visual overlap in edge cases
|
|
||||||
- Accounts for subtitle rendering timing variations
|
|
||||||
- Industry standard practice for subtitle editing
|
|
||||||
|
|
||||||
### Why Zero-Duration Blocks?
|
|
||||||
|
|
||||||
When first subtitle starts very early (< 2s):
|
|
||||||
- Can't display plot without overlapping dialogue
|
|
||||||
- Zero-duration blocks (0ms-0ms) preserve metadata
|
|
||||||
- Players skip rendering but parsers see the text
|
|
||||||
- Maintains file structure for re-processing
|
|
||||||
|
|
||||||
## Comparison: Before vs After
|
|
||||||
|
|
||||||
### Before (Broken Implementation)
|
|
||||||
- ❌ All subtitles shifted forward 38 seconds
|
|
||||||
- ❌ First dialogue at 00:00:10,000 → moved to 00:00:48,000
|
|
||||||
- ❌ Causes total desync with video
|
|
||||||
- ❌ Unusable output files
|
|
||||||
|
|
||||||
### After (Fixed Implementation)
|
|
||||||
- ✅ No subtitle timing changes
|
|
||||||
- ✅ First dialogue at 00:00:10,000 → stays at 00:00:10,000
|
|
||||||
- ✅ Perfect sync with video
|
|
||||||
- ✅ Professional-quality output
|
|
||||||
|
|
||||||
## Future Enhancements
|
|
||||||
|
|
||||||
Possible improvements (not currently needed):
|
|
||||||
|
|
||||||
1. **Variable safety gap** based on subtitle density
|
|
||||||
2. **Multi-language plot blocks** for international content
|
|
||||||
3. **Custom plot positioning** (before/after/both)
|
|
||||||
4. **Interactive plot display timing** adjustment
|
|
||||||
5. **Smart plot splitting** for very long summaries
|
|
||||||
|
|
||||||
## Conclusion
|
|
||||||
|
|
||||||
The subtitle processor now implements **true zero timing drift** using subtitle-aware parsing and adaptive injection. All existing subtitles maintain their exact original timing while plot metadata is safely prepended.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
**Status**: ✅ Production Ready
|
|
||||||
**Test Coverage**: 100% pass rate
|
|
||||||
**Performance**: < 50ms for typical SRT files
|
|
||||||
**Reliability**: Handles all edge cases
|
|
||||||
Reference in New Issue
Block a user