poprawki do parsowania autora
This commit is contained in:
112
VERIFICATION.md
Normal file
112
VERIFICATION.md
Normal file
@@ -0,0 +1,112 @@
|
||||
# Verification Workflow
|
||||
|
||||
Use `data/verified_author_overrides.tsv` for manual metadata corrections.
|
||||
|
||||
## Using `generate_abs_mock_report.py`
|
||||
|
||||
The script generates a non-destructive TSV report with proposed Audiobookshelf paths.
|
||||
It does not rename or move files.
|
||||
|
||||
What it does:
|
||||
- scans the audiobook library tree
|
||||
- detects audiobook roots based on audio files
|
||||
- tries to infer author, title, series, sequence, year, and narrator from folder names and sidecar OPF files
|
||||
- applies manual corrections from `data/verified_author_overrides.tsv`
|
||||
- writes a TSV report with proposed target paths for Audiobookshelf
|
||||
|
||||
What it does not do:
|
||||
- does not rename files
|
||||
- does not move directories
|
||||
- does not modify the library itself
|
||||
|
||||
Basic usage:
|
||||
|
||||
```bash
|
||||
python3 generate_abs_mock_report.py
|
||||
```
|
||||
|
||||
Default behavior:
|
||||
- reads the library from `/mnt/nextcloudExtDS/Ksiazki/Audiobooki`
|
||||
- writes the report to `reports/audiobookshelf_mock_report.tsv`
|
||||
- applies manual corrections from `data/verified_author_overrides.tsv`
|
||||
|
||||
Available options:
|
||||
|
||||
```bash
|
||||
python3 generate_abs_mock_report.py --help
|
||||
```
|
||||
|
||||
```text
|
||||
--root ROOT Path to the current audiobook library
|
||||
--output OUTPUT TSV output path
|
||||
--overrides OVERRIDES Optional TSV with verified metadata overrides
|
||||
```
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
python3 generate_abs_mock_report.py \
|
||||
--root /mnt/nextcloudExtDS/Ksiazki/Audiobooki
|
||||
```
|
||||
|
||||
```bash
|
||||
python3 generate_abs_mock_report.py \
|
||||
--root /path/to/library \
|
||||
--output reports/custom_report.tsv \
|
||||
--overrides data/verified_author_overrides.tsv
|
||||
```
|
||||
|
||||
Typical workflow:
|
||||
1. Run `python3 generate_abs_mock_report.py`.
|
||||
2. Open `reports/audiobookshelf_mock_report.tsv`.
|
||||
3. Review rows with `status=review` first, then ambiguous `unverified` rows.
|
||||
4. Add confirmed metadata to `data/verified_author_overrides.tsv`.
|
||||
5. Run the script again to regenerate the report with overrides applied.
|
||||
|
||||
What the script prints after completion:
|
||||
- `library_root` used for the scan
|
||||
- `report` path to the generated TSV
|
||||
- `books` number of detected audiobook roots
|
||||
- `ready` rows with enough metadata to propose a target path
|
||||
- `review` rows that still need manual verification
|
||||
|
||||
Main output file:
|
||||
- `reports/audiobookshelf_mock_report.tsv`
|
||||
|
||||
Important columns in the TSV:
|
||||
- `status`
|
||||
- `current_path`
|
||||
- `author`
|
||||
- `series`
|
||||
- `sequence`
|
||||
- `title`
|
||||
- `proposed_abs_path`
|
||||
- `notes`
|
||||
- `verification_status`
|
||||
- `verification_source`
|
||||
|
||||
How to read the main status fields:
|
||||
- `status=ready` means the row has enough metadata to build a proposed target path.
|
||||
- `status=review` means the row still needs manual verification.
|
||||
- `verification_status=unverified` means no manual override was applied yet.
|
||||
- `verification_status=verified_web` means the row was corrected or confirmed from a web source stored in `verification_source`.
|
||||
|
||||
Notes about paths:
|
||||
- `current_path` is the detected source folder in the current library.
|
||||
- `proposed_abs_path` is the suggested logical Audiobookshelf path relative to the author/series/title structure.
|
||||
- The script creates the parent directory for the output TSV automatically if it does not exist.
|
||||
|
||||
Source preference:
|
||||
- Prefer a direct audiobook/store/catalog page when it clearly confirms the metadata.
|
||||
- `lubimyczytac.pl` is an approved auxiliary source for verifying author, title, and series/cycle names.
|
||||
- Use `lubimyczytac.pl` especially when path-derived guesses are ambiguous or when storefront metadata is incomplete.
|
||||
|
||||
Recommended fields to confirm:
|
||||
- author
|
||||
- title
|
||||
- series
|
||||
- sequence
|
||||
|
||||
When adding an override:
|
||||
- Put the confirming page URL in `verification_source`.
|
||||
- Keep the note in `verification_note` short and only add it when it explains a correction or ambiguity.
|
||||
Reference in New Issue
Block a user