Cache Structure¶

Technical reference for Image Namer's caching system.

Overview¶

Image Namer uses a repository-local cache to store AI analysis results, avoiding redundant API calls.

Location: .image_namer/ in your working directory

Persistence: Cache is persistent across runs and survives system reboots

Scope: Per-directory (each project has its own cache)

Directory Structure¶

.image_namer/
├── version              # Cache schema version (plain text)
└── cache/
    ├── analysis/        # Name assessments (NameAssessment)
    │   ├── <key1>.json
    │   ├── <key2>.json
    │   └── ...
    └── names/           # Proposed names (ProposedName)
        ├── <key1>.json
        ├── <key2>.json
        └── ...

`version` File¶

Contains the cache schema version as plain text:

This is used to detect incompatible cache formats.

`cache/analysis/` Directory¶

Stores name assessment results (whether current filename is suitable).

`cache/names/` Directory¶

Stores proposed name results (what the new filename should be).

Cache Keys¶

Each cache entry is keyed by a composite identifier:

<sha256>__<provider>__<model>__v<rubric>

Components¶

Component	Description	Example
`<sha256>`	SHA-256 hash of image bytes	`a1b2c3d4e5f6...7890`
`<provider>`	AI provider name	`ollama` or `openai`
`<model>`	Model name	`gemma3:27b` or `gpt-4o`
`v<rubric>`	Rubric version	`v1`

Example Key¶

a1b2c3d4e5f6789012345678901234567890abcdefabcdef1234567890abcd__ollama__gemma3:27b__v1

File Extension¶

All cache entries have .json extension:

a1b2c3d4e5f6...7890__ollama__gemma3:27b__v1.json

Cache Entry Formats¶

Assessment Cache (`analysis/`)¶

Stores NameAssessment model:

{
  "suitable": true
}

or

{
  "suitable": false
}

Purpose: Determine if the current filename already follows the rubric.

Name Cache (`names/`)¶

Stores ProposedName model:

{
  "stem": "golden-retriever-puppy--running-in-park",
  "extension": ".jpg"
}

Purpose: Store the AI-generated proposed filename.

Two-Tier Caching Strategy¶

Image Namer uses a two-tier cache for optimal performance:

Tier 1: Assessment Cache¶

Check first: "Is the current filename already suitable?"

assessment = load_from_cache("analysis/<key>.json")
if assessment.suitable:
    skip_rename()  # File already has good name
    return

Benefit: Avoid generating new names for files that are already well-named.

Tier 2: Name Cache¶

Check second: "What should the new name be?"

if assessment.unsuitable:
    proposed = load_from_cache("names/<key>.json")
    if proposed:
        use_cached_name(proposed)
    else:
        proposed = call_ai_model()
        save_to_cache("names/<key>.json", proposed)

Benefit: Avoid redundant AI calls for unsuitable filenames.

Cache Lifecycle¶

Write Flow¶

1. Process image
2. Compute SHA-256 hash
3. Build cache key (<hash>__<provider>__<model>__v<rubric>)
4. Check assessment cache
   ├─ Hit: Use cached assessment
   └─ Miss: Call AI, save result to analysis/<key>.json
5. If unsuitable, check name cache
   ├─ Hit: Use cached name
   └─ Miss: Call AI, save result to names/<key>.json

Read Flow¶

1. Compute cache key
2. Check if cache file exists
   ├─ Exists: Load JSON, return result
   └─ Missing: Return None (cache miss)

Invalidation¶

Cache entries are invalidated when:

Image bytes change: New SHA-256 hash → new cache key
Provider changes: Different provider → new cache key
Model changes: Different model → new cache key
Rubric version bumps: New rubric version → new cache key

Cache Key Generation¶

Python Implementation¶

import hashlib
from pathlib import Path

def generate_cache_key(
    image_path: Path,
    provider: str,
    model: str,
    rubric_version: int
) -> str:
    # Compute SHA-256 of image bytes
    sha256 = hashlib.sha256()
    with open(image_path, "rb") as f:
        sha256.update(f.read())
    content_hash = sha256.hexdigest()

    # Build composite key
    return f"{content_hash}__{provider}__{model}__v{rubric_version}"

Cache File Path¶

from pathlib import Path

def assessment_cache_path(key: str) -> Path:
    return Path(".image_namer/cache/analysis") / f"{key}.json"

def name_cache_path(key: str) -> Path:
    return Path(".image_namer/cache/names") / f"{key}.json"

Cache Storage Format¶

All cache entries use JSON with UTF-8 encoding.

Writing¶

import json
from pathlib import Path

def save_to_cache(path: Path, data: dict) -> None:
    path.parent.mkdir(parents=True, exist_ok=True)
    with open(path, "w", encoding="utf-8") as f:
        json.dump(data, f, indent=2, ensure_ascii=False)

Reading¶

import json
from pathlib import Path

def load_from_cache(path: Path) -> dict | None:
    if not path.exists():
        return None
    with open(path, "r", encoding="utf-8") as f:
        return json.load(f)

Cache Size¶

Per-Entry Size¶

Assessment: ~50 bytes (just a boolean)
Proposed name: ~100-200 bytes (stem + extension)

Total Cache Size¶

For 100 images with 2 providers: - Assessments: 100 × 2 × 50 bytes = 10 KB - Names: 100 × 2 × 150 bytes = 30 KB - Total: ~40 KB

Cache size is negligible for typical use cases.

Cache Operations¶

Initialize Cache¶

# Automatically created on first run
image-namer file photo.jpg

Creates:

.image_namer/
├── version
└── cache/
    ├── analysis/
    └── names/

View Cache Entries¶

# List all assessments
ls .image_namer/cache/analysis/

# List all proposed names
ls .image_namer/cache/names/

# Count entries
ls .image_namer/cache/analysis/ | wc -l
ls .image_namer/cache/names/ | wc -l

Read Cache Entry¶

# View assessment
cat .image_namer/cache/analysis/<key>.json

# View proposed name
cat .image_namer/cache/names/<key>.json

Clear Cache¶

# Clear all cache
rm -rf .image_namer/

# Clear specific provider/model
rm .image_namer/cache/analysis/*__openai__gpt-4o__v1.json
rm .image_namer/cache/names/*__openai__gpt-4o__v1.json

# Clear old entries (older than 90 days)
find .image_namer/cache -type f -mtime +90 -delete

Cache Isolation¶

Per-Directory Isolation¶

Each working directory has its own cache:

~/ProjectA/
└── .image_namer/     # Cache for ProjectA

~/ProjectB/
└── .image_namer/     # Separate cache for ProjectB

Per-Provider Isolation¶

Different providers/models have separate cache entries:

.image_namer/cache/names/
├── abc123...def__ollama__gemma3:27b__v1.json
├── abc123...def__ollama__llama3:8b__v1.json
├── abc123...def__openai__gpt-4o__v1.json
└── xyz789...uvw__ollama__gemma3:27b__v1.json

Same image (abc123...def) has 3 entries for 3 different models.

Git Repository¶

The cache can be committed to git:

# Add to version control
git add .image_namer/
git commit -m "Add image naming cache"

Pros: - Team shares cache (saves time and API costs) - Consistent naming across team

Cons: - Cache size grows over time - May cause merge conflicts if many images are processed

Recommended: `.gitignore`¶

For most projects, exclude the cache:

# .gitignore
.image_namer/

Each developer maintains their own local cache.

Shared Cache via Symlink¶

Share cache across multiple local projects:

# Create global cache
mkdir -p ~/.image_namer_global

# Link projects to global cache
cd ~/ProjectA
ln -s ~/.image_namer_global .image_namer

cd ~/ProjectB
ln -s ~/.image_namer_global .image_namer

Now both projects share the same cache.

Cache Debugging¶

Find Cache Entry for Image¶

# Compute SHA-256 of image
shasum -a 256 photo.jpg
# Output: a1b2c3d4e5f6...7890  photo.jpg

# Find matching cache entries
ls .image_namer/cache/*/a1b2c3d4e5f6...7890__*

Verify Cache Key¶

import hashlib

# Compute SHA-256
sha256 = hashlib.sha256()
with open("photo.jpg", "rb") as f:
    sha256.update(f.read())
print(sha256.hexdigest())

Cache Statistics¶

# Total cache size
du -sh .image_namer/

# Number of assessments
ls .image_namer/cache/analysis/ | wc -l

# Number of proposed names
ls .image_namer/cache/names/ | wc -l

# List providers/models in cache
ls .image_namer/cache/names/ | sed 's/.*__\(.*\)__\(.*\)__v[0-9]*.json/\1 \2/' | sort -u

Troubleshooting¶

Cache Not Working¶

Symptoms: Image Namer re-analyzes every run despite cache

Possible causes: 1. Image bytes changed (new hash) 2. Different provider/model 3. Cache directory doesn't exist 4. Cache files corrupted

Solutions:

# Check if cache exists
ls .image_namer/cache/

# Check image hash
shasum -a 256 photo.jpg

# Verify provider/model in cache
ls .image_namer/cache/names/ | grep -i ollama

Cache Corruption¶

Symptoms: JSON parsing errors

Solution:

# Clear and rebuild cache
rm -rf .image_namer/
image-namer folder images/ --apply

Large Cache Size¶

Symptoms: .image_namer/ is very large

Causes: - Many images processed - Multiple providers/models - Old entries not cleaned up