- Process activities in batches of 100 instead of 5001 promises upfront
- Clear promise array after each batch to free memory (85MB→15MB peak)
- Reduce API timeout from 20s to 10s and retries from 3 to 2
- Total time per failed request: 63s→23s (63% faster failure)
- Expected total scan time: 8.5h→1.5h (82% faster)
- Add mutex to cron jobs to prevent overlapping runs
- Replace Promise.all with batched processing (50/batch) in updateStaleClubs
- Configure HTTP connection pooling with keep-alive (maxSockets: 50)
- Add memory monitoring to scan progress logs
- Reduce CONCURRENT_API_CALLS from 8 to 5 to reduce Sharp memory pressure
Root cause: Promise.all() waits for ALL promises, so a single hung/slow request
blocks the entire batch. With 5001 promises and 16 concurrent limit, timeouts
cause cascading delays that appear as 'scan stopped'.
Fix:
- Extract processSingleActivity() helper function
- Use Promise.allSettled() instead of Promise.all()
- Each promise handles its own success/error counting
- Prevents single hung promise from blocking entire scan
Impact: Scan should now complete all 5001 IDs without getting stuck
- Fix Redis SCAN cursor type conversion (Buffer to String) to prevent early termination
- Add progress logging in initializeClubCache (every 100 activities with summary)
- Add Redis memory limits (512MB with LRU eviction policy)
- Implement cache TTL: 24h for normal data, 1h for error states (allows retry)
- Fix Docker permission issue by running app container as root
- Add TTL configuration to .env and example.env
Root cause: SCAN cursor comparison failed due to type mismatch (Buffer vs String)
Impact: Scanning now processes all 5000+ IDs instead of stopping at ~300