Commit Graph

6 Commits

Author SHA1 Message Date
antanst
ada6cda4ac Fix snapshot overwrite logic to preserve successful responses
- Prevent overwriting snapshots that have valid response codes
- Ensure URL is removed from queue when snapshot update is skipped
- Add last_crawled timestamp tracking for better crawl scheduling
- Remove SkipIdenticalContent flag, simplify content deduplication logic
- Update database schema with last_crawled column and indexes

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-18 11:23:56 +03:00
antanst
9938dc542b Refine content deduplication and improve configuration 2025-06-16 17:09:26 +03:00
ecaa7f338d Update and refactor core functionality
- Update common package utilities
- Refactor network code for better error handling
- Remove deprecated files and functionality
- Enhance blacklist and filtering capabilities
- Improve snapshot handling and processing
2025-05-22 12:47:01 +03:00
94429b2224 Change errors to use xerrors package. 2025-05-12 20:37:58 +03:00
03e1849191 Add mode that prints multiple worker status in console 2025-01-16 10:04:02 +02:00
b52df073e9 Add first version of gemini-grc. 2024-12-27 12:09:55 +02:00