antanst
|
2357135d5a
|
Fix snapshot overwrite logic to preserve successful responses
- Prevent overwriting snapshots that have valid response codes
- Ensure URL is removed from queue when snapshot update is skipped
- Add last_crawled timestamp tracking for better crawl scheduling
- Remove SkipIdenticalContent flag, simplify content deduplication logic
- Update database schema with last_crawled column and indexes
|
2025-06-29 22:38:38 +03:00 |
|
antanst
|
8b498a2603
|
Refine content deduplication and improve configuration
|
2025-06-29 22:38:38 +03:00 |
|
|
|
a8173544e7
|
Update and refactor core functionality
- Update common package utilities
- Refactor network code for better error handling
- Remove deprecated files and functionality
- Enhance blacklist and filtering capabilities
- Improve snapshot handling and processing
|
2025-06-29 22:38:38 +03:00 |
|
|
|
03e1849191
|
Add mode that prints multiple worker status in console
|
2025-01-16 10:04:02 +02:00 |
|
|
|
b52df073e9
|
Add first version of gemini-grc.
|
2024-12-27 12:09:55 +02:00 |
|