Enhance crawler with seed list and SQL utilities
Add seedList module for URL initialization, comprehensive SQL utilities for database analysis, and update project configuration.
This commit is contained in:
13
misc/sql/recent_snapshot_activity.sql
Normal file
13
misc/sql/recent_snapshot_activity.sql
Normal file
@@ -0,0 +1,13 @@
|
||||
-- File: recent_snapshot_activity.sql
|
||||
-- Shows URLs with most snapshots in the last 7 days
|
||||
-- Usage: \i misc/sql/recent_snapshot_activity.sql
|
||||
|
||||
SELECT
|
||||
url,
|
||||
COUNT(*) as snapshot_count
|
||||
FROM snapshots
|
||||
WHERE timestamp > NOW() - INTERVAL '7 days'
|
||||
GROUP BY url
|
||||
HAVING COUNT(*) > 1
|
||||
ORDER BY snapshot_count DESC
|
||||
LIMIT 20;
|
||||
Reference in New Issue
Block a user