Commit Graph

  • f362a1d2da Update README.md main antanst 2025-06-29 23:10:14 +03:00
  • 7b3ad38f03 Remove old file antanst 2025-06-29 23:04:03 +03:00
  • 8e30a6a365 Remove old file antanst 2025-06-29 23:03:44 +03:00
  • 26311a6d2b Update README to reflect command-line flag configuration antanst 2025-06-29 22:25:38 +03:00
  • 57eb2555c5 Improve crawler performance and logging antanst 2025-06-29 22:27:20 +03:00
  • 453cf2294a Update log message for clarity antanst 2025-06-19 10:19:29 +03:00
  • ee2076f337 Clean up logging in worker processing antanst 2025-06-19 10:04:46 +03:00
  • acbac15c20 Improve crawler performance and worker coordination antanst 2025-06-19 09:59:50 +03:00
  • ddbe6b461b Update log message to reflect crawl date update behavior antanst 2025-06-18 12:03:37 +03:00
  • 55bb0d96d0 Update last_crawled timestamp when skipping duplicate content and improve error handling antanst 2025-06-18 12:02:55 +03:00
  • 349968d019 Improve error handling and add duplicate snapshot cleanup antanst 2025-06-18 11:56:26 +03:00
  • 2357135d5a Fix snapshot overwrite logic to preserve successful responses antanst 2025-06-18 11:23:56 +03:00
  • 98d3ed6707 Fix infinite recrawl loop with skip-identical-content antanst 2025-06-17 10:41:17 +03:00
  • 8b498a2603 Refine content deduplication and improve configuration antanst 2025-06-16 17:09:26 +03:00
  • 8588414b14 Enhance crawler with seed list and SQL utilities antanst 2025-06-16 12:29:33 +03:00
  • 5e6dabf1e7 Update documentation and project configuration antanst 2025-05-22 13:26:11 +03:00
  • a8173544e7 Update and refactor core functionality antanst 2025-05-22 12:47:01 +03:00
  • 3d07b56e8c Modernize host pool management antanst 2025-05-22 12:46:42 +03:00
  • c54c093a10 Implement context-aware database operations antanst 2025-05-22 12:46:36 +03:00
  • 57f5c0e865 Add whitelist functionality antanst 2025-05-22 12:46:28 +03:00
  • dc6eb610a2 Add robots.txt parsing and matching functionality antanst 2025-05-22 12:46:21 +03:00
  • 39e9ead982 Add context-aware network operations antanst 2025-05-22 12:45:58 +03:00
  • 5f4da4f806 Improve error handling with xerrors package antanst 2025-05-22 12:45:46 +03:00
  • 4ef3f70f1f Implement structured logging with slog antanst 2025-05-22 12:44:08 +03:00
  • b8ea6fab4a Change errors to use xerrors package. antanst 2025-05-12 20:37:58 +03:00
  • 5fe1490f1e Fix Makefile. antanst 2025-03-10 16:54:06 +02:00
  • a41490f834 Fix linter warnings in gemini/network.go antanst 2025-03-10 11:33:56 +02:00
  • 701a5df44f Improvements in error handling & descriptions antanst 2025-02-27 09:20:22 +02:00
  • 5b84960c5a Use go_errors library everywhere. antanst 2025-02-26 13:31:46 +02:00
  • be38104f05 Update license and readme. antanst 2025-02-26 10:37:37 +02:00
  • d70d6c35a3 update gitignore antanst 2025-02-26 10:37:20 +02:00
  • 8399225046 Improve main error handling antanst 2025-02-26 10:37:09 +02:00
  • e8e26ec76a Use Go race detector antanst 2025-02-26 10:36:51 +02:00
  • f6ac5003b0 Tidy go mod antanst 2025-02-26 10:36:41 +02:00
  • e626aabecb Add gemget script that downloads Gemini pages antanst 2025-02-26 10:35:44 +02:00
  • ebf59c50b8 Add Gopherspace crawling! antanst 2025-02-26 10:35:28 +02:00
  • 2a041fec7c Simplify host pool antanst 2025-02-26 10:35:11 +02:00
  • ca008b0796 Reorganize code for more granular imports antanst 2025-02-26 10:34:25 +02:00
  • 8350e106d6 Reorganize errors antanst 2025-02-26 10:32:38 +02:00
  • 9c7502b2a8 Improve blacklist to use regex matching antanst 2025-02-26 10:32:01 +02:00
  • dda21e833c Add regex matching function to util antanst 2025-01-16 22:36:03 +02:00
  • b0e7052c10 Add tidy & update Makefile targets antanst 2025-01-16 22:35:31 +02:00
  • 43b207c9ab Simplify duplicate code antanst 2025-01-16 13:58:14 +02:00
  • 285f2955e7 Proper package in tests antanst 2025-01-16 10:03:12 +02:00
  • 998b0e74ec Add DB scan error antanst 2025-01-16 10:02:54 +02:00
  • 766ee26f68 Simplify IP pool and convert it to host pool antanst 2025-01-16 09:39:20 +02:00
  • 5357ceb04d Break up Gemtext link parsing code and improve tests. antanst 2025-01-16 09:38:28 +02:00
  • 03e1849191 Add mode that prints multiple worker status in console antanst 2025-01-16 09:37:29 +02:00
  • ccb8f6838e Update DB init instructions & README antanst 2025-01-04 15:39:21 +02:00
  • 4e6fad873b Break up common functions and small refactor. antanst 2025-01-04 15:31:26 +02:00
  • b78fe00221 Add license. antanst 2024-12-27 12:13:05 +02:00
  • 90f6ecd024 Add README.md and Makefile. antanst 2024-12-27 12:11:35 +02:00
  • b52df073e9 Add first version of gemini-grc. antanst 2024-12-27 12:09:55 +02:00
  • 93822b239e Initial commit. antanst 2024-12-26 21:34:54 +02:00