Commit Graph

3 Commits

Author SHA1 Message Date
Marvin Zhang
b2ff8baed8 fix(spider): update node selection to use active nodes instead of all nodes
fix(spider): optimize form update logic to watch specific fields for changes
fix(grpc): adjust sync request ID handling for git and regular spiders
2025-12-02 11:42:29 +08:00
Marvin Zhang
18c5eb3956 fix: replace string slicing with filepath.Dir() in gRPC file sync
- Fix directory path calculation bug in downloadFileGRPC()
- Bug caused nested directory creation to fail (e.g., crawlab_project/spiders/)
- String slicing incorrectly truncated paths mid-character
- Now uses filepath.Dir() for correct parent directory extraction
- Fixes 'no such file or directory' errors during worker file sync
- Resolves spider task failures on worker nodes after gRPC migration

Validated by: REL-004, REL-005 test cases
2025-10-30 15:22:53 +08:00
Marvin Zhang
f441265cc2 feat(sync): add gRPC file synchronization service and integrate end-to-end
- add proto/services/sync_service.proto and generate Go pb + grpc bindings
- implement SyncServiceServer (streaming file scan + download) with:
  - request deduplication, in-memory cache (TTL), chunked streaming
  - concurrent-safe broadcast to waiters and server-side logging
- register SyncSvr in gRPC server and expose sync client in GrpcClient:
  - add syncClient field, registration and safe getters with reconnection-aware timeouts
- integrate gRPC sync into runner:
  - split syncFiles into syncFilesHTTP (legacy) and syncFilesGRPC
  - Runner now chooses implementation via config flag and performs streaming scan/download
- controller improvements:
  - add semaphore-based rate limiting for sync scan requests with in-flight counters and logs
- misc:
  - add utils.IsSyncGrpcEnabled() config helper
  - improve HTTP sync error diagnostics (Content-Type validation, response previews)
  - update/regenerate many protobuf and gRPC generated files (protoc/protoc-gen-go / protoc-gen-go-grpc version bumps)
2025-10-20 12:48:53 +08:00