Commit Graph

6162 Commits

Author SHA1 Message Date
Marvin Zhang
ba6d989c7e fix(controllers/health): return after responding OK to avoid falling through; tidy imports 2025-10-20 16:34:55 +08:00
Marvin Zhang
ec3dd2d077 fix(grpc/client): protect GetGrpcClient with _clientMux lock to avoid race during singleton init 2025-10-20 13:43:55 +08:00
Marvin Zhang
2dfc66743b fix(grpc/client,node/task/handler): add RetryWithBackoff, stabilize reconnection, and retry gRPC ops
- add RetryWithBackoff helper to grpc client for exponential retry with backoff and reconnection-aware handling
- increase reconnectionClientTimeout to 90s and introduce connectionStabilizationDelay; wait briefly after reconnection to avoid immediate flapping
- refresh reconnection flag while waiting for client registration and improve cancellation message
- replace direct heartbeat RPC with RetryWithBackoff in WorkerService (use extended timeout)
- use RetryWithBackoff for worker node status updates in task handler and propagate errors
2025-10-20 13:01:10 +08:00
Marvin Zhang
f441265cc2 feat(sync): add gRPC file synchronization service and integrate end-to-end
- add proto/services/sync_service.proto and generate Go pb + grpc bindings
- implement SyncServiceServer (streaming file scan + download) with:
  - request deduplication, in-memory cache (TTL), chunked streaming
  - concurrent-safe broadcast to waiters and server-side logging
- register SyncSvr in gRPC server and expose sync client in GrpcClient:
  - add syncClient field, registration and safe getters with reconnection-aware timeouts
- integrate gRPC sync into runner:
  - split syncFiles into syncFilesHTTP (legacy) and syncFilesGRPC
  - Runner now chooses implementation via config flag and performs streaming scan/download
- controller improvements:
  - add semaphore-based rate limiting for sync scan requests with in-flight counters and logs
- misc:
  - add utils.IsSyncGrpcEnabled() config helper
  - improve HTTP sync error diagnostics (Content-Type validation, response previews)
  - update/regenerate many protobuf and gRPC generated files (protoc/protoc-gen-go / protoc-gen-go-grpc version bumps)
2025-10-20 12:48:53 +08:00
Marvin Zhang
61604e1817 fix(task/handler): ensure latest gRPC client is used for task fetch/subscribe
Add svc.getGrpcClient() helper and use it when obtaining TaskClient so task fetch and
subscribe operations don't hold a stale client instance after ResetGrpcClient().
2025-10-20 12:22:34 +08:00
Marvin Zhang
4baa5fad59 fix(grpc/client): trigger reconnection on bad conn state and improve connection logging
- Trigger reconnection proactively from Get*WithTimeout when underlying connection is in
  SHUTDOWN or TRANSIENT_FAILURE to avoid returning stale/unusable clients.
- Add debug/info logs around client registration, connection attempts, closing existing
  connections, connection initiation, reconnection start, backoff retry and successful
  reconnection (including current state and registration status).
- Surface more context in reconnection and connection logs to aid diagnostics.
2025-10-20 11:34:41 +08:00
Marvin Zhang
6020fef30b chore(node): add timing logs and improve node status diagnostics
- master: add TIMING logs in setWorkerNodeOnline to mark start and completed DB update
- handler: log node status for reconnection debugging and include active/enabled values in "node not active or enabled" error
2025-10-20 11:14:55 +08:00
Marvin Zhang
49165b2165 refactor(node): reorganize task reconciliation, prioritize worker cache, add periodic cleanup
- Move and document reconciliation constants and add sectioned organization/comments.
- Split large monolithic logic into smaller functions:
  - reconcileDisconnectedTasks / reconcileDisconnectedTask
  - reconcileAbandonedAssignedTasks
  - reconcileStalePendingTasks / handleStalePendingTask
  - getActualTaskStatus / getStatusFromWorkerCache / triggerWorkerStatusSync
  - queryProcessStatus / requestProcessStatusFromWorker / mapProcessStatusToTaskStatus
  - findTasksByStatus / markTaskDisconnected / findAvailableNodeForTask
  - updateTaskStatus / saveTask / shouldMarkTaskAbnormal / markTaskAbnormal
- Add periodic background workers:
  - StartPeriodicReconciliation -> runPeriodicReconciliation to reconcile running/disconnected tasks
  - runPeriodicAssignedTaskCleanup -> cleanupStuckAssignedTasks to detect and recover stuck assigned tasks
- Prioritize worker-side cached status and attempt sync from task runner before querying worker processes.
- Introduce a placeholder createWorkerClient for future gRPC worker discovery/invocation.
- Replace ad-hoc DB updates with saveTask using retry/backoff and centralize status update logic.
- Improve logging and error messages, and tighten conditions for marking tasks abnormal.

This refactor clarifies responsibilities, improves reliability of status updates, and prepares the codebase for future worker gRPC integration.
2025-10-20 10:54:32 +08:00
Marvin Zhang
883c954b4e feat: update Dockerfile to include 'go mod tidy' before installation 2025-10-09 12:39:12 +08:00
Marvin Zhang
44fd0809e6 feat: disable backend unit tests and document reasons for integration test requirements 2025-10-09 12:35:09 +08:00
Marvin Zhang
5bff8823a8 feat: update test workflows to skip API tests and document controller test status 2025-10-09 11:34:26 +08:00
Marvin Zhang
2a211923da Update core/task/handler/runner_sync.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-10-09 11:14:32 +08:00
Marvin Zhang
587d9d0960 Merge branch 'test' into develop 2025-10-09 11:13:51 +08:00
Marvin Zhang
4c508557e7 feat: parameterize ports in Docker Compose for better configurability 2025-09-29 14:31:33 +08:00
Marvin Zhang
29ef8d67da feat: implement synchronization and error handling improvements in task reconciliation and file synchronization 2025-09-28 17:42:23 +08:00
Marvin Zhang
e80256aa61 feat: add support for Chinese locale in Docker setup 2025-09-17 16:00:27 +08:00
Marvin Zhang
b6e14a13fe refactor: remove obsolete task reconciliation service tests 2025-09-17 11:05:27 +08:00
Marvin Zhang
afa5fab4c1 feat: enhance task reconciliation with worker-side status caching and synchronization 2025-09-17 11:03:35 +08:00
Marvin Zhang
8c2c23d9b6 feat: Update gRPC service definitions and implement CheckProcess method
- Downgraded protoc-gen-go-grpc and protoc versions for compatibility.
- Added CheckProcess method to TaskService with corresponding request and response types.
- Updated Subscribe and Connect methods to use new generic client stream types.
- Refactored server and client implementations for Subscribe and Connect methods.
- Ensured backward compatibility by maintaining existing method signatures where applicable.
- Added necessary handler for CheckProcess in the service descriptor.
2025-09-17 10:37:03 +08:00
Marvin Zhang
c6834e9964 feat: enhance task reconciliation logic with improved status handling and error messaging 2025-09-17 10:18:13 +08:00
Marvin Zhang
8ebdd98f99 feat: enhance ORM functionality with toggle support and UI updates 2025-09-16 16:09:33 +08:00
Marvin Zhang
bfe40e7c67 fix: comment out AI Assistant toggle button in Header.vue 2025-09-16 15:45:56 +08:00
Marvin Zhang
39f83d71b1 fix: update NotificationRequestDTO to include BSON field names for setting and channel 2025-09-16 15:43:43 +08:00
Marvin Zhang
196273c423 feat: implement ORM support with toggle functionality and UI updates 2025-09-16 15:18:35 +08:00
Marvin Zhang
875ca290b5 fix: update UseORM description to specify supported databases (MySQL, PostgreSQL, SQL Server) 2025-09-16 14:04:01 +08:00
Marvin Zhang
293e630f6f feat: add UseORM field to Database struct for ORM support 2025-09-16 13:30:50 +08:00
Marvin Zhang
8450b074c0 fix: comment out unused 'models' menu item in SystemDetail.vue 2025-09-16 09:33:15 +08:00
Marvin Zhang
56277e47be fix: remove unused build scripts to streamline the build process 2025-09-16 09:32:23 +08:00
Marvin Zhang
9ade564d0a Remove unused TypeScript declaration files for task, token, and user components in the Crawlab UI, streamlining the codebase and improving maintainability. 2025-09-16 09:31:34 +08:00
Marvin Zhang
7d1a61581e feat: add support for multi-architecture Docker builds with configurable input 2025-09-14 16:39:29 +08:00
Marvin Zhang
72177b2728 fix: update node disconnected status styling and behavior 2025-09-14 15:20:01 +08:00
Marvin Zhang
437c30b699 fix: ensure worker services depend on healthy master service 2025-09-14 15:02:06 +08:00
Marvin Zhang
829fcac3ff feat: add multi-platform support for Docker builds 2025-09-14 14:49:53 +08:00
Marvin Zhang
7c33fec784 refactor: remove unused fields from WorkerService struct 2025-09-12 18:17:36 +08:00
Marvin Zhang
e221e3c640 feat: enhance gRPC client handling with improved reconnection logic and monitoring 2025-09-12 18:16:52 +08:00
Marvin Zhang
07bb7f8ba9 fix: enhance node metrics handling by checking state before accessing metrics map 2025-09-12 16:34:40 +08:00
Marvin Zhang
316878e129 test: add comprehensive tests for task reconciliation service handling offline nodes 2025-09-12 16:10:00 +08:00
Marvin Zhang
60be5072e5 feat: add node disconnection handling and update task statuses accordingly 2025-09-12 15:40:29 +08:00
Marvin Zhang
05a71da26e refactor: comment out AI chat features in NormalLayout.vue for future implementation 2025-09-12 15:01:29 +08:00
Marvin Zhang
b5f10cb6a8 feat: add TypeScript interfaces for Vuex store modules
- Introduced new interfaces for various store modules including environment, file, git, layout, node, notification alerts, channels, requests, settings, plugins, projects, roles, schedules, spiders, systems, tags, tasks, tokens, and users.
- Each module includes state, getters, mutations, and actions definitions to enhance type safety and maintainability.
- Added utility interfaces for file handling and view-specific types for database, git, login, node, project, result, schedule, spider, task, and user.
- Improved overall structure and organization of TypeScript typings for better developer experience.
2025-09-12 14:57:42 +08:00
Marvin Zhang
14a94ff798 refactor: enhance error logging in writeLogLines to respect circuit breaker state 2025-09-12 14:34:27 +08:00
Marvin Zhang
c0e230e5d8 refactor: rename PING code to HEARTBEAT in node service and update related proto files 2025-09-12 14:17:49 +08:00
Marvin Zhang
d39c265483 feat: add PING message handling for connection health checks
- Implemented PING message handling in TaskServiceServer to acknowledge health check pings.
- Updated isConnectionHealthy method in Runner to use a non-blocking approach for health checks, preventing interference with log streams.
- Introduced lastConnCheck timestamp to optimize health check frequency based on recent activity.
- Added PING code to TaskServiceConnectCode enum in proto definition and generated files.
- Updated gRPC client and server interfaces to support new PING functionality.
2025-09-12 13:58:16 +08:00
Marvin Zhang
333dfd44c0 refactor: implement circuit breaker for log connections to prevent flooding during failures 2025-09-12 13:55:44 +08:00
Marvin Zhang
3edd2a1210 refactor: optimize connection health checks to reduce log stream interference; adjust health check intervals and implement non-blocking pings 2025-08-16 17:42:07 +08:00
Marvin Zhang
65aeb3ed8c feat: add PING mechanism for connection health checks; update proto and generated files
- Introduced PING code in TaskServiceConnectCode enum for health checks.
- Updated Runner to use proper PING messages instead of fake log messages for connection health checks.
- Modified TaskServiceServer to handle PING requests and acknowledge them.
- Adjusted generated gRPC files to reflect changes in proto definitions and ensure compatibility.
2025-08-16 17:19:21 +08:00
Marvin Zhang
babecc46c0 refactor: update DependencySetupDialog to use language key for tag label; modify useDependencyList to dispatch node retrieval on mount 2025-08-08 11:12:05 +08:00
Marvin Zhang
45913ad7e4 refactor: implement health service for master and worker nodes; add health check script and integrate health checks into service lifecycle 2025-08-08 00:05:00 +08:00
Marvin Zhang
78f9e0ca8d refactor: update task worker pool to support dynamic max workers and improve queue management; enhance configuration defaults for node runners and task queue size 2025-08-07 18:16:23 +08:00
Marvin Zhang
6340a9b880 refactor: Move context initialization for graceful shutdown to appropriate locations 2025-08-07 17:27:11 +08:00