Commit Graph

98 Commits

Author SHA1 Message Date
Marvin Zhang
14a94ff798 refactor: enhance error logging in writeLogLines to respect circuit breaker state 2025-09-12 14:34:27 +08:00
Marvin Zhang
d39c265483 feat: add PING message handling for connection health checks
- Implemented PING message handling in TaskServiceServer to acknowledge health check pings.
- Updated isConnectionHealthy method in Runner to use a non-blocking approach for health checks, preventing interference with log streams.
- Introduced lastConnCheck timestamp to optimize health check frequency based on recent activity.
- Added PING code to TaskServiceConnectCode enum in proto definition and generated files.
- Updated gRPC client and server interfaces to support new PING functionality.
2025-09-12 13:58:16 +08:00
Marvin Zhang
333dfd44c0 refactor: implement circuit breaker for log connections to prevent flooding during failures 2025-09-12 13:55:44 +08:00
Marvin Zhang
3edd2a1210 refactor: optimize connection health checks to reduce log stream interference; adjust health check intervals and implement non-blocking pings 2025-08-16 17:42:07 +08:00
Marvin Zhang
65aeb3ed8c feat: add PING mechanism for connection health checks; update proto and generated files
- Introduced PING code in TaskServiceConnectCode enum for health checks.
- Updated Runner to use proper PING messages instead of fake log messages for connection health checks.
- Modified TaskServiceServer to handle PING requests and acknowledge them.
- Adjusted generated gRPC files to reflect changes in proto definitions and ensure compatibility.
2025-08-16 17:19:21 +08:00
Marvin Zhang
78f9e0ca8d refactor: update task worker pool to support dynamic max workers and improve queue management; enhance configuration defaults for node runners and task queue size 2025-08-07 18:16:23 +08:00
Marvin Zhang
6340a9b880 refactor: Move context initialization for graceful shutdown to appropriate locations 2025-08-07 17:27:11 +08:00
Marvin Zhang
6912b92501 refactor: enhance context handling across task runner and service components; ensure proper cancellation chains and prevent goroutine leaks 2025-08-07 15:40:48 +08:00
Marvin Zhang
e1251d808b refactor: update method receivers to value type for cleanup and connection methods; enhance context usage for task client operations 2025-08-07 11:53:42 +08:00
Marvin Zhang
d042bc8cd7 refactor: improve connection readiness check and enhance goroutine management in gRPC client; ensure proper context handling in stream listeners 2025-08-07 11:12:46 +08:00
Marvin Zhang
44dd68918f refactor: improve goroutine management and context handling in task and stream operations; ensure graceful shutdown and prevent leaks 2025-08-07 00:16:46 +08:00
Marvin Zhang
784ffc8b52 feat: implement task management service operations, stream manager, and worker pool
- Added service_operations.go for task management including run, cancel, and execution logic.
- Introduced stream_manager.go to handle task streams and manage cancellation signals.
- Created worker_pool.go to manage a bounded pool of workers for executing tasks concurrently.
- Implemented graceful shutdown and cleanup mechanisms for task runners and streams.
- Enhanced error handling and logging throughout the task management process.
2025-08-06 18:29:08 +08:00
Marvin Zhang
3678d14082 feat: implement bounded goroutine pools for task execution and notification handling; enhance task scheduler with graceful shutdown and cleanup routines; update metric component for new time range options 2025-08-06 17:57:37 +08:00
Marvin Zhang
a2d13fae36 feat: temporarily disable batch file saving route and implement alternative handler in spider controller 2025-07-23 14:55:04 +08:00
Marvin Zhang
46c0cd6298 refactor: update gRPC client access patterns to use safe getter methods for improved error handling 2025-07-08 18:08:46 +08:00
Marvin Zhang
00daa0ed96 fix: enhance gRPC client reconnection logic and add goroutine monitoring for potential leaks 2025-07-08 13:39:39 +08:00
Marvin Zhang
92046a8c2e fix: improve task cancellation and connection health check logic with timeout handling 2025-06-27 14:02:24 +08:00
Marvin Zhang
9f251f3ebe fix: enhance task cancellation logic with graceful termination and stuck task cleanup 2025-06-27 13:50:21 +08:00
Marvin Zhang
89514b0154 feat: implement zombie process prevention and cleanup mechanisms in task runner 2025-06-23 13:54:43 +08:00
Marvin Zhang
1008886715 fix: enhance task service resilience with connection health monitoring and periodic cleanup 2025-06-23 11:57:05 +08:00
Marvin Zhang
13038d4a1a fix: unable to sync files from master in worker nodes 2025-06-20 14:42:52 +08:00
Marvin Zhang
09cfe37272 fix: unable to sync files to worker nodes when running tasks 2025-06-18 21:50:53 +08:00
Marvin Zhang
79b7e074e1 refactor: update filter parameter in API requests and improve component structure 2025-06-12 00:17:40 +08:00
Marvin Zhang
622dce51c3 fix: goroutine cleanup and error handling during shutdown 2025-06-11 22:30:37 +08:00
Marvin Zhang
8a00af115c feat: refactor to remove unused 'getAllList' methods and related state properties 2025-06-09 10:20:39 +08:00
Marvin Zhang
8d32d54fe8 feat: add development and index generation run configurations; update SVG files and improve icon styles 2025-06-06 14:59:59 +08:00
Marvin Zhang
1aa58f5065 refactor: remove unused task type field in runner test setup
- Eliminated the unused 'Type' field from the Task model in the runner test setup to enhance code clarity and maintainability.
2025-04-22 16:50:56 +08:00
Marvin Zhang
4f57d277e7 refactor: standardize timestamp fields and improve code clarity
- Updated timestamp fields across the codebase from `*_ts` to `*_at` for consistency and clarity.
- Renamed constants for node status from "on"/"off" to "online"/"offline" to better reflect their meanings.
- Enhanced validation and error handling in various components to ensure data integrity.
- Refactored test cases to align with the new naming conventions and improve readability.
2025-04-21 18:13:22 +08:00
Marvin Zhang
1ce6f87ad5 feat: add global node_bin path configuration in Runner
- Introduced a new method to configure the global node_bin path in the Runner, ensuring it is included in the system PATH if not already present.
- Added GetNodeBinPath function in utils to retrieve the node_bin path from configuration, with a default fallback.
- Enhanced environment variable management for better integration of Node.js binaries.
2025-04-15 20:55:47 +08:00
Marvin Zhang
ce0143ca06 refactor: enhance health check function and add comprehensive test coverage
- Updated GetHealthFn to return an error for better error handling and clarity.
- Introduced a new test file for schedule management, covering various endpoints including creation, retrieval, updating, and deletion of schedules.
- Added tests for task management, including task creation, retrieval, updating, and cancellation.
- Implemented utility tests for filtering and response generation to ensure consistent API behavior.
- Improved logging in the task scheduler service for better traceability.
2025-03-13 18:10:24 +08:00
Marvin Zhang
67181700c8 feat: improve task runner environment configuration
- Remove Crawlab-specific environment variables from the task runner's environment
- Automatically create workspace directory if it doesn't exist
- Enhance environment setup to prevent potential configuration conflicts
2025-02-14 14:02:04 +08:00
Marvin Zhang
8d8b47e474 refactor: streamline file service retrieval and enhance spider template handling
- Replaced direct calls to getBaseFileFsSvc with a new method fs.GetBaseFileFsSvc in base_file.go for improved clarity and maintainability.
- Introduced SpiderTemplateService interface and implemented registry service for managing spider templates, enhancing template handling in the spider controller.
- Added template-related fields to the Spider model to support template functionality.
- Created utility functions for string case conversions in utils/string.go to facilitate consistent formatting across the codebase.
- Updated environment configuration to retrieve the Python path dynamically, improving flexibility in the task runner's setup.
2025-01-06 18:09:45 +08:00
Marvin Zhang
f5d9ccfbfc feat: initialize configuration and enhance IPC handling in task runner tests
- Added configuration initialization in db.go to ensure proper setup of application settings.
- Refactored runner_test.go to streamline IPC message handling by introducing a setupPipe function and an initRunner function for better readability and maintainability.
- Improved synchronization in tests by using channels for signaling readiness and processing, enhancing the reliability of IPC message handling.
- Updated test cases to validate IPC message processing and error handling, ensuring robustness in the task runner's functionality.
2025-01-06 14:41:38 +08:00
Marvin Zhang
8aa801e2ba feat: add Go path configuration to task runner
- Introduced a new method configureGoPath in runner.go to set the GOPATH environment variable based on the retrieved Go path.
- Updated configureEnv to call configureGoPath, ensuring the Go path is configured alongside Node.js paths.
- Added a new utility function GetGoPath in config.go to retrieve the Go path from configuration, with a default fallback.
- These changes enhance the task runner's environment setup by supporting Go development alongside existing Node.js configurations.
2025-01-06 13:42:40 +08:00
Marvin Zhang
37d77f7342 refactor: enhance IPC handling in task runner tests
- Updated IPC reader initialization in runner_test.go to use a channel for signaling readiness, improving synchronization.
- Added error logging when writing to the pipe to enhance traceability during tests.
- These changes improve the reliability and clarity of the test setup for the task runner.
2025-01-03 16:56:36 +08:00
Marvin Zhang
ff5cd32de4 refactor: streamline Node.js path configuration in task runner
- Removed redundant home directory retrieval and nvm checks in the configureNodePath method.
- Introduced a new utility function GetNodeModulesPath to centralize the logic for determining the global node_modules path.
- Updated environment variable setup to use the new utility function, improving clarity and maintainability of the code.
2025-01-03 16:49:24 +08:00
Marvin Zhang
a585ab16f7 feat: enhance task runner with task status updates and process command execution
- Added a task status update to 'processing' at the start of the Run method in runner.go, improving task tracking.
- Removed redundant task status update from the end of the Run method to streamline the execution flow.
- Updated command execution in process.go to use 'bash' instead of 'sh' for better compatibility across environments.
2025-01-03 16:44:38 +08:00
Marvin Zhang
47094b8e64 refactor: update setting routes and enhance dependency management
- Changed route parameter from ':id' to ':key' in settings-related routes for better clarity and consistency.
- Updated GetSetting, PostSetting, and PutSetting functions to use the new ':key' parameter.
- Introduced IsAutoInstallEnabled method in DependencyInstallerService to check auto-installation status.
- Enhanced the task runner to check if auto installation is enabled before proceeding with dependency installation.
- Improved initialization of settings data in the system service, ensuring proper insertion of initial settings.
2025-01-01 22:37:44 +08:00
Marvin Zhang
b056105246 feat: add dependency installer service and enhance task runner with dependency management
- Introduced a new DependencyInstallerService interface to define methods for managing dependency installation commands.
- Implemented registry service for managing the DependencyInstallerService instance.
- Enhanced the task runner to install dependencies if available, including command execution and logging for stdout and stderr.
- Improved error handling and logging throughout the task runner's dependency installation process.
- Updated the runner's methods to utilize the new dependency management features, ensuring better integration and functionality.
2025-01-01 20:51:55 +08:00
Marvin Zhang
136daffa26 refactor: improve IPC handling and logging in task runner tests
- Enhanced the IPC message handling in runner_test.go by adding detailed logging for better traceability.
- Refactored the test setup to use channels for synchronization and improved error handling during message processing.
- Updated the runner.go file to rename variables for clarity and streamline the IPC reader implementation.
- Improved the cleanup process in tests to ensure proper resource management and context cancellation.
2025-01-01 15:18:40 +08:00
Marvin Zhang
db2549e3cd fix: enhance error logging in file log driver and update default task log path
- Improved error messages in the FileLogDriver's cleanup method to include error details for better debugging.
- Updated the default task log path from '/app/logs/tasks' to '/var/log/crawlab/tasks' to ensure consistency across environments.
2025-01-01 14:26:10 +08:00
Marvin Zhang
7b6805a834 feat: enhance task runner with improved logging and dependency support
- Added support for new dependency file types: 'go.mod' and 'pom.xml' in dependency.go.
- Refactored command configuration in runner.go to improve logging and error handling.
- Introduced a new method to configure Node.js paths, enhancing environment setup for tasks.
- Enhanced IPC message handling with detailed logging for better traceability.
- Updated service logging to remove unnecessary prefixes for cleaner output.
- Improved command execution handling in process.go for better compatibility across platforms.
2024-12-31 22:52:21 +08:00
Marvin Zhang
dc59599509 refactor: remove db module and update imports to core/mongo
- Deleted the db module, consolidating database-related functionality into the core/mongo package for better organization and maintainability.
- Updated all import paths across the codebase to replace references to the removed db module with core/mongo.
- Cleaned up unused code and dependencies, enhancing overall project clarity and reducing complexity.
- This refactor improves the structure of the codebase by centralizing database operations and simplifying module management.
2024-12-25 10:28:21 +08:00
Marvin Zhang
3276083994 refactor: replace apex/log with structured logger across multiple services
- Replaced all instances of apex/log with a structured logger interface in various services, including Api, Server, Config, and others, to enhance logging consistency and context.
- Updated logging calls to utilize the new logger methods, improving error tracking and service monitoring.
- Added logger initialization in services and controllers to ensure proper logging setup.
- Improved error handling and logging messages for better clarity during service operations.
- Removed unused apex/log imports and cleaned up related code for better maintainability.
2024-12-24 19:11:19 +08:00
Marvin Zhang
99ed4396d1 refactor: improve logging messages and update configuration constants
- Updated logging messages in GrpcClient to provide clearer context, changing "ready" to "client is now ready" and "stopped" to "client has stopped".
- Refactored test setup in runner_test.go to remove unnecessary error checks during gRPC client start for cleaner code.
- Renamed GetDependencySetupScriptRoot to GetInstallRoot and updated related constants for better clarity and consistency in configuration management.
2024-12-23 18:19:08 +08:00
Marvin Zhang
29af5a366b feat: enhance gRPC client with state management and reconnection logic
- Introduced state management in GrpcClient to monitor and handle connection states effectively.
- Added a reconnect channel and a state monitoring goroutine to facilitate automatic reconnections on state changes.
- Updated the connect method to initiate a reconnection loop upon connection loss.
- Enhanced logging for connection state changes and errors during connection attempts.
- Refactored tests to ensure proper initialization of gRPC client and server, improving test reliability and coverage.
2024-12-21 21:41:00 +08:00
Marvin Zhang
c897fb58e4 refactor: streamline error handling and improve HTTP request management
- Removed the print flag from handleError function, simplifying error logging based on the development environment.
- Introduced a new performRequest function for standardized HTTP requests with JSON bodies, enhancing code reusability.
- Updated SendIMNotification and related functions to utilize the new RequestParam type for better clarity and consistency.
- Normalized HTTP request paths in the createHttpRequest method to ensure correct URL formatting.
- Added detailed error logging for JSON unmarshaling failures in syncFiles method.
- Introduced a NewHttpClient function to create HTTP clients with customizable timeouts.
2024-12-21 11:27:58 +08:00
Marvin Zhang
3cb74d76f9 feat: enhance gRPC client functionality and improve logging
- Added WaitForReady method to GrpcClient for blocking until the client is ready.
- Updated WorkerService to utilize WaitForReady for ensuring gRPC client readiness before starting.
- Refactored ModelService to consistently use GetGrpcClient for context management.
- Changed logging level for received metrics in MetricServiceServer from Info to Debug.
- Modified error handling in HandleError to conditionally print errors based on the environment.
- Cleaned up unused GrpcClient references in various services, improving code clarity.
2024-12-20 20:34:04 +08:00
Marvin Zhang
f736b2c58e fix: getting stream error for dependency server 2024-12-18 17:43:41 +08:00
Marvin Zhang
272371d9ce feat: allow set max runners for nodes 2024-12-11 22:05:34 +08:00