Commit Graph

245 Commits

Author SHA1 Message Date
Marvin Zhang
7caa0127dc fix(config): ensure workspace directory is created if it does not exist 2025-04-10 17:27:52 +08:00
Marvin Zhang
c937e0f45f refactor: enhance Spider model and string utility functions
- Updated the Spider model to introduce a new SpiderTemplateParams struct for improved template handling.
- Refactored string utility functions in utils/string.go to include a new replaceChars function, streamlining character replacement across multiple functions.
- Enhanced ToSnakeCase and ToKebabCase functions to utilize the new replaceChars function for better maintainability and readability.
- Added splitStringWithQuotes function to facilitate string manipulation with quotes, improving overall utility in string processing.
2025-01-07 13:21:16 +08:00
Marvin Zhang
c3c629a7d7 feat: add ToKebabCase utility function for string formatting
- Introduced a new function ToKebabCase in utils/string.go to convert strings to kebab-case format.
- The function trims whitespace, converts to lowercase, and replaces spaces and underscores with hyphens, enhancing string manipulation capabilities in the codebase.
2025-01-06 22:37:19 +08:00
Marvin Zhang
8d8b47e474 refactor: streamline file service retrieval and enhance spider template handling
- Replaced direct calls to getBaseFileFsSvc with a new method fs.GetBaseFileFsSvc in base_file.go for improved clarity and maintainability.
- Introduced SpiderTemplateService interface and implemented registry service for managing spider templates, enhancing template handling in the spider controller.
- Added template-related fields to the Spider model to support template functionality.
- Created utility functions for string case conversions in utils/string.go to facilitate consistent formatting across the codebase.
- Updated environment configuration to retrieve the Python path dynamically, improving flexibility in the task runner's setup.
2025-01-06 18:09:45 +08:00
Marvin Zhang
f5d9ccfbfc feat: initialize configuration and enhance IPC handling in task runner tests
- Added configuration initialization in db.go to ensure proper setup of application settings.
- Refactored runner_test.go to streamline IPC message handling by introducing a setupPipe function and an initRunner function for better readability and maintainability.
- Improved synchronization in tests by using channels for signaling readiness and processing, enhancing the reliability of IPC message handling.
- Updated test cases to validate IPC message processing and error handling, ensuring robustness in the task runner's functionality.
2025-01-06 14:41:38 +08:00
Marvin Zhang
8aa801e2ba feat: add Go path configuration to task runner
- Introduced a new method configureGoPath in runner.go to set the GOPATH environment variable based on the retrieved Go path.
- Updated configureEnv to call configureGoPath, ensuring the Go path is configured alongside Node.js paths.
- Added a new utility function GetGoPath in config.go to retrieve the Go path from configuration, with a default fallback.
- These changes enhance the task runner's environment setup by supporting Go development alongside existing Node.js configurations.
2025-01-06 13:42:40 +08:00
Marvin Zhang
37d77f7342 refactor: enhance IPC handling in task runner tests
- Updated IPC reader initialization in runner_test.go to use a channel for signaling readiness, improving synchronization.
- Added error logging when writing to the pipe to enhance traceability during tests.
- These changes improve the reliability and clarity of the test setup for the task runner.
2025-01-03 16:56:36 +08:00
Marvin Zhang
ff5cd32de4 refactor: streamline Node.js path configuration in task runner
- Removed redundant home directory retrieval and nvm checks in the configureNodePath method.
- Introduced a new utility function GetNodeModulesPath to centralize the logic for determining the global node_modules path.
- Updated environment variable setup to use the new utility function, improving clarity and maintainability of the code.
2025-01-03 16:49:24 +08:00
Marvin Zhang
a585ab16f7 feat: enhance task runner with task status updates and process command execution
- Added a task status update to 'processing' at the start of the Run method in runner.go, improving task tracking.
- Removed redundant task status update from the end of the Run method to streamline the execution flow.
- Updated command execution in process.go to use 'bash' instead of 'sh' for better compatibility across environments.
2025-01-03 16:44:38 +08:00
Marvin Zhang
47094b8e64 refactor: update setting routes and enhance dependency management
- Changed route parameter from ':id' to ':key' in settings-related routes for better clarity and consistency.
- Updated GetSetting, PostSetting, and PutSetting functions to use the new ':key' parameter.
- Introduced IsAutoInstallEnabled method in DependencyInstallerService to check auto-installation status.
- Enhanced the task runner to check if auto installation is enabled before proceeding with dependency installation.
- Improved initialization of settings data in the system service, ensuring proper insertion of initial settings.
2025-01-01 22:37:44 +08:00
Marvin Zhang
b056105246 feat: add dependency installer service and enhance task runner with dependency management
- Introduced a new DependencyInstallerService interface to define methods for managing dependency installation commands.
- Implemented registry service for managing the DependencyInstallerService instance.
- Enhanced the task runner to install dependencies if available, including command execution and logging for stdout and stderr.
- Improved error handling and logging throughout the task runner's dependency installation process.
- Updated the runner's methods to utilize the new dependency management features, ensuring better integration and functionality.
2025-01-01 20:51:55 +08:00
Marvin Zhang
136daffa26 refactor: improve IPC handling and logging in task runner tests
- Enhanced the IPC message handling in runner_test.go by adding detailed logging for better traceability.
- Refactored the test setup to use channels for synchronization and improved error handling during message processing.
- Updated the runner.go file to rename variables for clarity and streamline the IPC reader implementation.
- Improved the cleanup process in tests to ensure proper resource management and context cancellation.
2025-01-01 15:18:40 +08:00
Marvin Zhang
db2549e3cd fix: enhance error logging in file log driver and update default task log path
- Improved error messages in the FileLogDriver's cleanup method to include error details for better debugging.
- Updated the default task log path from '/app/logs/tasks' to '/var/log/crawlab/tasks' to ensure consistency across environments.
2025-01-01 14:26:10 +08:00
Marvin Zhang
7e7ac621ec fix: update default task log path for consistency across environments
- Changed the default task log path from '/var/log/crawlab/tasks' to '/app/logs/tasks' to align with the application's directory structure and improve portability in different deployment environments.
2025-01-01 11:54:48 +08:00
Marvin Zhang
7b6805a834 feat: enhance task runner with improved logging and dependency support
- Added support for new dependency file types: 'go.mod' and 'pom.xml' in dependency.go.
- Refactored command configuration in runner.go to improve logging and error handling.
- Introduced a new method to configure Node.js paths, enhancing environment setup for tasks.
- Enhanced IPC message handling with detailed logging for better traceability.
- Updated service logging to remove unnecessary prefixes for cleaner output.
- Improved command execution handling in process.go for better compatibility across platforms.
2024-12-31 22:52:21 +08:00
Marvin Zhang
25fe273a62 refactor: improve logging in gRPC services by removing service prefixes
- Updated log messages in NodeServiceServer and TaskServiceServer to remove the "[NodeServiceServer]" and "[TaskServiceServer]" prefixes for cleaner output.
- This change enhances log readability and maintains consistency across logging practices in the application.
2024-12-31 13:30:02 +08:00
Marvin Zhang
9bdb0c969f refactor: remove default_version field from DependencyConfig
- Eliminated the default_version field from the DependencyConfig struct in dependency_config.go to streamline the configuration model.
- This change simplifies the dependency management process and aligns with recent updates in the application structure.
2024-12-30 15:14:21 +08:00
Marvin Zhang
ef499a03e0 fix: improve logging in master and worker services
- Added logging for error handling in the MasterService when setting a worker node offline, replacing the previous trace.PrintError with a more informative log message.
- Enhanced WorkerService subscription method with debug logs to indicate subscription attempts and status, improving traceability during connection processes.
2024-12-29 19:19:36 +08:00
Marvin Zhang
54800974eb feat: refactor system info retrieval and enhance logo output
- Replaced viper calls with utility functions in GetSystemInfo to improve code clarity and maintainability.
- Added a new system.go file with utility functions for retrieving system version and edition information.
- Enhanced PrintLogoWithWelcomeInfo to include detailed system information, improving user experience during server startup.
- Updated output formatting for better readability and consistency in welcome messages.
2024-12-29 19:01:23 +08:00
Marvin Zhang
17f8917d0a chore: update dependencies and enhance gRPC services
- Updated various dependencies in go.mod and go.sum files, including cloud.google.com/go/compute/metadata to v0.5.2, google.golang.org/grpc to v1.69.2, and google.golang.org/protobuf to v1.36.1.
- Refactored gRPC service definitions to use the latest protoc-gen-go and protoc versions, ensuring compatibility with the latest gRPC-Go features.
- Introduced a new logger utility in core/utils/logger.go to improve logging capabilities across the application.
- Added a README.md for gRPC setup and compilation guidance, enhancing developer experience.
- Improved the Python installation script to handle version listing more effectively and ensure user-specific environment setup.
2024-12-25 17:46:49 +08:00
Marvin Zhang
9e67e50c6c fix: add new line after logo and welcome info prints
- Added a new line after the logo and welcome information prints in the PrintLogoWithWelcomeInfo function to improve output readability.
- This change enhances the user experience by ensuring that the printed information is visually separated, making it easier to read during server startup.
2024-12-25 14:22:18 +08:00
Marvin Zhang
2a33bd40f5 chore: update dependencies in go.mod and go.sum files
- Added github.com/common-nighthawk/go-figure as a new indirect dependency in both backend and core modules to support logo rendering functionality.
- Removed the github.com/imroc/req dependency from backend and go.sum files, streamlining the dependency list and improving project organization.
2024-12-25 14:16:38 +08:00
Marvin Zhang
a13893b627 feat: add logo printing functionality and update logger usage
- Introduced a new utility function to print a logo and welcome information for the Crawlab server, enhancing user experience during startup.
- Updated logger variable names in the apps package for consistency and clarity.
- Added a new dependency on github.com/common-nighthawk/go-figure to facilitate logo rendering.
- Improved the server command to display the logo when the server starts, provided the user is not using the pro version.
2024-12-25 13:08:56 +08:00
Marvin Zhang
dc59599509 refactor: remove db module and update imports to core/mongo
- Deleted the db module, consolidating database-related functionality into the core/mongo package for better organization and maintainability.
- Updated all import paths across the codebase to replace references to the removed db module with core/mongo.
- Cleaned up unused code and dependencies, enhancing overall project clarity and reducing complexity.
- This refactor improves the structure of the codebase by centralizing database operations and simplifying module management.
2024-12-25 10:28:21 +08:00
Marvin Zhang
a28ffbf66c refactor: simplify interfaces and improve configuration handling
- Removed unused ApiApp and ServerApp interfaces from core/apps/interfaces.go to streamline the codebase.
- Updated the GetApi method in the Server struct to return a pointer to the Api type for better type handling.
- Simplified the GetGinMode function in core/utils/config.go to always return gin.ReleaseMode, removing unnecessary conditional checks for development mode.
- These changes enhance code clarity and maintainability by eliminating redundant code and improving type safety.
2024-12-24 23:05:41 +08:00
Marvin Zhang
3276083994 refactor: replace apex/log with structured logger across multiple services
- Replaced all instances of apex/log with a structured logger interface in various services, including Api, Server, Config, and others, to enhance logging consistency and context.
- Updated logging calls to utilize the new logger methods, improving error tracking and service monitoring.
- Added logger initialization in services and controllers to ensure proper logging setup.
- Improved error handling and logging messages for better clarity during service operations.
- Removed unused apex/log imports and cleaned up related code for better maintainability.
2024-12-24 19:11:19 +08:00
Marvin Zhang
e064889795 refactor: replace apex/log with structured logger in master and worker services
- Removed direct usage of apex/log in favor of a structured logger interface for improved logging consistency and context.
- Updated logging calls in MasterService and WorkerService to utilize the new logger, enhancing error tracking and service monitoring.
- Added logger initialization in both services to ensure proper logging setup.
- Improved error handling and logging messages for better clarity during service operations.
2024-12-23 21:45:38 +08:00
Marvin Zhang
99ed4396d1 refactor: improve logging messages and update configuration constants
- Updated logging messages in GrpcClient to provide clearer context, changing "ready" to "client is now ready" and "stopped" to "client has stopped".
- Refactored test setup in runner_test.go to remove unnecessary error checks during gRPC client start for cleaner code.
- Renamed GetDependencySetupScriptRoot to GetInstallRoot and updated related constants for better clarity and consistency in configuration management.
2024-12-23 18:19:08 +08:00
Marvin Zhang
c3f4c4ae05 feat: enhance gRPC client with structured logging and dependency actions
- Added DependencyActionSync and DependencyActionSetup constants to improve dependency management.
- Refactored GrpcClient to utilize a logger interface for consistent logging across connection states and errors.
- Updated Start, Stop, and connection methods to replace direct log calls with logger methods, enhancing log context and readability.
- Simplified test cases by removing error checks on gRPC client start, ensuring cleaner test setup.
2024-12-23 17:17:21 +08:00
Marvin Zhang
34455dc47c feat: enhance DependencyConfig model with additional fields
- Added TotalDependencies and SearchReady fields to the DependencyConfig struct for improved dependency management.
- Retained the Setup field for backward compatibility while ensuring proper JSON and BSON handling.
2024-12-23 15:18:04 +08:00
Marvin Zhang
ed8fb78c3b feat: expand Logger interface and implement additional logging methods in ServiceLogger
- Added Debug, Info, Warn, Error, and Fatal methods to the Logger interface for comprehensive logging capabilities.
- Implemented corresponding methods in ServiceLogger to facilitate structured logging with service context.
- Enhanced the logging functionality to support various log levels, improving error tracking and debugging.
2024-12-23 14:25:48 +08:00
Marvin Zhang
bdc347aef7 feat: add Fatalf logging method to Logger interface and ServiceLogger
- Introduced Fatalf method in Logger interface for logging fatal messages with formatted content.
- Implemented Fatalf in ServiceLogger to log fatal messages and exit the program, enhancing error handling capabilities.
2024-12-23 13:15:18 +08:00
Marvin Zhang
afe21b7ca0 feat: implement logging interfaces and service logger
- Added Logger interface in core/interfaces/logger.go for standardized logging methods (Debugf, Infof, Warnf, Errorf).
- Introduced ServiceLogger in core/utils/log.go, which prefixes log messages with the service name for better context.
- Implemented logging methods in ServiceLogger to utilize the apex/log package for structured logging.
2024-12-21 23:00:30 +08:00
Marvin Zhang
e44b416e34 feat: enhance model base service with BSON ID normalization
- Added utility function to normalize BSON ObjectId in query parameters for gRPC methods.
- Updated GetOne, GetMany, DeleteOne, DeleteMany, UpdateOne, UpdateMany, ReplaceOne, and UpsertOne methods to utilize the new normalization function.
- Introduced new DependencyConfigSetup model instance in base model definitions for improved structure.
2024-12-21 22:07:54 +08:00
Marvin Zhang
29af5a366b feat: enhance gRPC client with state management and reconnection logic
- Introduced state management in GrpcClient to monitor and handle connection states effectively.
- Added a reconnect channel and a state monitoring goroutine to facilitate automatic reconnections on state changes.
- Updated the connect method to initiate a reconnection loop upon connection loss.
- Enhanced logging for connection state changes and errors during connection attempts.
- Refactored tests to ensure proper initialization of gRPC client and server, improving test reliability and coverage.
2024-12-21 21:41:00 +08:00
Marvin Zhang
75da4fff0f refactor: update gRPC client initialization in model service tests
- Replaced direct gRPC client instantiation with a call to GetGrpcClient().Start() for improved consistency and maintainability across test cases.
- Cleaned up unnecessary imports in model_service.go to streamline the codebase.
- Enhanced test setup by ensuring the gRPC client is properly initialized before executing service methods.
2024-12-21 20:27:02 +08:00
Marvin Zhang
41e973a4d3 fix: misaligned nodes when running tasks from a schedule through enhancement by adding new route and handler for running schedules
- Introduced a new POST endpoint '/:id/run' in the router to trigger the execution of a schedule.
- Implemented the PostScheduleRun function to handle the execution logic, including validation of input parameters and scheduling tasks.
- Refactored the PostSpiderRun function to streamline the scheduling process by directly calling the admin service.
- Updated the FsFileInfo struct to clarify the children field description.
- Modified the Task model to make NodeIds field optional in JSON serialization.
2024-12-21 15:33:07 +08:00
Marvin Zhang
b88b41afe2 feat: add API configuration options for port and path
- Introduced GetApiPort and GetApiPath functions to retrieve API port and path from configuration, with defaults provided.
- Updated GetApiEndpoint to construct the API endpoint URL using the new port and path functions, ensuring proper formatting.
- Added constants for default API port and path to enhance configurability.
2024-12-21 14:05:50 +08:00
Marvin Zhang
c897fb58e4 refactor: streamline error handling and improve HTTP request management
- Removed the print flag from handleError function, simplifying error logging based on the development environment.
- Introduced a new performRequest function for standardized HTTP requests with JSON bodies, enhancing code reusability.
- Updated SendIMNotification and related functions to utilize the new RequestParam type for better clarity and consistency.
- Normalized HTTP request paths in the createHttpRequest method to ensure correct URL formatting.
- Added detailed error logging for JSON unmarshaling failures in syncFiles method.
- Introduced a NewHttpClient function to create HTTP clients with customizable timeouts.
2024-12-21 11:27:58 +08:00
Marvin Zhang
3cb74d76f9 feat: enhance gRPC client functionality and improve logging
- Added WaitForReady method to GrpcClient for blocking until the client is ready.
- Updated WorkerService to utilize WaitForReady for ensuring gRPC client readiness before starting.
- Refactored ModelService to consistently use GetGrpcClient for context management.
- Changed logging level for received metrics in MetricServiceServer from Info to Debug.
- Modified error handling in HandleError to conditionally print errors based on the environment.
- Cleaned up unused GrpcClient references in various services, improving code clarity.
2024-12-20 20:34:04 +08:00
Marvin Zhang
f5631f965d chore: updated deps 2024-12-20 12:00:14 +08:00
Marvin Zhang
be93f9d17d feat: added retry for worker node start 2024-12-20 11:40:21 +08:00
Marvin Zhang
f736b2c58e fix: getting stream error for dependency server 2024-12-18 17:43:41 +08:00
Marvin Zhang
e77d4cdd31 feat: updated dependency management 2024-12-17 19:32:16 +08:00
Marvin Zhang
79c1d5d14b feat: updated dependency config setup 2024-12-16 21:44:03 +08:00
Marvin Zhang
c5c08dfba6 feat: added dependency config setup (wip) 2024-12-15 23:09:10 +08:00
Marvin Zhang
25580b4694 feat: prepare for dependency lang install/setup 2024-12-14 22:18:22 +08:00
Marvin Zhang
298420b0a6 feat: updated the way syncing pypi projects 2024-12-14 21:47:11 +08:00
Marvin Zhang
272371d9ce feat: allow set max runners for nodes 2024-12-11 22:05:34 +08:00
Marvin Zhang
1fe74fa8a5 fix: optimized node runners calculation 2024-12-11 20:43:40 +08:00