Commit Graph

255 Commits

Author SHA1 Message Date
Marvin Zhang
14fcf2ba11 feat: add LLM provider and model data models
Introduced two new data models for managing Language Learning Models (LLMs):
- LLMProvider: Represents LLM providers like OpenAI, Anthropic
- LLMModel: Represents specific models within a provider

Models include key attributes such as:
- Naming and display information
- Enabled/priority status
- Supported features
- Token pricing
- Configuration schemas
2025-03-09 20:10:37 +08:00
Marvin Zhang
00e0352ef7 Remove core/docs directory and related files
Deleted the entire core/docs directory, which contained:
- .gitignore
- package.json
- API documentation files (index.html, openapi.yaml)
- Publishing script for documentation
2025-03-04 22:58:12 +08:00
Marvin Zhang
a1f870715f feat: add GET endpoint for retrieving user's token list
Implemented a new GetTokenList handler in the token controller to:
- Fetch tokens created by the current user
- Support pagination and filtering
- Return tokens with total count
- Handle cases with no documents gracefully
2025-02-27 16:39:51 +08:00
Marvin Zhang
6048d4eeb8 fix: missing routes in get me api
https://github.com/crawlab-team/crawlab/issues/1550
2025-02-27 13:33:48 +08:00
Marvin Zhang
fae5c62b0a refactor: reorder fields in Spider struct for improved readability 2025-02-26 22:07:34 +08:00
Marvin Zhang
3237923f02 fix: prevent unnecessary error handling in export download endpoint
Adds an early return after handling bad request errors in the GetExportDownload function to avoid unnecessary error processing and improve error handling clarity.
2025-02-17 16:27:39 +08:00
Marvin Zhang
67181700c8 feat: improve task runner environment configuration
- Remove Crawlab-specific environment variables from the task runner's environment
- Automatically create workspace directory if it doesn't exist
- Enhance environment setup to prevent potential configuration conflicts
2025-02-14 14:02:04 +08:00
Marvin Zhang
4317a03971 fix: reset DefaultInstallEnvs to an empty string
Reverts the previous change that set DefaultInstallEnvs to "node,browser", returning it to an empty default value to maintain configuration flexibility.
2025-02-14 12:45:05 +08:00
Marvin Zhang
63cb7d445a feat: update DefaultInstallEnvs to include node and browser 2025-02-14 11:24:38 +08:00
Marvin Zhang
c4d0836063 feat: add GetInstallEnvs utility function for configuration management
- Introduced a new function GetInstallEnvs in config.go to retrieve installation environment variables
- Added DefaultInstallEnvs constant with an empty default value
- Implemented fallback mechanism to split DefaultInstallEnvs if no configuration is found
- Enhances configuration flexibility for installation environments
2025-02-14 11:16:32 +08:00
Marvin Zhang
51835947a3 feat: add spacing to logo output for improved readability 2025-02-11 13:27:52 +08:00
Marvin Zhang
c937e0f45f refactor: enhance Spider model and string utility functions
- Updated the Spider model to introduce a new SpiderTemplateParams struct for improved template handling.
- Refactored string utility functions in utils/string.go to include a new replaceChars function, streamlining character replacement across multiple functions.
- Enhanced ToSnakeCase and ToKebabCase functions to utilize the new replaceChars function for better maintainability and readability.
- Added splitStringWithQuotes function to facilitate string manipulation with quotes, improving overall utility in string processing.
2025-01-07 13:21:16 +08:00
Marvin Zhang
c3c629a7d7 feat: add ToKebabCase utility function for string formatting
- Introduced a new function ToKebabCase in utils/string.go to convert strings to kebab-case format.
- The function trims whitespace, converts to lowercase, and replaces spaces and underscores with hyphens, enhancing string manipulation capabilities in the codebase.
2025-01-06 22:37:19 +08:00
Marvin Zhang
8d8b47e474 refactor: streamline file service retrieval and enhance spider template handling
- Replaced direct calls to getBaseFileFsSvc with a new method fs.GetBaseFileFsSvc in base_file.go for improved clarity and maintainability.
- Introduced SpiderTemplateService interface and implemented registry service for managing spider templates, enhancing template handling in the spider controller.
- Added template-related fields to the Spider model to support template functionality.
- Created utility functions for string case conversions in utils/string.go to facilitate consistent formatting across the codebase.
- Updated environment configuration to retrieve the Python path dynamically, improving flexibility in the task runner's setup.
2025-01-06 18:09:45 +08:00
Marvin Zhang
f5d9ccfbfc feat: initialize configuration and enhance IPC handling in task runner tests
- Added configuration initialization in db.go to ensure proper setup of application settings.
- Refactored runner_test.go to streamline IPC message handling by introducing a setupPipe function and an initRunner function for better readability and maintainability.
- Improved synchronization in tests by using channels for signaling readiness and processing, enhancing the reliability of IPC message handling.
- Updated test cases to validate IPC message processing and error handling, ensuring robustness in the task runner's functionality.
2025-01-06 14:41:38 +08:00
Marvin Zhang
8aa801e2ba feat: add Go path configuration to task runner
- Introduced a new method configureGoPath in runner.go to set the GOPATH environment variable based on the retrieved Go path.
- Updated configureEnv to call configureGoPath, ensuring the Go path is configured alongside Node.js paths.
- Added a new utility function GetGoPath in config.go to retrieve the Go path from configuration, with a default fallback.
- These changes enhance the task runner's environment setup by supporting Go development alongside existing Node.js configurations.
2025-01-06 13:42:40 +08:00
Marvin Zhang
37d77f7342 refactor: enhance IPC handling in task runner tests
- Updated IPC reader initialization in runner_test.go to use a channel for signaling readiness, improving synchronization.
- Added error logging when writing to the pipe to enhance traceability during tests.
- These changes improve the reliability and clarity of the test setup for the task runner.
2025-01-03 16:56:36 +08:00
Marvin Zhang
ff5cd32de4 refactor: streamline Node.js path configuration in task runner
- Removed redundant home directory retrieval and nvm checks in the configureNodePath method.
- Introduced a new utility function GetNodeModulesPath to centralize the logic for determining the global node_modules path.
- Updated environment variable setup to use the new utility function, improving clarity and maintainability of the code.
2025-01-03 16:49:24 +08:00
Marvin Zhang
a585ab16f7 feat: enhance task runner with task status updates and process command execution
- Added a task status update to 'processing' at the start of the Run method in runner.go, improving task tracking.
- Removed redundant task status update from the end of the Run method to streamline the execution flow.
- Updated command execution in process.go to use 'bash' instead of 'sh' for better compatibility across environments.
2025-01-03 16:44:38 +08:00
Marvin Zhang
47094b8e64 refactor: update setting routes and enhance dependency management
- Changed route parameter from ':id' to ':key' in settings-related routes for better clarity and consistency.
- Updated GetSetting, PostSetting, and PutSetting functions to use the new ':key' parameter.
- Introduced IsAutoInstallEnabled method in DependencyInstallerService to check auto-installation status.
- Enhanced the task runner to check if auto installation is enabled before proceeding with dependency installation.
- Improved initialization of settings data in the system service, ensuring proper insertion of initial settings.
2025-01-01 22:37:44 +08:00
Marvin Zhang
b056105246 feat: add dependency installer service and enhance task runner with dependency management
- Introduced a new DependencyInstallerService interface to define methods for managing dependency installation commands.
- Implemented registry service for managing the DependencyInstallerService instance.
- Enhanced the task runner to install dependencies if available, including command execution and logging for stdout and stderr.
- Improved error handling and logging throughout the task runner's dependency installation process.
- Updated the runner's methods to utilize the new dependency management features, ensuring better integration and functionality.
2025-01-01 20:51:55 +08:00
Marvin Zhang
136daffa26 refactor: improve IPC handling and logging in task runner tests
- Enhanced the IPC message handling in runner_test.go by adding detailed logging for better traceability.
- Refactored the test setup to use channels for synchronization and improved error handling during message processing.
- Updated the runner.go file to rename variables for clarity and streamline the IPC reader implementation.
- Improved the cleanup process in tests to ensure proper resource management and context cancellation.
2025-01-01 15:18:40 +08:00
Marvin Zhang
db2549e3cd fix: enhance error logging in file log driver and update default task log path
- Improved error messages in the FileLogDriver's cleanup method to include error details for better debugging.
- Updated the default task log path from '/app/logs/tasks' to '/var/log/crawlab/tasks' to ensure consistency across environments.
2025-01-01 14:26:10 +08:00
Marvin Zhang
7e7ac621ec fix: update default task log path for consistency across environments
- Changed the default task log path from '/var/log/crawlab/tasks' to '/app/logs/tasks' to align with the application's directory structure and improve portability in different deployment environments.
2025-01-01 11:54:48 +08:00
Marvin Zhang
7b6805a834 feat: enhance task runner with improved logging and dependency support
- Added support for new dependency file types: 'go.mod' and 'pom.xml' in dependency.go.
- Refactored command configuration in runner.go to improve logging and error handling.
- Introduced a new method to configure Node.js paths, enhancing environment setup for tasks.
- Enhanced IPC message handling with detailed logging for better traceability.
- Updated service logging to remove unnecessary prefixes for cleaner output.
- Improved command execution handling in process.go for better compatibility across platforms.
2024-12-31 22:52:21 +08:00
Marvin Zhang
25fe273a62 refactor: improve logging in gRPC services by removing service prefixes
- Updated log messages in NodeServiceServer and TaskServiceServer to remove the "[NodeServiceServer]" and "[TaskServiceServer]" prefixes for cleaner output.
- This change enhances log readability and maintains consistency across logging practices in the application.
2024-12-31 13:30:02 +08:00
Marvin Zhang
9bdb0c969f refactor: remove default_version field from DependencyConfig
- Eliminated the default_version field from the DependencyConfig struct in dependency_config.go to streamline the configuration model.
- This change simplifies the dependency management process and aligns with recent updates in the application structure.
2024-12-30 15:14:21 +08:00
Marvin Zhang
ef499a03e0 fix: improve logging in master and worker services
- Added logging for error handling in the MasterService when setting a worker node offline, replacing the previous trace.PrintError with a more informative log message.
- Enhanced WorkerService subscription method with debug logs to indicate subscription attempts and status, improving traceability during connection processes.
2024-12-29 19:19:36 +08:00
Marvin Zhang
54800974eb feat: refactor system info retrieval and enhance logo output
- Replaced viper calls with utility functions in GetSystemInfo to improve code clarity and maintainability.
- Added a new system.go file with utility functions for retrieving system version and edition information.
- Enhanced PrintLogoWithWelcomeInfo to include detailed system information, improving user experience during server startup.
- Updated output formatting for better readability and consistency in welcome messages.
2024-12-29 19:01:23 +08:00
Marvin Zhang
17f8917d0a chore: update dependencies and enhance gRPC services
- Updated various dependencies in go.mod and go.sum files, including cloud.google.com/go/compute/metadata to v0.5.2, google.golang.org/grpc to v1.69.2, and google.golang.org/protobuf to v1.36.1.
- Refactored gRPC service definitions to use the latest protoc-gen-go and protoc versions, ensuring compatibility with the latest gRPC-Go features.
- Introduced a new logger utility in core/utils/logger.go to improve logging capabilities across the application.
- Added a README.md for gRPC setup and compilation guidance, enhancing developer experience.
- Improved the Python installation script to handle version listing more effectively and ensure user-specific environment setup.
2024-12-25 17:46:49 +08:00
Marvin Zhang
9e67e50c6c fix: add new line after logo and welcome info prints
- Added a new line after the logo and welcome information prints in the PrintLogoWithWelcomeInfo function to improve output readability.
- This change enhances the user experience by ensuring that the printed information is visually separated, making it easier to read during server startup.
2024-12-25 14:22:18 +08:00
Marvin Zhang
2a33bd40f5 chore: update dependencies in go.mod and go.sum files
- Added github.com/common-nighthawk/go-figure as a new indirect dependency in both backend and core modules to support logo rendering functionality.
- Removed the github.com/imroc/req dependency from backend and go.sum files, streamlining the dependency list and improving project organization.
2024-12-25 14:16:38 +08:00
Marvin Zhang
a13893b627 feat: add logo printing functionality and update logger usage
- Introduced a new utility function to print a logo and welcome information for the Crawlab server, enhancing user experience during startup.
- Updated logger variable names in the apps package for consistency and clarity.
- Added a new dependency on github.com/common-nighthawk/go-figure to facilitate logo rendering.
- Improved the server command to display the logo when the server starts, provided the user is not using the pro version.
2024-12-25 13:08:56 +08:00
Marvin Zhang
dc59599509 refactor: remove db module and update imports to core/mongo
- Deleted the db module, consolidating database-related functionality into the core/mongo package for better organization and maintainability.
- Updated all import paths across the codebase to replace references to the removed db module with core/mongo.
- Cleaned up unused code and dependencies, enhancing overall project clarity and reducing complexity.
- This refactor improves the structure of the codebase by centralizing database operations and simplifying module management.
2024-12-25 10:28:21 +08:00
Marvin Zhang
a28ffbf66c refactor: simplify interfaces and improve configuration handling
- Removed unused ApiApp and ServerApp interfaces from core/apps/interfaces.go to streamline the codebase.
- Updated the GetApi method in the Server struct to return a pointer to the Api type for better type handling.
- Simplified the GetGinMode function in core/utils/config.go to always return gin.ReleaseMode, removing unnecessary conditional checks for development mode.
- These changes enhance code clarity and maintainability by eliminating redundant code and improving type safety.
2024-12-24 23:05:41 +08:00
Marvin Zhang
3276083994 refactor: replace apex/log with structured logger across multiple services
- Replaced all instances of apex/log with a structured logger interface in various services, including Api, Server, Config, and others, to enhance logging consistency and context.
- Updated logging calls to utilize the new logger methods, improving error tracking and service monitoring.
- Added logger initialization in services and controllers to ensure proper logging setup.
- Improved error handling and logging messages for better clarity during service operations.
- Removed unused apex/log imports and cleaned up related code for better maintainability.
2024-12-24 19:11:19 +08:00
Marvin Zhang
e064889795 refactor: replace apex/log with structured logger in master and worker services
- Removed direct usage of apex/log in favor of a structured logger interface for improved logging consistency and context.
- Updated logging calls in MasterService and WorkerService to utilize the new logger, enhancing error tracking and service monitoring.
- Added logger initialization in both services to ensure proper logging setup.
- Improved error handling and logging messages for better clarity during service operations.
2024-12-23 21:45:38 +08:00
Marvin Zhang
99ed4396d1 refactor: improve logging messages and update configuration constants
- Updated logging messages in GrpcClient to provide clearer context, changing "ready" to "client is now ready" and "stopped" to "client has stopped".
- Refactored test setup in runner_test.go to remove unnecessary error checks during gRPC client start for cleaner code.
- Renamed GetDependencySetupScriptRoot to GetInstallRoot and updated related constants for better clarity and consistency in configuration management.
2024-12-23 18:19:08 +08:00
Marvin Zhang
c3f4c4ae05 feat: enhance gRPC client with structured logging and dependency actions
- Added DependencyActionSync and DependencyActionSetup constants to improve dependency management.
- Refactored GrpcClient to utilize a logger interface for consistent logging across connection states and errors.
- Updated Start, Stop, and connection methods to replace direct log calls with logger methods, enhancing log context and readability.
- Simplified test cases by removing error checks on gRPC client start, ensuring cleaner test setup.
2024-12-23 17:17:21 +08:00
Marvin Zhang
34455dc47c feat: enhance DependencyConfig model with additional fields
- Added TotalDependencies and SearchReady fields to the DependencyConfig struct for improved dependency management.
- Retained the Setup field for backward compatibility while ensuring proper JSON and BSON handling.
2024-12-23 15:18:04 +08:00
Marvin Zhang
ed8fb78c3b feat: expand Logger interface and implement additional logging methods in ServiceLogger
- Added Debug, Info, Warn, Error, and Fatal methods to the Logger interface for comprehensive logging capabilities.
- Implemented corresponding methods in ServiceLogger to facilitate structured logging with service context.
- Enhanced the logging functionality to support various log levels, improving error tracking and debugging.
2024-12-23 14:25:48 +08:00
Marvin Zhang
bdc347aef7 feat: add Fatalf logging method to Logger interface and ServiceLogger
- Introduced Fatalf method in Logger interface for logging fatal messages with formatted content.
- Implemented Fatalf in ServiceLogger to log fatal messages and exit the program, enhancing error handling capabilities.
2024-12-23 13:15:18 +08:00
Marvin Zhang
afe21b7ca0 feat: implement logging interfaces and service logger
- Added Logger interface in core/interfaces/logger.go for standardized logging methods (Debugf, Infof, Warnf, Errorf).
- Introduced ServiceLogger in core/utils/log.go, which prefixes log messages with the service name for better context.
- Implemented logging methods in ServiceLogger to utilize the apex/log package for structured logging.
2024-12-21 23:00:30 +08:00
Marvin Zhang
e44b416e34 feat: enhance model base service with BSON ID normalization
- Added utility function to normalize BSON ObjectId in query parameters for gRPC methods.
- Updated GetOne, GetMany, DeleteOne, DeleteMany, UpdateOne, UpdateMany, ReplaceOne, and UpsertOne methods to utilize the new normalization function.
- Introduced new DependencyConfigSetup model instance in base model definitions for improved structure.
2024-12-21 22:07:54 +08:00
Marvin Zhang
29af5a366b feat: enhance gRPC client with state management and reconnection logic
- Introduced state management in GrpcClient to monitor and handle connection states effectively.
- Added a reconnect channel and a state monitoring goroutine to facilitate automatic reconnections on state changes.
- Updated the connect method to initiate a reconnection loop upon connection loss.
- Enhanced logging for connection state changes and errors during connection attempts.
- Refactored tests to ensure proper initialization of gRPC client and server, improving test reliability and coverage.
2024-12-21 21:41:00 +08:00
Marvin Zhang
75da4fff0f refactor: update gRPC client initialization in model service tests
- Replaced direct gRPC client instantiation with a call to GetGrpcClient().Start() for improved consistency and maintainability across test cases.
- Cleaned up unnecessary imports in model_service.go to streamline the codebase.
- Enhanced test setup by ensuring the gRPC client is properly initialized before executing service methods.
2024-12-21 20:27:02 +08:00
Marvin Zhang
41e973a4d3 fix: misaligned nodes when running tasks from a schedule through enhancement by adding new route and handler for running schedules
- Introduced a new POST endpoint '/:id/run' in the router to trigger the execution of a schedule.
- Implemented the PostScheduleRun function to handle the execution logic, including validation of input parameters and scheduling tasks.
- Refactored the PostSpiderRun function to streamline the scheduling process by directly calling the admin service.
- Updated the FsFileInfo struct to clarify the children field description.
- Modified the Task model to make NodeIds field optional in JSON serialization.
2024-12-21 15:33:07 +08:00
Marvin Zhang
b88b41afe2 feat: add API configuration options for port and path
- Introduced GetApiPort and GetApiPath functions to retrieve API port and path from configuration, with defaults provided.
- Updated GetApiEndpoint to construct the API endpoint URL using the new port and path functions, ensuring proper formatting.
- Added constants for default API port and path to enhance configurability.
2024-12-21 14:05:50 +08:00
Marvin Zhang
c897fb58e4 refactor: streamline error handling and improve HTTP request management
- Removed the print flag from handleError function, simplifying error logging based on the development environment.
- Introduced a new performRequest function for standardized HTTP requests with JSON bodies, enhancing code reusability.
- Updated SendIMNotification and related functions to utilize the new RequestParam type for better clarity and consistency.
- Normalized HTTP request paths in the createHttpRequest method to ensure correct URL formatting.
- Added detailed error logging for JSON unmarshaling failures in syncFiles method.
- Introduced a NewHttpClient function to create HTTP clients with customizable timeouts.
2024-12-21 11:27:58 +08:00
Marvin Zhang
3cb74d76f9 feat: enhance gRPC client functionality and improve logging
- Added WaitForReady method to GrpcClient for blocking until the client is ready.
- Updated WorkerService to utilize WaitForReady for ensuring gRPC client readiness before starting.
- Refactored ModelService to consistently use GetGrpcClient for context management.
- Changed logging level for received metrics in MetricServiceServer from Info to Debug.
- Modified error handling in HandleError to conditionally print errors based on the environment.
- Cleaned up unused GrpcClient references in various services, improving code clarity.
2024-12-20 20:34:04 +08:00