Merge pull request #18 from crawlab-team/develop

Develop
This commit is contained in:
暗音
2020-01-07 00:53:33 +08:00
committed by GitHub
50 changed files with 1699 additions and 561 deletions

149
CHANGELOG-zh.md Normal file
View File

@@ -0,0 +1,149 @@
# 0.4.3 (2020-01-07)
### 功能 / 优化
- **依赖安装**. 允许用户在平台 Web 界面安装/卸载依赖以及添加编程语言(暂时只有 Node.js
- **Docker 中预装编程语言**. 允许 Docker 用户通过设置 `CRAWLAB_SERVER_LANG_NODE``Y` 来预装 `Node.js` 环境.
- **在爬虫详情页添加定时任务列表**. 允许用户在爬虫详情页查看、添加、编辑定时任务. [#360](https://github.com/crawlab-team/crawlab/issues/360)
- **Cron 表达式与 Linux 一致**. 将表达式从 6 元素改为 5 元素,与 Linux 一致.
- **启用/禁用定时任务**. 允许用户启用/禁用定时任务. [#297](https://github.com/crawlab-team/crawlab/issues/297)
- **优化任务管理**. 允许用户批量删除任务. [#341](https://github.com/crawlab-team/crawlab/issues/341)
- **优化爬虫管理**. 允许用户在爬虫列表页对爬虫进行筛选和排序.
- **添加中文版 `CHANGELOG`**.
- **在顶部添加 Github 加星按钮**.
### Bug 修复
- **定时任务问题**. [#423](https://github.com/crawlab-team/crawlab/issues/423)
- **上传爬虫zip文件问题**. [#403](https://github.com/crawlab-team/crawlab/issues/403) [#407](https://github.com/crawlab-team/crawlab/issues/407)
- **因为网络原因导致崩溃**. [#340](https://github.com/crawlab-team/crawlab/issues/340)
# 0.4.2 (2019-12-26)
### 功能 / 优化
- **免责声明**. 加入免责声明.
- **通过 API 获取版本号**. [#371](https://github.com/crawlab-team/crawlab/issues/371)
- **通过配置来允许用户注册**. [#346](https://github.com/crawlab-team/crawlab/issues/346)
- **允许添加新用户**.
- **更高级的文件管理**. 允许用户添加、编辑、重命名、删除代码文件. [#286](https://github.com/crawlab-team/crawlab/issues/286)
- **优化爬虫创建流程**. 允许用户在上传 zip 文件前创建空的自定义爬虫.
- **优化任务管理**. 允许用户通过选择条件过滤任务. [#341](https://github.com/crawlab-team/crawlab/issues/341)
### Bug 修复
- **重复节点**. [#391](https://github.com/crawlab-team/crawlab/issues/391)
- **"mongodb no reachable" 错误**. [#373](https://github.com/crawlab-team/crawlab/issues/373)
# 0.4.1 (2019-12-13)
### 功能 / 优化
- **Spiderfile 优化**. 将阶段由数组更换为字典. [#358](https://github.com/crawlab-team/crawlab/issues/358)
- **百度统计更新**.
### Bug 修复
- **无法展示定时任务**. [#353](https://github.com/crawlab-team/crawlab/issues/353)
- **重复节点注册**. [#334](https://github.com/crawlab-team/crawlab/issues/334)
# 0.4.0 (2019-12-06)
### 功能 / 优化
- **可配置爬虫**. 允许用户添加 `Spiderfile` 来配置抓取规则.
- **执行模式**. 允许用户选择 3 种任务执行模式: *所有节点*, *指定节点* and *随机*.
### Bug 修复
- **任务意外被杀死**. [#306](https://github.com/crawlab-team/crawlab/issues/306)
- **文档更正**. [#301](https://github.com/crawlab-team/crawlab/issues/258) [#301](https://github.com/crawlab-team/crawlab/issues/258)
- **直接部署与 Windows 不兼容**. [#288](https://github.com/crawlab-team/crawlab/issues/288)
- **日志文件丢失**. [#269](https://github.com/crawlab-team/crawlab/issues/269)
# 0.3.5 (2019-10-28)
### 功能 / 优化
- **优雅关闭**. [详情](https://github.com/crawlab-team/crawlab/commit/63fab3917b5a29fd9770f9f51f1572b9f0420385)
- **节点信息优化**. [详情](https://github.com/crawlab-team/crawlab/commit/973251a0fbe7a2184ac0da09e0404a17c736aee7)
- **将系统环境变量添加到任务**. [详情](https://github.com/crawlab-team/crawlab/commit/4ab4892471965d6342d30385578ca60dc51f8ad3)
- **自动刷新任务日志**. [详情](https://github.com/crawlab-team/crawlab/commit/4ab4892471965d6342d30385578ca60dc51f8ad3)
- **允许 HTTPS 部署**. [详情](https://github.com/crawlab-team/crawlab/commit/5d8f6f0c56768a6e58f5e46cbf5adff8c7819228)
### Bug 修复
- **定时任务中无法获取爬虫列表**. [详情](https://github.com/crawlab-team/crawlab/commit/311f72da19094e3fa05ab4af49812f58843d8d93)
- **无法获取工作节点信息**. [详情](https://github.com/crawlab-team/crawlab/commit/6af06efc17685a9e232e8c2b5fd819ec7d2d1674)
- **运行爬虫任务时无法选择节点**. [详情](https://github.com/crawlab-team/crawlab/commit/31f8e03234426e97aed9b0bce6a50562f957edad)
- **结果量很大时无法获取结果数量**. [#260](https://github.com/crawlab-team/crawlab/issues/260)
- **定时任务中的节点问题**. [#244](https://github.com/crawlab-team/crawlab/issues/244)
# 0.3.1 (2019-08-25)
### 功能 / 优化
- **Docker 镜像优化**. 将 Docker 镜像进一步分割成 alpine 镜像版本的 master、worker、frontendSplit docker further into master, worker, frontend.
- **单元测试**. 用单元测试覆盖部分后端代码.
- **前端优化**. 登录页、按钮大小、上传 UI 提示.
- **更灵活的节点注册**. 允许用户传一个变量作为注册 key而不是默认的 MAC 地址.
### Bug 修复
- **上传大爬虫文件错误**. 上传大爬虫文件时的内存崩溃问题. [#150](https://github.com/crawlab-team/crawlab/issues/150)
- **无法同步爬虫**. 通过提高写权限等级来修复同步爬虫文件时的问题. [#114](https://github.com/crawlab-team/crawlab/issues/114)
- **爬虫页问题**. 通过删除 `Site` 字段来修复. [#112](https://github.com/crawlab-team/crawlab/issues/112)
- **节点展示问题**. 当在多个机器上跑 Docker 容器时,节点无法正确展示. [#99](https://github.com/crawlab-team/crawlab/issues/99)
# 0.3.0 (2019-07-31)
### 功能 / 优化
- **Golang 后端**: 将后端由 Python 重构为 Golang很大的提高了稳定性和性能.
- **节点网络图**: 节点拓扑图可视化.
- **节点系统信息**: 可以查看包括操作系统、CPU数量、可执行文件在内的系统信息.
- **节点监控改进**: 节点通过 Redis 来监控和注册.
- **文件管理**: 可以在线编辑爬虫文件,包括代码高亮.
- **登录页/注册页/用户管理**: 要求用户登录后才能使用 Crawlab, 允许用户注册和用户管理,有一些基于角色的鉴权机制.
- **自动部署爬虫**: 爬虫将被自动部署或同步到所有在线节点.
- **更小的 Docker 镜像**: 瘦身版 Docker 镜像,通过多阶段构建将 Docker 镜像大小从 1.3G 减小到 700M 左右.
### Bug 修复
- **节点状态**. 节点状态不会随着节点下线而更新. [#87](https://github.com/tikazyq/crawlab/issues/87)
- **爬虫部署错误**. 通过自动爬虫部署来修复 [#83](https://github.com/tikazyq/crawlab/issues/83)
- **节点无法显示**. 节点无法显示在线 [#81](https://github.com/tikazyq/crawlab/issues/81)
- **定时任务无法工作**. 通过 Golang 后端修复 [#64](https://github.com/tikazyq/crawlab/issues/64)
- **Flower 错误**. 通过 Golang 后端修复 [#57](https://github.com/tikazyq/crawlab/issues/57)
# 0.2.4 (2019-07-07)
### 功能 / 优化
- **文档**: 更优和更详细的文档.
- **更好的 Crontab**: 通过 UI 界面生成 Cron 表达式.
- **更优的性能**: 从原生 flask 引擎 切换到 `gunicorn`. [#78](https://github.com/tikazyq/crawlab/issues/78)
### Bug 修复
- **删除爬虫**. 删除爬虫时不止在数据库中删除,还应该删除相关的文件夹、任务和定时任务. [#69](https://github.com/tikazyq/crawlab/issues/69)
- **MongoDB 授权**. 允许用户注明 `authenticationDatabase` 来连接 `mongodb`. [#68](https://github.com/tikazyq/crawlab/issues/68)
- **Windows 兼容性**. 加入 `eventlet``requirements.txt`. [#59](https://github.com/tikazyq/crawlab/issues/59)
# 0.2.3 (2019-06-12)
### 功能 / 优化
- **Docker**: 用户能够运行 Docker 镜像来加快部署.
- **CLI**: 允许用户通过命令行来执行 Crawlab 程序.
- **上传爬虫**: 允许用户上传自定义爬虫到 Crawlab.
- **预览时编辑字段**: 允许用户在可配置爬虫中预览数据时编辑字段.
### Bug 修复
- **爬虫分页**. 爬虫列表页中修复分页问题.
# 0.2.2 (2019-05-30)
### 功能 / 优化
- **自动抓取字段**: 在可配置爬虫列表页种自动抓取字段.
- **下载结果**: 允许下载结果为 CSV 文件.
- **百度统计**: 允许用户选择是否允许向百度统计发送统计数据.
### Bug 修复
- **结果页分页**. [#45](https://github.com/tikazyq/crawlab/issues/45)
- **定时任务重复触发**: 将 Flask DEBUG 设置为 False 来保证定时任务无法重复触发. [#32](https://github.com/tikazyq/crawlab/issues/32)
- **前端环境**: 添加 `VUE_APP_BASE_URL` 作为生产环境模式变量,然后 API 不会永远都是 `localhost` [#30](https://github.com/tikazyq/crawlab/issues/30)
# 0.2.1 (2019-05-27)
- **可配置爬虫**: 允许用户创建爬虫来抓取数据,而不用编写代码.
# 0.2 (2019-05-10)
- **高级数据统计**: 爬虫详情页的高级数据统计.
- **网站数据**: 加入网站列表(中国),允许用户查看 robots.txt、首页响应时间等信息.
# 0.1.1 (2019-04-23)
- **基础统计**: 用户可以查看基础统计数据,包括爬虫和任务页中的失败任务数、结果数.
- **近实时任务信息**: 周期性5 秒)向服务器轮训数据来实现近实时查看任务信息.
- **定时任务**: 利用 apscheduler 实现定时任务,允许用户设置类似 Cron 的定时任务.
# 0.1 (2019-04-17)
- **首次发布**

View File

@@ -1,3 +1,21 @@
# 0.4.3 (2020-01-07)
### Features / Enhancement
- **Dependency Installation**. Allow users to install/uninstall dependencies and add programming languages (Node.js only for now) on the platform web interface.
- **Pre-install Programming Languages in Docker**. Allow Docker users to set `CRAWLAB_SERVER_LANG_NODE` as `Y` to pre-install `Node.js` environments.
- **Add Schedule List in Spider Detail Page**. Allow users to view / add / edit schedule cron jobs in the spider detail page. [#360](https://github.com/crawlab-team/crawlab/issues/360)
- **Align Cron Expression with Linux**. Change the expression of 6 elements to 5 elements as aligned in Linux.
- **Enable/Disable Schedule Cron**. Allow users to enable/disable the schedule jobs. [#297](https://github.com/crawlab-team/crawlab/issues/297)
- **Better Task Management**. Allow users to batch delete tasks. [#341](https://github.com/crawlab-team/crawlab/issues/341)
- **Better Spider Management**. Allow users to sort and filter spiders in the spider list page.
- **Added Chinese `CHANGELOG`**.
- **Added Github Star Button at Nav Bar**.
### Bug Fixes
- **Schedule Cron Task Issue**. [#423](https://github.com/crawlab-team/crawlab/issues/423)
- **Upload Spider Zip File Issue**. [#403](https://github.com/crawlab-team/crawlab/issues/403) [#407](https://github.com/crawlab-team/crawlab/issues/407)
- **Exit due to Network Failure**. [#340](https://github.com/crawlab-team/crawlab/issues/340)
# 0.4.2 (2019-12-26)
### Features / Enhancement
- **Disclaimer**. Added page for Disclaimer.

View File

@@ -59,4 +59,4 @@ EXPOSE 8080
EXPOSE 8000
# start backend
CMD ["/bin/sh", "/app/docker_init.sh"]
CMD ["/bin/bash", "/app/docker_init.sh"]

View File

@@ -57,4 +57,4 @@ EXPOSE 8080
EXPOSE 8000
# start backend
CMD ["/bin/sh", "/app/docker_init.sh"]
CMD ["/bin/bash", "/app/docker_init.sh"]

View File

@@ -1,16 +1,16 @@
# Crawlab
![](http://114.67.75.98:8082/buildStatus/icon?job=crawlab%2Fmaster)
![](https://img.shields.io/github/release/crawlab-team/crawlab.svg)
![](https://img.shields.io/docker/cloud/build/tikazyq/crawlab.svg?label=build&logo=docker)
![](https://img.shields.io/docker/pulls/tikazyq/crawlab?label=pulls&logo=docker)
![](https://img.shields.io/github/release/crawlab-team/crawlab.svg?logo=github)
![](https://img.shields.io/github/last-commit/crawlab-team/crawlab.svg)
![](https://img.shields.io/github/issues/crawlab-team/crawlab.svg)
![](https://img.shields.io/github/contributors/crawlab-team/crawlab.svg)
![](https://img.shields.io/docker/pulls/tikazyq/crawlab)
![](https://img.shields.io/github/issues/crawlab-team/crawlab/bug.svg?label=bugs&color=red)
![](https://img.shields.io/github/issues/crawlab-team/crawlab/enhancement.svg?label=enhancements&color=cyan)
![](https://img.shields.io/github/license/crawlab-team/crawlab.svg)
中文 | [English](https://github.com/crawlab-team/crawlab)
[安装](#安装) | [运行](#运行) | [截图](#截图) | [架构](#架构) | [集成](#与其他框架的集成) | [比较](#与其他框架比较) | [相关文章](#相关文章) | [社区&赞助](#社区--赞助) | [免责声明](https://github.com/crawlab-team/crawlab/blob/master/DISCLAIMER-zh.md)
[安装](#安装) | [运行](#运行) | [截图](#截图) | [架构](#架构) | [集成](#与其他框架的集成) | [比较](#与其他框架比较) | [相关文章](#相关文章) | [社区&赞助](#社区--赞助) | [更新日志](https://github.com/crawlab-team/crawlab/blob/master/CHANGELOG-zh.md) | [免责声明](https://github.com/crawlab-team/crawlab/blob/master/DISCLAIMER-zh.md)
基于Golang的分布式爬虫管理平台支持Python、NodeJS、Go、Java、PHP等多种编程语言以及多种爬虫框架。
@@ -19,9 +19,9 @@
## 安装
三种方式:
1. [Docker](https://tikazyq.github.io/crawlab-docs/Installation/Docker.html)(推荐)
2. [直接部署](https://tikazyq.github.io/crawlab-docs/Installation/Direct.html)(了解内核)
3. [Kubernetes](https://mp.weixin.qq.com/s/3Q1BQATUIEE_WXcHPqhYbA)
1. [Docker](http://docs.crawlab.cn/Installation/Docker.html)(推荐)
2. [直接部署](http://docs.crawlab.cn/Installation/Direct.html)(了解内核)
3. [Kubernetes](https://juejin.im/post/5e0a02d851882549884c27ad) (多节点部署)
### 要求Docker
- Docker 18.03+
@@ -31,9 +31,17 @@
### 要求(直接部署)
- Go 1.12+
- Node 8.12+
- Redis
- Redis 5.x+
- MongoDB 3.6+
## 快速开始
```bash
git clone https://github.com/crawlab-team/crawlab
cd crawlab
docker-compose up -d
```
## 运行
### Docker
@@ -123,6 +131,10 @@ Docker部署的详情请见[相关文档](https://tikazyq.github.io/crawlab-d
![](https://raw.githubusercontent.com/tikazyq/crawlab-docs/master/images/schedule.png)
#### 依赖安装
![](http://static-docs.crawlab.cn/node-install-dependencies.png)
## 架构
Crawlab的架构包括了一个主节点Master Node和多个工作节点Worker Node以及负责通信和数据储存的Redis和MongoDB数据库。

View File

@@ -1,16 +1,16 @@
# Crawlab
![](http://114.67.75.98:8082/buildStatus/icon?job=crawlab%2Fmaster)
![](https://img.shields.io/github/release/crawlab-team/crawlab.svg)
![](https://img.shields.io/docker/cloud/build/tikazyq/crawlab.svg?label=build&logo=docker)
![](https://img.shields.io/docker/pulls/tikazyq/crawlab?label=pulls&logo=docker)
![](https://img.shields.io/github/release/crawlab-team/crawlab.svg?logo=github)
![](https://img.shields.io/github/last-commit/crawlab-team/crawlab.svg)
![](https://img.shields.io/github/issues/crawlab-team/crawlab.svg)
![](https://img.shields.io/github/contributors/crawlab-team/crawlab.svg)
![](https://img.shields.io/docker/pulls/tikazyq/crawlab)
![](https://img.shields.io/github/issues/crawlab-team/crawlab/bug.svg?label=bugs&color=red)
![](https://img.shields.io/github/issues/crawlab-team/crawlab/enhancement.svg?label=enhancements&color=cyan)
![](https://img.shields.io/github/license/crawlab-team/crawlab.svg)
[中文](https://github.com/crawlab-team/crawlab/blob/master/README-zh.md) | English
[Installation](#installation) | [Run](#run) | [Screenshot](#screenshot) | [Architecture](#architecture) | [Integration](#integration-with-other-frameworks) | [Compare](#comparison-with-other-frameworks) | [Community & Sponsorship](#community--sponsorship) | [Disclaimer](https://github.com/crawlab-team/crawlab/blob/master/DISCLAIMER.md)
[Installation](#installation) | [Run](#run) | [Screenshot](#screenshot) | [Architecture](#architecture) | [Integration](#integration-with-other-frameworks) | [Compare](#comparison-with-other-frameworks) | [Community & Sponsorship](#community--sponsorship) | [CHANGELOG](https://github.com/crawlab-team/crawlab/blob/master/CHANGELOG.md) | [Disclaimer](https://github.com/crawlab-team/crawlab/blob/master/DISCLAIMER.md)
Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium.
@@ -19,9 +19,9 @@ Golang-based distributed web crawler management platform, supporting various lan
## Installation
Two methods:
1. [Docker](https://tikazyq.github.io/crawlab-docs/Installation/Docker.html) (Recommended)
2. [Direct Deploy](https://tikazyq.github.io/crawlab-docs/Installation/Direct.html) (Check Internal Kernel)
3. [Kubernetes](https://mp.weixin.qq.com/s/3Q1BQATUIEE_WXcHPqhYbA)
1. [Docker](http://docs.crawlab.cn/Installation/Docker.html) (Recommended)
2. [Direct Deploy](http://docs.crawlab.cn/Installation/Direct.html) (Check Internal Kernel)
3. [Kubernetes](https://juejin.im/post/5e0a02d851882549884c27ad) (Multi-Node Deployment)
### Pre-requisite (Docker)
- Docker 18.03+
@@ -31,9 +31,17 @@ Two methods:
### Pre-requisite (Direct Deploy)
- Go 1.12+
- Node 8.12+
- Redis
- Redis 5.x+
- MongoDB 3.6+
## Quick Start
```bash
git clone https://github.com/crawlab-team/crawlab
cd crawlab
docker-compose up -d
```
## Run
### Docker
@@ -121,6 +129,10 @@ For Docker Deployment details, please refer to [relevant documentation](https://
![](https://raw.githubusercontent.com/tikazyq/crawlab-docs/master/images/schedule.png)
#### Dependency Installation
![](http://static-docs.crawlab.cn/node-install-dependencies.png)
## Architecture
The architecture of Crawlab is consisted of the Master Node and multiple Worker Nodes, and Redis and MongoDB databases which are mainly for nodes communication and data storage.

View File

@@ -26,12 +26,15 @@ server:
# mac地址 或者 ip地址如果是ip则需要手动指定IP
type: "mac"
ip: ""
lang: # 安装语言环境, Y 为安装N 为不安装,只对 Docker 有效
python: "Y"
node: "N"
spider:
path: "./spiders"
task:
workers: 4
other:
tmppath: "./tmp"
version: 0.4.1
tmppath: "/tmp"
version: 0.4.3
setting:
allowRegister: "N"

View File

@@ -0,0 +1,6 @@
package constants
const (
ASCENDING = "ascending"
DESCENDING = "descending"
)

9
backend/constants/rpc.go Normal file
View File

@@ -0,0 +1,9 @@
package constants
const (
RpcInstallLang = "install_lang"
RpcInstallDep = "install_dep"
RpcUninstallDep = "uninstall_dep"
RpcGetDepList = "get_dep_list"
RpcGetInstalledDepList = "get_installed_dep_list"
)

View File

@@ -1,7 +1,7 @@
package constants
const (
ScheduleStatusStop = "stop"
ScheduleStatusStop = "stopped"
ScheduleStatusRunning = "running"
ScheduleStatusError = "error"

View File

@@ -8,6 +8,6 @@ const (
const (
Python = "python"
NodeJS = "node"
Nodejs = "node"
Java = "java"
)

View File

@@ -93,5 +93,14 @@ func InitMongo() error {
// 赋值给全局mongo session
Session = sess
}
//Add Unique index for 'key'
keyIndex := mgo.Index{
Key: []string{"key"},
Unique: true,
}
s, c := GetCol("nodes")
defer s.Close()
c.EnsureIndex(keyIndex)
return nil
}

View File

@@ -36,6 +36,19 @@ func (r *Redis) RPush(collection string, value interface{}) error {
defer utils.Close(c)
if _, err := c.Do("RPUSH", collection, value); err != nil {
log.Error(err.Error())
debug.PrintStack()
return err
}
return nil
}
func (r *Redis) LPush(collection string, value interface{}) error {
c := r.pool.Get()
defer utils.Close(c)
if _, err := c.Do("RPUSH", collection, value); err != nil {
log.Error(err.Error())
debug.PrintStack()
return err
}
@@ -58,6 +71,7 @@ func (r *Redis) HSet(collection string, key string, value string) error {
defer utils.Close(c)
if _, err := c.Do("HSET", collection, key, value); err != nil {
log.Error(err.Error())
debug.PrintStack()
return err
}
@@ -70,6 +84,8 @@ func (r *Redis) HGet(collection string, key string) (string, error) {
value, err2 := redis.String(c.Do("HGET", collection, key))
if err2 != nil {
log.Error(err2.Error())
debug.PrintStack()
return value, err2
}
return value, nil
@@ -80,6 +96,8 @@ func (r *Redis) HDel(collection string, key string) error {
defer utils.Close(c)
if _, err := c.Do("HDEL", collection, key); err != nil {
log.Error(err.Error())
debug.PrintStack()
return err
}
return nil
@@ -91,11 +109,29 @@ func (r *Redis) HKeys(collection string) ([]string, error) {
value, err2 := redis.Strings(c.Do("HKeys", collection))
if err2 != nil {
log.Error(err2.Error())
debug.PrintStack()
return []string{}, err2
}
return value, nil
}
func (r *Redis) BRPop(collection string, timeout int) (string, error) {
if timeout <= 0 {
timeout = 60
}
c := r.pool.Get()
defer utils.Close(c)
values, err := redis.Strings(c.Do("BRPOP", collection, timeout))
if err != nil {
log.Error(err.Error())
debug.PrintStack()
return "", err
}
return values[1], nil
}
func NewRedisPool() *redis.Pool {
var address = viper.GetString("redis.address")
var port = viper.GetString("redis.port")
@@ -112,8 +148,8 @@ func NewRedisPool() *redis.Pool {
Dial: func() (conn redis.Conn, e error) {
return redis.DialURL(url,
redis.DialConnectTimeout(time.Second*10),
redis.DialReadTimeout(time.Second*10),
redis.DialWriteTimeout(time.Second*15),
redis.DialReadTimeout(time.Second*600),
redis.DialWriteTimeout(time.Second*10),
)
},
TestOnBorrow: func(c redis.Conn, t time.Time) error {

View File

@@ -110,12 +110,20 @@ func main() {
// 初始化依赖服务
if err := services.InitDepsFetcher(); err != nil {
log.Error("init user service error:" + err.Error())
log.Error("init dependency fetcher error:" + err.Error())
debug.PrintStack()
panic(err)
}
log.Info("initialized dependency fetcher successfully")
// 初始化RPC服务
if err := services.InitRpcService(); err != nil {
log.Error("init rpc service error:" + err.Error())
debug.PrintStack()
panic(err)
}
log.Info("initialized rpc service successfully")
// 以下为主节点服务
if model.IsMaster() {
// 中间件
@@ -139,6 +147,9 @@ func main() {
authGroup.GET("/nodes/:id/langs", routes.GetLangList) // 节点语言环境列表
authGroup.GET("/nodes/:id/deps", routes.GetDepList) // 节点第三方依赖列表
authGroup.GET("/nodes/:id/deps/installed", routes.GetInstalledDepList) // 节点已安装第三方依赖列表
authGroup.POST("/nodes/:id/deps/install", routes.InstallDep) // 节点安装依赖
authGroup.POST("/nodes/:id/deps/uninstall", routes.UninstallDep) // 节点卸载依赖
authGroup.POST("/nodes/:id/langs/install", routes.InstallLang) // 节点安装语言
// 爬虫
authGroup.GET("/spiders", routes.GetSpiderList) // 爬虫列表
authGroup.GET("/spiders/:id", routes.GetSpider) // 爬虫详情
@@ -157,7 +168,7 @@ func main() {
authGroup.POST("/spiders/:id/file/rename", routes.RenameSpiderFile) // 爬虫文件重命名
authGroup.GET("/spiders/:id/dir", routes.GetSpiderDir) // 爬虫目录
authGroup.GET("/spiders/:id/stats", routes.GetSpiderStats) // 爬虫统计数据
authGroup.GET("/spider/types", routes.GetSpiderTypes) // 爬虫类型
authGroup.GET("/spiders/:id/schedules", routes.GetSpiderSchedules) // 爬虫定时任务
// 可配置爬虫
authGroup.GET("/config_spiders/:id/config", routes.GetConfigSpiderConfig) // 获取可配置爬虫配置
authGroup.POST("/config_spiders/:id/config", routes.PostConfigSpiderConfig) // 更改可配置爬虫配置
@@ -178,13 +189,13 @@ func main() {
authGroup.GET("/tasks/:id/results", routes.GetTaskResults) // 任务结果
authGroup.GET("/tasks/:id/results/download", routes.DownloadTaskResultsCsv) // 下载任务结果
// 定时任务
authGroup.GET("/schedules", routes.GetScheduleList) // 定时任务列表
authGroup.GET("/schedules/:id", routes.GetSchedule) // 定时任务详情
authGroup.PUT("/schedules", routes.PutSchedule) // 创建定时任务
authGroup.POST("/schedules/:id", routes.PostSchedule) // 修改定时任务
authGroup.DELETE("/schedules/:id", routes.DeleteSchedule) // 删除定时任务
authGroup.POST("/schedules/:id/stop", routes.StopSchedule) // 停止定时任务
authGroup.POST("/schedules/:id/run", routes.RunSchedule) // 运行定时任务
authGroup.GET("/schedules", routes.GetScheduleList) // 定时任务列表
authGroup.GET("/schedules/:id", routes.GetSchedule) // 定时任务详情
authGroup.PUT("/schedules", routes.PutSchedule) // 创建定时任务
authGroup.POST("/schedules/:id", routes.PostSchedule) // 修改定时任务
authGroup.DELETE("/schedules/:id", routes.DeleteSchedule) // 删除定时任务
authGroup.POST("/schedules/:id/disable", routes.DisableSchedule) // 禁用定时任务
authGroup.POST("/schedules/:id/enable", routes.EnableSchedule) // 启用定时任务
// 统计数据
authGroup.GET("/stats/home", routes.GetHomeStats) // 首页统计数据
// 用户
@@ -196,7 +207,8 @@ func main() {
// release版本
authGroup.GET("/version", routes.GetVersion) // 获取发布的版本
// 系统
authGroup.GET("/system/deps", routes.GetAllDepList) // 节点所有第三方依赖列表
authGroup.GET("/system/deps/:lang", routes.GetAllDepList) // 节点所有第三方依赖列表
authGroup.GET("/system/deps/:lang/:dep_name/json", routes.GetDepJson) // 节点第三方依赖JSON
}
}

View File

@@ -173,8 +173,8 @@ func GetNode(id bson.ObjectId) (Node, error) {
defer s.Close()
if err := c.FindId(id).One(&node); err != nil {
log.Errorf("get node error: %s, id: %s", err.Error(), id.Hex())
debug.PrintStack()
//log.Errorf("get node error: %s, id: %s", err.Error(), id.Hex())
//debug.PrintStack()
return node, err
}
return node, nil

View File

@@ -16,20 +16,17 @@ type Schedule struct {
Name string `json:"name" bson:"name"`
Description string `json:"description" bson:"description"`
SpiderId bson.ObjectId `json:"spider_id" bson:"spider_id"`
//NodeId bson.ObjectId `json:"node_id" bson:"node_id"`
//NodeKey string `json:"node_key" bson:"node_key"`
Cron string `json:"cron" bson:"cron"`
EntryId cron.EntryID `json:"entry_id" bson:"entry_id"`
Param string `json:"param" bson:"param"`
RunType string `json:"run_type" bson:"run_type"`
NodeIds []bson.ObjectId `json:"node_ids" bson:"node_ids"`
// 状态
Status string `json:"status" bson:"status"`
Status string `json:"status" bson:"status"`
Enabled bool `json:"enabled" bson:"enabled"`
// 前端展示
SpiderName string `json:"spider_name" bson:"spider_name"`
NodeName string `json:"node_name" bson:"node_name"`
Nodes []Node `json:"nodes" bson:"nodes"`
Message string `json:"message" bson:"message"`
CreateTs time.Time `json:"create_ts" bson:"create_ts"`
@@ -84,20 +81,15 @@ func GetScheduleList(filter interface{}) ([]Schedule, error) {
var schs []Schedule
for _, schedule := range schedules {
// TODO: 获取节点名称
//if schedule.NodeId == bson.ObjectIdHex(constants.ObjectIdNull) {
// // 选择所有节点
// schedule.NodeName = "All Nodes"
//} else {
// // 选择单一节点
// node, err := GetNode(schedule.NodeId)
// if err != nil {
// schedule.Status = constants.ScheduleStatusError
// schedule.Message = constants.ScheduleStatusErrorNotFoundNode
// } else {
// schedule.NodeName = node.Name
// }
//}
// 获取节点名称
schedule.Nodes = []Node{}
if schedule.RunType == constants.RunTypeSelectedNodes {
for _, nodeId := range schedule.NodeIds {
// 选择单一节点
node, _ := GetNode(nodeId)
schedule.Nodes = append(schedule.Nodes, node)
}
}
// 获取爬虫名称
spider, err := GetSpider(schedule.SpiderId)

View File

@@ -107,13 +107,13 @@ func (spider *Spider) Delete() error {
}
// 获取爬虫列表
func GetSpiderList(filter interface{}, skip int, limit int) ([]Spider, int, error) {
func GetSpiderList(filter interface{}, skip int, limit int, sortStr string) ([]Spider, int, error) {
s, c := database.GetCol("spiders")
defer s.Close()
// 获取爬虫列表
var spiders []Spider
if err := c.Find(filter).Skip(skip).Limit(limit).Sort("+name").All(&spiders); err != nil {
if err := c.Find(filter).Skip(skip).Limit(limit).Sort(sortStr).All(&spiders); err != nil {
debug.PrintStack()
return spiders, 0, err
}
@@ -275,27 +275,7 @@ func GetSpiderCount() (int, error) {
return count, nil
}
// 获取爬虫类型
func GetSpiderTypes() ([]*entity.SpiderType, error) {
s, c := database.GetCol("spiders")
defer s.Close()
group := bson.M{
"$group": bson.M{
"_id": "$type",
"count": bson.M{"$sum": 1},
},
}
var types []*entity.SpiderType
if err := c.Pipe([]bson.M{group}).All(&types); err != nil {
log.Errorf("get spider types error: %s", err.Error())
debug.PrintStack()
return nil, err
}
return types, nil
}
// 获取爬虫定时任务
func GetConfigSpiderData(spider Spider) (entity.ConfigSpiderData, error) {
// 构造配置数据
configData := entity.ConfigSpiderData{}

View File

@@ -117,18 +117,12 @@ func GetTaskList(filter interface{}, skip int, limit int, sortKey string) ([]Tas
for i, task := range tasks {
// 获取爬虫名称
spider, err := task.GetSpider()
if err != nil || spider.Id.Hex() == "" {
_ = spider.Delete()
} else {
if spider, err := task.GetSpider(); err == nil {
tasks[i].SpiderName = spider.DisplayName
}
// 获取节点名称
node, err := task.GetNode()
if node.Id.Hex() == "" || err != nil {
_ = task.Delete()
} else {
if node, err := task.GetNode(); err == nil {
tasks[i].NodeName = node.Name
}
}
@@ -142,6 +136,8 @@ func GetTaskListTotal(filter interface{}) (int, error) {
var result int
result, err := c.Find(filter).Count()
if err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return result, err
}
return result, nil
@@ -168,6 +164,8 @@ func AddTask(item Task) error {
item.UpdateTs = time.Now()
if err := c.Insert(&item); err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return err
}
return nil
@@ -179,6 +177,8 @@ func RemoveTask(id string) error {
var result Task
if err := c.FindId(id).One(&result); err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return err
}

View File

@@ -110,9 +110,9 @@ func DeleteSchedule(c *gin.Context) {
}
// 停止定时任务
func StopSchedule(c *gin.Context) {
func DisableSchedule(c *gin.Context) {
id := c.Param("id")
if err := services.Sched.Stop(bson.ObjectIdHex(id)); err != nil {
if err := services.Sched.Disable(bson.ObjectIdHex(id)); err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
@@ -120,9 +120,9 @@ func StopSchedule(c *gin.Context) {
}
// 运行定时任务
func RunSchedule(c *gin.Context) {
func EnableSchedule(c *gin.Context) {
id := c.Param("id")
if err := services.Sched.Run(bson.ObjectIdHex(id)); err != nil {
if err := services.Sched.Enable(bson.ObjectIdHex(id)); err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}

View File

@@ -27,22 +27,38 @@ import (
)
func GetSpiderList(c *gin.Context) {
pageNum, _ := c.GetQuery("pageNum")
pageSize, _ := c.GetQuery("pageSize")
pageNum, _ := c.GetQuery("page_num")
pageSize, _ := c.GetQuery("page_size")
keyword, _ := c.GetQuery("keyword")
t, _ := c.GetQuery("type")
sortKey, _ := c.GetQuery("sort_key")
sortDirection, _ := c.GetQuery("sort_direction")
// 筛选
filter := bson.M{
"name": bson.M{"$regex": bson.RegEx{Pattern: keyword, Options: "im"}},
}
if t != "" && t != "all" {
filter["type"] = t
}
// 排序
sortStr := "-_id"
if sortKey != "" && sortDirection != "" {
if sortDirection == constants.DESCENDING {
sortStr = "-" + sortKey
} else if sortDirection == constants.ASCENDING {
sortStr = "+" + sortKey
} else {
HandleErrorF(http.StatusBadRequest, c, "invalid sort_direction")
}
}
// 分页
page := &entity.Page{}
page.GetPage(pageNum, pageSize)
results, count, err := model.GetSpiderList(filter, page.Skip, page.Limit)
results, count, err := model.GetSpiderList(filter, page.Skip, page.Limit, sortStr)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
@@ -693,20 +709,6 @@ func RenameSpiderFile(c *gin.Context) {
})
}
// 爬虫类型
func GetSpiderTypes(c *gin.Context) {
types, err := model.GetSpiderTypes()
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
c.JSON(http.StatusOK, Response{
Status: "ok",
Message: "success",
Data: types,
})
}
func GetSpiderStats(c *gin.Context) {
type Overview struct {
TaskCount int `json:"task_count" bson:"task_count"`
@@ -826,3 +828,25 @@ func GetSpiderStats(c *gin.Context) {
},
})
}
func GetSpiderSchedules(c *gin.Context) {
id := c.Param("id")
if !bson.IsObjectIdHex(id) {
HandleErrorF(http.StatusBadRequest, c, "spider_id is invalid")
return
}
// 获取定时任务
list, err := model.GetScheduleList(bson.M{"spider_id": bson.ObjectIdHex(id)})
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
c.JSON(http.StatusOK, Response{
Status: "ok",
Message: "success",
Data: list,
})
}

View File

@@ -32,24 +32,8 @@ func GetDepList(c *gin.Context) {
return
}
depList = list
} else {
HandleErrorF(http.StatusBadRequest, c, fmt.Sprintf("%s is not implemented", lang))
return
}
c.JSON(http.StatusOK, Response{
Status: "ok",
Message: "success",
Data: depList,
})
}
func GetInstalledDepList(c *gin.Context) {
nodeId := c.Param("id")
lang := c.Query("lang")
var depList []entity.Dependency
if lang == constants.Python {
list, err := services.GetPythonInstalledDepList(nodeId)
} else if lang == constants.Nodejs {
list, err := services.GetNodejsDepList(nodeId, depName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
@@ -67,8 +51,56 @@ func GetInstalledDepList(c *gin.Context) {
})
}
func GetAllDepList(c *gin.Context) {
func GetInstalledDepList(c *gin.Context) {
nodeId := c.Param("id")
lang := c.Query("lang")
var depList []entity.Dependency
if lang == constants.Python {
if services.IsMasterNode(nodeId) {
list, err := services.GetPythonLocalInstalledDepList(nodeId)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
depList = list
} else {
list, err := services.GetPythonRemoteInstalledDepList(nodeId)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
depList = list
}
} else if lang == constants.Nodejs {
if services.IsMasterNode(nodeId) {
list, err := services.GetNodejsLocalInstalledDepList(nodeId)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
depList = list
} else {
list, err := services.GetNodejsRemoteInstalledDepList(nodeId)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
depList = list
}
} else {
HandleErrorF(http.StatusBadRequest, c, fmt.Sprintf("%s is not implemented", lang))
return
}
c.JSON(http.StatusOK, Response{
Status: "ok",
Message: "success",
Data: depList,
})
}
func GetAllDepList(c *gin.Context) {
lang := c.Param("lang")
depName := c.Query("dep_name")
// 获取所有依赖列表
@@ -108,3 +140,176 @@ func GetAllDepList(c *gin.Context) {
Data: returnList,
})
}
func InstallDep(c *gin.Context) {
type ReqBody struct {
Lang string `json:"lang"`
DepName string `json:"dep_name"`
}
nodeId := c.Param("id")
var reqBody ReqBody
if err := c.ShouldBindJSON(&reqBody); err != nil {
HandleError(http.StatusBadRequest, c, err)
return
}
if reqBody.Lang == constants.Python {
if services.IsMasterNode(nodeId) {
_, err := services.InstallPythonLocalDep(reqBody.DepName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
} else {
_, err := services.InstallPythonRemoteDep(nodeId, reqBody.DepName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
}
} else if reqBody.Lang == constants.Nodejs {
if services.IsMasterNode(nodeId) {
_, err := services.InstallNodejsLocalDep(reqBody.DepName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
} else {
_, err := services.InstallNodejsRemoteDep(nodeId, reqBody.DepName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
}
} else {
HandleErrorF(http.StatusBadRequest, c, fmt.Sprintf("%s is not implemented", reqBody.Lang))
return
}
// TODO: check if install is successful
c.JSON(http.StatusOK, Response{
Status: "ok",
Message: "success",
})
}
func UninstallDep(c *gin.Context) {
type ReqBody struct {
Lang string `json:"lang"`
DepName string `json:"dep_name"`
}
nodeId := c.Param("id")
var reqBody ReqBody
if err := c.ShouldBindJSON(&reqBody); err != nil {
HandleError(http.StatusBadRequest, c, err)
}
if reqBody.Lang == constants.Python {
if services.IsMasterNode(nodeId) {
_, err := services.UninstallPythonLocalDep(reqBody.DepName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
} else {
_, err := services.UninstallPythonRemoteDep(nodeId, reqBody.DepName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
}
} else if reqBody.Lang == constants.Nodejs {
if services.IsMasterNode(nodeId) {
_, err := services.UninstallNodejsLocalDep(reqBody.DepName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
} else {
_, err := services.UninstallNodejsRemoteDep(nodeId, reqBody.DepName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
}
} else {
HandleErrorF(http.StatusBadRequest, c, fmt.Sprintf("%s is not implemented", reqBody.Lang))
return
}
// TODO: check if uninstall is successful
c.JSON(http.StatusOK, Response{
Status: "ok",
Message: "success",
})
}
func GetDepJson(c *gin.Context) {
depName := c.Param("dep_name")
lang := c.Param("lang")
var dep entity.Dependency
if lang == constants.Python {
_dep, err := services.FetchPythonDepInfo(depName)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
}
dep = _dep
} else {
HandleErrorF(http.StatusBadRequest, c, fmt.Sprintf("%s is not implemented", lang))
return
}
c.Header("Cache-Control", "max-age=86400")
c.JSON(http.StatusOK, Response{
Status: "ok",
Message: "success",
Data: dep,
})
}
func InstallLang(c *gin.Context) {
type ReqBody struct {
Lang string `json:"lang"`
}
nodeId := c.Param("id")
var reqBody ReqBody
if err := c.ShouldBindJSON(&reqBody); err != nil {
HandleError(http.StatusBadRequest, c, err)
return
}
if reqBody.Lang == constants.Nodejs {
if services.IsMasterNode(nodeId) {
_, err := services.InstallNodejsLocalLang()
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
} else {
_, err := services.InstallNodejsRemoteLang(nodeId)
if err != nil {
HandleError(http.StatusInternalServerError, c, err)
return
}
}
} else {
HandleErrorF(http.StatusBadRequest, c, fmt.Sprintf("%s is not implemented", reqBody.Lang))
return
}
// TODO: check if install is successful
c.JSON(http.StatusOK, Response{
Status: "ok",
Message: "success",
})
}

View File

@@ -0,0 +1,17 @@
#!/bin/env bash
# install nvm
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.2/install.sh | bash
export NVM_DIR="$([ -z "${XDG_CONFIG_HOME-}" ] && printf %s "${HOME}/.nvm" || printf %s "${XDG_CONFIG_HOME}/nvm")"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm
# install Node.js v8.12
nvm install 8.12
# create soft links
ln -s $HOME/.nvm/versions/node/v8.12.0/bin/npm /usr/local/bin/npm
ln -s $HOME/.nvm/versions/node/v8.12.0/bin/node /usr/local/bin/node
# environments manipulation
export NODE_PATH=$HOME.nvm/versions/node/v8.12.0/lib/node_modules
export PATH=$NODE_PATH:$PATH

View File

@@ -12,6 +12,7 @@ import (
"encoding/json"
"fmt"
"github.com/apex/log"
"github.com/globalsign/mgo"
"github.com/globalsign/mgo/bson"
"github.com/gomodule/redigo/redis"
"runtime/debug"
@@ -116,7 +117,7 @@ func handleNodeInfo(key string, data *Data) {
defer s.Close()
var node model.Node
if err := c.Find(bson.M{"key": key}).One(&node); err != nil {
if err := c.Find(bson.M{"key": key}).One(&node); err != nil && err == mgo.ErrNotFound {
// 数据库不存在该节点
node = model.Node{
Key: key,
@@ -133,7 +134,7 @@ func handleNodeInfo(key string, data *Data) {
log.Errorf(err.Error())
return
}
} else {
} else if node.Key != "" {
// 数据库存在该节点
node.Status = constants.StatusOnline
node.UpdateTs = time.Now()
@@ -190,35 +191,7 @@ func UpdateNodeData() {
log.Errorf(err.Error())
return
}
// 注释掉,无需这样处理。 直接覆盖key对应的节点信息即可 by xyz 2020.01.01
//先获取所有Redis的nodekey
/*list, _ := database.RedisClient.HKeys("nodes")
if i := utils.Contains(list, key); i == false {
// 构造节点数据
data := Data{
Key: key,
Mac: mac,
Ip: ip,
Master: model.IsMaster(),
UpdateTs: time.Now(),
UpdateTsUnix: time.Now().Unix(),
}
// 注册节点到Redis
dataBytes, err := json.Marshal(&data)
if err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return
}
if err := database.RedisClient.HSet("nodes", key, utils.BytesToString(dataBytes)); err != nil {
log.Errorf(err.Error())
return
}
}*/
}
func MasterNodeCallback(message redis.Message) (err error) {

231
backend/services/rpc.go Normal file
View File

@@ -0,0 +1,231 @@
package services
import (
"crawlab/constants"
"crawlab/database"
"crawlab/entity"
"crawlab/model"
"crawlab/utils"
"encoding/json"
"fmt"
"github.com/apex/log"
uuid "github.com/satori/go.uuid"
"runtime/debug"
)
type RpcMessage struct {
Id string `json:"id"`
Method string `json:"method"`
Params map[string]string `json:"params"`
Result string `json:"result"`
}
func RpcServerInstallLang(msg RpcMessage) RpcMessage {
lang := GetRpcParam("lang", msg.Params)
if lang == constants.Nodejs {
output, _ := InstallNodejsLocalLang()
msg.Result = output
}
return msg
}
func RpcClientInstallLang(nodeId string, lang string) (output string, err error) {
params := map[string]string{}
params["lang"] = lang
data, err := RpcClientFunc(nodeId, constants.RpcInstallLang, params, 600)()
if err != nil {
return
}
output = data
return
}
func RpcServerInstallDep(msg RpcMessage) RpcMessage {
lang := GetRpcParam("lang", msg.Params)
depName := GetRpcParam("dep_name", msg.Params)
if lang == constants.Python {
output, _ := InstallPythonLocalDep(depName)
msg.Result = output
}
return msg
}
func RpcClientInstallDep(nodeId string, lang string, depName string) (output string, err error) {
params := map[string]string{}
params["lang"] = lang
params["dep_name"] = depName
data, err := RpcClientFunc(nodeId, constants.RpcInstallDep, params, 10)()
if err != nil {
return
}
output = data
return
}
func RpcServerUninstallDep(msg RpcMessage) RpcMessage {
lang := GetRpcParam("lang", msg.Params)
depName := GetRpcParam("dep_name", msg.Params)
if lang == constants.Python {
output, _ := UninstallPythonLocalDep(depName)
msg.Result = output
}
return msg
}
func RpcClientUninstallDep(nodeId string, lang string, depName string) (output string, err error) {
params := map[string]string{}
params["lang"] = lang
params["dep_name"] = depName
data, err := RpcClientFunc(nodeId, constants.RpcUninstallDep, params, 60)()
if err != nil {
return
}
output = data
return
}
func RpcServerGetInstalledDepList(nodeId string, msg RpcMessage) RpcMessage {
lang := GetRpcParam("lang", msg.Params)
if lang == constants.Python {
depList, _ := GetPythonLocalInstalledDepList(nodeId)
resultStr, _ := json.Marshal(depList)
msg.Result = string(resultStr)
} else if lang == constants.Nodejs {
depList, _ := GetNodejsLocalInstalledDepList(nodeId)
resultStr, _ := json.Marshal(depList)
msg.Result = string(resultStr)
}
return msg
}
func RpcClientGetInstalledDepList(nodeId string, lang string) (list []entity.Dependency, err error) {
params := map[string]string{}
params["lang"] = lang
data, err := RpcClientFunc(nodeId, constants.RpcGetInstalledDepList, params, 10)()
if err != nil {
return
}
// 反序列化结果
if err := json.Unmarshal([]byte(data), &list); err != nil {
return list, err
}
return
}
func RpcClientFunc(nodeId string, method string, params map[string]string, timeout int) func() (string, error) {
return func() (result string, err error) {
// 请求ID
id := uuid.NewV4().String()
// 构造RPC消息
msg := RpcMessage{
Id: id,
Method: method,
Params: params,
Result: "",
}
// 发送RPC消息
msgStr := ObjectToString(msg)
if err := database.RedisClient.LPush(fmt.Sprintf("rpc:%s", nodeId), msgStr); err != nil {
return result, err
}
// 获取RPC回复消息
dataStr, err := database.RedisClient.BRPop(fmt.Sprintf("rpc:%s", nodeId), timeout)
if err != nil {
return result, err
}
// 反序列化消息
if err := json.Unmarshal([]byte(dataStr), &msg); err != nil {
return result, err
}
return msg.Result, err
}
}
func GetRpcParam(key string, params map[string]string) string {
return params[key]
}
func ObjectToString(params interface{}) string {
bytes, _ := json.Marshal(params)
return utils.BytesToString(bytes)
}
var IsRpcStopped = false
func StopRpcService() {
IsRpcStopped = true
}
func InitRpcService() error {
go func() {
for {
// 获取当前节点
node, err := model.GetCurrentNode()
if err != nil {
log.Errorf(err.Error())
debug.PrintStack()
continue
}
// 获取获取消息队列信息
dataStr, err := database.RedisClient.BRPop(fmt.Sprintf("rpc:%s", node.Id.Hex()), 300)
if err != nil {
log.Errorf(err.Error())
debug.PrintStack()
continue
}
// 反序列化消息
var msg RpcMessage
if err := json.Unmarshal([]byte(dataStr), &msg); err != nil {
log.Errorf(err.Error())
debug.PrintStack()
continue
}
// 根据Method调用本地方法
var replyMsg RpcMessage
if msg.Method == constants.RpcInstallDep {
replyMsg = RpcServerInstallDep(msg)
} else if msg.Method == constants.RpcUninstallDep {
replyMsg = RpcServerUninstallDep(msg)
} else if msg.Method == constants.RpcInstallLang {
replyMsg = RpcServerInstallLang(msg)
} else if msg.Method == constants.RpcGetInstalledDepList {
replyMsg = RpcServerGetInstalledDepList(node.Id.Hex(), msg)
} else {
continue
}
// 发送返回消息
if err := database.RedisClient.LPush(fmt.Sprintf("rpc:%s", node.Id.Hex()), ObjectToString(replyMsg)); err != nil {
log.Errorf(err.Error())
debug.PrintStack()
continue
}
// 如果停止RPC服务则返回
if IsRpcStopped {
return
}
}
}()
return nil
}

View File

@@ -53,6 +53,8 @@ func AddScheduleTask(s model.Schedule) func() {
Param: s.Param,
}
if err := AddTask(t); err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return
}
if err := AssignTask(t); err != nil {
@@ -137,7 +139,7 @@ func (s *Scheduler) Start() error {
func (s *Scheduler) AddJob(job model.Schedule) error {
spec := job.Cron
// 添加任务
// 添加定时任务
eid, err := s.cron.AddFunc(spec, AddScheduleTask(job))
if err != nil {
log.Errorf("add func task error: %s", err.Error())
@@ -147,7 +149,12 @@ func (s *Scheduler) AddJob(job model.Schedule) error {
// 更新EntryID
job.EntryId = eid
// 更新状态
job.Status = constants.ScheduleStatusRunning
job.Enabled = true
// 保存定时任务
if err := job.Save(); err != nil {
log.Errorf("job save error: %s", err.Error())
debug.PrintStack()
@@ -176,8 +183,8 @@ func ParserCron(spec string) error {
return nil
}
// 停止定时任务
func (s *Scheduler) Stop(id bson.ObjectId) error {
// 禁用定时任务
func (s *Scheduler) Disable(id bson.ObjectId) error {
schedule, err := model.GetSchedule(id)
if err != nil {
return err
@@ -185,17 +192,22 @@ func (s *Scheduler) Stop(id bson.ObjectId) error {
if schedule.EntryId == 0 {
return errors.New("entry id not found")
}
// 从cron服务中删除该任务
s.cron.Remove(schedule.EntryId)
// 更新状态
schedule.Status = constants.ScheduleStatusStop
schedule.Enabled = false
if err = schedule.Save(); err != nil {
return err
}
return nil
}
// 运行任务
func (s *Scheduler) Run(id bson.ObjectId) error {
// 启用定时任务
func (s *Scheduler) Enable(id bson.ObjectId) error {
schedule, err := model.GetSchedule(id)
if err != nil {
return err

View File

@@ -143,7 +143,7 @@ func ReadFileByStep(filePath string, handle func([]byte, *mgo.GridFile), fileCre
// 发布所有爬虫
func PublishAllSpiders() {
// 获取爬虫列表
spiders, _, _ := model.GetSpiderList(nil, 0, constants.Infinite)
spiders, _, _ := model.GetSpiderList(nil, 0, constants.Infinite, "-_id")
if len(spiders) == 0 {
return
}

View File

@@ -13,6 +13,7 @@ import (
"github.com/apex/log"
"github.com/imroc/req"
"os/exec"
"path"
"regexp"
"runtime/debug"
"sort"
@@ -20,29 +21,10 @@ import (
"sync"
)
type PythonDepJsonData struct {
Info PythonDepJsonDataInfo `json:"info"`
}
type PythonDepJsonDataInfo struct {
Name string `json:"name"`
Summary string `json:"summary"`
Version string `json:"version"`
}
type PythonDepNameDict struct {
Name string `json:"name"`
Weight int `json:"weight"`
}
type PythonDepNameDictSlice []PythonDepNameDict
func (s PythonDepNameDictSlice) Len() int { return len(s) }
func (s PythonDepNameDictSlice) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
func (s PythonDepNameDictSlice) Less(i, j int) bool { return s[i].Weight > s[j].Weight }
// 系统信息 chan 映射
var SystemInfoChanMap = utils.NewChanMap()
// 从远端获取系统信息
func GetRemoteSystemInfo(nodeId string) (sysInfo entity.SystemInfo, err error) {
// 发送消息
msg := entity.NodeMessage{
@@ -70,6 +52,7 @@ func GetRemoteSystemInfo(nodeId string) (sysInfo entity.SystemInfo, err error) {
return sysInfo, nil
}
// 获取系统信息
func GetSystemInfo(nodeId string) (sysInfo entity.SystemInfo, err error) {
if IsMasterNode(nodeId) {
sysInfo, err = model.GetLocalSystemInfo()
@@ -79,11 +62,12 @@ func GetSystemInfo(nodeId string) (sysInfo entity.SystemInfo, err error) {
return
}
// 获取语言列表
func GetLangList(nodeId string) []entity.Lang {
list := []entity.Lang{
{Name: "Python", ExecutableName: "python", ExecutablePath: "/usr/local/bin/python", DepExecutablePath: "/usr/local/bin/pip"},
{Name: "NodeJS", ExecutableName: "node", ExecutablePath: "/usr/local/bin/node"},
{Name: "Java", ExecutableName: "java", ExecutablePath: "/usr/local/bin/java"},
{Name: "Node.js", ExecutableName: "node", ExecutablePath: "/usr/local/bin/node", DepExecutablePath: "/usr/local/bin/npm"},
//{Name: "Java", ExecutableName: "java", ExecutablePath: "/usr/local/bin/java"},
}
for i, lang := range list {
list[i].Installed = IsInstalledLang(nodeId, lang)
@@ -91,6 +75,7 @@ func GetLangList(nodeId string) []entity.Lang {
return list
}
// 根据语言名获取语言实例
func GetLangFromLangName(nodeId string, name string) entity.Lang {
langList := GetLangList(nodeId)
for _, lang := range langList {
@@ -101,6 +86,70 @@ func GetLangFromLangName(nodeId string, name string) entity.Lang {
return entity.Lang{}
}
// 是否已安装该依赖
func IsInstalledLang(nodeId string, lang entity.Lang) bool {
sysInfo, err := GetSystemInfo(nodeId)
if err != nil {
return false
}
for _, exec := range sysInfo.Executables {
if exec.Path == lang.ExecutablePath {
return true
}
}
return false
}
// 是否已安装该依赖
func IsInstalledDep(installedDepList []entity.Dependency, dep entity.Dependency) bool {
for _, _dep := range installedDepList {
if strings.ToLower(_dep.Name) == strings.ToLower(dep.Name) {
return true
}
}
return false
}
// 初始化函数
func InitDepsFetcher() error {
c := cron.New(cron.WithSeconds())
c.Start()
if _, err := c.AddFunc("0 */5 * * * *", UpdatePythonDepList); err != nil {
return err
}
go func() {
UpdatePythonDepList()
}()
return nil
}
// =========
// Python
// =========
type PythonDepJsonData struct {
Info PythonDepJsonDataInfo `json:"info"`
}
type PythonDepJsonDataInfo struct {
Name string `json:"name"`
Summary string `json:"summary"`
Version string `json:"version"`
}
type PythonDepNameDict struct {
Name string `json:"name"`
Weight int `json:"weight"`
}
type PythonDepNameDictSlice []PythonDepNameDict
func (s PythonDepNameDictSlice) Len() int { return len(s) }
func (s PythonDepNameDictSlice) Swap(i, j int) { s[i], s[j] = s[j], s[i] }
func (s PythonDepNameDictSlice) Less(i, j int) bool { return s[i].Weight > s[j].Weight }
// 获取Python本地依赖列表
func GetPythonDepList(nodeId string, searchDepName string) ([]entity.Dependency, error) {
var list []entity.Dependency
@@ -129,22 +178,51 @@ func GetPythonDepList(nodeId string, searchDepName string) ([]entity.Dependency,
}
}
// 获取已安装依赖
installedDepList, err := GetPythonInstalledDepList(nodeId)
if err != nil {
return list, err
// 获取已安装依赖列表
var installedDepList []entity.Dependency
if IsMasterNode(nodeId) {
installedDepList, err = GetPythonLocalInstalledDepList(nodeId)
if err != nil {
return list, err
}
} else {
installedDepList, err = GetPythonRemoteInstalledDepList(nodeId)
if err != nil {
return list, err
}
}
// 从依赖源获取数据
var goSync sync.WaitGroup
// 根据依赖名排序
sort.Stable(depNameList)
// 遍历依赖名列表取前20个
for i, depNameDict := range depNameList {
if i > 20 {
break
}
dep := entity.Dependency{
Name: depNameDict.Name,
}
dep.Installed = IsInstalledDep(installedDepList, dep)
list = append(list, dep)
}
// 从依赖源获取信息
//list, err = GetPythonDepListWithInfo(list)
return list, nil
}
// 获取Python依赖的源数据信息
func GetPythonDepListWithInfo(depList []entity.Dependency) ([]entity.Dependency, error) {
var goSync sync.WaitGroup
for i, dep := range depList {
if i > 10 {
break
}
goSync.Add(1)
go func(depName string, n *sync.WaitGroup) {
url := fmt.Sprintf("https://pypi.org/pypi/%s/json", depName)
go func(i int, dep entity.Dependency, depList []entity.Dependency, n *sync.WaitGroup) {
url := fmt.Sprintf("https://pypi.org/pypi/%s/json", dep.Name)
res, err := req.Get(url)
if err != nil {
n.Done()
@@ -155,21 +233,38 @@ func GetPythonDepList(nodeId string, searchDepName string) ([]entity.Dependency,
n.Done()
return
}
dep := entity.Dependency{
Name: depName,
Version: data.Info.Version,
Description: data.Info.Summary,
}
dep.Installed = IsInstalledDep(installedDepList, dep)
list = append(list, dep)
depList[i].Version = data.Info.Version
depList[i].Description = data.Info.Summary
n.Done()
}(depNameDict.Name, &goSync)
}(i, dep, depList, &goSync)
}
goSync.Wait()
return list, nil
return depList, nil
}
func FetchPythonDepInfo(depName string) (entity.Dependency, error) {
url := fmt.Sprintf("https://pypi.org/pypi/%s/json", depName)
res, err := req.Get(url)
if err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return entity.Dependency{}, err
}
var data PythonDepJsonData
if err := res.ToJSON(&data); err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return entity.Dependency{}, err
}
dep := entity.Dependency{
Name: depName,
Version: data.Info.Version,
Description: data.Info.Summary,
}
return dep, nil
}
// 从Redis获取Python依赖列表
func GetPythonDepListFromRedis() ([]string, error) {
var list []string
@@ -192,28 +287,7 @@ func GetPythonDepListFromRedis() ([]string, error) {
return list, nil
}
func IsInstalledLang(nodeId string, lang entity.Lang) bool {
sysInfo, err := GetSystemInfo(nodeId)
if err != nil {
return false
}
for _, exec := range sysInfo.Executables {
if exec.Path == lang.ExecutablePath {
return true
}
}
return false
}
func IsInstalledDep(installedDepList []entity.Dependency, dep entity.Dependency) bool {
for _, _dep := range installedDepList {
if strings.ToLower(_dep.Name) == strings.ToLower(dep.Name) {
return true
}
}
return false
}
// 从Python依赖源获取依赖列表并返回
func FetchPythonDepList() ([]string, error) {
// 依赖URL
url := "https://pypi.tuna.tsinghua.edu.cn/simple"
@@ -251,6 +325,7 @@ func FetchPythonDepList() ([]string, error) {
return list, nil
}
// 更新Python依赖列表到Redis
func UpdatePythonDepList() {
// 从依赖源获取列表
list, _ := FetchPythonDepList()
@@ -271,7 +346,8 @@ func UpdatePythonDepList() {
}
}
func GetPythonInstalledDepList(nodeId string) ([]entity.Dependency, error){
// 获取Python本地已安装的依赖列表
func GetPythonLocalInstalledDepList(nodeId string) ([]entity.Dependency, error) {
var list []entity.Dependency
lang := GetLangFromLangName(nodeId, constants.Python)
@@ -301,11 +377,206 @@ func GetPythonInstalledDepList(nodeId string) ([]entity.Dependency, error){
return list, nil
}
func InitDepsFetcher() error {
c := cron.New(cron.WithSeconds())
c.Start()
if _, err := c.AddFunc("0 */5 * * * *", UpdatePythonDepList); err != nil {
return err
// 获取Python远端依赖列表
func GetPythonRemoteInstalledDepList(nodeId string) ([]entity.Dependency, error) {
depList, err := RpcClientGetInstalledDepList(nodeId, constants.Python)
if err != nil {
return depList, err
}
return nil
return depList, nil
}
// 安装Python本地依赖
func InstallPythonLocalDep(depName string) (string, error) {
// 依赖镜像URL
url := "https://pypi.tuna.tsinghua.edu.cn/simple"
cmd := exec.Command("pip", "install", depName, "-i", url)
outputBytes, err := cmd.Output()
if err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return fmt.Sprintf("error: %s", err.Error()), err
}
return string(outputBytes), nil
}
// 获取Python远端依赖列表
func InstallPythonRemoteDep(nodeId string, depName string) (string, error) {
output, err := RpcClientInstallDep(nodeId, constants.Python, depName)
if err != nil {
return output, err
}
return output, nil
}
// 安装Python本地依赖
func UninstallPythonLocalDep(depName string) (string, error) {
cmd := exec.Command("pip", "uninstall", "-y", depName)
outputBytes, err := cmd.Output()
if err != nil {
log.Errorf(string(outputBytes))
log.Errorf(err.Error())
debug.PrintStack()
return fmt.Sprintf("error: %s", err.Error()), err
}
return string(outputBytes), nil
}
// 获取Python远端依赖列表
func UninstallPythonRemoteDep(nodeId string, depName string) (string, error) {
output, err := RpcClientUninstallDep(nodeId, constants.Python, depName)
if err != nil {
return output, err
}
return output, nil
}
// ==============
// Node.js
// ==============
func InstallNodejsLocalLang() (string, error) {
cmd := exec.Command("/bin/sh", path.Join("scripts", "install-nodejs.sh"))
output, err := cmd.Output()
if err != nil {
log.Error(err.Error())
debug.PrintStack()
return string(output), err
}
// TODO: check if Node.js is installed successfully
return string(output), nil
}
// 获取Node.js远端依赖列表
func InstallNodejsRemoteLang(nodeId string) (string, error) {
output, err := RpcClientInstallLang(nodeId, constants.Nodejs)
if err != nil {
return output, err
}
return output, nil
}
// 获取Nodejs本地已安装的依赖列表
func GetNodejsLocalInstalledDepList(nodeId string) ([]entity.Dependency, error) {
var list []entity.Dependency
lang := GetLangFromLangName(nodeId, constants.Nodejs)
if !IsInstalledLang(nodeId, lang) {
return list, errors.New("nodejs is not installed")
}
cmd := exec.Command("npm", "ls", "-g", "--depth", "0")
outputBytes, _ := cmd.Output()
//if err != nil {
// log.Error("error: " + string(outputBytes))
// debug.PrintStack()
// return list, err
//}
regex := regexp.MustCompile("\\s(.*)@(.*)")
for _, line := range strings.Split(string(outputBytes), "\n") {
arr := regex.FindStringSubmatch(line)
if len(arr) < 3 {
continue
}
dep := entity.Dependency{
Name: strings.ToLower(arr[1]),
Version: arr[2],
Installed: true,
}
list = append(list, dep)
}
return list, nil
}
// 获取Nodejs远端依赖列表
func GetNodejsRemoteInstalledDepList(nodeId string) ([]entity.Dependency, error) {
depList, err := RpcClientGetInstalledDepList(nodeId, constants.Nodejs)
if err != nil {
return depList, err
}
return depList, nil
}
// 安装Nodejs本地依赖
func InstallNodejsLocalDep(depName string) (string, error) {
// 依赖镜像URL
url := "https://registry.npm.taobao.org"
cmd := exec.Command("npm", "install", depName, "-g", "--registry", url)
outputBytes, err := cmd.Output()
if err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return fmt.Sprintf("error: %s", err.Error()), err
}
return string(outputBytes), nil
}
// 获取Nodejs远端依赖列表
func InstallNodejsRemoteDep(nodeId string, depName string) (string, error) {
output, err := RpcClientInstallDep(nodeId, constants.Nodejs, depName)
if err != nil {
return output, err
}
return output, nil
}
// 安装Nodejs本地依赖
func UninstallNodejsLocalDep(depName string) (string, error) {
cmd := exec.Command("npm", "uninstall", depName, "-g")
outputBytes, err := cmd.Output()
if err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return fmt.Sprintf("error: %s", err.Error()), err
}
return string(outputBytes), nil
}
// 获取Nodejs远端依赖列表
func UninstallNodejsRemoteDep(nodeId string, depName string) (string, error) {
output, err := RpcClientUninstallDep(nodeId, constants.Nodejs, depName)
if err != nil {
return output, err
}
return output, nil
}
// 获取Nodejs本地依赖列表
func GetNodejsDepList(nodeId string, searchDepName string) (depList []entity.Dependency, err error) {
// 执行shell命令
cmd := exec.Command("npm", "search", "--json", searchDepName)
outputBytes, _ := cmd.Output()
// 获取已安装依赖列表
var installedDepList []entity.Dependency
if IsMasterNode(nodeId) {
installedDepList, err = GetNodejsLocalInstalledDepList(nodeId)
if err != nil {
return depList, err
}
} else {
installedDepList, err = GetNodejsRemoteInstalledDepList(nodeId)
if err != nil {
return depList, err
}
}
// 反序列化
if err := json.Unmarshal(outputBytes, &depList); err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return depList, err
}
// 遍历安装列表
for i, dep := range depList {
depList[i].Installed = IsInstalledDep(installedDepList, dep)
}
return depList, nil
}

View File

@@ -19,6 +19,7 @@ import (
"runtime"
"runtime/debug"
"strconv"
"strings"
"sync"
"syscall"
"time"
@@ -104,6 +105,17 @@ func AssignTask(task model.Task) error {
// 设置环境变量
func SetEnv(cmd *exec.Cmd, envs []model.Env, taskId string, dataCol string) *exec.Cmd {
// 默认把Node.js的全局node_modules加入环境变量
envPath := os.Getenv("PATH")
for _, _path := range strings.Split(envPath, ":") {
if strings.Contains(_path, "/.nvm/versions/node/") {
pathNodeModules := strings.Replace(_path, "/bin", "/lib/node_modules", -1)
_ = os.Setenv("PATH", pathNodeModules+":"+envPath)
_ = os.Setenv("NODE_PATH", pathNodeModules)
break
}
}
// 默认环境变量
cmd.Env = append(os.Environ(), "CRAWLAB_TASK_ID="+taskId)
cmd.Env = append(cmd.Env, "CRAWLAB_COLLECTION="+dataCol)
@@ -615,11 +627,15 @@ func AddTask(t model.Task) error {
// 将任务存入数据库
if err := model.AddTask(t); err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return err
}
// 加入任务队列
if err := AssignTask(t); err != nil {
log.Errorf(err.Error())
debug.PrintStack()
return err
}

View File

@@ -8,12 +8,15 @@ services:
CRAWLAB_SERVER_MASTER: "Y" # whether to be master node 是否为主节点,主节点为 Y工作节点为 N
CRAWLAB_MONGO_HOST: "mongo" # MongoDB host address MongoDB 的地址,在 docker compose 网络中,直接引用服务名称
CRAWLAB_REDIS_ADDRESS: "redis" # Redis host address Redis 的地址,在 docker compose 网络中,直接引用服务名称
# CRAWLAB_SERVER_LANG_NODE: "Y" # 预安装 Node.js 语言环境
ports:
- "8080:8080" # frontend port mapping 前端端口映射
- "8000:8000" # backend port mapping 后端端口映射
depends_on:
- mongo
- redis
volumes:
- "/Users/marvzhang/projects/crawlab-team/crawlab/docker_init.sh:/app/docker_init.sh"
worker:
image: tikazyq/crawlab:latest
container_name: worker

View File

@@ -1,4 +1,4 @@
#!/bin/sh
#!/bin/bash
# replace default api path to new one
if [ "${CRAWLAB_API_ADDRESS}" = "" ];
@@ -22,5 +22,12 @@ fi
# start nginx
service nginx start
# install languages: Node.js
if [ "${CRAWLAB_SERVER_LANG_NODE}" = "Y" ];
then
echo "installing node.js"
/bin/sh /app/backend/scripts/install-nodejs.sh
fi
# start backend
crawlab

View File

@@ -6,6 +6,10 @@
<meta name="renderer" content="webkit">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no">
<link rel="icon" href="/static/favicon.ico" type="image/x-icon">
<!-- Place this tag in your head or just before your close body tag. -->
<script async defer src="https://buttons.github.io/buttons.js"></script>
<title>Crawlab</title>
</head>
<body>

View File

@@ -1,6 +1,6 @@
{
"name": "crawlab",
"version": "0.4.2",
"version": "0.4.3",
"private": true,
"scripts": {
"serve": "vue-cli-service serve --ip=0.0.0.0 --mode=development",
@@ -35,6 +35,7 @@
"vue-ba": "^1.2.5",
"vue-codemirror": "^4.0.6",
"vue-codemirror-lite": "^1.0.4",
"vue-github-button": "^1.1.2",
"vue-i18n": "^8.9.0",
"vue-router": "^3.0.1",
"vue-virtual-scroll-list": "^1.3.9",

View File

@@ -6,6 +6,10 @@
<meta name="renderer" content="webkit">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no">
<link rel="icon" href="/favicon.ico" type="image/x-icon">
<!-- Place this tag in your head or just before your close body tag. -->
<script async defer src="https://buttons.github.io/buttons.js"></script>
<title>Crawlab</title>
</head>
<body>

View File

@@ -55,7 +55,7 @@ export default {
})
}
})
this.$st.sendEv('节点详情', '保存')
this.$st.sendEv('节点详情', '概览', '保存')
}
}
}

View File

@@ -51,6 +51,10 @@
</el-form>
</el-row>
<el-row class="button-container" v-if="!isView">
<el-button size="normal" v-if="isShowRun" type="danger" @click="onCrawl"
icon="el-icon-video-play" style="margin-right: 10px">
{{$t('Run')}}
</el-button>
<el-upload
v-if="spiderForm.type === 'customized'"
:action="$request.baseUrl + `/spiders/${spiderForm._id}/upload`"
@@ -69,8 +73,7 @@
icon="el-icon-video-play">
{{$t('Run')}}
</el-button>
<el-button size="small" type="success" @click="onSave"
icon="el-icon-check">
<el-button size="small" type="success" @click="onSave" icon="el-icon-check">
{{$t('Save')}}
</el-button>
</el-row>

View File

@@ -1,18 +1,29 @@
<template>
<div class="node-installation">
<el-form inline>
<el-form-item v-if="!isShowInstalled">
<el-form-item>
<el-autocomplete
v-if="activeLang.executable_name === 'python'"
v-model="depName"
style="width: 240px"
:placeholder="$t('Search Dependencies')"
:fetchSuggestions="fetchAllDepList"
minlength="2"
:minlength="2"
@select="onSearch"
/>
<el-input
v-else
v-model="depName"
style="width: 240px"
:placeholder="$t('Search Dependencies')"
/>
</el-form-item>
<el-form-item>
<el-button icon="el-icon-search" type="success" @click="onSearch">
<el-button
icon="el-icon-search"
type="success"
@click="onSearch"
>
{{$t('Search')}}
</el-button>
</el-form-item>
@@ -20,7 +31,7 @@
<el-checkbox v-model="isShowInstalled" :label="$t('Show installed')" @change="onIsShowInstalledChange"/>
</el-form-item>
</el-form>
<el-tabs v-model="activeTab">
<el-tabs v-model="activeTab" @tab-click="onTabChange">
<el-tab-pane v-for="lang in langList" :key="lang.name" :label="lang.name" :name="lang.executable_name"/>
</el-tabs>
<template v-if="activeLang.installed">
@@ -37,7 +48,7 @@
width="180"
/>
<el-table-column
:label="$t('Latest Version')"
:label="!isShowInstalled ? $t('Latest Version') : $t('Version')"
prop="version"
width="100"
/>
@@ -51,10 +62,24 @@
>
<template slot-scope="scope">
<el-button
v-if="!scope.row.installed"
v-loading="getDepLoading(scope.row)"
:disabled="getDepLoading(scope.row)"
size="mini"
:type="scope.row.installed ? 'danger' : 'primary' "
type="primary"
@click="onClickInstallDep(scope.row)"
>
{{scope.row.installed ? $t('Uninstall') : $t('Install')}}
{{$t('Install')}}
</el-button>
<el-button
v-else
v-loading="getDepLoading(scope.row)"
:disabled="getDepLoading(scope.row)"
size="mini"
type="danger"
@click="onClickUninstallDep(scope.row)"
>
{{$t('Uninstall')}}
</el-button>
</template>
</el-table-column>
@@ -63,7 +88,13 @@
<template v-else>
<div class="install-wrapper">
<h3>{{activeLang.name + $t(' is not installed, do you want to install it?')}}</h3>
<el-button type="primary" style="width: 240px;font-weight: bolder;font-size: 18px">
<el-button
v-loading="isLoadingInstallLang"
:disabled="isLoadingInstallLang"
type="primary"
style="width: 240px;font-weight: bolder;font-size: 18px"
@click="onClickInstallLang"
>
{{$t('Install')}}
</el-button>
</div>
@@ -86,7 +117,9 @@ export default {
depList: [],
loading: false,
isShowInstalled: false,
installedDepList: []
installedDepList: [],
depLoadingDict: {},
isLoadingInstallLang: false
}
},
computed: {
@@ -101,6 +134,9 @@ export default {
}
return {}
},
activeLangName () {
return this.activeLang.executable_name
},
computedDepList () {
if (this.isShowInstalled) {
return this.installedDepList
@@ -117,7 +153,19 @@ export default {
dep_name: this.depName
})
this.loading = false
this.depList = res.data.data.sort((a, b) => a.name > b.name ? 1 : -1)
this.depList = res.data.data
if (this.activeLangName === 'python') {
// 排序
this.depList = this.depList.sort((a, b) => a.name > b.name ? 1 : -1)
// 异步获取python附加信息
this.depList.map(async dep => {
const res = await this.$request.get(`/system/deps/${this.activeLang.executable_name}/${dep.name}/json`)
dep.version = res.data.data.version
dep.description = res.data.data.description
})
}
},
async getInstalledDepList () {
this.loading = true
@@ -128,8 +176,7 @@ export default {
this.installedDepList = res.data.data
},
async fetchAllDepList (queryString, callback) {
const res = await this.$request.get('/system/deps', {
lang: this.activeLang.executable_name,
const res = await this.$request.get(`/system/deps/${this.activeLang.executable_name}`, {
dep_name: queryString
})
callback(res.data.data ? res.data.data.map(d => {
@@ -137,16 +184,101 @@ export default {
}) : [])
},
onSearch () {
if (!this.isShowInstalled) {
this.getDepList()
} else {
this.getInstalledDepList()
}
this.isShowInstalled = false
this.getDepList()
this.$st.sendEv('节点详情', '安装', '搜索依赖')
},
onIsShowInstalledChange (val) {
if (val) {
this.getInstalledDepList()
} else {
this.depName = ''
this.depList = []
}
this.$st.sendEv('节点详情', '安装', '点击查看已安装')
},
async onClickInstallDep (dep) {
const name = dep.name
this.$set(this.depLoadingDict, name, true)
const arr = this.$route.path.split('/')
const id = arr[arr.length - 1]
const data = await this.$request.post(`/nodes/${id}/deps/install`, {
lang: this.activeLang.executable_name,
dep_name: name
})
if (!data || data.error) {
this.$notify.error({
title: this.$t('Installing dependency failed'),
message: this.$t('The dependency installation is unsuccessful: ') + name
})
} else {
this.$notify.success({
title: this.$t('Installing dependency successful'),
message: this.$t('You have successfully installed a dependency: ') + name
})
dep.installed = true
}
this.$set(this.depLoadingDict, name, false)
this.$st.sendEv('节点详情', '安装', '安装依赖')
},
async onClickUninstallDep (dep) {
const name = dep.name
this.$set(this.depLoadingDict, name, true)
const arr = this.$route.path.split('/')
const id = arr[arr.length - 1]
const data = await this.$request.post(`/nodes/${id}/deps/uninstall`, {
lang: this.activeLang.executable_name,
dep_name: name
})
if (!data || data.error) {
this.$notify.error({
title: this.$t('Uninstalling dependency failed'),
message: this.$t('The dependency uninstallation is unsuccessful: ') + name
})
} else {
this.$notify.success({
title: this.$t('Uninstalling dependency successful'),
message: this.$t('You have successfully uninstalled a dependency: ') + name
})
dep.installed = false
}
this.$set(this.depLoadingDict, name, false)
this.$st.sendEv('节点详情', '安装', '卸载依赖')
},
getDepLoading (dep) {
const name = dep.name
if (this.depLoadingDict[name] === undefined) {
return false
}
return this.depLoadingDict[name]
},
async onClickInstallLang () {
this.isLoadingInstallLang = true
const res = await this.$request.post(`/nodes/${this.nodeForm._id}/langs/install`, {
lang: this.activeLang.executable_name
})
if (!res || res.error) {
this.$notify.error({
title: this.$t('Installing language failed'),
message: this.$t('The language installation is unsuccessful: ') + this.activeLang.name
})
} else {
this.$notify.success({
title: this.$t('Installing language successful'),
message: this.$t('You have successfully installed a language: ') + this.activeLang.name
})
}
this.isLoadingInstallLang = false
this.$st.sendEv('节点详情', '安装', '安装语言')
},
onTabChange () {
if (this.isShowInstalled) {
this.getInstalledDepList()
} else {
this.depName = ''
this.depList = []
}
this.$st.sendEv('节点详情', '安装', '切换标签')
}
},
async created () {
@@ -160,5 +292,13 @@ export default {
</script>
<style scoped>
.node-installation >>> .el-button .el-loading-spinner {
margin-top: -13px;
height: 28px;
}
.node-installation >>> .el-button .el-loading-spinner .circular {
width: 28px;
height: 28px;
}
</style>

View File

@@ -223,6 +223,8 @@ export default {
'error': '错误',
'Not Found Node': '节点配置错误',
'Not Found Spider': '爬虫配置错误',
'[minute] [hour] [day] [month] [day of week]': '[分] [时] [天] [月] [星期几]',
'Enable/Disable': '启用/禁用',
// 网站
'Site': '网站',
@@ -267,6 +269,7 @@ export default {
'Number of CPU': 'CPU数',
'Executables': '执行文件',
'Latest Version': '最新版本',
'Version': '版本',
// 弹出框
'Notification': '提示',
@@ -299,7 +302,26 @@ export default {
'Disclaimer': '免责声明',
'Please search dependencies': '请搜索依赖',
'No Data': '暂无数据',
'Show installed': '看已安装',
'Show installed': '看已安装',
'Installing dependency successful': '安装依赖成功',
'Installing dependency failed': '安装依赖失败',
'You have successfully installed a dependency: ': '您已成功安装依赖: ',
'The dependency installation is unsuccessful: ': '安装依赖失败: ',
'Uninstalling dependency successful': '卸载依赖成功',
'Uninstalling dependency failed': '卸载依赖失败',
'You have successfully uninstalled a dependency: ': '您已成功卸载依赖: ',
'The dependency uninstallation is unsuccessful: ': '卸载依赖失败: ',
'Installing language successful': '安装语言成功',
'Installing language failed': '安装语言失败',
'You have successfully installed a language: ': '您已成功安装语言: ',
'The language installation is unsuccessful: ': '安装语言失败: ',
'Enabling the schedule successful': '启用定时任务成功',
'Disabling the schedule successful': '禁用定时任务成功',
'Enabling the schedule unsuccessful': '启用定时任务失败',
'Disabling the schedule unsuccessful': '禁用定时任务失败',
'The schedule has been removed': '已删除定时任务',
'The schedule has been added': '已添加定时任务',
'The schedule has been saved': '已保存定时任务',
// 登录
'Sign in': '登录',
@@ -334,5 +356,8 @@ export default {
add_cron: '生成Cron',
// Cron Format: [second] [minute] [hour] [day of month] [month] [day of week]
cron_format: 'Cron 格式: [秒] [分] [小时] [日] [月] [周]'
}
},
// 其他
'Star crawlab-team/crawlab on GitHub': '在 GitHub 上为 Crawlab 加星吧'
}

View File

@@ -21,7 +21,12 @@ const actions = {
getScheduleList ({ state, commit }) {
request.get('/schedules')
.then(response => {
commit('SET_SCHEDULE_LIST', response.data.data)
commit('SET_SCHEDULE_LIST', response.data.data.map(d => {
const arr = d.cron.split(' ')
arr.splice(0, 1)
d.cron = arr.join(' ')
return d
}))
})
},
addSchedule ({ state }) {
@@ -33,11 +38,11 @@ const actions = {
removeSchedule ({ state }, id) {
request.delete(`/schedules/${id}`)
},
stopSchedule ({ state, dispatch }, id) {
return request.post(`/schedules/${id}/stop`)
enableSchedule ({ state, dispatch }, id) {
return request.post(`/schedules/${id}/enable`)
},
runSchedule ({ state, dispatch }, id) {
return request.post(`/schedules/${id}/run`)
disableSchedule ({ state, dispatch }, id) {
return request.post(`/schedules/${id}/disable`)
}
}

View File

@@ -180,6 +180,11 @@ const actions = {
async getTemplateList ({ state, commit }) {
const res = await request.get(`/config_spiders_templates`)
commit('SET_TEMPLATE_LIST', res.data.data)
},
async getScheduleList ({ state, commit }, payload) {
const { id } = payload
const res = await request.get(`/spiders/${id}/schedules`)
commit('schedule/SET_SCHEDULE_LIST', res.data.data, { root: true })
}
}

View File

@@ -34,6 +34,17 @@
</a>
<el-dropdown-menu slot="dropdown"></el-dropdown-menu>
</el-dropdown>
<el-dropdown class="github right">
<!-- Place this tag where you want the button to render. -->
<github-button
href="https://github.com/crawlab-team/crawlab"
data-color-scheme="no-preference: light; light: light; dark: dark;"
data-size="large"
data-show-count="true"
:aria-label="$t('Star crawlab-team/crawlab on GitHub')">
Star
</github-button>
</el-dropdown>
</div>
</template>
@@ -41,11 +52,13 @@
import { mapGetters } from 'vuex'
import Breadcrumb from '@/components/Breadcrumb'
import Hamburger from '@/components/Hamburger'
import GithubButton from 'vue-github-button'
export default {
components: {
Breadcrumb,
Hamburger
Hamburger,
GithubButton
},
computed: {
...mapGetters([
@@ -122,6 +135,12 @@ export default {
}
}
.github {
height: 50px;
margin-right: 35px;
margin-top: -10px;
}
.right {
float: right
}

View File

@@ -49,8 +49,7 @@ export default {
},
methods: {
onTabClick (name) {
if (name === 'installation') {
}
this.$st.sendEv('节点详情', '切换标签', name)
},
onNodeChange (id) {
this.$router.push(`/nodes/${id}`)

View File

@@ -240,7 +240,7 @@ export default {
message: 'Deleted successfully'
})
})
})
}))
},
onDeploy (row) {
this.$store.dispatch('spider/getSpiderData', row._id)

View File

@@ -32,8 +32,30 @@
/>
</el-select>
</el-form-item>
<el-form-item :label="$t('Spider')" prop="spider_id" required>
<el-select v-model="scheduleForm.spider_id" :placeholder="$t('Spider')" filterable>
<el-form-item v-if="!isDisabledSpiderSchedule" :label="$t('Spider')" prop="spider_id" required>
<el-select
v-model="scheduleForm.spider_id"
:placeholder="$t('Spider')"
filterable
:disabled="isDisabledSpiderSchedule"
>
<el-option
v-for="op in spiderList"
:key="op._id"
:value="op._id"
:label="`${op.display_name} (${op.name})`"
:disabled="isDisabledSpider(op)"
>
</el-option>
</el-select>
</el-form-item>
<el-form-item v-else :label="$t('Spider')" required>
<el-select
:value="spiderId"
:placeholder="$t('Spider')"
filterable
:disabled="isDisabledSpiderSchedule"
>
<el-option
v-for="op in spiderList"
:key="op._id"
@@ -45,8 +67,10 @@
</el-select>
</el-form-item>
<el-form-item :label="$t('Cron')" prop="cron" required>
<el-input v-model="scheduleForm.cron"
:placeholder="$t('schedules.cron')">
<el-input
v-model="scheduleForm.cron"
:placeholder="`${$t('[minute] [hour] [day] [month] [day of week]')}`"
>
</el-input>
<!--<el-button size="small" style="width:100px" type="primary" @click="onShowCronDialog">{{$t('schedules.add_cron')}}</el-button>-->
</el-form-item>
@@ -115,13 +139,28 @@
</el-tag>
</template>
</el-table-column>
<el-table-column v-else-if="col.name === 'run_type'" :key="col.name" :label="$t(col.label)">
<el-table-column v-else-if="col.name === 'run_type'" :key="col.name" :label="$t(col.label)" :width="col.width">
<template slot-scope="scope">
<template v-if="scope.row.run_type === 'all-nodes'">{{$t('All Nodes')}}</template>
<template v-else-if="scope.row.run_type === 'selected-nodes'">{{$t('Selected Nodes')}}</template>
<template v-else-if="scope.row.run_type === 'random'">{{$t('Random')}}</template>
</template>
</el-table-column>
<el-table-column v-else-if="col.name === 'node_names'" :key="col.name" :label="$t(col.label)" :width="col.width">
<template slot-scope="scope">
{{scope.row.nodes.map(d => d.name).join(', ')}}
</template>
</el-table-column>
<el-table-column v-else-if="col.name === 'enable'" :key="col.name" :label="$t(col.label)" :width="col.width">
<template slot-scope="scope">
<el-switch
v-model="scope.row.enabled"
active-color="#13ce66"
inactive-color="#ff4949"
@change="onEnabledChange(scope.row)"
/>
</template>
</el-table-column>
<el-table-column v-else :key="col.name"
:property="col.name"
:label="$t(col.label)"
@@ -133,7 +172,7 @@
</template>
</el-table-column>
</template>
<el-table-column :label="$t('Action')" align="left" width="180px" fixed="right">
<el-table-column :label="$t('Action')" align="left" width="auto" fixed="right">
<template slot-scope="scope">
<!-- 编辑 -->
<el-tooltip :content="$t('Edit')" placement="top">
@@ -166,14 +205,15 @@ export default {
data () {
return {
columns: [
{ name: 'name', label: 'Name', width: '180' },
{ name: 'cron', label: 'Cron', width: '120' },
{ name: 'run_type', label: 'Run Type', width: '150' },
{ name: 'node_name', label: 'Node', width: '150' },
{ name: 'spider_name', label: 'Spider', width: '150' },
{ name: 'param', label: 'Parameters', width: '150' },
{ name: 'description', label: 'Description', width: 'auto' },
{ name: 'status', label: 'Status', width: 'auto' }
{ name: 'name', label: 'Name', width: '150px' },
{ name: 'cron', label: 'Cron', width: '120px' },
{ name: 'run_type', label: 'Run Type', width: '120px' },
{ name: 'node_names', label: 'Node', width: '150px' },
{ name: 'spider_name', label: 'Spider', width: '150px' },
{ name: 'param', label: 'Parameters', width: '150px' },
{ name: 'description', label: 'Description', width: '200px' },
{ name: 'enable', label: 'Enable/Disable', width: '120px' }
// { name: 'status', label: 'Status', width: '100px' }
],
isEdit: false,
dialogTitle: '',
@@ -199,6 +239,9 @@ export default {
}
}
return {}
},
isDisabledSpiderSchedule () {
return false
}
},
methods: {
@@ -217,23 +260,27 @@ export default {
onAddSubmit () {
this.$refs.scheduleForm.validate(res => {
if (res) {
const form = JSON.parse(JSON.stringify(this.scheduleForm))
form.cron = '0 ' + this.scheduleForm.cron
if (this.isEdit) {
request.post(`/schedules/${this.scheduleForm._id}`, this.scheduleForm).then(response => {
request.post(`/schedules/${this.scheduleForm._id}`, form).then(response => {
if (response.data.error) {
this.$message.error(response.data.error)
return
}
this.dialogVisible = false
this.$store.dispatch('schedule/getScheduleList')
this.$message.success(this.$t('The schedule has been saved'))
})
} else {
request.put('/schedules', this.scheduleForm).then(response => {
request.put('/schedules', form).then(response => {
if (response.data.error) {
this.$message.error(response.data.error)
return
}
this.dialogVisible = false
this.$store.dispatch('schedule/getScheduleList')
this.$message.success(this.$t('The schedule has been added'))
})
}
}
@@ -258,57 +305,13 @@ export default {
.then(() => {
setTimeout(() => {
this.$store.dispatch('schedule/getScheduleList')
this.$message.success(`Schedule "${row.name}" has been removed`)
this.$message.success(this.$t('The schedule has been removed'))
}, 100)
})
}).catch(() => {
})
this.$st.sendEv('定时任务', '删除定时任务')
},
onCrawl (row) {
// 停止定时任务
if (!row.status || row.status === 'running') {
this.$confirm(this.$t('Are you sure to delete the schedule task?'), this.$t('Notification'), {
confirmButtonText: this.$t('Confirm'),
cancelButtonText: this.$t('Cancel'),
type: 'warning'
}).then(() => {
this.$store.dispatch('schedule/stopSchedule', row._id)
.then((resp) => {
if (resp.data.status === 'ok') {
this.$store.dispatch('schedule/getScheduleList')
return
}
this.$message({
type: 'error',
message: resp.data.error
})
})
}).catch(() => {
})
}
// 运行定时任务
if (row.status === 'stop') {
this.$confirm(this.$t('Are you sure to delete the schedule task?'), this.$t('Notification'), {
confirmButtonText: this.$t('Confirm'),
cancelButtonText: this.$t('Cancel'),
type: 'warning'
}).then(() => {
this.$store.dispatch('schedule/runSchedule', row._id)
.then((resp) => {
if (resp.data.status === 'ok') {
this.$store.dispatch('schedule/getScheduleList')
return
}
this.$message({
type: 'error',
message: resp.data.error
})
})
}).catch(() => {
})
}
},
isDisabledSpider (spider) {
if (spider.type === 'customized') {
return !spider.cmd
@@ -324,6 +327,19 @@ export default {
} else if (row.status === 'error') {
return 'Start'
}
},
async onEnabledChange (row) {
let res
if (row.enabled) {
res = await this.$store.dispatch('schedule/enableSchedule', row._id)
} else {
res = await this.$store.dispatch('schedule/disableSchedule', row._id)
}
if (!res || res.data.error) {
this.$message.error(this.$t(`${row.enabled ? 'Enabling' : 'Disabling'} the schedule unsuccessful`))
} else {
this.$message.success(this.$t(`${row.enabled ? 'Enabling' : 'Disabling'} the schedule successful`))
}
}
},
created () {

View File

@@ -25,6 +25,9 @@
<el-tab-pane :label="$t('Analytics')" name="analytics">
<spider-stats ref="spider-stats"/>
</el-tab-pane>
<el-tab-pane :label="$t('Schedules')" name="schedules">
<spider-schedules/>
</el-tab-pane>
</el-tabs>
</div>
</template>
@@ -38,10 +41,12 @@ import SpiderOverview from '../../components/Overview/SpiderOverview'
import EnvironmentList from '../../components/Environment/EnvironmentList'
import SpiderStats from '../../components/Stats/SpiderStats'
import ConfigList from '../../components/Config/ConfigList'
import SpiderSchedules from './SpiderSchedules'
export default {
name: 'SpiderDetail',
components: {
SpiderSchedules,
ConfigList,
SpiderStats,
EnvironmentList,

View File

@@ -164,73 +164,92 @@
<!--tabs-->
<el-tabs v-model="filter.type" @tab-click="onClickTab">
<el-tab-pane :label="$t('All')" name="all"></el-tab-pane>
<el-tab-pane :label="$t('Configurable')" name="configurable"></el-tab-pane>
<el-tab-pane :label="$t('Customized')" name="customized"></el-tab-pane>
<el-tab-pane :label="$t('Configurable')" name="configurable"></el-tab-pane>
</el-tabs>
<!--./tabs-->
<!--table list-->
<el-table :data="spiderList"
class="table"
:header-cell-style="{background:'rgb(48, 65, 86)',color:'white'}"
border
@row-click="onRowClick"
<el-table
:data="spiderList"
class="table"
:header-cell-style="{background:'rgb(48, 65, 86)',color:'white'}"
border
@row-click="onRowClick"
@sort-change="onSortChange"
>
<template v-for="col in columns">
<el-table-column v-if="col.name === 'type'"
:key="col.name"
:label="$t(col.label)"
align="left"
:width="col.width">
<el-table-column
v-if="col.name === 'type'"
:key="col.name"
:label="$t(col.label)"
align="left"
:width="col.width"
:sortable="col.sortable"
>
<template slot-scope="scope">
{{$t(scope.row.type)}}
</template>
</el-table-column>
<el-table-column v-else-if="col.name === 'last_5_errors'"
:key="col.name"
:label="$t(col.label)"
:width="col.width"
align="center">
<el-table-column
v-else-if="col.name === 'last_5_errors'"
:key="col.name"
:label="$t(col.label)"
:width="col.width"
:sortable="col.sortable"
align="center"
>
<template slot-scope="scope">
<div :style="{color:scope.row[col.name]>0?'red':''}">
{{scope.row[col.name]}}
</div>
</template>
</el-table-column>
<el-table-column v-else-if="col.name === 'cmd'"
:key="col.name"
:label="$t(col.label)"
:width="col.width"
align="left">
<el-table-column
v-else-if="col.name === 'cmd'"
:key="col.name"
:label="$t(col.label)"
:width="col.width"
:sortable="col.sortable"
align="left"
>
<template slot-scope="scope">
<el-input v-model="scope.row[col.name]"></el-input>
</template>
</el-table-column>
<el-table-column v-else-if="col.name.match(/_ts$/)"
:key="col.name"
:label="$t(col.label)"
:sortable="col.sortable"
:align="col.align"
:width="col.width">
<el-table-column
v-else-if="col.name.match(/_ts$/)"
:key="col.name"
:label="$t(col.label)"
:sortable="col.sortable"
:align="col.align"
:width="col.width"
>
<template slot-scope="scope">
{{getTime(scope.row[col.name])}}
</template>
</el-table-column>
<el-table-column v-else-if="col.name === 'last_status'"
:key="col.name"
:label="$t(col.label)"
align="left" :width="col.width">
<el-table-column
v-else-if="col.name === 'last_status'"
:key="col.name"
:label="$t(col.label)"
align="left"
:width="col.width"
:sortable="col.sortable"
>
<template slot-scope="scope">
<status-tag :status="scope.row.last_status"/>
</template>
</el-table-column>
<el-table-column v-else
:key="col.name"
:property="col.name"
:label="$t(col.label)"
:sortable="col.sortable"
:align="col.align || 'left'"
:width="col.width">
<el-table-column
v-else
:key="col.name"
:property="col.name"
:label="$t(col.label)"
:sortable="col.sortable"
:align="col.align || 'left'"
:width="col.width"
>
</el-table-column>
</template>
<el-table-column :label="$t('Action')" align="left" fixed="right">
@@ -301,10 +320,14 @@ export default {
keyword: '',
type: 'all'
},
sort: {
sortKey: '',
sortDirection: null
},
types: [],
columns: [
{ name: 'display_name', label: 'Name', width: '160', align: 'left' },
{ name: 'type', label: 'Spider Type', width: '120' },
{ name: 'display_name', label: 'Name', width: '160', align: 'left', sortable: true },
{ name: 'type', label: 'Spider Type', width: '120', sortable: true },
{ name: 'last_status', label: 'Last Status', width: '120' },
{ name: 'last_run_ts', label: 'Last Run', width: '140' },
// { name: 'update_ts', label: 'Update Time', width: '140' },
@@ -544,24 +567,26 @@ export default {
onRowClick (row, column, event) {
this.onView(row, event)
},
onSortChange ({ column, prop, order }) {
this.sort.sortKey = order ? prop : ''
this.sort.sortDirection = order
this.getList()
},
onClickTab (tab) {
this.filter.type = tab.name
this.getList()
},
getList () {
let params = {
pageNum: this.pagination.pageNum,
pageSize: this.pagination.pageSize,
page_num: this.pagination.pageNum,
page_size: this.pagination.pageSize,
sort_key: this.sort.sortKey,
sort_direction: this.sort.sortDirection,
keyword: this.filter.keyword,
type: this.filter.type
}
this.$store.dispatch('spider/getSpiderList', params)
}
// getTypes () {
// request.get(`/spider/types`).then(resp => {
// this.types = resp.data.data
// })
// }
},
async created () {
// fetch spider types

View File

@@ -0,0 +1,55 @@
<script>
import ScheduleList from '../schedule/ScheduleList'
export default {
name: 'SpiderSchedules',
extends: ScheduleList,
computed: {
isDisabledSpiderSchedule () {
return true
},
spiderId () {
const arr = this.$route.path.split('/')
return arr[arr.length - 1]
}
},
methods: {
getNodeList () {
this.$request.get('/nodes', {}).then(response => {
this.nodeList = response.data.data.map(d => {
d.systemInfo = {
os: '',
arch: '',
num_cpu: '',
executables: []
}
return d
})
})
},
getSpiderList () {
this.$request.get('/spiders', {})
.then(response => {
this.spiderList = response.data.data.list || []
})
},
onAdd () {
this.isEdit = false
this.dialogVisible = true
this.$store.commit('schedule/SET_SCHEDULE_FORM', { node_ids: [], spider_id: this.spiderId })
this.$st.sendEv('定时任务', '添加定时任务')
}
},
created () {
const arr = this.$route.path.split('/')
const id = arr[arr.length - 1]
this.$store.dispatch(`spider/getScheduleList`, { id })
// 节点列表
this.getNodeList()
// 爬虫列表
this.getSpiderList()
}
}
</script>

View File

@@ -345,9 +345,9 @@ export default {
this.$store.dispatch('node/getNodeList')
},
mounted () {
// this.handle = setInterval(() => {
// this.$store.dispatch('task/getTaskList')
// }, 5000)
this.handle = setInterval(() => {
this.$store.dispatch('task/getTaskList')
}, 5000)
},
destroyed () {
clearInterval(this.handle)

View File

@@ -3932,6 +3932,11 @@ getpass@^0.1.1:
dependencies:
assert-plus "^1.0.0"
github-buttons@^2.3.0:
version "2.6.0"
resolved "https://registry.npm.taobao.org/github-buttons/download/github-buttons-2.6.0.tgz#fa3e031451cee7ba05c3254fa67c73fe783104dc"
integrity sha1-+j4DFFHO57oFwyVPpnxz/ngxBNw=
glob-base@^0.3.0:
version "0.3.0"
resolved "http://registry.npm.taobao.org/glob-base/download/glob-base-0.3.0.tgz#dbb164f6221b1c0b1ccf82aea328b497df0ea3c4"
@@ -8587,6 +8592,13 @@ vue-eslint-parser@^5.0.0:
esquery "^1.0.1"
lodash "^4.17.11"
vue-github-button@^1.1.2:
version "1.1.2"
resolved "https://registry.npm.taobao.org/vue-github-button/download/vue-github-button-1.1.2.tgz#318518c3a31d0fbd278ebcc80fbc5f88d68836e6"
integrity sha1-MYUYw6MdD70njrzID7xfiNaINuY=
dependencies:
github-buttons "^2.3.0"
vue-hot-reload-api@^2.3.0:
version "2.3.2"
resolved "http://registry.npm.taobao.org/vue-hot-reload-api/download/vue-hot-reload-api-2.3.2.tgz#1fcc1495effe08a790909b46bf7b5c4cfeb6f21b"

View File

@@ -9,6 +9,7 @@ services:
CRAWLAB_MONGO_HOST: "mongo"
CRAWLAB_REDIS_ADDRESS: "redis"
CRAWLAB_LOG_PATH: "/var/logs/crawlab"
CRAWLAB_SETTING_ALLOWREGISTER: "Y"
ports:
- "8080:8080" # frontend
- "8000:8000" # backend

View File

@@ -1,178 +0,0 @@
#!/usr/bin/env bash
# Use this script to test if a given TCP host/port are available
WAITFORIT_cmdname=${0##*/}
echoerr() { if [[ $WAITFORIT_QUIET -ne 1 ]]; then echo "$@" 1>&2; fi }
usage()
{
cat << USAGE >&2
Usage:
$WAITFORIT_cmdname host:port [-s] [-t timeout] [-- command args]
-h HOST | --host=HOST Host or IP under test
-p PORT | --port=PORT TCP port under test
Alternatively, you specify the host and port as host:port
-s | --strict Only execute subcommand if the test succeeds
-q | --quiet Don't output any status messages
-t TIMEOUT | --timeout=TIMEOUT
Timeout in seconds, zero for no timeout
-- COMMAND ARGS Execute command with args after the test finishes
USAGE
exit 1
}
wait_for()
{
if [[ $WAITFORIT_TIMEOUT -gt 0 ]]; then
echoerr "$WAITFORIT_cmdname: waiting $WAITFORIT_TIMEOUT seconds for $WAITFORIT_HOST:$WAITFORIT_PORT"
else
echoerr "$WAITFORIT_cmdname: waiting for $WAITFORIT_HOST:$WAITFORIT_PORT without a timeout"
fi
WAITFORIT_start_ts=$(date +%s)
while :
do
if [[ $WAITFORIT_ISBUSY -eq 1 ]]; then
nc -z $WAITFORIT_HOST $WAITFORIT_PORT
WAITFORIT_result=$?
else
(echo > /dev/tcp/$WAITFORIT_HOST/$WAITFORIT_PORT) >/dev/null 2>&1
WAITFORIT_result=$?
fi
if [[ $WAITFORIT_result -eq 0 ]]; then
WAITFORIT_end_ts=$(date +%s)
echoerr "$WAITFORIT_cmdname: $WAITFORIT_HOST:$WAITFORIT_PORT is available after $((WAITFORIT_end_ts - WAITFORIT_start_ts)) seconds"
break
fi
sleep 1
done
return $WAITFORIT_result
}
wait_for_wrapper()
{
# In order to support SIGINT during timeout: http://unix.stackexchange.com/a/57692
if [[ $WAITFORIT_QUIET -eq 1 ]]; then
timeout $WAITFORIT_BUSYTIMEFLAG $WAITFORIT_TIMEOUT $0 --quiet --child --host=$WAITFORIT_HOST --port=$WAITFORIT_PORT --timeout=$WAITFORIT_TIMEOUT &
else
timeout $WAITFORIT_BUSYTIMEFLAG $WAITFORIT_TIMEOUT $0 --child --host=$WAITFORIT_HOST --port=$WAITFORIT_PORT --timeout=$WAITFORIT_TIMEOUT &
fi
WAITFORIT_PID=$!
trap "kill -INT -$WAITFORIT_PID" INT
wait $WAITFORIT_PID
WAITFORIT_RESULT=$?
if [[ $WAITFORIT_RESULT -ne 0 ]]; then
echoerr "$WAITFORIT_cmdname: timeout occurred after waiting $WAITFORIT_TIMEOUT seconds for $WAITFORIT_HOST:$WAITFORIT_PORT"
fi
return $WAITFORIT_RESULT
}
# process arguments
while [[ $# -gt 0 ]]
do
case "$1" in
*:* )
WAITFORIT_hostport=(${1//:/ })
WAITFORIT_HOST=${WAITFORIT_hostport[0]}
WAITFORIT_PORT=${WAITFORIT_hostport[1]}
shift 1
;;
--child)
WAITFORIT_CHILD=1
shift 1
;;
-q | --quiet)
WAITFORIT_QUIET=1
shift 1
;;
-s | --strict)
WAITFORIT_STRICT=1
shift 1
;;
-h)
WAITFORIT_HOST="$2"
if [[ $WAITFORIT_HOST == "" ]]; then break; fi
shift 2
;;
--host=*)
WAITFORIT_HOST="${1#*=}"
shift 1
;;
-p)
WAITFORIT_PORT="$2"
if [[ $WAITFORIT_PORT == "" ]]; then break; fi
shift 2
;;
--port=*)
WAITFORIT_PORT="${1#*=}"
shift 1
;;
-t)
WAITFORIT_TIMEOUT="$2"
if [[ $WAITFORIT_TIMEOUT == "" ]]; then break; fi
shift 2
;;
--timeout=*)
WAITFORIT_TIMEOUT="${1#*=}"
shift 1
;;
--)
shift
WAITFORIT_CLI=("$@")
break
;;
--help)
usage
;;
*)
echoerr "Unknown argument: $1"
usage
;;
esac
done
if [[ "$WAITFORIT_HOST" == "" || "$WAITFORIT_PORT" == "" ]]; then
echoerr "Error: you need to provide a host and port to test."
usage
fi
WAITFORIT_TIMEOUT=${WAITFORIT_TIMEOUT:-15}
WAITFORIT_STRICT=${WAITFORIT_STRICT:-0}
WAITFORIT_CHILD=${WAITFORIT_CHILD:-0}
WAITFORIT_QUIET=${WAITFORIT_QUIET:-0}
# check to see if timeout is from busybox?
WAITFORIT_TIMEOUT_PATH=$(type -p timeout)
WAITFORIT_TIMEOUT_PATH=$(realpath $WAITFORIT_TIMEOUT_PATH 2>/dev/null || readlink -f $WAITFORIT_TIMEOUT_PATH)
if [[ $WAITFORIT_TIMEOUT_PATH =~ "busybox" ]]; then
WAITFORIT_ISBUSY=1
WAITFORIT_BUSYTIMEFLAG="-t"
else
WAITFORIT_ISBUSY=0
WAITFORIT_BUSYTIMEFLAG=""
fi
if [[ $WAITFORIT_CHILD -gt 0 ]]; then
wait_for
WAITFORIT_RESULT=$?
exit $WAITFORIT_RESULT
else
if [[ $WAITFORIT_TIMEOUT -gt 0 ]]; then
wait_for_wrapper
WAITFORIT_RESULT=$?
else
wait_for
WAITFORIT_RESULT=$?
fi
fi
if [[ $WAITFORIT_CLI != "" ]]; then
if [[ $WAITFORIT_RESULT -ne 0 && $WAITFORIT_STRICT -eq 1 ]]; then
echoerr "$WAITFORIT_cmdname: strict mode, refusing to execute subprocess"
exit $WAITFORIT_RESULT
fi
exec "${WAITFORIT_CLI[@]}"
else
exit $WAITFORIT_RESULT
fi