diff --git a/CHANGELOG.md b/CHANGELOG.md index 95ef9cd7..671aa8a1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,33 @@ +# 0.4.2 (unknown) +### Features / Enhancement +- **Disclaimer**. Added page for Disclaimer. +- **Call API to fetch version**. [#371](https://github.com/crawlab-team/crawlab/issues/371) +- **Configure to allow user registration**. [#346](https://github.com/crawlab-team/crawlab/issues/346) +- **Allow adding new users**. + +### Bug Fixes +- **"mongodb no reachable" error**. [#373](https://github.com/crawlab-team/crawlab/issues/373) + +# 0.4.1 (2019-12-13) +### Features / Enhancement +- **Spiderfile Optimization**. Stages changed from dictionary to array. [#358](https://github.com/crawlab-team/crawlab/issues/358) +- **Baidu Tongji Update**. + +### Bug Fixes +- **Unable to display schedule tasks**. [#353](https://github.com/crawlab-team/crawlab/issues/353) +- **Duplicate node registration**. [#334](https://github.com/crawlab-team/crawlab/issues/334) + +# 0.4.0 (2019-12-06) +### Features / Enhancement +- **Configurable Spider**. Allow users to add spiders using *Spiderfile* to configure crawling rules. +- **Execution Mode**. Allow users to select 3 modes for task execution: *All Nodes*, *Selected Nodes* and *Random*. + +### Bug Fixes +- **Task accidentally killed**. [#306](https://github.com/crawlab-team/crawlab/issues/306) +- **Documentation fix**. [#301](https://github.com/crawlab-team/crawlab/issues/258) [#301](https://github.com/crawlab-team/crawlab/issues/258) +- **Direct deploy incompatible with Windows**. [#288](https://github.com/crawlab-team/crawlab/issues/288) +- **Log files lost**. [#269](https://github.com/crawlab-team/crawlab/issues/269) + # 0.3.5 (2019-10-28) ### Features / Enhancement - **Graceful Showdown**. [detail](https://github.com/crawlab-team/crawlab/commit/63fab3917b5a29fd9770f9f51f1572b9f0420385) diff --git a/DISCLAIMER-zh.md b/DISCLAIMER-zh.md new file mode 100644 index 00000000..a329e4e9 --- /dev/null +++ b/DISCLAIMER-zh.md @@ -0,0 +1,12 @@ +# 免责声明 + +本免责及隐私保护声明(以下简称“免责声明”或“本声明”)适用于 Crawlab 开发组 (以下简称“开发组”)研发的系列软件(以下简称"Crawlab") 在您阅读本声明后若不同意此声明中的任何条款,或对本声明存在质疑,请立刻停止使用我们的软件。若您已经开始或正在使用 Crawlab,则表示您已阅读并同意本声明的所有条款之约定。 + +1. 总则:您通过安装 Crawlab 并使用 Crawlab 提供的服务与功能即表示您已经同意与开发组立本协议。开发组可随时执行全权决定更改“条款”。经修订的“条款”一经在 Github 免责声明页面上公布后,立即自动生效。 +2. 本产品是基于Golang的分布式爬虫管理平台,支持Python、NodeJS、Go、Java、PHP等多种编程语言以及多种爬虫框架。 +3. 一切因使用 Crawlab 而引致之任何意外、疏忽、合约毁坏、诽谤、版权或知识产权侵犯及其所造成的损失(包括在非官方站点下载 Crawlab 而感染电脑病毒),Crawlab 开发组概不负责,亦不承担任何法律责任。 +4. 用户对使用 Crawlab 自行承担风险,我们不做任何形式的保证, 因网络状况、通讯线路等任何技术原因而导致用户不能正常升级更新,我们也不承担任何法律责任。 +5. 用户使用 Crawlab 对目标网站进行抓取时需遵从《网络安全法》等与爬虫相关的法律法规,切勿擅自采集公民个人信息、用 DDoS 等方式造成目标网站瘫痪、不遵从目标网站的 robots.txt 协议等非法手段。 +6. Crawlab 尊重并保护所有用户的个人隐私权,不会窃取任何用户计算机中的信息。 +7. 系统的版权:Crawlab 开发组对所有开发的或合作开发的产品拥有知识产权,著作权,版权和使用权,这些产品受到适用的知识产权、版权、商标、服务商标、专利或其他法律的保护。 +8. 传播:任何公司或个人在网络上发布,传播我们软件的行为都是允许的,但因公司或个人传播软件可能造成的任何法律和刑事事件 Crawlab 开发组不负任何责任。 diff --git a/DISCLAIMER.md b/DISCLAIMER.md new file mode 100644 index 00000000..72aae961 --- /dev/null +++ b/DISCLAIMER.md @@ -0,0 +1,12 @@ +# Disclaimer + +This Disclaimer and privacy protection statement (hereinafter referred to as "disclaimer statement" or "this statement") is applicable to the series of software (hereinafter referred to as "crawlab") developed by crawlab development group (hereinafter referred to as "development group") after you read this statement, if you do not agree with any terms in this statement or have doubts about this statement, please stop using our software immediately. If you have started or are using crawlab, you have read and agree to all terms of this statement. + +1. General: by installing crawlab and using the services and functions provided by crawlab, you have agreed to establish this agreement with the development team. The developer group may at any time change the terms at its sole discretion. The amended "terms" shall take effect automatically as soon as they are published on the GitHub disclaimer page. +2. This product is a distributed crawler management platform based on golang, supporting python, nodejs, go, Java, PHP and other programming languages as well as a variety of crawler frameworks. +3. The development team of crawlab shall not be responsible for any accident, negligence, contract damage, defamation, copyright or intellectual property infringement caused by the use of crawlab and any loss caused by it (including computer virus infection caused by downloading crawlab on the unofficial site), and shall not bear any legal responsibility. +4. The user shall bear the risk of using crawlab by himself, we do not make any form of guarantee, and we will not bear any legal responsibility for the user's failure to upgrade and update normally due to any technical reasons such as network condition and communication line. +5. When users use crawlab to grab the target website, they need to comply with the laws and regulations related to crawlers, such as the network security law. Do not collect personal information of citizens without authorization, cause the target website to be paralyzed by DDoS, or fail to comply with the robots.txt protocol and other illegal means of the target website. +6. Crawlab respects and protects the personal privacy of all users and will not steal any information from users' computers. +7. Copyright of the system: the crawleb development team owns the intellectual property rights, copyrights, copyrights and use rights for all developed or jointly developed products, which are protected by applicable intellectual property rights, copyrights, trademarks, service trademarks, patents or other laws. +8. Communication: any company or individual who publishes or disseminates our software on the Internet is allowed, but the crawlab development team shall not be responsible for any legal and criminal events that may be caused by the company or individual disseminating the software. \ No newline at end of file diff --git a/Dockerfile b/Dockerfile index 0809a0ba..ddb4d47e 100644 --- a/Dockerfile +++ b/Dockerfile @@ -37,7 +37,6 @@ RUN apt-get update \ RUN pip install scrapy pymongo bs4 requests -i https://pypi.tuna.tsinghua.edu.cn/simple # copy backend files -COPY --from=backend-build /go/src/app . COPY --from=backend-build /go/bin/crawlab /usr/local/bin # install nginx diff --git a/Dockerfile.local b/Dockerfile.local index ed4e7e96..ddb4d47e 100644 --- a/Dockerfile.local +++ b/Dockerfile.local @@ -4,16 +4,17 @@ WORKDIR /go/src/app COPY ./backend . ENV GO111MODULE on -ENV GOPROXY https://mirrors.aliyun.com/goproxy/ +ENV GOPROXY https://goproxy.io RUN go install -v ./... -FROM node:8.16.0 AS frontend-build +FROM node:8.16.0-alpine AS frontend-build ADD ./frontend /app WORKDIR /app # install frontend +RUN npm config set unsafe-perm true RUN npm install -g yarn && yarn install --registry=https://registry.npm.taobao.org RUN npm run build:prod @@ -36,7 +37,6 @@ RUN apt-get update \ RUN pip install scrapy pymongo bs4 requests -i https://pypi.tuna.tsinghua.edu.cn/simple # copy backend files -COPY --from=backend-build /go/src/app . COPY --from=backend-build /go/bin/crawlab /usr/local/bin # install nginx @@ -56,4 +56,4 @@ EXPOSE 8080 EXPOSE 8000 # start backend -CMD ["/bin/sh", "/app/docker_init.sh"] \ No newline at end of file +CMD ["/bin/sh", "/app/docker_init.sh"] diff --git a/README-zh.md b/README-zh.md index a12eacc4..0c943c3e 100644 --- a/README-zh.md +++ b/README-zh.md @@ -10,7 +10,7 @@ 中文 | [English](https://github.com/crawlab-team/crawlab) -[安装](#安装) | [运行](#运行) | [截图](#截图) | [架构](#架构) | [集成](#与其他框架的集成) | [比较](#与其他框架比较) | [相关文章](#相关文章) | [社区&赞助](#社区--赞助) +[安装](#安装) | [运行](#运行) | [截图](#截图) | [架构](#架构) | [集成](#与其他框架的集成) | [比较](#与其他框架比较) | [相关文章](#相关文章) | [社区&赞助](#社区--赞助) | [免责声明](https://github.com/crawlab-team/crawlab/blob/master/DISCLAIMER-zh.md) 基于Golang的分布式爬虫管理平台,支持Python、NodeJS、Go、Java、PHP等多种编程语言以及多种爬虫框架。 diff --git a/README.md b/README.md index 70822b1d..11ac8383 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,7 @@ [中文](https://github.com/crawlab-team/crawlab/blob/master/README-zh.md) | English -[Installation](#installation) | [Run](#run) | [Screenshot](#screenshot) | [Architecture](#architecture) | [Integration](#integration-with-other-frameworks) | [Compare](#comparison-with-other-frameworks) | [Community & Sponsorship](#community--sponsorship) +[Installation](#installation) | [Run](#run) | [Screenshot](#screenshot) | [Architecture](#architecture) | [Integration](#integration-with-other-frameworks) | [Compare](#comparison-with-other-frameworks) | [Community & Sponsorship](#community--sponsorship) | [Disclaimer](https://github.com/crawlab-team/crawlab/blob/master/DISCLAIMER.md) Golang-based distributed web crawler management platform, supporting various languages including Python, NodeJS, Go, Java, PHP and various web crawler frameworks including Scrapy, Puppeteer, Selenium. diff --git a/backend/conf/config.yml b/backend/conf/config.yml index 60d2bd41..a6522ba5 100644 --- a/backend/conf/config.yml +++ b/backend/conf/config.yml @@ -32,3 +32,6 @@ task: workers: 4 other: tmppath: "/tmp" +version: 0.4.1 +setting: + allowRegister: "N" \ No newline at end of file diff --git a/backend/database/mongo.go b/backend/database/mongo.go index e72baeaa..d646285d 100644 --- a/backend/database/mongo.go +++ b/backend/database/mongo.go @@ -61,10 +61,36 @@ func InitMongo() error { dialInfo.Password = mongoPassword dialInfo.Source = mongoAuth } - sess, err := mgo.DialWithInfo(&dialInfo) - if err != nil { - return err + + // mongo session + var sess *mgo.Session + + // 错误次数 + errNum := 0 + + // 重复尝试连接mongo + for { + var err error + + // 连接mongo + sess, err = mgo.DialWithInfo(&dialInfo) + + if err != nil { + // 如果连接错误,休息1秒,错误次数+1 + time.Sleep(1 * time.Second) + errNum++ + + // 如果错误次数超过30,返回错误 + if errNum >= 30 { + return err + } + } else { + // 如果没有错误,退出循环 + break + } } + + // 赋值给全局mongo session Session = sess } return nil diff --git a/backend/entity/config_spider.go b/backend/entity/config_spider.go index 3fe28bc9..d9e085d2 100644 --- a/backend/entity/config_spider.go +++ b/backend/entity/config_spider.go @@ -5,7 +5,7 @@ type ConfigSpiderData struct { Engine string `yaml:"engine" json:"engine"` StartUrl string `yaml:"start_url" json:"start_url"` StartStage string `yaml:"start_stage" json:"start_stage"` - Stages map[string]Stage `yaml:"stages" json:"stages"` + Stages []Stage `yaml:"stages" json:"stages"` Settings map[string]string `yaml:"settings" json:"settings"` } diff --git a/backend/main.go b/backend/main.go index 0d7b7cc1..d14f64b7 100644 --- a/backend/main.go +++ b/backend/main.go @@ -114,9 +114,9 @@ func main() { app.Use(middlewares.CORSMiddleware()) anonymousGroup := app.Group("/") { - anonymousGroup.POST("/login", routes.Login) // 用户登录 - anonymousGroup.PUT("/users", routes.PutUser) // 添加用户 - + anonymousGroup.POST("/login", routes.Login) // 用户登录 + anonymousGroup.PUT("/users", routes.PutUser) // 添加用户 + anonymousGroup.GET("/setting", routes.GetSetting) // 获取配置信息 } authGroup := app.Group("/", middlewares.AuthorizationMiddleware()) { @@ -176,6 +176,8 @@ func main() { authGroup.POST("/users/:id", routes.PostUser) // 更改用户 authGroup.DELETE("/users/:id", routes.DeleteUser) // 删除用户 authGroup.GET("/me", routes.GetMe) // 获取自己账户 + // release版本 + authGroup.GET("/version", routes.GetVersion) // 获取发布的版本 } } diff --git a/backend/mock/schedule.go b/backend/mock/schedule.go index 702e8754..57d757e7 100644 --- a/backend/mock/schedule.go +++ b/backend/mock/schedule.go @@ -10,12 +10,15 @@ import ( "time" ) +var NodeIdss = []bson.ObjectId{bson.ObjectIdHex("5d429e6c19f7abede924fee2"), + bson.ObjectIdHex("5d429e6c19f7abede924fee1")} + var scheduleList = []model.Schedule{ { Id: bson.ObjectId("5d429e6c19f7abede924fee2"), Name: "test schedule", SpiderId: "123", - NodeId: bson.ObjectId("5d429e6c19f7abede924fee2"), + NodeIds: NodeIdss, Cron: "***1*", EntryId: 10, // 前端展示 @@ -29,7 +32,7 @@ var scheduleList = []model.Schedule{ Id: bson.ObjectId("xx429e6c19f7abede924fee2"), Name: "test schedule2", SpiderId: "234", - NodeId: bson.ObjectId("5d429e6c19f7abede924fee2"), + NodeIds: NodeIdss, Cron: "***1*", EntryId: 10, // 前端展示 @@ -100,8 +103,10 @@ func PutSchedule(c *gin.Context) { } // 如果node_id为空,则置为空ObjectId - if item.NodeId == "" { - item.NodeId = bson.ObjectIdHex(constants.ObjectIdNull) + for _, NodeId := range item.NodeIds { + if NodeId == "" { + NodeId = bson.ObjectIdHex(constants.ObjectIdNull) + } } c.JSON(http.StatusOK, Response{ diff --git a/backend/mock/schedule_test.go b/backend/mock/schedule_test.go index 12843c75..2c5c2701 100644 --- a/backend/mock/schedule_test.go +++ b/backend/mock/schedule_test.go @@ -75,7 +75,7 @@ func TestPostSchedule(t *testing.T) { Id: bson.ObjectIdHex("5d429e6c19f7abede924fee2"), Name: "test schedule", SpiderId: bson.ObjectIdHex("5d429e6c19f7abede924fee2"), - NodeId: bson.ObjectIdHex("5d429e6c19f7abede924fee2"), + NodeIds: NodeIdss, Cron: "***1*", EntryId: 10, // 前端展示 @@ -112,7 +112,7 @@ func TestPutSchedule(t *testing.T) { Id: bson.ObjectIdHex("5d429e6c19f7abede924fee2"), Name: "test schedule", SpiderId: bson.ObjectIdHex("5d429e6c19f7abede924fee2"), - NodeId: bson.ObjectIdHex("5d429e6c19f7abede924fee2"), + NodeIds: NodeIdss, Cron: "***1*", EntryId: 10, // 前端展示 diff --git a/backend/model/config_spider/common.go b/backend/model/config_spider/common.go index c803755a..4d244fe1 100644 --- a/backend/model/config_spider/common.go +++ b/backend/model/config_spider/common.go @@ -15,16 +15,12 @@ func GetAllFields(data entity.ConfigSpiderData) []entity.Field { func GetStartStageName(data entity.ConfigSpiderData) string { // 如果 start_stage 设置了且在 stages 里,则返回 if data.StartStage != "" { - for stageName := range data.Stages { - if stageName == data.StartStage { - return data.StartStage - } - } + return data.StartStage } // 否则返回第一个 stage - for stageName := range data.Stages { - return stageName + for _, stage := range data.Stages { + return stage.Name } return "" } diff --git a/backend/model/config_spider/scrapy.go b/backend/model/config_spider/scrapy.go index 6fcb77f0..ee24a3e7 100644 --- a/backend/model/config_spider/scrapy.go +++ b/backend/model/config_spider/scrapy.go @@ -83,7 +83,8 @@ func (g ScrapyGenerator) ProcessSpider() error { // 替换 parsers strParser := "" - for stageName, stage := range g.ConfigData.Stages { + for _, stage := range g.ConfigData.Stages { + stageName := stage.Name stageStr := g.GetParserString(stageName, stage) strParser += stageStr } diff --git a/backend/model/node.go b/backend/model/node.go index d662ab6d..2fe810f8 100644 --- a/backend/model/node.go +++ b/backend/model/node.go @@ -55,7 +55,7 @@ func GetCurrentNode() (Node, error) { for { // 如果错误次数超过10次 if errNum >= 10 { - panic("cannot get current node") + return node, errors.New("cannot get current node") } // 尝试获取节点 diff --git a/backend/model/schedule.go b/backend/model/schedule.go index 39e1244f..c1923885 100644 --- a/backend/model/schedule.go +++ b/backend/model/schedule.go @@ -12,15 +12,18 @@ import ( ) type Schedule struct { - Id bson.ObjectId `json:"_id" bson:"_id"` - Name string `json:"name" bson:"name"` - Description string `json:"description" bson:"description"` - SpiderId bson.ObjectId `json:"spider_id" bson:"spider_id"` - NodeId bson.ObjectId `json:"node_id" bson:"node_id"` - NodeKey string `json:"node_key" bson:"node_key"` - Cron string `json:"cron" bson:"cron"` - EntryId cron.EntryID `json:"entry_id" bson:"entry_id"` - Param string `json:"param" bson:"param"` + Id bson.ObjectId `json:"_id" bson:"_id"` + Name string `json:"name" bson:"name"` + Description string `json:"description" bson:"description"` + SpiderId bson.ObjectId `json:"spider_id" bson:"spider_id"` + //NodeId bson.ObjectId `json:"node_id" bson:"node_id"` + //NodeKey string `json:"node_key" bson:"node_key"` + Cron string `json:"cron" bson:"cron"` + EntryId cron.EntryID `json:"entry_id" bson:"entry_id"` + Param string `json:"param" bson:"param"` + RunType string `json:"run_type" bson:"run_type"` + NodeIds []bson.ObjectId `json:"node_ids" bson:"node_ids"` + // 状态 Status string `json:"status" bson:"status"` @@ -49,26 +52,26 @@ func (sch *Schedule) Delete() error { return c.RemoveId(sch.Id) } -func (sch *Schedule) SyncNodeIdAndSpiderId(node Node, spider Spider) { - sch.syncNodeId(node) - sch.syncSpiderId(spider) -} +//func (sch *Schedule) SyncNodeIdAndSpiderId(node Node, spider Spider) { +// sch.syncNodeId(node) +// sch.syncSpiderId(spider) +//} -func (sch *Schedule) syncNodeId(node Node) { - if node.Id.Hex() == sch.NodeId.Hex() { - return - } - sch.NodeId = node.Id - _ = sch.Save() -} +//func (sch *Schedule) syncNodeId(node Node) { +// if node.Id.Hex() == sch.NodeId.Hex() { +// return +// } +// sch.NodeId = node.Id +// _ = sch.Save() +//} -func (sch *Schedule) syncSpiderId(spider Spider) { - if spider.Id.Hex() == sch.SpiderId.Hex() { - return - } - sch.SpiderId = spider.Id - _ = sch.Save() -} +//func (sch *Schedule) syncSpiderId(spider Spider) { +// if spider.Id.Hex() == sch.SpiderId.Hex() { +// return +// } +// sch.SpiderId = spider.Id +// _ = sch.Save() +//} func GetScheduleList(filter interface{}) ([]Schedule, error) { s, c := database.GetCol("schedules") @@ -81,20 +84,20 @@ func GetScheduleList(filter interface{}) ([]Schedule, error) { var schs []Schedule for _, schedule := range schedules { - // 获取节点名称 - if schedule.NodeId == bson.ObjectIdHex(constants.ObjectIdNull) { - // 选择所有节点 - schedule.NodeName = "All Nodes" - } else { - // 选择单一节点 - node, err := GetNode(schedule.NodeId) - if err != nil { - schedule.Status = constants.ScheduleStatusError - schedule.Message = constants.ScheduleStatusErrorNotFoundNode - } else { - schedule.NodeName = node.Name - } - } + // TODO: 获取节点名称 + //if schedule.NodeId == bson.ObjectIdHex(constants.ObjectIdNull) { + // // 选择所有节点 + // schedule.NodeName = "All Nodes" + //} else { + // // 选择单一节点 + // node, err := GetNode(schedule.NodeId) + // if err != nil { + // schedule.Status = constants.ScheduleStatusError + // schedule.Message = constants.ScheduleStatusErrorNotFoundNode + // } else { + // schedule.NodeName = node.Name + // } + //} // 获取爬虫名称 spider, err := GetSpider(schedule.SpiderId) @@ -130,12 +133,13 @@ func UpdateSchedule(id bson.ObjectId, item Schedule) error { if err := c.FindId(id).One(&result); err != nil { return err } - node, err := GetNode(item.NodeId) - if err != nil { - return err - } + //node, err := GetNode(item.NodeId) + //if err != nil { + // return err + //} - item.NodeKey = node.Key + item.UpdateTs = time.Now() + //item.NodeKey = node.Key if err := item.Save(); err != nil { return err } @@ -146,15 +150,15 @@ func AddSchedule(item Schedule) error { s, c := database.GetCol("schedules") defer s.Close() - node, err := GetNode(item.NodeId) - if err != nil { - return err - } + //node, err := GetNode(item.NodeId) + //if err != nil { + // return err + //} item.Id = bson.NewObjectId() item.CreateTs = time.Now() item.UpdateTs = time.Now() - item.NodeKey = node.Key + //item.NodeKey = node.Key if err := c.Insert(&item); err != nil { debug.PrintStack() diff --git a/backend/model/spider.go b/backend/model/spider.go index a0d72c1c..78adc4d0 100644 --- a/backend/model/spider.go +++ b/backend/model/spider.go @@ -319,11 +319,5 @@ func GetConfigSpiderData(spider Spider) (entity.ConfigSpiderData, error) { return configData, err } - // 赋值 stage_name - for stageName, stage := range configData.Stages { - stage.Name = stageName - configData.Stages[stageName] = stage - } - return configData, nil } diff --git a/backend/routes/setting.go b/backend/routes/setting.go new file mode 100644 index 00000000..4429873e --- /dev/null +++ b/backend/routes/setting.go @@ -0,0 +1,33 @@ +package routes + +import ( + "github.com/gin-gonic/gin" + "github.com/spf13/viper" + "net/http" +) + +type SettingBody struct { + AllowRegister string `json:"allow_register"` +} + +func GetVersion(c *gin.Context) { + version := viper.GetString("version") + + c.JSON(http.StatusOK, Response{ + Status: "ok", + Message: "success", + Data: version, + }) +} + +func GetSetting(c *gin.Context) { + allowRegister := viper.GetString("setting.allowRegister") + + body := SettingBody{AllowRegister: allowRegister} + + c.JSON(http.StatusOK, Response{ + Status: "ok", + Message: "success", + Data: body, + }) +} diff --git a/backend/routes/task.go b/backend/routes/task.go index 6b91ed66..d5e3cacc 100644 --- a/backend/routes/task.go +++ b/backend/routes/task.go @@ -119,7 +119,6 @@ func PutTask(c *gin.Context) { return } } - } else if reqBody.RunType == constants.RunTypeRandom { // 随机 t := model.Task{ @@ -130,7 +129,6 @@ func PutTask(c *gin.Context) { HandleError(http.StatusInternalServerError, c, err) return } - } else if reqBody.RunType == constants.RunTypeSelectedNodes { // 指定节点 for _, nodeId := range reqBody.NodeIds { @@ -145,7 +143,6 @@ func PutTask(c *gin.Context) { return } } - } else { HandleErrorF(http.StatusInternalServerError, c, "invalid run_type") return diff --git a/backend/routes/user.go b/backend/routes/user.go index a6d44cae..33b6a958 100644 --- a/backend/routes/user.go +++ b/backend/routes/user.go @@ -21,6 +21,7 @@ type UserListRequestData struct { type UserRequestData struct { Username string `json:"username"` Password string `json:"password"` + Role string `json:"role"` } func GetUser(c *gin.Context) { @@ -88,11 +89,16 @@ func PutUser(c *gin.Context) { return } + // 默认为正常用户 + if reqData.Role == "" { + reqData.Role = constants.RoleNormal + } + // 添加用户 user := model.User{ Username: strings.ToLower(reqData.Username), Password: utils.EncryptPassword(reqData.Password), - Role: constants.RoleNormal, + Role: reqData.Role, } if err := user.Add(); err != nil { HandleError(http.StatusInternalServerError, c, err) diff --git a/backend/services/config_spider.go b/backend/services/config_spider.go index 7c736cc7..fe0a3da1 100644 --- a/backend/services/config_spider.go +++ b/backend/services/config_spider.go @@ -61,7 +61,9 @@ func ValidateSpiderfile(configData entity.ConfigSpiderData) error { // 校验stages dict := map[string]int{} - for stageName, stage := range configData.Stages { + for _, stage := range configData.Stages { + stageName := stage.Name + // stage 名称不能为空 if stageName == "" { return errors.New("spiderfile invalid: stage name is empty") @@ -152,12 +154,6 @@ func IsUniqueConfigSpiderFields(fields []entity.Field) bool { func ProcessSpiderFilesFromConfigData(spider model.Spider, configData entity.ConfigSpiderData) error { spiderDir := spider.Src - // 赋值 stage_name - for stageName, stage := range configData.Stages { - stage.Name = stageName - configData.Stages[stageName] = stage - } - // 删除已有的爬虫文件 for _, fInfo := range utils.ListDir(spiderDir) { // 不删除Spiderfile diff --git a/backend/services/node.go b/backend/services/node.go index e6c2ac08..d14ce4ae 100644 --- a/backend/services/node.go +++ b/backend/services/node.go @@ -167,27 +167,34 @@ func UpdateNodeData() { debug.PrintStack() return } - // 构造节点数据 - data := Data{ - Key: key, - Mac: mac, - Ip: ip, - Master: model.IsMaster(), - UpdateTs: time.Now(), - UpdateTsUnix: time.Now().Unix(), + + //先获取所有Redis的nodekey + list, _ := database.RedisClient.HKeys("nodes") + + if i := utils.Contains(list, key); i == false { + // 构造节点数据 + data := Data{ + Key: key, + Mac: mac, + Ip: ip, + Master: model.IsMaster(), + UpdateTs: time.Now(), + UpdateTsUnix: time.Now().Unix(), + } + + // 注册节点到Redis + dataBytes, err := json.Marshal(&data) + if err != nil { + log.Errorf(err.Error()) + debug.PrintStack() + return + } + if err := database.RedisClient.HSet("nodes", key, utils.BytesToString(dataBytes)); err != nil { + log.Errorf(err.Error()) + return + } } - // 注册节点到Redis - dataBytes, err := json.Marshal(&data) - if err != nil { - log.Errorf(err.Error()) - debug.PrintStack() - return - } - if err := database.RedisClient.HSet("nodes", key, utils.BytesToString(dataBytes)); err != nil { - log.Errorf(err.Error()) - return - } } func MasterNodeCallback(message redis.Message) (err error) { diff --git a/backend/services/schedule.go b/backend/services/schedule.go index 7a8defde..53938aea 100644 --- a/backend/services/schedule.go +++ b/backend/services/schedule.go @@ -7,7 +7,7 @@ import ( "errors" "github.com/apex/log" "github.com/globalsign/mgo/bson" - "github.com/satori/go.uuid" + uuid "github.com/satori/go.uuid" "runtime/debug" ) @@ -19,48 +19,87 @@ type Scheduler struct { func AddScheduleTask(s model.Schedule) func() { return func() { - node, err := model.GetNodeByKey(s.NodeKey) - if err != nil || node.Id.Hex() == "" { - log.Errorf("get node by key error: %s", err.Error()) - debug.PrintStack() - return - } - - spider := model.GetSpiderByName(s.SpiderName) - if spider == nil || spider.Id.Hex() == "" { - log.Errorf("get spider by name error: %s", err.Error()) - debug.PrintStack() - return - } - - // 同步ID到定时任务 - s.SyncNodeIdAndSpiderId(node, *spider) - // 生成任务ID id := uuid.NewV4() - // 生成任务模型 - t := model.Task{ - Id: id.String(), - SpiderId: spider.Id, - NodeId: node.Id, - Status: constants.StatusPending, - Param: s.Param, - } + if s.RunType == constants.RunTypeAllNodes { + // 所有节点 + nodes, err := model.GetNodeList(nil) + if err != nil { + return + } + for _, node := range nodes { + t := model.Task{ + Id: id.String(), + SpiderId: s.SpiderId, + NodeId: node.Id, + Param: s.Param, + } - // 将任务存入数据库 - if err := model.AddTask(t); err != nil { - log.Errorf(err.Error()) - debug.PrintStack() + if err := AddTask(t); err != nil { + return + } + if err := AssignTask(t); err != nil { + log.Errorf(err.Error()) + debug.PrintStack() + return + } + } + } else if s.RunType == constants.RunTypeRandom { + // 随机 + t := model.Task{ + Id: id.String(), + SpiderId: s.SpiderId, + Param: s.Param, + } + if err := AddTask(t); err != nil { + return + } + if err := AssignTask(t); err != nil { + log.Errorf(err.Error()) + debug.PrintStack() + return + } + } else if s.RunType == constants.RunTypeSelectedNodes { + // 指定节点 + for _, nodeId := range s.NodeIds { + t := model.Task{ + Id: id.String(), + SpiderId: s.SpiderId, + NodeId: nodeId, + Param: s.Param, + } + + if err := AddTask(t); err != nil { + return + } + + if err := AssignTask(t); err != nil { + log.Errorf(err.Error()) + debug.PrintStack() + return + } + } + } else { return } - // 加入任务队列 - if err := AssignTask(t); err != nil { - log.Errorf(err.Error()) - debug.PrintStack() - return - } + //node, err := model.GetNodeByKey(s.NodeKey) + //if err != nil || node.Id.Hex() == "" { + // log.Errorf("get node by key error: %s", err.Error()) + // debug.PrintStack() + // return + //} + // + //spider := model.GetSpiderByName(s.SpiderName) + //if spider == nil || spider.Id.Hex() == "" { + // log.Errorf("get spider by name error: %s", err.Error()) + // debug.PrintStack() + // return + //} + // + //// 同步ID到定时任务 + //s.SyncNodeIdAndSpiderId(node, *spider) } } diff --git a/backend/template/spiderfile/Spiderfile.163_news b/backend/template/spiderfile/Spiderfile.163_news index 29d58279..c2a73be7 100644 --- a/backend/template/spiderfile/Spiderfile.163_news +++ b/backend/template/spiderfile/Spiderfile.163_news @@ -4,17 +4,17 @@ start_url: "http://news.163.com/special/0001386F/rank_news.html" start_stage: "list" engine: "scrapy" stages: - list: - is_list: true - list_css: "table tr:not(:first-child)" - fields: - - name: "title" - css: "td:nth-child(1) > a" - - name: "url" - css: "td:nth-child(1) > a" - attr: "href" - - name: "clicks" - css: "td.cBlue" +- name: list + is_list: true + list_css: "table tr:not(:first-child)" + fields: + - name: "title" + css: "td:nth-child(1) > a" + - name: "url" + css: "td:nth-child(1) > a" + attr: "href" + - name: "clicks" + css: "td.cBlue" settings: ROBOTSTXT_OBEY: false USER_AGENT: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36 diff --git a/backend/template/spiderfile/Spiderfile.baidu b/backend/template/spiderfile/Spiderfile.baidu index fbf720e4..5643c980 100644 --- a/backend/template/spiderfile/Spiderfile.baidu +++ b/backend/template/spiderfile/Spiderfile.baidu @@ -4,19 +4,19 @@ start_url: http://www.baidu.com/s?wd=crawlab start_stage: list engine: scrapy stages: - list: - is_list: true - list_xpath: //*[contains(@class, "c-container")] - page_xpath: //*[@id="page"]//a[@class="n"][last()] - page_attr: href - fields: - - name: title - xpath: .//h3/a - - name: url - xpath: .//h3/a - attr: href - - name: abstract - xpath: .//*[@class="c-abstract"] +- name: list + is_list: true + list_xpath: //*[contains(@class, "c-container")] + page_xpath: //*[@id="page"]//a[@class="n"][last()] + page_attr: href + fields: + - name: title + xpath: .//h3/a + - name: url + xpath: .//h3/a + attr: href + - name: abstract + xpath: .//*[@class="c-abstract"] settings: ROBOTSTXT_OBEY: false USER_AGENT: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36 diff --git a/backend/template/spiderfile/Spiderfile.toscrapy_books b/backend/template/spiderfile/Spiderfile.toscrapy_books index 4bf18f61..247b4f40 100644 --- a/backend/template/spiderfile/Spiderfile.toscrapy_books +++ b/backend/template/spiderfile/Spiderfile.toscrapy_books @@ -4,25 +4,25 @@ start_url: "http://books.toscrape.com" start_stage: "list" engine: "scrapy" stages: - list: - is_list: true - list_css: "section article.product_pod" - page_css: "ul.pager li.next a" - page_attr: "href" - fields: - - name: "title" - css: "h3 > a" - - name: "url" - css: "h3 > a" - attr: "href" - next_stage: "detail" - - name: "price" - css: ".product_price > .price_color" - detail: - is_list: false - fields: - - name: "description" - css: "#product_description + p" +- name: list + is_list: true + list_css: "section article.product_pod" + page_css: "ul.pager li.next a" + page_attr: "href" + fields: + - name: "title" + css: "h3 > a" + - name: "url" + css: "h3 > a" + attr: "href" + next_stage: "detail" + - name: "price" + css: ".product_price > .price_color" +- name: detail + is_list: false + fields: + - name: "description" + css: "#product_description + p" settings: ROBOTSTXT_OBEY: true AUTOTHROTTLE_ENABLED: true diff --git a/backend/utils/helpers.go b/backend/utils/helpers.go index 8a80e9e8..e181c66c 100644 --- a/backend/utils/helpers.go +++ b/backend/utils/helpers.go @@ -6,6 +6,7 @@ import ( "github.com/apex/log" "github.com/gomodule/redigo/redis" "io" + "reflect" "runtime/debug" "unsafe" ) @@ -40,3 +41,20 @@ func Close(c io.Closer) { //log.WithError(err).Error("关闭资源文件失败。") } } + +func Contains(array interface{}, val interface{}) (fla bool) { + fla = false + switch reflect.TypeOf(array).Kind() { + case reflect.Slice: + { + s := reflect.ValueOf(array) + for i := 0; i < s.Len(); i++ { + if reflect.DeepEqual(val, s.Index(i).Interface()) { + fla = true + return + } + } + } + } + return +} diff --git a/docker-compose.yml b/docker-compose.yml index 270c986c..bea50fb1 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -4,37 +4,37 @@ services: image: tikazyq/crawlab:latest container_name: master environment: - CRAWLAB_API_ADDRESS: "http://localhost:8000" - CRAWLAB_SERVER_MASTER: "Y" - CRAWLAB_MONGO_HOST: "mongo" - CRAWLAB_REDIS_ADDRESS: "redis" + CRAWLAB_API_ADDRESS: "http://localhost:8000" # backend API address 后端 API 地址,设置为 http://<宿主机IP>:<端口>,端口为映射出来的端口 + CRAWLAB_SERVER_MASTER: "Y" # whether to be master node 是否为主节点,主节点为 Y,工作节点为 N + CRAWLAB_MONGO_HOST: "mongo1" # MongoDB host address MongoDB 的地址,在 docker compose 网络中,直接引用服务名称 + CRAWLAB_REDIS_ADDRESS: "redis" # Redis host address Redis 的地址,在 docker compose 网络中,直接引用服务名称 ports: - - "8080:8080" # frontend - - "8000:8000" # backend + - "8080:8080" # frontend port mapping 前端端口映射 + - "8000:8000" # backend port mapping 后端端口映射 depends_on: - - mongo + - mongo1 - redis worker: image: tikazyq/crawlab:latest container_name: worker environment: CRAWLAB_SERVER_MASTER: "N" - CRAWLAB_MONGO_HOST: "mongo" + CRAWLAB_MONGO_HOST: "mongo1" CRAWLAB_REDIS_ADDRESS: "redis" depends_on: - - mongo + - mongo1 - redis - mongo: + mongo1: image: mongo:latest restart: always - volumes: - - "/opt/crawlab/mongo/data/db:/data/db" - ports: - - "27017:27017" + # volumes: + # - "/opt/crawlab/mongo/data/db:/data/db" # make data persistent 持久化 + # ports: + # - "27017:27017" # expose port to host machine 暴露接口到宿主机 redis: image: redis:latest restart: always - volumes: - - "/opt/crawlab/redis/data:/data" - ports: - - "6379:6379" \ No newline at end of file + # volumes: + # - "/opt/crawlab/redis/data:/data" # make data persistent 持久化 + # ports: + # - "6379:6379" # expose port to host machine 暴露接口到宿主机 diff --git a/docker_init.sh b/docker_init.sh index 09f63e9b..97c505dc 100755 --- a/docker_init.sh +++ b/docker_init.sh @@ -22,4 +22,5 @@ fi # start nginx service nginx start +# start backend crawlab \ No newline at end of file diff --git a/frontend/package.json b/frontend/package.json index 724b5e36..d11b503b 100644 --- a/frontend/package.json +++ b/frontend/package.json @@ -1,6 +1,6 @@ { "name": "crawlab", - "version": "0.3.5", + "version": "0.4.1", "private": true, "scripts": { "serve": "vue-cli-service serve --ip=0.0.0.0 --mode=development", @@ -29,6 +29,7 @@ "normalize.css": "7.0.0", "nprogress": "0.2.0", "path": "^0.12.7", + "showdown": "^1.9.1", "vcrontab": "^0.3.3", "vue": "^2.5.22", "vue-ba": "^1.2.5", diff --git a/frontend/src/App.vue b/frontend/src/App.vue index 06a41cce..a7ba8069 100644 --- a/frontend/src/App.vue +++ b/frontend/src/App.vue @@ -6,6 +6,9 @@ diff --git a/frontend/src/components/Environment/EnvironmentList.vue b/frontend/src/components/Environment/EnvironmentList.vue index e7c89a51..1b50715d 100644 --- a/frontend/src/components/Environment/EnvironmentList.vue +++ b/frontend/src/components/Environment/EnvironmentList.vue @@ -49,11 +49,11 @@ export default { name: '', value: '' }) - this.$st.sendEv('爬虫详情-环境', '添加') + this.$st.sendEv('爬虫详情', '环境', '添加') }, deleteEnv (index) { this.spiderForm.envs.splice(index, 1) - this.$st.sendEv('爬虫详情-环境', '删除') + this.$st.sendEv('爬虫详情', '环境', '删除') }, save () { this.$store.dispatch('spider/editSpider') @@ -63,7 +63,7 @@ export default { .catch(error => { this.$message.error(error) }) - this.$st.sendEv('爬虫详情-环境', '保存') + this.$st.sendEv('爬虫详情', '环境', '保存') } } } diff --git a/frontend/src/components/InfoView/SpiderInfoView.vue b/frontend/src/components/InfoView/SpiderInfoView.vue index bf05a2ba..f804bca1 100644 --- a/frontend/src/components/InfoView/SpiderInfoView.vue +++ b/frontend/src/components/InfoView/SpiderInfoView.vue @@ -112,7 +112,7 @@ export default { methods: { onCrawl () { this.crawlConfirmDialogVisible = true - this.$st.sendEv('爬虫详情-概览', '点击运行') + this.$st.sendEv('爬虫详情', '概览', '点击运行') }, onSave () { this.$refs['spiderForm'].validate(res => { @@ -126,7 +126,7 @@ export default { }) } }) - this.$st.sendEv('爬虫详情-概览', '保存') + this.$st.sendEv('爬虫详情', '概览', '保存') }, fetchSiteSuggestions (keyword, callback) { this.$request.get('/sites', { diff --git a/frontend/src/components/Overview/TaskOverview.vue b/frontend/src/components/Overview/TaskOverview.vue index 603f28ef..cd3de0a8 100644 --- a/frontend/src/components/Overview/TaskOverview.vue +++ b/frontend/src/components/Overview/TaskOverview.vue @@ -52,11 +52,11 @@ export default { methods: { onClickNodeTitle () { this.$router.push(`/nodes/${this.nodeForm._id}`) - this.$st.sendEv('任务详情-概览', '点击节点详情') + this.$st.sendEv('任务详情', '概览', '点击节点详情') }, onClickSpiderTitle () { this.$router.push(`/spiders/${this.spiderForm._id}`) - this.$st.sendEv('任务详情-概览', '点击爬虫详情') + this.$st.sendEv('任务详情', '概览', '点击爬虫详情') } }, created () { diff --git a/frontend/src/components/TableView/FieldsTableView.vue b/frontend/src/components/TableView/FieldsTableView.vue index 87bf88f1..26b6010d 100644 --- a/frontend/src/components/TableView/FieldsTableView.vue +++ b/frontend/src/components/TableView/FieldsTableView.vue @@ -1,11 +1,5 @@ @@ -165,7 +167,8 @@ export default { return { columns: [ { name: 'name', label: 'Name', width: '180' }, - { name: 'cron', label: 'schedules.cron', width: '120' }, + { name: 'cron', label: 'Cron', width: '120' }, + { name: 'run_type', label: 'Run Type', width: '150' }, { name: 'node_name', label: 'Node', width: '150' }, { name: 'spider_name', label: 'Spider', width: '150' }, { name: 'param', label: 'Parameters', width: '150' }, @@ -208,8 +211,8 @@ export default { onAdd () { this.isEdit = false this.dialogVisible = true - this.$store.commit('schedule/SET_SCHEDULE_FORM', {}) - this.$st.sendEv('定时任务', '添加') + this.$store.commit('schedule/SET_SCHEDULE_FORM', { node_ids: [] }) + this.$st.sendEv('定时任务', '添加定时任务') }, onAddSubmit () { this.$refs.scheduleForm.validate(res => { @@ -235,7 +238,7 @@ export default { } } }) - this.$st.sendEv('定时任务', '提交') + this.$st.sendEv('定时任务', '提交定时任务') }, isShowRun (row) { }, @@ -243,12 +246,12 @@ export default { this.$store.commit('schedule/SET_SCHEDULE_FORM', row) this.dialogVisible = true this.isEdit = true - this.$st.sendEv('定时任务', '修改', 'id', row._id) + this.$st.sendEv('定时任务', '修改定时任务') }, onRemove (row) { - this.$confirm('确定删除定时任务?', '提示', { - confirmButtonText: '确定', - cancelButtonText: '取消', + this.$confirm(this.$t('Are you sure to delete the schedule task?'), this.$t('Notification'), { + confirmButtonText: this.$t('Confirm'), + cancelButtonText: this.$t('Cancel'), type: 'warning' }).then(() => { this.$store.dispatch('schedule/removeSchedule', row._id) @@ -258,15 +261,16 @@ export default { this.$message.success(`Schedule "${row.name}" has been removed`) }, 100) }) - }).catch(() => {}) - this.$st.sendEv('定时任务', '删除', 'id', row._id) + }).catch(() => { + }) + this.$st.sendEv('定时任务', '删除定时任务') }, onCrawl (row) { // 停止定时任务 if (!row.status || row.status === 'running') { - this.$confirm('确定停止定时任务?', '提示', { - confirmButtonText: '确定', - cancelButtonText: '取消', + this.$confirm(this.$t('Are you sure to delete the schedule task?'), this.$t('Notification'), { + confirmButtonText: this.$t('Confirm'), + cancelButtonText: this.$t('Cancel'), type: 'warning' }).then(() => { this.$store.dispatch('schedule/stopSchedule', row._id) @@ -280,13 +284,14 @@ export default { message: resp.data.error }) }) - }).catch(() => {}) + }).catch(() => { + }) } // 运行定时任务 if (row.status === 'stop') { - this.$confirm('确定运行定时任务?', '提示', { - confirmButtonText: '确定', - cancelButtonText: '取消', + this.$confirm(this.$t('Are you sure to delete the schedule task?'), this.$t('Notification'), { + confirmButtonText: this.$t('Confirm'), + cancelButtonText: this.$t('Cancel'), type: 'warning' }).then(() => { this.$store.dispatch('schedule/runSchedule', row._id) @@ -300,7 +305,24 @@ export default { message: resp.data.error }) }) - }).catch(() => {}) + }).catch(() => { + }) + } + }, + isDisabledSpider (spider) { + if (spider.type === 'customized') { + return !spider.cmd + } else { + return false + } + }, + getStatusTooltip (row) { + if (row.status === 'stop') { + return 'Start' + } else if (row.status === 'running') { + return 'Stop' + } else if (row.status === 'error') { + return 'Start' } } }, @@ -338,6 +360,7 @@ export default { min-height: 360px; margin-top: 10px; } + .status-tag { cursor: pointer; } diff --git a/frontend/src/views/spider/SpiderList.vue b/frontend/src/views/spider/SpiderList.vue index 78c87a36..74177b62 100644 --- a/frontend/src/views/spider/SpiderList.vue +++ b/frontend/src/views/spider/SpiderList.vue @@ -370,17 +370,17 @@ export default { } await this.$store.dispatch('spider/getSpiderList') this.$router.push(`/spiders/${res2.data.data._id}`) - this.$st.sendEv('爬虫', '添加爬虫-可配置爬虫') + this.$st.sendEv('爬虫列表', '添加爬虫', '可配置爬虫') }) }, onAddCustomized () { this.addDialogVisible = false this.addCustomizedDialogVisible = true - this.$st.sendEv('爬虫', '添加爬虫-自定义爬虫') + this.$st.sendEv('爬虫列表', '添加爬虫', '自定义爬虫') }, onRefresh () { this.getList() - this.$st.sendEv('爬虫', '刷新') + this.$st.sendEv('爬虫列表', '刷新') }, onSubmit () { const vm = this @@ -434,19 +434,19 @@ export default { message: 'Deleted successfully' }) }) - this.$st.sendEv('爬虫', '删除') + this.$st.sendEv('爬虫列表', '删除爬虫') }) }, onCrawl (row, ev) { ev.stopPropagation() this.crawlConfirmDialogVisible = true this.activeSpiderId = row._id - this.$st.sendEv('爬虫', '点击运行') + this.$st.sendEv('爬虫列表', '点击运行') }, onView (row, ev) { ev.stopPropagation() this.$router.push('/spiders/' + row._id) - this.$st.sendEv('爬虫', '查看') + this.$st.sendEv('爬虫列表', '查看爬虫') }, onImport () { this.$refs.importForm.validate(valid => { @@ -467,7 +467,7 @@ export default { }) } }) - this.$st.sendEv('爬虫', '导入爬虫') + this.$st.sendEv('爬虫列表', '导入爬虫') }, openImportDialog () { this.dialogVisible = true @@ -495,10 +495,6 @@ export default { callback(data) }) }, - onSiteSelect (item) { - this.$store.commit('spider/SET_FILTER_SITE', item._id) - this.$st.sendEv('爬虫', '搜索网站') - }, onAddConfigurableSiteSelect (item) { this.spiderForm.site = item._id }, diff --git a/frontend/src/views/task/TaskDetail.vue b/frontend/src/views/task/TaskDetail.vue index d61394e8..9097344e 100644 --- a/frontend/src/views/task/TaskDetail.vue +++ b/frontend/src/views/task/TaskDetail.vue @@ -97,7 +97,7 @@ export default { }, downloadCSV () { this.$store.dispatch('task/getTaskResultExcel', this.$route.params.id) - this.$st.sendEv('任务详情-结果', '下载CSV') + this.$st.sendEv('任务详情', '结果', '下载CSV') }, getTaskLog () { if (this.$route.params.id) { diff --git a/frontend/src/views/task/TaskList.vue b/frontend/src/views/task/TaskList.vue index 8013c080..becc7d0b 100644 --- a/frontend/src/views/task/TaskList.vue +++ b/frontend/src/views/task/TaskList.vue @@ -221,7 +221,7 @@ export default { }, onRefresh () { this.$store.dispatch('task/getTaskList') - this.$st.sendEv('任务', '搜索') + this.$st.sendEv('任务列表', '搜索') }, onRemoveMultipleTask () { if (this.multipleSelection.length === 0) { @@ -267,20 +267,20 @@ export default { message: 'Deleted successfully' }) }) - this.$st.sendEv('任务', '删除', 'id', row._id) + this.$st.sendEv('任务列表', '删除任务') }) }, onView (row) { this.$router.push(`/tasks/${row._id}`) - this.$st.sendEv('任务', '搜索', 'id', row._id) + this.$st.sendEv('任务列表', '查看任务') }, onClickSpider (row) { this.$router.push(`/spiders/${row.spider_id}`) - this.$st.sendEv('任务', '点击爬虫详情', 'id', row.spider_id) + this.$st.sendEv('任务列表', '点击爬虫详情') }, onClickNode (row) { this.$router.push(`/nodes/${row.node_id}`) - this.$st.sendEv('任务', '点击节点详情', 'id', row.node_id) + this.$st.sendEv('任务列表', '点击节点详情') }, onPageChange () { setTimeout(() => { diff --git a/frontend/src/views/user/UserList.vue b/frontend/src/views/user/UserList.vue index a0e2029f..26cbedea 100644 --- a/frontend/src/views/user/UserList.vue +++ b/frontend/src/views/user/UserList.vue @@ -3,14 +3,14 @@ - - + + - + - - + + @@ -27,7 +27,7 @@
- + 添加用户
@@ -109,6 +109,7 @@ export default { } return { dialogVisible: false, + isAdd: false, rules: { password: [{ validator: validatePass }] } @@ -145,6 +146,7 @@ export default { return dayjs(ts).format('YYYY-MM-DD HH:mm:ss') }, onEdit (row) { + this.isAdd = false this.$store.commit('user/SET_USER_FORM', row) this.dialogVisible = true }, @@ -161,24 +163,48 @@ export default { message: this.$t('Deleted successfully') }) }) - this.$st.sendEv('用户', '删除', 'id', row._id) + .then(() => { + this.$store.dispatch('user/getUserList') + }) + this.$st.sendEv('用户列表', '删除用户') }) // this.$store.commit('user/SET_USER_FORM', row) }, onConfirm () { - this.dialogVisible = false this.$refs.form.validate(valid => { - if (valid) { + if (!valid) return + if (this.isAdd) { + // 添加用户 + this.$store.dispatch('user/addUser') + .then(() => { + this.$message({ + type: 'success', + message: this.$t('Saved successfully') + }) + this.dialogVisible = false + this.$st.sendEv('用户列表', '添加用户') + }) + .then(() => { + this.$store.dispatch('user/getUserList') + }) + } else { + // 编辑用户 this.$store.dispatch('user/editUser') .then(() => { this.$message({ type: 'success', message: this.$t('Saved successfully') }) + this.dialogVisible = false + this.$st.sendEv('用户列表', '编辑用户') }) } }) - this.$st.sendEv('用户', '编辑') + }, + onClickAddUser () { + this.isAdd = true + this.$store.commit('user/SET_USER_FORM', {}) + this.dialogVisible = true } }, created () { @@ -192,15 +218,18 @@ export default { display: flex; justify-content: space-between; margin-bottom: 8px; - .filter-search { - width: 240px; - } - .right { - .btn { - margin-left: 10px; - } - } + .filter-search { + width: 240px; + } + + .right { + + .btn { + margin-left: 10px; + } + + } } .el-table { diff --git a/frontend/yarn.lock b/frontend/yarn.lock index ef361b2b..a6600a96 100644 --- a/frontend/yarn.lock +++ b/frontend/yarn.lock @@ -1158,6 +1158,11 @@ ansi-regex@^4.0.0: version "4.0.0" resolved "http://registry.npm.taobao.org/ansi-regex/download/ansi-regex-4.0.0.tgz#70de791edf021404c3fd615aa89118ae0432e5a9" +ansi-regex@^4.1.0: + version "4.1.0" + resolved "https://registry.npm.taobao.org/ansi-regex/download/ansi-regex-4.1.0.tgz?cache=0&sync_timestamp=1570188570027&other_urls=https%3A%2F%2Fregistry.npm.taobao.org%2Fansi-regex%2Fdownload%2Fansi-regex-4.1.0.tgz#8b9f8f08cf1acb843756a839ca8c7e3168c51997" + integrity sha1-i5+PCM8ay4Q3Vqg5yox+MWjFGZc= + ansi-styles@^2.2.1: version "2.2.1" resolved "http://registry.npm.taobao.org/ansi-styles/download/ansi-styles-2.2.1.tgz#b432dd3358b634cf75e1e4664368240533c1ddbe" @@ -2086,6 +2091,15 @@ cliui@^4.0.0, cliui@^4.1.0: strip-ansi "^4.0.0" wrap-ansi "^2.0.0" +cliui@^5.0.0: + version "5.0.0" + resolved "https://registry.npm.taobao.org/cliui/download/cliui-5.0.0.tgz#deefcfdb2e800784aa34f46fa08e06851c7bbbc5" + integrity sha1-3u/P2y6AB4SqNPRvoI4GhRx7u8U= + dependencies: + string-width "^3.1.0" + strip-ansi "^5.2.0" + wrap-ansi "^5.1.0" + clone-deep@^2.0.1: version "2.0.2" resolved "http://registry.npm.taobao.org/clone-deep/download/clone-deep-2.0.2.tgz#00db3a1e173656730d1188c3d6aced6d7ea97713" @@ -3884,6 +3898,11 @@ get-caller-file@^1.0.1: version "1.0.3" resolved "http://registry.npm.taobao.org/get-caller-file/download/get-caller-file-1.0.3.tgz#f978fa4c90d1dfe7ff2d6beda2a515e713bdcf4a" +get-caller-file@^2.0.1: + version "2.0.5" + resolved "https://registry.npm.taobao.org/get-caller-file/download/get-caller-file-2.0.5.tgz#4f94412a82db32f36e3b0b9741f8a97feb031f7e" + integrity sha1-T5RBKoLbMvNuOwuXQfipf+sDH34= + get-stdin@^4.0.1: version "4.0.1" resolved "http://registry.npm.taobao.org/get-stdin/download/get-stdin-4.0.1.tgz#b968c6b0a04384324902e8bf1a5df32579a450fe" @@ -7271,6 +7290,11 @@ require-main-filename@^1.0.1: version "1.0.1" resolved "http://registry.npm.taobao.org/require-main-filename/download/require-main-filename-1.0.1.tgz#97f717b69d48784f5f526a6c5aa8ffdda055a4d1" +require-main-filename@^2.0.0: + version "2.0.0" + resolved "https://registry.npm.taobao.org/require-main-filename/download/require-main-filename-2.0.0.tgz#d0b329ecc7cc0f61649f62215be69af54aa8989b" + integrity sha1-0LMp7MfMD2Fkn2IhW+aa9UqomJs= + require-uncached@^1.0.3: version "1.0.3" resolved "http://registry.npm.taobao.org/require-uncached/download/require-uncached-1.0.3.tgz#4e0d56d6c9662fd31e43011c4b95aa49955421d3" @@ -7586,6 +7610,13 @@ shellwords@^0.1.1: version "0.1.1" resolved "http://registry.npm.taobao.org/shellwords/download/shellwords-0.1.1.tgz#d6b9181c1a48d397324c84871efbcfc73fc0654b" +showdown@^1.9.1: + version "1.9.1" + resolved "https://registry.npm.taobao.org/showdown/download/showdown-1.9.1.tgz#134e148e75cd4623e09c21b0511977d79b5ad0ef" + integrity sha1-E04UjnXNRiPgnCGwURl315ta0O8= + dependencies: + yargs "^14.2" + sigmund@^1.0.1: version "1.0.1" resolved "http://registry.npm.taobao.org/sigmund/download/sigmund-1.0.1.tgz#3ff21f198cad2175f9f3b781853fd94d0d19b590" @@ -7890,6 +7921,15 @@ string-width@^3.0.0: is-fullwidth-code-point "^2.0.0" strip-ansi "^5.0.0" +string-width@^3.1.0: + version "3.1.0" + resolved "https://registry.npm.taobao.org/string-width/download/string-width-3.1.0.tgz?cache=0&other_urls=https%3A%2F%2Fregistry.npm.taobao.org%2Fstring-width%2Fdownload%2Fstring-width-3.1.0.tgz#22767be21b62af1081574306f69ac51b62203961" + integrity sha1-InZ74htirxCBV0MG9prFG2IgOWE= + dependencies: + emoji-regex "^7.0.1" + is-fullwidth-code-point "^2.0.0" + strip-ansi "^5.1.0" + string.prototype.padend@^3.0.0: version "3.0.0" resolved "http://registry.npm.taobao.org/string.prototype.padend/download/string.prototype.padend-3.0.0.tgz#f3aaef7c1719f170c5eab1c32bf780d96e21f2f0" @@ -7940,6 +7980,13 @@ strip-ansi@^5.0.0: dependencies: ansi-regex "^4.0.0" +strip-ansi@^5.1.0, strip-ansi@^5.2.0: + version "5.2.0" + resolved "https://registry.npm.taobao.org/strip-ansi/download/strip-ansi-5.2.0.tgz#8c9a536feb6afc962bdfa5b104a5091c1ad9c0ae" + integrity sha1-jJpTb+tq/JYr36WxBKUJHBrZwK4= + dependencies: + ansi-regex "^4.1.0" + strip-bom@3.0.0, strip-bom@^3.0.0: version "3.0.0" resolved "http://registry.npm.taobao.org/strip-bom/download/strip-bom-3.0.0.tgz#2334c18e9c759f7bdd56fdef7e9ae3d588e68ed3" @@ -8834,6 +8881,15 @@ wrap-ansi@^2.0.0: string-width "^1.0.1" strip-ansi "^3.0.1" +wrap-ansi@^5.1.0: + version "5.1.0" + resolved "https://registry.npm.taobao.org/wrap-ansi/download/wrap-ansi-5.1.0.tgz#1fd1f67235d5b6d0fee781056001bfb694c03b09" + integrity sha1-H9H2cjXVttD+54EFYAG/tpTAOwk= + dependencies: + ansi-styles "^3.2.0" + string-width "^3.0.0" + strip-ansi "^5.0.0" + wrappy@1: version "1.0.2" resolved "http://registry.npm.taobao.org/wrappy/download/wrappy-1.0.2.tgz#b5243d8f3ec1aa35f1364605bc0d1036e30ab69f" @@ -8911,6 +8967,14 @@ yargs-parser@^11.1.1: camelcase "^5.0.0" decamelize "^1.2.0" +yargs-parser@^15.0.0: + version "15.0.0" + resolved "https://registry.npm.taobao.org/yargs-parser/download/yargs-parser-15.0.0.tgz#cdd7a97490ec836195f59f3f4dbe5ea9e8f75f08" + integrity sha1-zdepdJDsg2GV9Z8/Tb5eqej3Xwg= + dependencies: + camelcase "^5.0.0" + decamelize "^1.2.0" + yargs-parser@^5.0.0: version "5.0.0" resolved "http://registry.npm.taobao.org/yargs-parser/download/yargs-parser-5.0.0.tgz#275ecf0d7ffe05c77e64e7c86e4cd94bf0e1228a" @@ -8974,6 +9038,23 @@ yargs@^12.0.5: y18n "^3.2.1 || ^4.0.0" yargs-parser "^11.1.1" +yargs@^14.2: + version "14.2.2" + resolved "https://registry.npm.taobao.org/yargs/download/yargs-14.2.2.tgz?cache=0&sync_timestamp=1574137859196&other_urls=https%3A%2F%2Fregistry.npm.taobao.org%2Fyargs%2Fdownload%2Fyargs-14.2.2.tgz#2769564379009ff8597cdd38fba09da9b493c4b5" + integrity sha1-J2lWQ3kAn/hZfN04+6CdqbSTxLU= + dependencies: + cliui "^5.0.0" + decamelize "^1.2.0" + find-up "^3.0.0" + get-caller-file "^2.0.1" + require-directory "^2.1.1" + require-main-filename "^2.0.0" + set-blocking "^2.0.0" + string-width "^3.0.0" + which-module "^2.0.0" + y18n "^4.0.0" + yargs-parser "^15.0.0" + yargs@^7.0.0: version "7.1.0" resolved "http://registry.npm.taobao.org/yargs/download/yargs-7.1.0.tgz#6ba318eb16961727f5d284f8ea003e8d6154d0c8" diff --git a/jenkins/develop/docker-compose.yaml b/jenkins/develop/docker-compose.yaml index ec95ae9f..745c0bdc 100644 --- a/jenkins/develop/docker-compose.yaml +++ b/jenkins/develop/docker-compose.yaml @@ -27,10 +27,6 @@ services: mongo: image: mongo:latest restart: always - ports: - - "27027:27017" redis: image: redis:latest restart: always - ports: - - "6389:6379" \ No newline at end of file diff --git a/jenkins/master/docker-compose.yaml b/jenkins/master/docker-compose.yaml index 1b7a476b..ff9dd64e 100644 --- a/jenkins/master/docker-compose.yaml +++ b/jenkins/master/docker-compose.yaml @@ -29,12 +29,9 @@ services: restart: always volumes: - "/opt/crawlab/mongo/data/db:/data/db" - ports: - - "27017:27017" + - "/opt/crawlab/mongo/tmp:/tmp" redis: image: redis:latest restart: always volumes: - "/opt/crawlab/redis/data:/data" - ports: - - "6379:6379" \ No newline at end of file diff --git a/wait-for-it.sh b/wait-for-it.sh new file mode 100755 index 00000000..607a7d67 --- /dev/null +++ b/wait-for-it.sh @@ -0,0 +1,178 @@ +#!/usr/bin/env bash +# Use this script to test if a given TCP host/port are available + +WAITFORIT_cmdname=${0##*/} + +echoerr() { if [[ $WAITFORIT_QUIET -ne 1 ]]; then echo "$@" 1>&2; fi } + +usage() +{ + cat << USAGE >&2 +Usage: + $WAITFORIT_cmdname host:port [-s] [-t timeout] [-- command args] + -h HOST | --host=HOST Host or IP under test + -p PORT | --port=PORT TCP port under test + Alternatively, you specify the host and port as host:port + -s | --strict Only execute subcommand if the test succeeds + -q | --quiet Don't output any status messages + -t TIMEOUT | --timeout=TIMEOUT + Timeout in seconds, zero for no timeout + -- COMMAND ARGS Execute command with args after the test finishes +USAGE + exit 1 +} + +wait_for() +{ + if [[ $WAITFORIT_TIMEOUT -gt 0 ]]; then + echoerr "$WAITFORIT_cmdname: waiting $WAITFORIT_TIMEOUT seconds for $WAITFORIT_HOST:$WAITFORIT_PORT" + else + echoerr "$WAITFORIT_cmdname: waiting for $WAITFORIT_HOST:$WAITFORIT_PORT without a timeout" + fi + WAITFORIT_start_ts=$(date +%s) + while : + do + if [[ $WAITFORIT_ISBUSY -eq 1 ]]; then + nc -z $WAITFORIT_HOST $WAITFORIT_PORT + WAITFORIT_result=$? + else + (echo > /dev/tcp/$WAITFORIT_HOST/$WAITFORIT_PORT) >/dev/null 2>&1 + WAITFORIT_result=$? + fi + if [[ $WAITFORIT_result -eq 0 ]]; then + WAITFORIT_end_ts=$(date +%s) + echoerr "$WAITFORIT_cmdname: $WAITFORIT_HOST:$WAITFORIT_PORT is available after $((WAITFORIT_end_ts - WAITFORIT_start_ts)) seconds" + break + fi + sleep 1 + done + return $WAITFORIT_result +} + +wait_for_wrapper() +{ + # In order to support SIGINT during timeout: http://unix.stackexchange.com/a/57692 + if [[ $WAITFORIT_QUIET -eq 1 ]]; then + timeout $WAITFORIT_BUSYTIMEFLAG $WAITFORIT_TIMEOUT $0 --quiet --child --host=$WAITFORIT_HOST --port=$WAITFORIT_PORT --timeout=$WAITFORIT_TIMEOUT & + else + timeout $WAITFORIT_BUSYTIMEFLAG $WAITFORIT_TIMEOUT $0 --child --host=$WAITFORIT_HOST --port=$WAITFORIT_PORT --timeout=$WAITFORIT_TIMEOUT & + fi + WAITFORIT_PID=$! + trap "kill -INT -$WAITFORIT_PID" INT + wait $WAITFORIT_PID + WAITFORIT_RESULT=$? + if [[ $WAITFORIT_RESULT -ne 0 ]]; then + echoerr "$WAITFORIT_cmdname: timeout occurred after waiting $WAITFORIT_TIMEOUT seconds for $WAITFORIT_HOST:$WAITFORIT_PORT" + fi + return $WAITFORIT_RESULT +} + +# process arguments +while [[ $# -gt 0 ]] +do + case "$1" in + *:* ) + WAITFORIT_hostport=(${1//:/ }) + WAITFORIT_HOST=${WAITFORIT_hostport[0]} + WAITFORIT_PORT=${WAITFORIT_hostport[1]} + shift 1 + ;; + --child) + WAITFORIT_CHILD=1 + shift 1 + ;; + -q | --quiet) + WAITFORIT_QUIET=1 + shift 1 + ;; + -s | --strict) + WAITFORIT_STRICT=1 + shift 1 + ;; + -h) + WAITFORIT_HOST="$2" + if [[ $WAITFORIT_HOST == "" ]]; then break; fi + shift 2 + ;; + --host=*) + WAITFORIT_HOST="${1#*=}" + shift 1 + ;; + -p) + WAITFORIT_PORT="$2" + if [[ $WAITFORIT_PORT == "" ]]; then break; fi + shift 2 + ;; + --port=*) + WAITFORIT_PORT="${1#*=}" + shift 1 + ;; + -t) + WAITFORIT_TIMEOUT="$2" + if [[ $WAITFORIT_TIMEOUT == "" ]]; then break; fi + shift 2 + ;; + --timeout=*) + WAITFORIT_TIMEOUT="${1#*=}" + shift 1 + ;; + --) + shift + WAITFORIT_CLI=("$@") + break + ;; + --help) + usage + ;; + *) + echoerr "Unknown argument: $1" + usage + ;; + esac +done + +if [[ "$WAITFORIT_HOST" == "" || "$WAITFORIT_PORT" == "" ]]; then + echoerr "Error: you need to provide a host and port to test." + usage +fi + +WAITFORIT_TIMEOUT=${WAITFORIT_TIMEOUT:-15} +WAITFORIT_STRICT=${WAITFORIT_STRICT:-0} +WAITFORIT_CHILD=${WAITFORIT_CHILD:-0} +WAITFORIT_QUIET=${WAITFORIT_QUIET:-0} + +# check to see if timeout is from busybox? +WAITFORIT_TIMEOUT_PATH=$(type -p timeout) +WAITFORIT_TIMEOUT_PATH=$(realpath $WAITFORIT_TIMEOUT_PATH 2>/dev/null || readlink -f $WAITFORIT_TIMEOUT_PATH) +if [[ $WAITFORIT_TIMEOUT_PATH =~ "busybox" ]]; then + WAITFORIT_ISBUSY=1 + WAITFORIT_BUSYTIMEFLAG="-t" + +else + WAITFORIT_ISBUSY=0 + WAITFORIT_BUSYTIMEFLAG="" +fi + +if [[ $WAITFORIT_CHILD -gt 0 ]]; then + wait_for + WAITFORIT_RESULT=$? + exit $WAITFORIT_RESULT +else + if [[ $WAITFORIT_TIMEOUT -gt 0 ]]; then + wait_for_wrapper + WAITFORIT_RESULT=$? + else + wait_for + WAITFORIT_RESULT=$? + fi +fi + +if [[ $WAITFORIT_CLI != "" ]]; then + if [[ $WAITFORIT_RESULT -ne 0 && $WAITFORIT_STRICT -eq 1 ]]; then + echoerr "$WAITFORIT_cmdname: strict mode, refusing to execute subprocess" + exit $WAITFORIT_RESULT + fi + exec "${WAITFORIT_CLI[@]}" +else + exit $WAITFORIT_RESULT +fi \ No newline at end of file