Commit Graph

49 Commits

Author SHA1 Message Date
marvzhang
562ae39eb2 加入复制爬虫 2020-02-24 09:12:03 +08:00
marvzhang
57f1edc207 updated dockerpush.yml 2020-02-23 09:19:18 +08:00
marvzhang
8c059c912e 加入批量删除爬虫任务 2020-02-22 17:35:51 +08:00
marvzhang
0cb2e8134c 加入Git同步 2020-02-18 12:15:40 +08:00
marvzhang
bb972e98ef 加入添加scrapy爬虫 2020-02-17 14:06:16 +08:00
marvzhang
a448c0f06b fixed unable to sync spiders to nodes error 2020-02-03 16:08:43 +08:00
marvzhang
de2d230e1a changed dir 2020-02-03 11:58:05 +08:00
marvzhang
be3235cefa added demo for general spiders 2020-02-03 10:30:04 +08:00
marvzhang
cb6a8f79d8 added demo spiders 2020-02-03 09:21:41 +08:00
marvzhang
5740774ccc 添加demo爬虫 2020-02-02 22:56:11 +08:00
marvzhang
633740f1c2 fixed https://github.com/crawlab-team/crawlab/issues/485 2020-02-01 19:18:30 +08:00
陈景阳
81edf9ab72 fix 无法及时同步爬虫的问题 2020-01-28 15:43:57 +08:00
marvzhang
04905f38b5 加入爬虫列表排序 2020-01-06 13:18:29 +08:00
marvzhang
01aea28c9c 添加文件管理功能(后端) 2019-12-25 20:49:43 +08:00
marvzhang
5405dba540 加入可配置爬虫阶段设置 2019-11-30 10:58:54 +08:00
marvzhang
9c282ddb4d 加入可配置爬虫 2019-11-24 17:57:12 +08:00
marvzhang
7003951561 fixed https://github.com/crawlab-team/crawlab/issues/315 2019-11-24 12:20:44 +08:00
yaziming
ee808e0e60 refactor(all): refactor code
remove redundant code and some code refactor
2019-10-11 16:01:57 +08:00
陈景阳
4a40d38844 fix md5值不一致的问题 2019-10-07 12:49:37 +08:00
陈景阳
dabf5cacf1 fix 创建目录错误的问题 2019-10-07 12:21:32 +08:00
陈景阳
41556cab74 fix 删除爬虫的问题 2019-09-30 12:09:37 +08:00
陈景阳
698d240bd6 fix bug 2019-09-26 21:13:25 +08:00
陈景阳
0ddb294885 完成爬虫列表 2019-09-26 20:53:05 +08:00
陈景阳
3845e57612 fix 上传的问题 2019-09-26 19:44:12 +08:00
陈景阳
5f158ddb44 完成爬虫获取 2019-09-26 19:12:02 +08:00
陈景阳
31be4c1839 优化爬虫获取逻辑 2019-09-26 16:43:32 +08:00
陈景阳
a11544d809 优化爬虫获取逻辑 2019-09-26 16:26:32 +08:00
陈景阳
947b561653 优化爬虫获取逻辑 2019-09-26 11:38:13 +08:00
陈景阳
05c28230b7 爬虫逻辑修改为从GridFS获取 2019-09-26 11:28:20 +08:00
陈景阳
42e7647cd4 fix 消息无法订阅问题
fix 可能出现重复爬虫的问题
2019-09-10 14:26:50 +08:00
yaziming
e10d8fd996 refactor(backend): Use more efficient bytes to string methods and remove unnecessary type conversions
detail:
    1. add utils.BytesToString function instead of string() convert bytes to string.
    2. use bytes.NewReader instead of strings.NewReader(string(sb)).
    3. use w.Body.Bytes() instead of []byte(w.Body.String()).
2019-09-03 15:17:32 +08:00
陈景阳
cd78e6c745 还原代码 2019-09-03 09:06:04 +08:00
陈景阳
eb92d34f9b Merge branch 'develop' of https://github.com/crawlab-team/crawlab into develop 2019-09-03 08:57:21 +08:00
陈景阳
b5084c964b 还原代码 2019-09-02 18:14:34 +08:00
陈景阳
8b237ddc2a fix 无法正常删除有问题的爬虫 2019-09-02 18:04:47 +08:00
陈景阳
93dd3b714a fix 如果从dir读取爬虫为空,则移除所有的爬虫 2019-09-02 17:37:48 +08:00
yaziming
a9346a0934 backend:
1. Mongo dial add 5 seconds connection timeout.
 2. Redis uses connection pool mode.
 3. Redis pool new connection have 10 seconds write timeout and read timeout and connection timeout.
2019-09-01 17:18:08 +08:00
yaziming
81f6cf021f Backend:
improve
     - AuthMiddleware 注入当前用户的信息
     - 增加Context服务支持快捷获取当前登录者信息
     - 重构Login/GetMe接口逻辑避免重复的数据库查询
     - 规范化error信息声明(向下兼容,旧代码可逐渐迁移规范化)
     - 修正部分不符合规范的代码
2019-08-31 21:26:56 +08:00
陈景阳
b9ed176950 添加日志打印 2019-08-31 17:56:42 +08:00
陈景阳
e423ff564c fix 爬虫目录无法打开的问题 2019-08-31 17:04:49 +08:00
陈景阳
4944883f95 fix 节点注册异常情况 2019-08-31 13:49:34 +08:00
陈景阳
771cb76277 fix 前端控制台报错的问题 \
fix 无法打印中文的问题
2019-08-31 12:04:12 +08:00
陈景阳
d5dce56712 Merge branch 'develop' of https://github.com/crawlab-team/crawlab into v0.4.0 2019-08-28 16:00:27 +08:00
陈景阳
cae11e3796 添加打印日志 2019-08-28 16:00:09 +08:00
陈景阳
c7e137a6aa 删除爬虫顺带删除文件 2019-08-27 09:39:16 +08:00
hantmac
007f10b83b bug fix:fix permission bug when create spiders dir caused by umask 2019-08-26 17:44:01 +08:00
hantmac
4446f8704a Code optimization:change the crawler zip file to slice read 2019-08-20 17:09:43 +08:00
hantmac
a599db1810 Code optimization:change the crawler zip file to slice read 2019-08-20 16:59:53 +08:00
Marvin Zhang
56c99b314f added golang 2019-07-22 12:51:52 +08:00