updated README

2026-01-21 17:21:09 +01:00 · 2019-06-20 12:42:10 +08:00
parent 212c291a05
commit 312ba656cd
2 changed files with 32 additions and 55 deletions
--- a/README-zh.md
+++ b/README-zh.md
@@ -10,7 +10,7 @@

 基于Celery的爬虫分布式爬虫管理平台，支持多种编程语言以及多种爬虫框架.

-[查看演示 Demo](http://114.67.75.98:8080) | [文档](https://tikazyq.github.io/crawlab)
+[查看演示 Demo](http://114.67.75.98:8080) | [文档](https://tikazyq.github.io/crawlab-docs)

 ## 要求
 - Python 3.6+
@@ -29,7 +29,7 @@

 #### 首页

-![](https://user-gold-cdn.xitu.io/2019/3/6/169524d4c7f117f7?imageView2/0/w/1280/h/960/format/webp/ignore-error/1)
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/home.png)

 #### 爬虫列表

@@ -37,12 +37,20 @@

 #### 爬虫详情 - 概览

-![](https://user-gold-cdn.xitu.io/2019/3/6/169524e0794d6be1?imageView2/0/w/1280/h/960/format/webp/ignore-error/1)
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/spider-detail-overview.png)
+
+#### 爬虫详情 - 分析
+
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/spider-detail-analytics.png)

 #### 任务详情 - 抓取结果

 ![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/task-detail-results.png)

+#### 定时任务
+
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/schedule-generate-cron.png)
+
 ## 架构

 Crawlab的架构跟Celery非常相似，但是加入了包括前端、爬虫、Flower在内的额外模块，以支持爬虫管理的功能。架构图如下。
--- a/README.md
+++ b/README.md
@@ -10,7 +10,7 @@

 Celery-based web crawler admin platform for managing distributed web spiders regardless of languages and frameworks. 

-[Demo](http://114.67.75.98:8080) | [Documentation](https://tikazyq.github.io/crawlab)
+[Demo](http://114.67.75.98:8080) | [Documentation](https://tikazyq.github.io/crawlab-docs)

 ## Pre-requisite
 - Python 3.6+
@@ -20,49 +20,42 @@ Celery-based web crawler admin platform for managing distributed web spiders reg

 ## Installation

-```bash
-# install the requirements for backend
-pip install -r requirements.txt
-```
-
-```bash
-# install frontend node modules
-cd frontend
-npm install
-```
-
-## Configure
-
-Please edit configuration file `config.py` to configure api and database connections.
-
-## Quick Start
-```bash
-python manage.py serve
-```
+Threee methods:
+1. [Docker](https://tikazyq.github.io/crawlab/Installation/Docker.md) (Recommended)
+2. [Direct Deploy](https://tikazyq.github.io/crawlab/Installation/Direct.md)
+3. [Preview](https://tikazyq.github.io/crawlab/Installation/Direct.md) (Quick start)

 ## Screenshot

 #### Home Page

-![](https://user-gold-cdn.xitu.io/2019/3/6/169524d4c7f117f7?imageView2/0/w/1280/h/960/format/webp/ignore-error/1)
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/home.png)

 #### Spider List

-![](https://user-gold-cdn.xitu.io/2019/3/6/169524daf9c8ccef?imageView2/0/w/1280/h/960/format/webp/ignore-error/1)
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/spider-list.png)

 #### Spider Detail - Overview

-![](https://user-gold-cdn.xitu.io/2019/3/6/169524e0794d6be1?imageView2/0/w/1280/h/960/format/webp/ignore-error/1)
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/spider-detail-overview.png)
+
+#### Spider Detail - Analytics
+
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/spider-detail-analytics.png)

 #### Task Detail - Results

-![](https://user-gold-cdn.xitu.io/2019/3/6/169524e4064c7f0a?imageView2/0/w/1280/h/960/format/webp/ignore-error/1)
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/task-detail-results.png)
+
+#### Cron Schedule
+
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/schedule-generate-cron.png)

 ## Architecture

 Crawlab's architecture is very similar to Celery's, but a few more modules including Frontend, Spiders and Flower are added to feature the crawling management functionality. 

-![crawlab-architecture](./docs/img/crawlab-architecture.png)
+![](https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/architecture.png)

 ### Nodes

@@ -70,16 +63,7 @@ Nodes are actually the workers defined in Celery. A node is running and connecte

 ### Spiders

-##### Auto Discovery
-In `config.py` file, edit `PROJECT_SOURCE_FILE_FOLDER` as the directory where the spiders projects are located. The web app will discover spider projects automatically. How simple is that!
-
-##### Deploy Spiders
-
-All spiders need to be deployed to a specific node before crawling. Simply click "Deploy" button on spider detail page and the spiders will be deployed to all active nodes. 
-
-##### Run Spiders
-
-After deploying the spider, you can click "Run" button on spider detail page and select a specific node to start crawling. It will triggers a task for the crawling, where you can see in detail in tasks page.
+The spider source codes and configured crawling rules are stored on `App`, which need to be deployed to each `worker` node.

 ### Tasks

@@ -146,26 +130,11 @@ Crawlab is easy to use, general enough to adapt spiders in any language and any
 | [ScrapydWeb](https://github.com/my8100/scrapydweb) | Admin Platform | Y | Y | Y
 | [Scrapyd](https://github.com/scrapy/scrapyd) | Web Service | Y | N | N/A

-## TODOs
-##### Backend
- [ ] File Management
- [ ] MySQL Database Support
- [ ] Task Restart
- [ ] Node Monitoring
- [ ] More spider examples
-
-##### Frontend
- [x] Task Stats/Analytics
- [x] Table Filters
- [x] Multi-Language Support (中文)
- [ ] Login & User Management
- [ ] General Search
-
 ## Community & Sponsorship

 If you feel Crawlab could benefit your daily work or your company, please add the author's Wechat account noting "Crawlab" to enter the discussion group. Or you scan the Alipay QR code below to give us a reward to upgrade our teamwork software or buy a coffee.

 <p align="center">
-    <img src="https://user-gold-cdn.xitu.io/2019/3/15/169814cbd5e600e9?imageslim" height="360">
-    <img src="https://raw.githubusercontent.com/tikazyq/crawlab/master/docs/img/payment.jpg" height="360">
+    <img src="https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/qrcode.png" height="360">
+    <img src="https://crawlab.oss-cn-hangzhou.aliyuncs.com/gitbook/payment.jpg" height="360">
 </p>