2015-12-09 07:57:17 +00:00
<!DOCTYPE HTML>
< html lang = "zh-tw" >
< head >
< meta charset = "UTF-8" >
< meta http-equiv = "X-UA-Compatible" content = "IE=edge" / >
2015-12-21 04:55:18 +00:00
< title > 示例: 併發的Web爬蟲 | Go编程语言< / title >
2015-12-09 07:57:17 +00:00
< meta content = "text/html; charset=utf-8" http-equiv = "Content-Type" >
< meta name = "description" content = "" >
< meta name = "generator" content = "GitBook 2.5.2" >
< meta name = "HandheldFriendly" content = "true" / >
< meta name = "viewport" content = "width=device-width, initial-scale=1, user-scalable=no" >
< meta name = "apple-mobile-web-app-capable" content = "yes" >
< meta name = "apple-mobile-web-app-status-bar-style" content = "black" >
< link rel = "apple-touch-icon-precomposed" sizes = "152x152" href = "../gitbook/images/apple-touch-icon-precomposed-152.png" >
< link rel = "shortcut icon" href = "../gitbook/images/favicon.ico" type = "image/x-icon" >
< link rel = "stylesheet" href = "../gitbook/style.css" >
< link rel = "stylesheet" href = "../gitbook/plugins/gitbook-plugin-highlight/website.css" >
< link rel = "stylesheet" href = "../gitbook/plugins/gitbook-plugin-fontsettings/website.css" >
< link rel = "next" href = "../ch8/ch8-07.html" / >
< link rel = "prev" href = "../ch8/ch8-05.html" / >
< / head >
< body >
2015-12-25 04:39:07 +00:00
< div class = "book" data-level = "8.6" data-chapter-title = "示例: 併發的Web爬蟲" data-filepath = "ch8/ch8-06.md" data-basepath = ".." data-revision = "Fri Dec 25 2015 12:32:44 GMT+0800 (中国标准时间)" >
2015-12-09 07:57:17 +00:00
< div class = "book-summary" >
< nav role = "navigation" >
< ul class = "summary" >
< li class = "chapter " data-level = "0" data-path = "index.html" >
< a href = "../index.html" >
< i class = "fa fa-check" > < / i >
前言
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "0.1" data-path = "ch0/ch0-01.html" >
< a href = "../ch0/ch0-01.html" >
< i class = "fa fa-check" > < / i >
< b > 0.1.< / b >
Go語言起源
< / a >
< / li >
< li class = "chapter " data-level = "0.2" data-path = "ch0/ch0-02.html" >
< a href = "../ch0/ch0-02.html" >
< i class = "fa fa-check" > < / i >
< b > 0.2.< / b >
Go語言項目
< / a >
< / li >
< li class = "chapter " data-level = "0.3" data-path = "ch0/ch0-03.html" >
< a href = "../ch0/ch0-03.html" >
< i class = "fa fa-check" > < / i >
< b > 0.3.< / b >
本書的組織
< / a >
< / li >
< li class = "chapter " data-level = "0.4" data-path = "ch0/ch0-04.html" >
< a href = "../ch0/ch0-04.html" >
< i class = "fa fa-check" > < / i >
< b > 0.4.< / b >
更多的信息
< / a >
< / li >
< li class = "chapter " data-level = "0.5" data-path = "ch0/ch0-05.html" >
< a href = "../ch0/ch0-05.html" >
< i class = "fa fa-check" > < / i >
< b > 0.5.< / b >
2015-12-21 04:55:18 +00:00
致謝
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "1" data-path = "ch1/ch1.html" >
< a href = "../ch1/ch1.html" >
< i class = "fa fa-check" > < / i >
< b > 1.< / b >
入門
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "1.1" data-path = "ch1/ch1-01.html" >
< a href = "../ch1/ch1-01.html" >
< i class = "fa fa-check" > < / i >
< b > 1.1.< / b >
Hello, World
< / a >
< / li >
< li class = "chapter " data-level = "1.2" data-path = "ch1/ch1-02.html" >
< a href = "../ch1/ch1-02.html" >
< i class = "fa fa-check" > < / i >
< b > 1.2.< / b >
命令行參數
< / a >
< / li >
< li class = "chapter " data-level = "1.3" data-path = "ch1/ch1-03.html" >
< a href = "../ch1/ch1-03.html" >
< i class = "fa fa-check" > < / i >
< b > 1.3.< / b >
2015-12-21 04:55:18 +00:00
査找重複的行
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "1.4" data-path = "ch1/ch1-04.html" >
< a href = "../ch1/ch1-04.html" >
< i class = "fa fa-check" > < / i >
< b > 1.4.< / b >
2015-12-21 04:55:18 +00:00
GIF動畵
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "1.5" data-path = "ch1/ch1-05.html" >
< a href = "../ch1/ch1-05.html" >
< i class = "fa fa-check" > < / i >
< b > 1.5.< / b >
穫取URL
< / a >
< / li >
< li class = "chapter " data-level = "1.6" data-path = "ch1/ch1-06.html" >
< a href = "../ch1/ch1-06.html" >
< i class = "fa fa-check" > < / i >
< b > 1.6.< / b >
2015-12-21 04:55:18 +00:00
併發穫取多個URL
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "1.7" data-path = "ch1/ch1-07.html" >
< a href = "../ch1/ch1-07.html" >
< i class = "fa fa-check" > < / i >
< b > 1.7.< / b >
Web服務
< / a >
< / li >
< li class = "chapter " data-level = "1.8" data-path = "ch1/ch1-08.html" >
< a href = "../ch1/ch1-08.html" >
< i class = "fa fa-check" > < / i >
< b > 1.8.< / b >
本章要點
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "2" data-path = "ch2/ch2.html" >
< a href = "../ch2/ch2.html" >
< i class = "fa fa-check" > < / i >
< b > 2.< / b >
程序結構
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "2.1" data-path = "ch2/ch2-01.html" >
< a href = "../ch2/ch2-01.html" >
< i class = "fa fa-check" > < / i >
< b > 2.1.< / b >
命名
< / a >
< / li >
< li class = "chapter " data-level = "2.2" data-path = "ch2/ch2-02.html" >
< a href = "../ch2/ch2-02.html" >
< i class = "fa fa-check" > < / i >
< b > 2.2.< / b >
聲明
< / a >
< / li >
< li class = "chapter " data-level = "2.3" data-path = "ch2/ch2-03.html" >
< a href = "../ch2/ch2-03.html" >
< i class = "fa fa-check" > < / i >
< b > 2.3.< / b >
變量
< / a >
< / li >
< li class = "chapter " data-level = "2.4" data-path = "ch2/ch2-04.html" >
< a href = "../ch2/ch2-04.html" >
< i class = "fa fa-check" > < / i >
< b > 2.4.< / b >
賦值
< / a >
< / li >
< li class = "chapter " data-level = "2.5" data-path = "ch2/ch2-05.html" >
< a href = "../ch2/ch2-05.html" >
< i class = "fa fa-check" > < / i >
< b > 2.5.< / b >
類型
< / a >
< / li >
< li class = "chapter " data-level = "2.6" data-path = "ch2/ch2-06.html" >
< a href = "../ch2/ch2-06.html" >
< i class = "fa fa-check" > < / i >
< b > 2.6.< / b >
包和文件
< / a >
< / li >
< li class = "chapter " data-level = "2.7" data-path = "ch2/ch2-07.html" >
< a href = "../ch2/ch2-07.html" >
< i class = "fa fa-check" > < / i >
< b > 2.7.< / b >
作用域
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "3" data-path = "ch3/ch3.html" >
< a href = "../ch3/ch3.html" >
< i class = "fa fa-check" > < / i >
< b > 3.< / b >
基礎數據類型
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "3.1" data-path = "ch3/ch3-01.html" >
< a href = "../ch3/ch3-01.html" >
< i class = "fa fa-check" > < / i >
< b > 3.1.< / b >
整型
< / a >
< / li >
< li class = "chapter " data-level = "3.2" data-path = "ch3/ch3-02.html" >
< a href = "../ch3/ch3-02.html" >
< i class = "fa fa-check" > < / i >
< b > 3.2.< / b >
浮點數
< / a >
< / li >
< li class = "chapter " data-level = "3.3" data-path = "ch3/ch3-03.html" >
< a href = "../ch3/ch3-03.html" >
< i class = "fa fa-check" > < / i >
< b > 3.3.< / b >
2015-12-21 04:55:18 +00:00
複數
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "3.4" data-path = "ch3/ch3-04.html" >
< a href = "../ch3/ch3-04.html" >
< i class = "fa fa-check" > < / i >
< b > 3.4.< / b >
2015-12-21 04:55:18 +00:00
布爾型
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "3.5" data-path = "ch3/ch3-05.html" >
< a href = "../ch3/ch3-05.html" >
< i class = "fa fa-check" > < / i >
< b > 3.5.< / b >
字符串
< / a >
< / li >
< li class = "chapter " data-level = "3.6" data-path = "ch3/ch3-06.html" >
< a href = "../ch3/ch3-06.html" >
< i class = "fa fa-check" > < / i >
< b > 3.6.< / b >
常量
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "4" data-path = "ch4/ch4.html" >
< a href = "../ch4/ch4.html" >
< i class = "fa fa-check" > < / i >
< b > 4.< / b >
2015-12-21 04:55:18 +00:00
複合數據類型
2015-12-09 07:57:17 +00:00
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "4.1" data-path = "ch4/ch4-01.html" >
< a href = "../ch4/ch4-01.html" >
< i class = "fa fa-check" > < / i >
< b > 4.1.< / b >
數組
< / a >
< / li >
< li class = "chapter " data-level = "4.2" data-path = "ch4/ch4-02.html" >
< a href = "../ch4/ch4-02.html" >
< i class = "fa fa-check" > < / i >
< b > 4.2.< / b >
切片
< / a >
< / li >
< li class = "chapter " data-level = "4.3" data-path = "ch4/ch4-03.html" >
< a href = "../ch4/ch4-03.html" >
< i class = "fa fa-check" > < / i >
< b > 4.3.< / b >
字典
< / a >
< / li >
< li class = "chapter " data-level = "4.4" data-path = "ch4/ch4-04.html" >
< a href = "../ch4/ch4-04.html" >
< i class = "fa fa-check" > < / i >
< b > 4.4.< / b >
結構體
< / a >
< / li >
< li class = "chapter " data-level = "4.5" data-path = "ch4/ch4-05.html" >
< a href = "../ch4/ch4-05.html" >
< i class = "fa fa-check" > < / i >
< b > 4.5.< / b >
JSON
< / a >
< / li >
< li class = "chapter " data-level = "4.6" data-path = "ch4/ch4-06.html" >
< a href = "../ch4/ch4-06.html" >
< i class = "fa fa-check" > < / i >
< b > 4.6.< / b >
文本和HTML模闆
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "5" data-path = "ch5/ch5.html" >
< a href = "../ch5/ch5.html" >
< i class = "fa fa-check" > < / i >
< b > 5.< / b >
函數
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "5.1" data-path = "ch5/ch5-01.html" >
< a href = "../ch5/ch5-01.html" >
< i class = "fa fa-check" > < / i >
< b > 5.1.< / b >
函數聲明
< / a >
< / li >
< li class = "chapter " data-level = "5.2" data-path = "ch5/ch5-02.html" >
< a href = "../ch5/ch5-02.html" >
< i class = "fa fa-check" > < / i >
< b > 5.2.< / b >
遞歸
< / a >
< / li >
< li class = "chapter " data-level = "5.3" data-path = "ch5/ch5-03.html" >
< a href = "../ch5/ch5-03.html" >
< i class = "fa fa-check" > < / i >
< b > 5.3.< / b >
多返迴值
< / a >
< / li >
< li class = "chapter " data-level = "5.4" data-path = "ch5/ch5-04.html" >
< a href = "../ch5/ch5-04.html" >
< i class = "fa fa-check" > < / i >
< b > 5.4.< / b >
錯誤
< / a >
< / li >
< li class = "chapter " data-level = "5.5" data-path = "ch5/ch5-05.html" >
< a href = "../ch5/ch5-05.html" >
< i class = "fa fa-check" > < / i >
< b > 5.5.< / b >
函數值
< / a >
< / li >
< li class = "chapter " data-level = "5.6" data-path = "ch5/ch5-06.html" >
< a href = "../ch5/ch5-06.html" >
< i class = "fa fa-check" > < / i >
< b > 5.6.< / b >
匿名函數
< / a >
< / li >
< li class = "chapter " data-level = "5.7" data-path = "ch5/ch5-07.html" >
< a href = "../ch5/ch5-07.html" >
< i class = "fa fa-check" > < / i >
< b > 5.7.< / b >
可變參數
< / a >
< / li >
< li class = "chapter " data-level = "5.8" data-path = "ch5/ch5-08.html" >
< a href = "../ch5/ch5-08.html" >
< i class = "fa fa-check" > < / i >
< b > 5.8.< / b >
Deferred函數
< / a >
< / li >
< li class = "chapter " data-level = "5.9" data-path = "ch5/ch5-09.html" >
< a href = "../ch5/ch5-09.html" >
< i class = "fa fa-check" > < / i >
< b > 5.9.< / b >
Panic異常
< / a >
< / li >
< li class = "chapter " data-level = "5.10" data-path = "ch5/ch5-10.html" >
< a href = "../ch5/ch5-10.html" >
< i class = "fa fa-check" > < / i >
< b > 5.10.< / b >
Recover捕穫異常
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "6" data-path = "ch6/ch6.html" >
< a href = "../ch6/ch6.html" >
< i class = "fa fa-check" > < / i >
< b > 6.< / b >
方法
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "6.1" data-path = "ch6/ch6-01.html" >
< a href = "../ch6/ch6-01.html" >
< i class = "fa fa-check" > < / i >
< b > 6.1.< / b >
方法聲明
< / a >
< / li >
< li class = "chapter " data-level = "6.2" data-path = "ch6/ch6-02.html" >
< a href = "../ch6/ch6-02.html" >
< i class = "fa fa-check" > < / i >
< b > 6.2.< / b >
2015-12-21 04:55:18 +00:00
基於指針對象的方法
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "6.3" data-path = "ch6/ch6-03.html" >
< a href = "../ch6/ch6-03.html" >
< i class = "fa fa-check" > < / i >
< b > 6.3.< / b >
通過嵌入結構體來擴展類型
< / a >
< / li >
< li class = "chapter " data-level = "6.4" data-path = "ch6/ch6-04.html" >
< a href = "../ch6/ch6-04.html" >
< i class = "fa fa-check" > < / i >
< b > 6.4.< / b >
2015-12-21 04:55:18 +00:00
方法值和方法表達式
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "6.5" data-path = "ch6/ch6-05.html" >
< a href = "../ch6/ch6-05.html" >
< i class = "fa fa-check" > < / i >
< b > 6.5.< / b >
示例: Bit數組
< / a >
< / li >
< li class = "chapter " data-level = "6.6" data-path = "ch6/ch6-06.html" >
< a href = "../ch6/ch6-06.html" >
< i class = "fa fa-check" > < / i >
< b > 6.6.< / b >
封裝
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "7" data-path = "ch7/ch7.html" >
< a href = "../ch7/ch7.html" >
< i class = "fa fa-check" > < / i >
< b > 7.< / b >
接口
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "7.1" data-path = "ch7/ch7-01.html" >
< a href = "../ch7/ch7-01.html" >
< i class = "fa fa-check" > < / i >
< b > 7.1.< / b >
2015-12-21 04:55:18 +00:00
接口是合約
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "7.2" data-path = "ch7/ch7-02.html" >
< a href = "../ch7/ch7-02.html" >
< i class = "fa fa-check" > < / i >
< b > 7.2.< / b >
接口類型
< / a >
< / li >
< li class = "chapter " data-level = "7.3" data-path = "ch7/ch7-03.html" >
< a href = "../ch7/ch7-03.html" >
< i class = "fa fa-check" > < / i >
< b > 7.3.< / b >
實現接口的條件
< / a >
< / li >
< li class = "chapter " data-level = "7.4" data-path = "ch7/ch7-04.html" >
< a href = "../ch7/ch7-04.html" >
< i class = "fa fa-check" > < / i >
< b > 7.4.< / b >
flag.Value接口
< / a >
< / li >
< li class = "chapter " data-level = "7.5" data-path = "ch7/ch7-05.html" >
< a href = "../ch7/ch7-05.html" >
< i class = "fa fa-check" > < / i >
< b > 7.5.< / b >
接口值
< / a >
< / li >
< li class = "chapter " data-level = "7.6" data-path = "ch7/ch7-06.html" >
< a href = "../ch7/ch7-06.html" >
< i class = "fa fa-check" > < / i >
< b > 7.6.< / b >
sort.Interface接口
< / a >
< / li >
< li class = "chapter " data-level = "7.7" data-path = "ch7/ch7-07.html" >
< a href = "../ch7/ch7-07.html" >
< i class = "fa fa-check" > < / i >
< b > 7.7.< / b >
http.Handler接口
< / a >
< / li >
< li class = "chapter " data-level = "7.8" data-path = "ch7/ch7-08.html" >
< a href = "../ch7/ch7-08.html" >
< i class = "fa fa-check" > < / i >
< b > 7.8.< / b >
error接口
< / a >
< / li >
< li class = "chapter " data-level = "7.9" data-path = "ch7/ch7-09.html" >
< a href = "../ch7/ch7-09.html" >
< i class = "fa fa-check" > < / i >
< b > 7.9.< / b >
2015-12-21 04:55:18 +00:00
示例: 表達式求值
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "7.10" data-path = "ch7/ch7-10.html" >
< a href = "../ch7/ch7-10.html" >
< i class = "fa fa-check" > < / i >
< b > 7.10.< / b >
類型斷言
< / a >
< / li >
< li class = "chapter " data-level = "7.11" data-path = "ch7/ch7-11.html" >
< a href = "../ch7/ch7-11.html" >
< i class = "fa fa-check" > < / i >
< b > 7.11.< / b >
2015-12-21 04:55:18 +00:00
基於類型斷言識别錯誤類型
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "7.12" data-path = "ch7/ch7-12.html" >
< a href = "../ch7/ch7-12.html" >
< i class = "fa fa-check" > < / i >
< b > 7.12.< / b >
通過類型斷言査詢接口
< / a >
< / li >
< li class = "chapter " data-level = "7.13" data-path = "ch7/ch7-13.html" >
< a href = "../ch7/ch7-13.html" >
< i class = "fa fa-check" > < / i >
< b > 7.13.< / b >
類型分支
< / a >
< / li >
< li class = "chapter " data-level = "7.14" data-path = "ch7/ch7-14.html" >
< a href = "../ch7/ch7-14.html" >
< i class = "fa fa-check" > < / i >
< b > 7.14.< / b >
示例: 基於標記的XML解碼
< / a >
< / li >
< li class = "chapter " data-level = "7.15" data-path = "ch7/ch7-15.html" >
< a href = "../ch7/ch7-15.html" >
< i class = "fa fa-check" > < / i >
< b > 7.15.< / b >
補充幾點
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "8" data-path = "ch8/ch8.html" >
< a href = "../ch8/ch8.html" >
< i class = "fa fa-check" > < / i >
< b > 8.< / b >
Goroutines和Channels
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "8.1" data-path = "ch8/ch8-01.html" >
< a href = "../ch8/ch8-01.html" >
< i class = "fa fa-check" > < / i >
< b > 8.1.< / b >
Goroutines
< / a >
< / li >
< li class = "chapter " data-level = "8.2" data-path = "ch8/ch8-02.html" >
< a href = "../ch8/ch8-02.html" >
< i class = "fa fa-check" > < / i >
< b > 8.2.< / b >
2015-12-21 04:55:18 +00:00
示例: 併發的Clock服務
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "8.3" data-path = "ch8/ch8-03.html" >
< a href = "../ch8/ch8-03.html" >
< i class = "fa fa-check" > < / i >
< b > 8.3.< / b >
2015-12-21 04:55:18 +00:00
示例: 併發的Echo服務
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "8.4" data-path = "ch8/ch8-04.html" >
< a href = "../ch8/ch8-04.html" >
< i class = "fa fa-check" > < / i >
< b > 8.4.< / b >
Channels
< / a >
< / li >
< li class = "chapter " data-level = "8.5" data-path = "ch8/ch8-05.html" >
< a href = "../ch8/ch8-05.html" >
< i class = "fa fa-check" > < / i >
< b > 8.5.< / b >
併行的循環
< / a >
< / li >
< li class = "chapter active" data-level = "8.6" data-path = "ch8/ch8-06.html" >
< a href = "../ch8/ch8-06.html" >
< i class = "fa fa-check" > < / i >
< b > 8.6.< / b >
2015-12-21 04:55:18 +00:00
示例: 併發的Web爬蟲
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "8.7" data-path = "ch8/ch8-07.html" >
< a href = "../ch8/ch8-07.html" >
< i class = "fa fa-check" > < / i >
< b > 8.7.< / b >
2015-12-21 04:55:18 +00:00
基於select的多路複用
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "8.8" data-path = "ch8/ch8-08.html" >
< a href = "../ch8/ch8-08.html" >
< i class = "fa fa-check" > < / i >
< b > 8.8.< / b >
2015-12-21 04:55:18 +00:00
示例: 併發的字典遍歷
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "8.9" data-path = "ch8/ch8-09.html" >
< a href = "../ch8/ch8-09.html" >
< i class = "fa fa-check" > < / i >
< b > 8.9.< / b >
2015-12-21 04:55:18 +00:00
併發的退齣
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "8.10" data-path = "ch8/ch8-10.html" >
< a href = "../ch8/ch8-10.html" >
< i class = "fa fa-check" > < / i >
< b > 8.10.< / b >
示例: 聊天服務
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "9" data-path = "ch9/ch9.html" >
< a href = "../ch9/ch9.html" >
< i class = "fa fa-check" > < / i >
< b > 9.< / b >
2015-12-21 04:55:18 +00:00
基於共享變量的併發
2015-12-09 07:57:17 +00:00
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "9.1" data-path = "ch9/ch9-01.html" >
< a href = "../ch9/ch9-01.html" >
< i class = "fa fa-check" > < / i >
< b > 9.1.< / b >
競爭條件
< / a >
< / li >
< li class = "chapter " data-level = "9.2" data-path = "ch9/ch9-02.html" >
< a href = "../ch9/ch9-02.html" >
< i class = "fa fa-check" > < / i >
< b > 9.2.< / b >
sync.Mutex互斥鎖
< / a >
< / li >
< li class = "chapter " data-level = "9.3" data-path = "ch9/ch9-03.html" >
< a href = "../ch9/ch9-03.html" >
< i class = "fa fa-check" > < / i >
< b > 9.3.< / b >
sync.RWMutex讀寫鎖
< / a >
< / li >
< li class = "chapter " data-level = "9.4" data-path = "ch9/ch9-04.html" >
< a href = "../ch9/ch9-04.html" >
< i class = "fa fa-check" > < / i >
< b > 9.4.< / b >
2015-12-21 04:55:18 +00:00
內存同步
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "9.5" data-path = "ch9/ch9-05.html" >
< a href = "../ch9/ch9-05.html" >
< i class = "fa fa-check" > < / i >
< b > 9.5.< / b >
sync.Once初始化
< / a >
< / li >
< li class = "chapter " data-level = "9.6" data-path = "ch9/ch9-06.html" >
< a href = "../ch9/ch9-06.html" >
< i class = "fa fa-check" > < / i >
< b > 9.6.< / b >
競爭條件檢測
< / a >
< / li >
< li class = "chapter " data-level = "9.7" data-path = "ch9/ch9-07.html" >
< a href = "../ch9/ch9-07.html" >
< i class = "fa fa-check" > < / i >
< b > 9.7.< / b >
2015-12-21 04:55:18 +00:00
示例: 併發的非阻塞緩存
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "9.8" data-path = "ch9/ch9-08.html" >
< a href = "../ch9/ch9-08.html" >
< i class = "fa fa-check" > < / i >
< b > 9.8.< / b >
2015-12-21 04:55:18 +00:00
Goroutines和線程
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "10" data-path = "ch10/ch10.html" >
< a href = "../ch10/ch10.html" >
< i class = "fa fa-check" > < / i >
< b > 10.< / b >
包和工具
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "10.1" data-path = "ch10/ch10-01.html" >
< a href = "../ch10/ch10-01.html" >
< i class = "fa fa-check" > < / i >
< b > 10.1.< / b >
簡介
< / a >
< / li >
< li class = "chapter " data-level = "10.2" data-path = "ch10/ch10-02.html" >
< a href = "../ch10/ch10-02.html" >
< i class = "fa fa-check" > < / i >
< b > 10.2.< / b >
導入路徑
< / a >
< / li >
< li class = "chapter " data-level = "10.3" data-path = "ch10/ch10-03.html" >
< a href = "../ch10/ch10-03.html" >
< i class = "fa fa-check" > < / i >
< b > 10.3.< / b >
包聲明
< / a >
< / li >
< li class = "chapter " data-level = "10.4" data-path = "ch10/ch10-04.html" >
< a href = "../ch10/ch10-04.html" >
< i class = "fa fa-check" > < / i >
< b > 10.4.< / b >
導入聲明
< / a >
< / li >
< li class = "chapter " data-level = "10.5" data-path = "ch10/ch10-05.html" >
< a href = "../ch10/ch10-05.html" >
< i class = "fa fa-check" > < / i >
< b > 10.5.< / b >
匿名導入
< / a >
< / li >
< li class = "chapter " data-level = "10.6" data-path = "ch10/ch10-06.html" >
< a href = "../ch10/ch10-06.html" >
< i class = "fa fa-check" > < / i >
< b > 10.6.< / b >
包和命名
< / a >
< / li >
< li class = "chapter " data-level = "10.7" data-path = "ch10/ch10-07.html" >
< a href = "../ch10/ch10-07.html" >
< i class = "fa fa-check" > < / i >
< b > 10.7.< / b >
工具
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "11" data-path = "ch11/ch11.html" >
< a href = "../ch11/ch11.html" >
< i class = "fa fa-check" > < / i >
< b > 11.< / b >
測試
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "11.1" data-path = "ch11/ch11-01.html" >
< a href = "../ch11/ch11-01.html" >
< i class = "fa fa-check" > < / i >
< b > 11.1.< / b >
go test
< / a >
< / li >
< li class = "chapter " data-level = "11.2" data-path = "ch11/ch11-02.html" >
< a href = "../ch11/ch11-02.html" >
< i class = "fa fa-check" > < / i >
< b > 11.2.< / b >
測試函數
< / a >
< / li >
< li class = "chapter " data-level = "11.3" data-path = "ch11/ch11-03.html" >
< a href = "../ch11/ch11-03.html" >
< i class = "fa fa-check" > < / i >
< b > 11.3.< / b >
測試覆蓋率
< / a >
< / li >
< li class = "chapter " data-level = "11.4" data-path = "ch11/ch11-04.html" >
< a href = "../ch11/ch11-04.html" >
< i class = "fa fa-check" > < / i >
< b > 11.4.< / b >
基準測試
< / a >
< / li >
< li class = "chapter " data-level = "11.5" data-path = "ch11/ch11-05.html" >
< a href = "../ch11/ch11-05.html" >
< i class = "fa fa-check" > < / i >
< b > 11.5.< / b >
剖析
< / a >
< / li >
< li class = "chapter " data-level = "11.6" data-path = "ch11/ch11-06.html" >
< a href = "../ch11/ch11-06.html" >
< i class = "fa fa-check" > < / i >
< b > 11.6.< / b >
示例函數
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "12" data-path = "ch12/ch12.html" >
< a href = "../ch12/ch12.html" >
< i class = "fa fa-check" > < / i >
< b > 12.< / b >
反射
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "12.1" data-path = "ch12/ch12-01.html" >
< a href = "../ch12/ch12-01.html" >
< i class = "fa fa-check" > < / i >
< b > 12.1.< / b >
2015-12-21 04:55:18 +00:00
爲何需要反射?
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "12.2" data-path = "ch12/ch12-02.html" >
< a href = "../ch12/ch12-02.html" >
< i class = "fa fa-check" > < / i >
< b > 12.2.< / b >
reflect.Type和reflect.Value
< / a >
< / li >
< li class = "chapter " data-level = "12.3" data-path = "ch12/ch12-03.html" >
< a href = "../ch12/ch12-03.html" >
< i class = "fa fa-check" > < / i >
< b > 12.3.< / b >
Display遞歸打印
< / a >
< / li >
< li class = "chapter " data-level = "12.4" data-path = "ch12/ch12-04.html" >
< a href = "../ch12/ch12-04.html" >
< i class = "fa fa-check" > < / i >
< b > 12.4.< / b >
2015-12-21 04:55:18 +00:00
示例: 編碼S表達式
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "12.5" data-path = "ch12/ch12-05.html" >
< a href = "../ch12/ch12-05.html" >
< i class = "fa fa-check" > < / i >
< b > 12.5.< / b >
通過reflect.Value脩改值
< / a >
< / li >
< li class = "chapter " data-level = "12.6" data-path = "ch12/ch12-06.html" >
< a href = "../ch12/ch12-06.html" >
< i class = "fa fa-check" > < / i >
< b > 12.6.< / b >
2015-12-21 04:55:18 +00:00
示例: 解碼S表達式
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "chapter " data-level = "12.7" data-path = "ch12/ch12-07.html" >
< a href = "../ch12/ch12-07.html" >
< i class = "fa fa-check" > < / i >
< b > 12.7.< / b >
穫取結構體字段標識
< / a >
< / li >
< li class = "chapter " data-level = "12.8" data-path = "ch12/ch12-08.html" >
< a href = "../ch12/ch12-08.html" >
< i class = "fa fa-check" > < / i >
< b > 12.8.< / b >
顯示一個類型的方法集
< / a >
< / li >
< li class = "chapter " data-level = "12.9" data-path = "ch12/ch12-09.html" >
< a href = "../ch12/ch12-09.html" >
< i class = "fa fa-check" > < / i >
< b > 12.9.< / b >
幾點忠告
< / a >
< / li >
< / ul >
< / li >
< li class = "chapter " data-level = "13" data-path = "ch13/ch13.html" >
< a href = "../ch13/ch13.html" >
< i class = "fa fa-check" > < / i >
< b > 13.< / b >
底層編程
< / a >
< ul class = "articles" >
< li class = "chapter " data-level = "13.1" data-path = "ch13/ch13-01.html" >
< a href = "../ch13/ch13-01.html" >
< i class = "fa fa-check" > < / i >
< b > 13.1.< / b >
unsafe.Sizeof, Alignof 和 Offsetof
< / a >
< / li >
< li class = "chapter " data-level = "13.2" data-path = "ch13/ch13-02.html" >
< a href = "../ch13/ch13-02.html" >
< i class = "fa fa-check" > < / i >
< b > 13.2.< / b >
unsafe.Pointer
< / a >
< / li >
< li class = "chapter " data-level = "13.3" data-path = "ch13/ch13-03.html" >
< a href = "../ch13/ch13-03.html" >
< i class = "fa fa-check" > < / i >
< b > 13.3.< / b >
示例: 深度相等判斷
< / a >
< / li >
< li class = "chapter " data-level = "13.4" data-path = "ch13/ch13-04.html" >
< a href = "../ch13/ch13-04.html" >
< i class = "fa fa-check" > < / i >
< b > 13.4.< / b >
通過cgo調用C代碼
< / a >
< / li >
< li class = "chapter " data-level = "13.5" data-path = "ch13/ch13-05.html" >
< a href = "../ch13/ch13-05.html" >
< i class = "fa fa-check" > < / i >
< b > 13.5.< / b >
幾點忠告
< / a >
< / li >
< / ul >
< / li >
2015-12-24 06:47:06 +00:00
< li class = "chapter " data-level = "14" data-path = "CONTRIBUTORS.html" >
2015-12-09 07:57:17 +00:00
2015-12-24 06:47:06 +00:00
< a href = "../CONTRIBUTORS.html" >
2015-12-09 07:57:17 +00:00
< i class = "fa fa-check" > < / i >
2015-12-21 04:55:18 +00:00
< b > 14.< / b >
2015-12-09 07:57:17 +00:00
2015-12-24 06:47:06 +00:00
附録
2015-12-09 07:57:17 +00:00
< / a >
< / li >
< li class = "divider" > < / li >
< li >
< a href = "https://www.gitbook.com" target = "blank" class = "gitbook-link" >
本書使用 GitBook 釋出
< / a >
< / li >
< / ul >
< / nav >
< / div >
< div class = "book-body" >
< div class = "body-inner" >
< div class = "book-header" role = "navigation" >
<!-- Actions Left -->
<!-- Title -->
< h1 >
< i class = "fa fa-circle-o-notch fa-spin" > < / i >
< a href = "../" > Go编程语言< / a >
< / h1 >
< / div >
< div class = "page-wrapper" tabindex = "-1" role = "main" >
< div class = "page-inner" >
< section class = "normal" id = "section-" >
2015-12-21 04:55:18 +00:00
< h2 id = "86-示例-併發的web爬蟲" > 8.6. 示 例 : 併 發 的 Web爬 蟲 < / h2 >
< p > 在 5.6節 中 , 我 們 做 了 一 個 簡 單 的 web爬 蟲 , 用 bfs(廣 度 優 先 )算 法 來 抓 取 整 個 網 站 。 在 本 節 中 , 我 們 會 讓 這 個 這 個 爬 蟲 併 行 化 , 這 樣 每 一 個 彼 此 獨 立 的 抓 取 命 令 可 以 併 行 進 行 IO, 最 大 化 利 用 網 絡 資 源 。 crawl函 數 和 gopl.io/ch5/findlinks3中 的 是 一 樣 的 。 < / p >
2015-12-14 04:08:47 +00:00
< pre > < code class = "lang-go" > gopl.io/ch8/crawl1
< span class = "hljs-keyword" > func< / span > crawl(url < span class = "hljs-typename" > string< / span > ) []< span class = "hljs-typename" > string< / span > {
fmt.Println(url)
list, err := links.Extract(url)
< span class = "hljs-keyword" > if< / span > err != < span class = "hljs-constant" > nil< / span > {
log.Print(err)
}
< span class = "hljs-keyword" > return< / span > list
}
< / code > < / pre >
2015-12-21 04:55:18 +00:00
< p > 主 函 數 和 5.6節 中 的 breadthFirst(深 度 優 先 )類 似 。 像 之 前 一 樣 , 一 個 worklist是 一 個 記 録 了 需 要 處 理 的 元 素 的 隊 列 , 每 一 個 元 素 都 是 一 個 需 要 抓 取 的 URL列 表 , 不 過 這 一 次 我 們 用 channel代 替 slice來 做 這 個 隊 列 。 每 一 個 對 crawl的 調 用 都 會 在 他 們 自 己 的 goroutine中 進 行 併 且 會 把 他 們 抓 到 的 鏈 接 發 送 迴 worklist。 < / p >
2015-12-14 04:08:47 +00:00
< pre > < code class = "lang-go" > < span class = "hljs-keyword" > func< / span > main() {
worklist := < span class = "hljs-built_in" > make< / span > (< span class = "hljs-keyword" > chan< / span > []< span class = "hljs-typename" > string< / span > )
< span class = "hljs-comment" > // Start with the command-line arguments.< / span >
< span class = "hljs-keyword" > go< / span > < span class = "hljs-keyword" > func< / span > () { worklist < - os.Args[< span class = "hljs-number" > 1< / span > :] }()
< span class = "hljs-comment" > // Crawl the web concurrently.< / span >
seen := < span class = "hljs-built_in" > make< / span > (< span class = "hljs-keyword" > map< / span > [< span class = "hljs-typename" > string< / span > ]< span class = "hljs-typename" > bool< / span > )
< span class = "hljs-keyword" > for< / span > list := < span class = "hljs-keyword" > range< / span > worklist {
< span class = "hljs-keyword" > for< / span > _, link := < span class = "hljs-keyword" > range< / span > list {
< span class = "hljs-keyword" > if< / span > !seen[link] {
seen[link] = < span class = "hljs-constant" > true< / span >
< span class = "hljs-keyword" > go< / span > < span class = "hljs-keyword" > func< / span > (link < span class = "hljs-typename" > string< / span > ) {
worklist < - crawl(link)
}(link)
}
}
}
}
< / code > < / pre >
2015-12-21 04:55:18 +00:00
< p > 註 意 這 里 的 crawl所 在 的 goroutine會 將 link作 爲 一 個 顯 式 的 參 數 傳 入 , 來 避 免 “ 循 環 變 量 快 照 ” 的 問 題 (在 5.6.1中 有 講 解 )。 另 外 註 意 這 里 將 命 令 行 參 數 傳 入 worklist也 是 在 一 個 另 外 的 goroutine中 進 行 的 , 這 是 爲 了 避 免 在 main goroutine和 crawler goroutine中 同 時 向 另 一 個 goroutine通 過 channel發 送 內 容 時 發 生 死 鎖 (因 爲 另 一 邊 的 接 收 操 作 還 沒 有 準 備 好 )。 當 然 , 這 里 我 們 也 可 以 用 buffered channel來 解 決 問 題 , 這 里 不 再 贅 述 。 < / p >
< p > 現 在 爬 蟲 可 以 高 併 發 地 運 行 起 來 , 併 且 可 以 産 生 一 大 坨 的 URL了 , 不 過 還 是 會 有 倆 問 題 。 一 個 問 題 是 在 運 行 一 段 時 間 後 可 能 會 齣 現 在 log的 錯 誤 信 息 里 的 : < / p >
2015-12-14 04:08:47 +00:00
< pre > < code > $ go build gopl.io/ch8/crawl1
$ ./crawl1 http://gopl.io/
http://gopl.io/
https://golang.org/help/
https://golang.org/doc/
https://golang.org/blog/
...
2015/07/15 18:22:12 Get ...: dial tcp: lookup blog.golang.org: no such host
2015/07/15 18:22:12 Get ...: dial tcp 23.21.222.120:443: socket:
too many open files
...
2015-12-21 04:55:18 +00:00
< / code > < / pre > < p > 最 初 的 錯 誤 信 息 是 一 個 讓 人 莫 名 的 DNS査 找 失 敗 , 卽 使 這 個 域 名 是 完 全 可 靠 的 。 而 隨 後 的 錯 誤 信 息 揭 示 了 原 因 : 這 個 程 序 一 次 性 創 建 了 太 多 網 絡 連 接 , 超 過 了 每 一 個 進 程 的 打 開 文 件 數 限 製 , 旣 而 導 致 了 在 調 用 net.Dial像 DNS査 找 失 敗 這 樣 的 問 題 。 < / p >
< p > 這 個 程 序 實 在 是 太 他 媽 併 行 了 。 無 窮 無 盡 地 併 行 化 併 不 是 什 麽 好 事 情 , 因 爲 不 管 怎 麽 説 , 你 的 繫 統 總 是 會 有 一 個 些 限 製 因 素 , 比 如 CPU覈 心 數 會 限 製 你 的 計 算 負 載 , 比 如 你 的 硬 盤 轉 軸 和 磁 頭 數 限 製 了 你 的 本 地 磁 盤 IO操 作 頻 率 , 比 如 你 的 網 絡 帶 寬 限 製 了 你 的 下 載 速 度 上 限 , 或 者 是 你 的 一 個 web服 務 的 服 務 容 量 上 限 等 等 。 爲 了 解 決 這 個 問 題 , 我 們 可 以 限 製 併 發 程 序 所 使 用 的 資 源 來 使 之 適 應 自 己 的 運 行 環 境 。 對 於 我 們 的 例 子 來 説 , 最 簡 單 的 方 法 就 是 限 製 對 links.Extract在 同 一 時 間 最 多 不 會 有 超 過 n次 調 用 , 這 里 的 n是 fd的 limit-20, 一 般 情 況 下 。 這 個 一 個 夜 店 里 限 製 客 人 數 目 是 一 個 道 理 , 隻 有 當 有 客 人 離 開 時 , 纔 會 允 許 新 的 客 人 進 入 店 內 (譯 註 : 作 者 你 個 老 流 氓 )。 < / p >
< p > 我 們 可 以 用 一 個 有 容 量 限 製 的 buffered channel來 控 製 併 發 , 這 類 似 於 操 作 繫 統 里 的 計 數 信 號 量 概 念 。 從 概 念 上 講 , channel里 的 n個 空 槽 代 表 n個 可 以 處 理 內 容 的 token(通 行 證 ), 從 channel里 接 收 一 個 值 會 釋 放 其 中 的 一 個 token, 併 且 生 成 一 個 新 的 空 槽 位 。 這 樣 保 證 了 在 沒 有 接 收 介 入 時 最 多 有 n個 發 送 操 作 。 (這 里 可 能 我 們 拿 channel里 填 充 的 槽 來 做 token更 直 觀 一 些 , 不 過 還 是 這 樣 吧 ~)。 由 於 channel里 的 元 素 類 型 併 不 重 要 , 我 們 用 一 個 零 值 的 struct{}來 作 爲 其 元 素 。 < / p >
< p > 讓 我 們 重 寫 crawl函 數 , 將 對 links.Extract的 調 用 操 作 用 穫 取 、 釋 放 token的 操 作 包 裹 起 來 , 來 確 保 同 一 時 間 對 其 隻 有 20個 調 用 。 信 號 量 數 量 和 其 能 操 作 的 IO資 源 數 量 應 保 持 接 近 。 < / p >
2015-12-14 04:08:47 +00:00
< pre > < code class = "lang-go" > gopl.io/ch8/crawl2
< span class = "hljs-comment" > // tokens is a counting semaphore used to< / span >
< span class = "hljs-comment" > // enforce a limit of 20 concurrent requests.< / span >
< span class = "hljs-keyword" > var< / span > tokens = < span class = "hljs-built_in" > make< / span > (< span class = "hljs-keyword" > chan< / span > < span class = "hljs-keyword" > struct< / span > {}, < span class = "hljs-number" > 20< / span > )
< span class = "hljs-keyword" > func< / span > crawl(url < span class = "hljs-typename" > string< / span > ) []< span class = "hljs-typename" > string< / span > {
fmt.Println(url)
tokens < - < span class = "hljs-keyword" > struct< / span > {}{} < span class = "hljs-comment" > // acquire a token< / span >
list, err := links.Extract(url)
< -tokens < span class = "hljs-comment" > // release the token< / span >
< span class = "hljs-keyword" > if< / span > err != < span class = "hljs-constant" > nil< / span > {
log.Print(err)
}
< span class = "hljs-keyword" > return< / span > list
}
< / code > < / pre >
2015-12-21 04:55:18 +00:00
< p > 第 二 個 問 題 是 這 個 程 序 永 遠 都 不 會 終 止 , 卽 使 它 已 經 爬 到 了 所 有 初 始 鏈 接 衍 生 齣 的 鏈 接 。 (當 然 , 除 非 你 慎 重 地 選 擇 了 合 適 的 初 始 化 URL或 者 已 經 實 現 了 練 習 8.6中 的 深 度 限 製 , 你 應 該 還 沒 有 意 識 到 這 個 問 題 )。 爲 了 使 這 個 程 序 能 夠 終 止 , 我 們 需 要 在 worklist爲 空 或 者 沒 有 crawl的 goroutine在 運 行 時 退 齣 主 循 環 。 < / p >
2015-12-14 04:08:47 +00:00
< pre > < code class = "lang-go" > < span class = "hljs-keyword" > func< / span > main() {
worklist := < span class = "hljs-built_in" > make< / span > (< span class = "hljs-keyword" > chan< / span > []< span class = "hljs-typename" > string< / span > )
< span class = "hljs-keyword" > var< / span > n < span class = "hljs-typename" > int< / span > < span class = "hljs-comment" > // number of pending sends to worklist< / span >
< span class = "hljs-comment" > // Start with the command-line arguments.< / span >
n++
< span class = "hljs-keyword" > go< / span > < span class = "hljs-keyword" > func< / span > () { worklist < - os.Args[< span class = "hljs-number" > 1< / span > :] }()
< span class = "hljs-comment" > // Crawl the web concurrently.< / span >
seen := < span class = "hljs-built_in" > make< / span > (< span class = "hljs-keyword" > map< / span > [< span class = "hljs-typename" > string< / span > ]< span class = "hljs-typename" > bool< / span > )
< span class = "hljs-keyword" > for< / span > ; n > < span class = "hljs-number" > 0< / span > ; n-- {
list := < -worklist
< span class = "hljs-keyword" > for< / span > _, link := < span class = "hljs-keyword" > range< / span > list {
< span class = "hljs-keyword" > if< / span > !seen[link] {
seen[link] = < span class = "hljs-constant" > true< / span >
n++
< span class = "hljs-keyword" > go< / span > < span class = "hljs-keyword" > func< / span > (link < span class = "hljs-typename" > string< / span > ) {
worklist < - crawl(link)
}(link)
}
}
}
}
< / code > < / pre >
2015-12-21 04:55:18 +00:00
< p > 這 個 版 本 中 , 計 算 器 n對 worklist的 發 送 操 作 數 量 進 行 了 限 製 。 每 一 次 我 們 發 現 有 元 素 需 要 被 發 送 到 worklist時 , 我 們 都 會 對 n進 行 ++操 作 , 在 向 worklist中 發 送 初 始 的 命 令 行 參 數 之 前 , 我 們 也 進 行 過 一 次 ++操 作 。 這 里 的 操 作 ++是 在 每 啟 動 一 個 crawler的 goroutine之 前 。 主 循 環 會 在 n減 爲 0時 終 止 , 這 時 候 説 明 沒 活 可 榦 了 。 < / p >
< p > 現 在 這 個 併 發 爬 蟲 會 比 5.6節 中 的 深 度 優 先 蒐 索 版 快 上 20倍 , 而 且 不 會 齣 什 麽 錯 , 併 且 在 其 完 成 任 務 時 也 會 正 確 地 終 止 。 < / p >
< p > 下 面 的 程 序 是 避 免 過 度 併 發 的 另 一 種 思 路 。 這 個 版 本 使 用 了 原 來 的 crawl函 數 , 但 沒 有 使 用 計 數 信 號 量 , 取 而 代 之 用 了 20個 長 活 的 crawler goroutine, 這 樣 來 保 證 最 多 20個 HTTP請 求 在 併 發 。 < / p >
2015-12-14 04:08:47 +00:00
< pre > < code class = "lang-go" > < span class = "hljs-keyword" > func< / span > main() {
worklist := < span class = "hljs-built_in" > make< / span > (< span class = "hljs-keyword" > chan< / span > []< span class = "hljs-typename" > string< / span > ) < span class = "hljs-comment" > // lists of URLs, may have duplicates< / span >
unseenLinks := < span class = "hljs-built_in" > make< / span > (< span class = "hljs-keyword" > chan< / span > < span class = "hljs-typename" > string< / span > ) < span class = "hljs-comment" > // de-duplicated URLs< / span >
< span class = "hljs-comment" > // Add command-line arguments to worklist.< / span >
< span class = "hljs-keyword" > go< / span > < span class = "hljs-keyword" > func< / span > () { worklist < - os.Args[< span class = "hljs-number" > 1< / span > :] }()
< span class = "hljs-comment" > // Create 20 crawler goroutines to fetch each unseen link.< / span >
< span class = "hljs-keyword" > for< / span > i := < span class = "hljs-number" > 0< / span > ; i < < span class = "hljs-number" > 20< / span > ; i++ {
< span class = "hljs-keyword" > go< / span > < span class = "hljs-keyword" > func< / span > () {
< span class = "hljs-keyword" > for< / span > link := < span class = "hljs-keyword" > range< / span > unseenLinks {
foundLinks := crawl(link)
< span class = "hljs-keyword" > go< / span > < span class = "hljs-keyword" > func< / span > () { worklist < - foundLinks }()
}
}()
}
< span class = "hljs-comment" > // The main goroutine de-duplicates worklist items< / span >
< span class = "hljs-comment" > // and sends the unseen ones to the crawlers.< / span >
seen := < span class = "hljs-built_in" > make< / span > (< span class = "hljs-keyword" > map< / span > [< span class = "hljs-typename" > string< / span > ]< span class = "hljs-typename" > bool< / span > )
< span class = "hljs-keyword" > for< / span > list := < span class = "hljs-keyword" > range< / span > worklist {
< span class = "hljs-keyword" > for< / span > _, link := < span class = "hljs-keyword" > range< / span > list {
< span class = "hljs-keyword" > if< / span > !seen[link] {
seen[link] = < span class = "hljs-constant" > true< / span >
unseenLinks < - link
}
}
}
}
< / code > < / pre >
2015-12-21 04:55:18 +00:00
< p > 所 有 的 爬 蟲 goroutine現 在 都 是 被 同 一 個 channel-unseenLinks餵 飽 的 了 。 主 goroutine負 責 拆 分 它 從 worklist里 拿 到 的 元 素 , 然 後 把 沒 有 抓 過 的 經 由 unseenLinks channel發 送 給 一 個 爬 蟲 的 goroutine。 < / p >
< p > seen這 個 map被 限 定 在 main goroutine中 ; 也 就 是 説 這 個 map隻 能 在 main goroutine中 進 行 訪 問 。 類 似 於 其 它 的 信 息 隱 藏 方 式 , 這 樣 的 約 束 可 以 讓 我 們 從 一 定 程 度 上 保 證 程 序 的 正 確 性 。 例 如 , 內 部 變 量 不 能 夠 在 函 數 外 部 被 訪 問 到 ; 變 量 (§ 2.3.4)在 沒 有 被 轉 義 的 情 況 下 是 無 法 在 函 數 外 部 訪 問 的 ; 一 個 對 象 的 封 裝 字 段 無 法 被 該 對 象 的 方 法 以 外 的 方 法 訪 問 到 。 在 所 有 的 情 況 下 , 信 息 隱 藏 都 可 以 幫 助 我 們 約 束 我 們 的 程 序 , 使 其 不 發 生 意 料 之 外 的 情 況 。 < / p >
< p > crawl函 數 爬 到 的 鏈 接 在 一 個 專 有 的 goroutine中 被 發 送 到 worklist中 來 避 免 死 鎖 。 爲 了 節 省 空 間 , 這 個 例 子 的 終 止 問 題 我 們 先 不 進 行 詳 細 闡 述 了 。 < / p >
< p > 練 習 8.6: 爲 併 發 爬 蟲 增 加 深 度 限 製 。 也 就 是 説 , 如 果 用 戶 設 置 了 depth=3, 那 麽 隻 有 從 首 頁 跳 轉 三 次 以 內 能 夠 跳 到 的 頁 面 纔 能 被 抓 取 到 。 < / p >
< p > 練 習 8.7: 完 成 一 個 併 發 程 序 來 創 建 一 個 線 上 網 站 的 本 地 鏡 像 , 把 該 站 點 的 所 有 可 達 的 頁 面 都 抓 取 到 本 地 硬 盤 。 爲 了 省 事 , 我 們 這 里 可 以 隻 取 齣 現 在 該 域 下 的 所 有 頁 面 (比 如 golang.org結 尾 , 譯 註 : 外 鏈 的 應 該 就 不 算 了 。 )當 然 了 , 齣 現 在 頁 面 里 的 鏈 接 你 也 需 要 進 行 一 些 處 理 , 使 其 能 夠 在 你 的 鏡 像 站 點 上 進 行 跳 轉 , 而 不 是 指 向 原 始 的 鏈 接 。 < / p >
2015-12-14 04:08:47 +00:00
< p > 譯 註 :
2015-12-21 04:55:18 +00:00
拓 展 閲 讀 :
2015-12-14 04:08:47 +00:00
< a href = "http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-golang/" target = "_blank" > http://marcio.io/2015/07/handling-1-million-requests-per-minute-with-golang/< / a > < / p >
2015-12-09 07:57:17 +00:00
< / section >
< / div >
< / div >
< / div >
< a href = "../ch8/ch8-05.html" class = "navigation navigation-prev " aria-label = "Previous page: 併行的循環" > < i class = "fa fa-angle-left" > < / i > < / a >
2015-12-21 04:55:18 +00:00
< a href = "../ch8/ch8-07.html" class = "navigation navigation-next " aria-label = "Next page: 基於select的多路複用" > < i class = "fa fa-angle-right" > < / i > < / a >
2015-12-09 07:57:17 +00:00
< / div >
< / div >
< script src = "../gitbook/app.js" > < / script >
< script src = "../gitbook/plugins/gitbook-plugin-sharing/buttons.js" > < / script >
< script src = "../gitbook/plugins/gitbook-plugin-fontsettings/buttons.js" > < / script >
< script >
require(["gitbook"], function(gitbook) {
2015-12-25 04:39:07 +00:00
var config = {"highlight":{},"sharing":{"facebook":true,"twitter":true,"google":false,"weibo":false,"instapaper":false,"vk":false,"all":["facebook","google","twitter","weibo","instapaper"]},"fontsettings":{"theme":"white","family":"sans","size":2}};
2015-12-09 07:57:17 +00:00
gitbook.start(config);
});
< / script >
< / body >
< / html >