自动化数据抓取技术(IV):正则表达Regex
学习教程和文档
Regex Cheat Sheet。参看rexegg.com网站教程](https://www.rexegg.com/regex-quickstart.html#lookarounds)
regular-expressions.info。参看网站教程
常见问题场景
指定出现次数
具体请参看参看
?, ??
: 0 or 1 occurrences (??
is lazy,?
is greedy)*, *?
: any number of occurrences+, +?
: at least one occurrence{n}
: exactly n occurrences{n,m}
: n to m occurrences, inclusive{n,m}?
: n to m occurences, lazy{n,}, {n,}?
: at least n occurrence
例子:
To get “exactly N or M”, you need to write the quantified regex twice, unless m,n are special:
X{n,m} if m = n+1
(?:X{n}){1,2} if m = 2n
杂谈
中文半破折号
通过regex查找替换
(\d{4})-(\d{4}) 替换为 \1—\2