汉字和正则表达式怎么匹配?
一般情况下,可以这样匹配中文,如图:&;amplt;img src = "/50/edcbd 2 faf 1a 916675 CEC 852 BD 886 e 599 _ HD . jpg " data-raw width = " 827 " data-raw height = " 600 " class = " origin _ image zh-light box-thumb " width = " 827 " data-o riginal = "/edcbd 2 faf 1a 91675 CEC 852 BDampgt;
先找到这个有汤或者有规律的节点,再和上面的字符组进行匹配。
假设只有一个节点,用法如下:
从bs4 import beautiful soup URL = ' XXX ' html = req . get(URL)按要求导入重新导入请求。textbs = beautiful soup(html)span = bs . find _ all(' span ',' pro-title ')' ' ' span = re . find all(& lt;span\sclass="pro-title " >【^<;]+& lt;/span & gt;',html)s = span[0]m = re . find all('[\ u4e 00-\ u9fa 5]+',s)' ' ' s = str(span)m = re . find all('[\ u4e 00-\ u9fa 5]+',s)print(m)