|
|
| 实例讲解ASP实现抓取网上房产信息(2) |
| 作者:不详 来源:不详 发布时间:2006-8-13 0:22:43 发布人:chinazhan |
减小字体
增大字体
next function RemoveTag(body) Set regEx = New RegExp regEx.Pattern = "<[a].*?<\/[a]>" regEx.IgnoreCase = True regEx.Global = True Set Matches = regEx.Execute(body) dim i,arr(15),ifexit i=0 j=0 For Each Match in Matches TempStr = Match.Value TempStr=replace(TempStr,"<td>","") TempStr=replace(TempStr,"</td>","") TempStr=replace(TempStr,"<tr>","") TempStr=replace(TempStr,"</tr>","") arr(i)=TempStr i=i+1 if(i>=15) then exit for end if Next Set regEx=nothing Set Matches =nothing RemoveTag=arr end function function RegexHtml(body) dim r_arr(47),r_temp Set regEx2 = New RegExp regEx2.Pattern ="<a.*?<\/a>" regEx2.IgnoreCase = True regEx2.Global = True Set Matches2 = regEx2.Execute(body) iii=0 For Each Match in Matches2 r_arr(iii)=Match.Value iii=iii+1 Next RegexHtml=r_arr set regEx2=nothing set Matches2=nothing end function '====================================================== conn.close set conn=nothing %> </body> </html> function.asp <% '************************************************** '函数名:gotTopic '作 用:截字符串,汉字一个算两个字符,英文算一个字符 '参 数:str ----原字符串 ' strlen ----截取长度 '返回值:截取后的字符串 '************************************************** function gotTopic(str,strlen) if str="" then gotTopic="" exit function end if dim l,t,c, i str=replace(replace(replace(replace(str," "," "),""",chr(34)),">",">"),"<","<") str=replace(str,"?","") l=len(str) t=0 for i=1 to l c=Abs(Asc(Mid(str,i,1))) if c>255 then t=t+2 else t=t+1 end if if t>=strlen then gotTopic=left(str,i) & "…" exit for else gotTopic=str end if next gotTopic=replace(replace(replace(replace(gotTopic," "," "),chr(34),"""),">",">"),"<","<") end function '========================================================= '函数:RemoveHTML(strHTML) '功能:去除HTML标记 '参数:strHTML --要去除HTML标记的字符串 '========================================================= Function RemoveHTML(strHTML) Dim objRegExp, Match, Matches Set objRegExp = New Regexp objRegExp.IgnoreCase = True objRegExp.Global = True '取闭合的<> objRegExp.Pattern = "<.+?>" '进行匹配 Set Matches = objRegExp.Execute(strHTML) ' 遍历匹配集合,并替换掉匹配的项目 For Each Match in Matches strHtml=Replace(strHTML,Match.Value,"") Next RemoveHTML=strHTML Set objRegExp = Nothing set Matches=nothing End Function %> conn.asp <% 'on error resume next set conn=server.CreateObject("adodb.connection") con= "driver={Microsoft Access Driver (*.mdb)};dbq=" & Server.MapPath("stest.mdb") conn.open con sub connclose conn.close set conn=nothing end sub %> 附:抓取信息的详细页面事例 总序列号: 479280 信息类别: 出租 所在城市: 济南 房屋具体位置: 华龙路华信路交界口 房屋类型: 其他 楼层: 六层 使用面积: 24~240 平方米之间 房价: 0 [租赁:元/月,买卖:万元/套] 其他说明: 华信商务楼3至6层小空间对外出租(0.5元/平起),本楼属纯商务办公投资使用,可用于办公写字间,周边设施齐全、交通便利(37、80、K95在本楼前经过),全产权、市证,楼内设施包括水、电、暖、电梯设施齐全,有意者可电讯! 联系人: 鲁、王 联系方式: 88017966、86812217 信息来源: 2005-8-4 8:28:55 来自:218.98.86.175 点击次数: 19 做人要厚道,请注明转自chinazhan中国站长(www.ChinaZhan.com)。
|
| |
|
[]
[返回上一页]
[打 印]
[收 藏] |
|
| ∷相关文章评论∷ (评论内容只代表网友观点,与本站立场无关!) [更多评论...] |
|
|