如何使用Objective-C解析HTML和XML
by Elton on 二.25, 2010, under Mac
使用Objective-C解析HTML或者XML,系统自带有两种方式一个是通过libxml,一个是通过NSXMLParser。不过这两种方式都需要自己写很多编码来处理抓取下来的内容,而且不是很直观。
有一个比较好的类库hpple,它是一个轻量级的包装框架,可以很好的解决这个问题。它是用XPath来定位和解析HTML或者XML。
安装步骤:
-加入 libxml2 到你的项目中
Menu Project->Edit Project Settings
搜索 “Header Search Paths”
添加新的 search path “${SDKROOT}/usr/include/libxml2″
Enable recursive option
-加入 libxml2 library 到你的项目
Menu Project->Edit Project Settings
搜索 “Other Linker Flags”
添加新的 search flag “-lxml2″
-将下面hpple的源代码加入到你的项目中:
HTFpple.h
HTFpple.m
HTFppleElement.h
HTFppleElement.m
XPathQuery.h
XPathQuery.m
-XPath学习地址http://www.w3schools.com/XPath/default.asp
示例代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | #import "TFHpple.h" NSData *data = [[NSData alloc] initWithContentsOfFile:@"example.html"]; // Create parser xpathParser = [[TFHpple alloc] initWithHTMLData:data]; //Get all the cells of the 2nd row of the 3rd table NSArray *elements = [xpathParser search:@"//table[3]/tr[2]/td"]; // Access the first cell TFHppleElement *element = [elements objectAtIndex:0]; // Get the text within the cell tag NSString *content = [element content]; [xpathParser release]; [data release]; |
另外,还有一个类似的解决方案可以参考
ElementParser http://github.com/Objective3/ElementParser



二月 25th, 2010 on 13:19
非常有用!多谢~
二月 25th, 2010 on 21:07
之前都是用NSXMLParser,比较繁琐,多谢介绍,研究一下.
五月 17th, 2010 on 10:33
多谢。还有,你那照片很棒!
有个问题请大侠赐教。
iphone多触点屏,触摸精度是如何的?
它能对按压在上面的手指的接触面进行感知吗,能将触摸进行输入,然后显示接触面积的形状吗?如何实现比较好些呢?
五月 17th, 2010 on 20:59
谢谢你的谬奖。 我觉得iphone的屏幕没那么灵敏,不过你可以通过移动的速度来动态的改变画笔的粗细。 就想有的绘图软件那样。 具体我也没有研究过。 你可以看看相关文档
五月 22nd, 2010 on 12:32
谢谢,祝你又出奇招,发个大财哈~
五月 29th, 2010 on 06:28
呵呵,谢谢啊
六月 24th, 2010 on 15:08
elton您好,在使用 hpple解析 html的过程中碰到有中文乱码的问题,不知到您是否碰到过?如何解决呢?
六月 25th, 2010 on 16:34
我没有抓中文,我抓的都是数字。 改天试一下
八月 7th, 2010 on 01:56
Excellent read, I must say. You’ve researched the topic very well :)
八月 30th, 2010 on 03:14
A thoughtful insight and ideas I will use on my website. You’ve obviously spent a lot of time on this. Well done!
九月 2nd, 2010 on 21:21
Thanks for posting this article.
九月 3rd, 2010 on 20:34
Could not have been written any better than this. Skimming through this post brings to mind my old room colleague! He continuously kept blabbing about this.
九月 4th, 2010 on 00:02
I love what you guys are always doing
九月 6th, 2010 on 07:12
I often read your blog and always find it very interesting. Thought it was about time i let you know , Keep up the great work
九月 6th, 2010 on 08:47
Hello,
I wanted to let you know that I have been reading for a few months on and off and I would like to sign up for the daily feed. I am not to computer smart so I’ll give it a try but I might need some help. This is a great find and I would hate to lose contact, and maybe never find it again.
Anyway, thanks again and I look forward to posting again sometime!
十月 15th, 2010 on 06:18
A customer may always be in the command of worth but module e’er origin above a cat’s organ
十月 15th, 2010 on 07:06
Thanks for making the effort to make clear the terminlogy in this blog to the beginners!
十月 15th, 2010 on 16:59
In searching for sites related to web hosting and specifically comparison hosting linux plan web, your site came up.
四月 30th, 2011 on 22:02
关于解析中文乱码的问题不知博主有没试过,今日卡了一天都没有解决,还望博主能有所指点。