View this PageEdit this PageAttachments to this PageHistory of this PageHomeRecent ChangesSearch the SwikiHelp Guide

Retrieving website information

Retrieving website information.

How can we write a Java program that will search a website for information about particular subject and retrieve that information into a file? Has this already been done, am I trying to reinvent the wheel?
At the beginning of the course I believe Mark said something about doing this.

There's no simple way to search a website for information because webpages have no inherent structure. That said, you can certainly write a Java program to extract some information (e.g. the weather) from particular webpages if you know the format of that page. Also, more and more websites are using XML and web services (e.g. Amazon, Google, and Ebay all provide APIs) to make it easier to get information from them.

Can you be more specific about what type of information you want to obtain from a website?

Link to this Page