View this PageEdit this Page (locked)Attachments to this PageHistory of this PageHomeRecent ChangesSearch the SwikiHelp Guide
Hotspots: Admin Pages | Turn-in Site |
Current Links: Cases Final Project Summer 2007

MFoV P6 cases page

MFoV members:

The legend of P6
When we first read the requirements for P6, we were all freaking out. If you've ever visited listen.com, then you may know that it links to many different web sites. We didn't know how we were going to parse each of these different sites. But luckily, they changed the requirements a bit and made our lives alot easier. Although, when we found out what the new requirements were, we had already written a parser for IUMA.com (oh well). We got mp3.com working early on, and then pulled an all-nighter at the CoC on Wednesday to get listen.com working. This week was also the same week that some of us had tests and another program due on Thursday. It was a god awful week if I remember correctly.

P6 Goals
1. To parse mp3.com (both artist pages, and search pages)
2. To parse listen.com for links to mp3.com
3. Once the pages have been parsed, generate a list of links
4. Let the user choose a file, and then downlaod that file
5. Add the downloaded file to the playlist (this was a P7 requirement, but we went ahead and did it anyway)

What was hard?
1. Figuring out how to grab the links out of the pages we were parsing
2. Figuring out how to get only a certain number of links, and not all of them
3. The rest of the project wasn't too hard. We tried to figure out how to download files w/o trying to get all the bytes at once. We never did figure this out, but we came up with an alternative. When you use our program to get songs from mp3.com, you are downloading the lo-fi mp3's. Most of these are relatively small (usually 1 meg or less) and can be downloaded by squeak without crashing the VM. If you try to get a file all at once, and it is too big to store in VM memory, then the VM will crash (Yes, this means the lowSpaceHandler dosen't work).

Implementation
First off, let me say this. HtmlParser rules! The way you use it, is you retrieve the contents of some web page, I won't say exactly how, and then pass the contents to the HtmlParser class. It returns a parse tree of the page you sent it, with everything broken into parts. This is a very useuful, and as professor Shivers would say, sexy, tool. Once you have the parsed page, you can just use forAllSubentitiesDo:, and just match patterns. We thought parsing pages was going to be hard, but thanks to HtmlParser, it turned out to be much easier than we expected. But don't let this get you too excited. One of the requirements was that we have to let the user specify the number of searches they want to do. This made parsing a little more difficult, b/c you have to tell your do loop when to quit. I also wanted to show the user the artist and title of the song they were going to download, so once I found a link I had to tell the loop that I wanted to find the artist, which took a little messing around to get right. Parsing listen.com was more difficult for several reasons. The main reason it was more difficult is that we had to first check which songs had any links to them at all. Then once that had been checked, we had to check wether or not there was a link to mp3.com. All this link checking results in a pretty slow search, even if you search for just a few songs. Anyways, here is a collage of the steps you would go through if you were to get a song from mp3.com/listen.com using our program:



Link to this Page