Posts tagged
searchI am hosting sproutpics on my dreamhost account, which is mostly being unused. I have lots of extra disk space to play with, so I have written a LiveJournal aggregator. The basic concept is similar to SproutSearch (read an XML feed of blog information, store it, organize and display it). Since Dreamhost now offers PHP 5 hosting, I opted to switch sproutpics to PHP 5 so I could use some new functionality it offers.
Instead of using my own HTML parser, as I did with SproutSearch, I used SimpleXML to parse LiveJournal's RSS feed. I found this to work very nicely, it uses a lot less code than SproutSearch, and it's no doubt quite a bit faster as well.
You can see my modest page here:
http://www.sproutpics.com/livejournal.phpWhen I first created my blog search engine, it was a page within the SproutWorks website. I eventually bought sproutsearch.com and mirrored the content there, while also leaving it on SproutWorks. The traffic to SproutSearch was very low until around May, when I made the search engine organize the blogs into topic pages. The traffic to both sites climbed, with SproutWorks getting a lot of traffic (for me, anyway). At some point, SproutWorks traffic dropped considerably, while SproutSearch continued to climb. Now, they get about even traffic levels.
I am assuming that one, or both of these sites have been penalized because they share the same content. So, today I am only listing a few blogs in each topic on SproutWorks, while keeping the full content on SproutSearch. I am hoping that SproutSearch will gain at least as much traffic as I will lose on SproutWorks.
I have started working on an RSS search engine to add to SproutSearch (
http://www.sproutsearch.com). I used SimpleXml to parse the RSS. This makes it really easy to parse them. The general idea I have is to store items from the feeds in a database, along with their tag information. Then I can generate a bunch of tag pages once I have gathered enough items.
The tags will have their own pages on SproutSearch, but I am also thinking of combining all the different sources into one page. So, on a topic page, it might have some listings from Blogger, some from LiveJournal, and some from RSS. As I add more data sources, those can be integrated as well. Then it would have links to view a more detailed listing of a particular data source.
My blog and RSS feed search engine is almost to the point of indexing 5 million blogs from Blogger.
I have implemented a new rating system. When you click on a blog, please vote on it.
http://www.sproutsearch.comI have just signed up at the search engine submission service
www.blastengine.com . They submit to over 1 million search engines. I used this service a while ago and it resulted in more traffic for this site.
Similar posts
search enginesLiveJournal Aggregator