Monday 2 May 2016

Web Crawler:

In this project I have crawled the website http://lums.edu.pk/ till the depth of 10 pages from one page. I have developed the assignments in two parts; 

-First one is URLsFetcher which fetches all the urls and their html content and saves data in 2 separate files in D drive, keeping the index common for URL and its content in corresponding file(I have used this technique to make the search faster using basic form of indexing).

-Second is the application interface in which user can enter query to be searched and it parses the content txt files to get the results and it saves the searched queries persistently(you can re open the application and all the previously searched term would be in suggestive drop down using auto-complete). We can update the txt files by running the URLsFetcher again and web crawler part will work on it as well. I have used multi-threading for searching and content writing

PS: I have attached screenshot of app, code both programs and all txt files. Technology and IDE used: C#,Windows Form Application, VS2012.

PDF: https://goo.gl/n6vbS3
Project Files: https://goo.gl/fyrXd1

No comments:

Post a Comment