In this article, I will show you how to create a website url crawler using asp.net c#. you can crawl web pages and extract data from a website by inputs the url. It requests the web page and then get response the data programmatically.
The piece of following code written in c# language helps to crawl web page and extract data from website.
Html Design Code:
Create an asp.net web application and right click on the application and create a new web form and name it as CrawlData.aspx. Copy and paste the following design code on it.
<form id="form1" runat="server" style="text-align: center">
<div style="border: 1px solid #DED8D8; width: 750px; height: 550px; font-family: Arial;">
<h2>Crawl website URL</h2>
<asp:TextBox ID="txtUrl" runat="server"></asp:TextBox>
<asp:Button ID="btnCrawl" Text="Crawl" runat="server" OnClick="btnCrawl_Click" Style="height: 26px" />
<br />
<br />
<iframe style="width: 750px; height: 100%;" id="irm1" src="CrawlData/new.html" runat="server"></iframe>
</div>
</form>
Code behind:
protected void btnCrawl_Click(object sender, EventArgs e)
{
string url = txtUrl.Text;
WebRequest request = WebRequest.Create(url);
string path = Server.MapPath("~/CrawlData/");
using (WebResponse response = request.GetResponse())
{
using (StreamReader responseReader =
new StreamReader(response.GetResponseStream()))
{
string responseData =responseReader.ReadToEnd();
using (StreamWriter writer=
new StreamWriter(path+ "new.html"))
{
writer.Write(responseData);
}
}
}
irm1.Src = "CrawlData/new.html";
}
}
Description: Run the application and enter the page url you want to crawl. Click the “crawl” button, It request the web page and crawl data from website and save it in the project folder “CrawlData”. Here I have entered this url "https://www.google.co.in" , it crawls the web page and displayed on the iframe.
Post your comments / questions
Recent Article
- ImportError: cannot import name 'url' from 'django.conf.urls' - Django Error
- How to Enable Virtualization in BIOS Security Settings in Intel Processors For Android Studio?
- Dependency 'androidx.activity:activity:1.8.0' requires libraries and applications that depend on it.
- AttributeError: 'NoneType' object has no attribute 'get_text' - Python
- ModuleNotFoundError: No module named 'openpyxl' - Python
- How to get thumbnail from vimeo video URL in Python?
- Remove all special characters, punctuation except spaces from string - Python
- OSError: cannot write mode RGBA as JPEG- Python
Related Article