Table of Contents

Class HtmlWeb

Namespace
HtmlAgilityPack
Assembly
HtmlAgilityPack.dll

A utility class to get HTML document from HTTP.

public class HtmlWeb
Inheritance
HtmlWeb
Inherited Members

Constructors

HtmlWeb()

public HtmlWeb()

Fields

PostResponse

Occurs after an HTTP request has been executed.

public HtmlWeb.PostResponseHandler PostResponse

Field Value

HtmlWeb.PostResponseHandler

PreHandleDocument

Occurs before an HTML document is handled.

public HtmlWeb.PreHandleDocumentHandler PreHandleDocument

Field Value

HtmlWeb.PreHandleDocumentHandler

PreRequest

Occurs before an HTTP request is executed.

public HtmlWeb.PreRequestHandler PreRequest

Field Value

HtmlWeb.PreRequestHandler

Properties

AutoDetectEncoding

Gets or Sets a value indicating if document encoding must be automatically detected.

public bool AutoDetectEncoding { get; set; }

Property Value

bool

AutomaticDecompression

Gets or sets the automatic decompression.

public DecompressionMethods AutomaticDecompression { get; set; }

Property Value

DecompressionMethods

The automatic decompression.

CacheOnly

Gets or Sets a value indicating whether to get document only from the cache. If this is set to true and document is not found in the cache, nothing will be loaded.

public bool CacheOnly { get; set; }

Property Value

bool

CachePath

Gets or Sets the cache path. If null, no caching mechanism will be used.

public string CachePath { get; set; }

Property Value

string

CaptureRedirect

Gets or sets a value indicating whether redirect should be captured instead of the current location.

public bool CaptureRedirect { get; set; }

Property Value

bool

True if capture redirect, false if not.

FromCache

Gets a value indicating if the last document was retrieved from the cache.

public bool FromCache { get; }

Property Value

bool

MaxAutoRedirects

Maximum number of redirects that will be followed. To disable redirects, do not set the value to 0, please set CaptureRedirect to 'true'.

public int? MaxAutoRedirects { get; set; }

Property Value

int?

Must be greater than 0.

OverrideEncoding

Gets or sets the Encoding used to override the response stream from any web request

public Encoding OverrideEncoding { get; set; }

Property Value

Encoding

RequestDuration

Gets the last request duration in milliseconds.

public int RequestDuration { get; }

Property Value

int

ResponseUri

Gets the URI of the Internet resource that actually responded to the request.

public Uri ResponseUri { get; }

Property Value

Uri

StatusCode

Gets the last request status.

public HttpStatusCode StatusCode { get; }

Property Value

HttpStatusCode

StreamBufferSize

Gets or Sets the size of the buffer used for memory operations.

public int StreamBufferSize { get; set; }

Property Value

int

Timeout

Gets or sets the timeout value in milliseconds. Must be greater than zero. A value of -1 sets the timeout to be infinite.

public int Timeout { get; set; }

Property Value

int

UseCookies

Gets or Sets a value indicating if cookies will be stored.

public bool UseCookies { get; set; }

Property Value

bool

UserAgent

Gets or Sets the User Agent HTTP 1.1 header sent on any webrequest

public string UserAgent { get; set; }

Property Value

string

UsingCache

Gets or Sets a value indicating whether the caching mechanisms should be used or not.

public bool UsingCache { get; set; }

Property Value

bool

UsingCacheIfExists

Gets or Sets a value indicating whether to get document from the cache if exists, otherwise from the web A value indicating whether to get document from the cache if exists, otherwise from the web

public bool UsingCacheIfExists { get; set; }

Property Value

bool

Methods

CreateInstance(string, string, XsltArgumentList, Type)

Creates an instance of the given type from the specified Internet resource.

public object CreateInstance(string htmlUrl, string xsltUrl, XsltArgumentList xsltArgs, Type type)

Parameters

htmlUrl string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

xsltUrl string

The URL that specifies the XSLT stylesheet to load.

xsltArgs XsltArgumentList

An System.Xml.Xsl.XsltArgumentList containing the namespace-qualified arguments used as input to the transform.

type Type

The requested type.

Returns

object

An newly created instance.

CreateInstance(string, string, XsltArgumentList, Type, string)

Creates an instance of the given type from the specified Internet resource.

public object CreateInstance(string htmlUrl, string xsltUrl, XsltArgumentList xsltArgs, Type type, string xmlPath)

Parameters

htmlUrl string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

xsltUrl string

The URL that specifies the XSLT stylesheet to load.

xsltArgs XsltArgumentList

An System.Xml.Xsl.XsltArgumentList containing the namespace-qualified arguments used as input to the transform.

type Type

The requested type.

xmlPath string

A file path where the temporary XML before transformation will be saved. Mostly used for debugging purposes.

Returns

object

An newly created instance.

CreateInstance(string, Type)

Creates an instance of the given type from the specified Internet resource.

public object CreateInstance(string url, Type type)

Parameters

url string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

type Type

The requested type.

Returns

object

An newly created instance.

Get(string, string)

Gets an HTML document from an Internet resource and saves it to the specified file.

public void Get(string url, string path)

Parameters

url string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

path string

The location of the file where you want to save the document.

Get(string, string, WebProxy, NetworkCredential)

Gets an HTML document from an Internet resource and saves it to the specified file. - Proxy aware

public void Get(string url, string path, WebProxy proxy, NetworkCredential credentials)

Parameters

url string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

path string

The location of the file where you want to save the document.

proxy WebProxy
credentials NetworkCredential

Get(string, string, WebProxy, NetworkCredential, string)

Gets an HTML document from an Internet resource and saves it to the specified file. Understands Proxies

public void Get(string url, string path, WebProxy proxy, NetworkCredential credentials, string method)

Parameters

url string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

path string

The location of the file where you want to save the document.

proxy WebProxy
credentials NetworkCredential
method string

The HTTP method used to open the connection, such as GET, POST, PUT, or PROPFIND.

Get(string, string, string)

Gets an HTML document from an Internet resource and saves it to the specified file.

public void Get(string url, string path, string method)

Parameters

url string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

path string

The location of the file where you want to save the document.

method string

The HTTP method used to open the connection, such as GET, POST, PUT, or PROPFIND.

GetCachePath(Uri)

Gets the cache file path for a specified url.

public string GetCachePath(Uri uri)

Parameters

uri Uri

The url fo which to retrieve the cache path. May not be null.

Returns

string

The cache file path.

GetContentTypeForExtension(string, string)

Gets the MIME content type for a given path extension.

public static string GetContentTypeForExtension(string extension, string def)

Parameters

extension string

The input path extension.

def string

The default content type to return if any error occurs.

Returns

string

The path extension's MIME content type.

GetExtensionForContentType(string, string)

Gets the path extension for a given MIME content type.

public static string GetExtensionForContentType(string contentType, string def)

Parameters

contentType string

The input MIME content type.

def string

The default path extension to return if any error occurs.

Returns

string

The MIME content type's path extension.

Load(string)

Gets an HTML document from an Internet resource.

public HtmlDocument Load(string url)

Parameters

url string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

Returns

HtmlDocument

A new HTML document.

Load(string, string)

Loads an HTML document from an Internet resource.

public HtmlDocument Load(string url, string method)

Parameters

url string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

method string

The HTTP method used to open the connection, such as GET, POST, PUT, or PROPFIND.

Returns

HtmlDocument

A new HTML document.

Load(string, string, int, string, string)

Gets an HTML document from an Internet resource.

public HtmlDocument Load(string url, string proxyHost, int proxyPort, string userId, string password)

Parameters

url string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

proxyHost string

Host to use for Proxy

proxyPort int

Port the Proxy is on

userId string

User Id for Authentication

password string

Password for Authentication

Returns

HtmlDocument

A new HTML document.

Load(string, string, WebProxy, NetworkCredential)

Loads an HTML document from an Internet resource.

public HtmlDocument Load(string url, string method, WebProxy proxy, NetworkCredential credentials)

Parameters

url string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

method string

The HTTP method used to open the connection, such as GET, POST, PUT, or PROPFIND.

proxy WebProxy

Proxy to use with this request

credentials NetworkCredential

Credentials to use when authenticating

Returns

HtmlDocument

A new HTML document.

Load(Uri)

Gets an HTML document from an Internet resource.

public HtmlDocument Load(Uri uri)

Parameters

uri Uri

The requested Uri, such as new Uri("http://Myserver/Mypath/Myfile.asp").

Returns

HtmlDocument

A new HTML document.

Load(Uri, string)

Loads an HTML document from an Internet resource.

public HtmlDocument Load(Uri uri, string method)

Parameters

uri Uri

The requested URL, such as new Uri("http://Myserver/Mypath/Myfile.asp").

method string

The HTTP method used to open the connection, such as GET, POST, PUT, or PROPFIND.

Returns

HtmlDocument

A new HTML document.

Load(Uri, string, int, string, string)

Gets an HTML document from an Internet resource.

public HtmlDocument Load(Uri uri, string proxyHost, int proxyPort, string userId, string password)

Parameters

uri Uri

The requested Uri, such as new Uri("http://Myserver/Mypath/Myfile.asp").

proxyHost string

Host to use for Proxy

proxyPort int

Port the Proxy is on

userId string

User Id for Authentication

password string

Password for Authentication

Returns

HtmlDocument

A new HTML document.

Load(Uri, string, WebProxy, NetworkCredential)

Loads an HTML document from an Internet resource.

public HtmlDocument Load(Uri uri, string method, WebProxy proxy, NetworkCredential credentials)

Parameters

uri Uri

The requested Uri, such as new Uri("http://Myserver/Mypath/Myfile.asp").

method string

The HTTP method used to open the connection, such as GET, POST, PUT, or PROPFIND.

proxy WebProxy

Proxy to use with this request

credentials NetworkCredential

Credentials to use when authenticating

Returns

HtmlDocument

A new HTML document.

LoadHtmlAsXml(string, string, XsltArgumentList, XmlTextWriter)

Loads an HTML document from an Internet resource and saves it to the specified XmlTextWriter, after an XSLT transformation.

public void LoadHtmlAsXml(string htmlUrl, string xsltUrl, XsltArgumentList xsltArgs, XmlTextWriter writer)

Parameters

htmlUrl string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

xsltUrl string

The URL that specifies the XSLT stylesheet to load.

xsltArgs XsltArgumentList

An XsltArgumentList containing the namespace-qualified arguments used as input to the transform.

writer XmlTextWriter

The XmlTextWriter to which you want to save.

LoadHtmlAsXml(string, string, XsltArgumentList, XmlTextWriter, string)

Loads an HTML document from an Internet resource and saves it to the specified XmlTextWriter, after an XSLT transformation.

public void LoadHtmlAsXml(string htmlUrl, string xsltUrl, XsltArgumentList xsltArgs, XmlTextWriter writer, string xmlPath)

Parameters

htmlUrl string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp". May not be null.

xsltUrl string

The URL that specifies the XSLT stylesheet to load.

xsltArgs XsltArgumentList

An XsltArgumentList containing the namespace-qualified arguments used as input to the transform.

writer XmlTextWriter

The XmlTextWriter to which you want to save.

xmlPath string

A file path where the temporary XML before transformation will be saved. Mostly used for debugging purposes.

LoadHtmlAsXml(string, XmlTextWriter)

Loads an HTML document from an Internet resource and saves it to the specified XmlTextWriter.

public void LoadHtmlAsXml(string htmlUrl, XmlTextWriter writer)

Parameters

htmlUrl string

The requested URL, such as "http://Myserver/Mypath/Myfile.asp".

writer XmlTextWriter

The XmlTextWriter to which you want to save to.