Python PyQuery Module – Complete API & Commands

python-pyquery

Pyquery is a module that allows you to parse HTML. The great thing about this module is that you use the Jquery selector and functions when you do this. If you know a little bit of jQuery, you can use this module to get the desired part of an Internet page from the source very easily.

How to Install PyQuery Module

Give the command to install

Learn Python PyQuery Module

Since the installation is ok, we can use it. Let’s get the title of the site we’re linking to.

As you can see, we can use the jQuery properties on the K object. We can delete the property.

We deleted the class property. We can select an object from the element, search for an object, change the contents of the object we find. Let’s do it all at once.

For example, let’s get links to the links in the menu section of the site we’re connecting to.

As you can see, if you have a bit of jQuery knowledge, this module is really easy to parse HTML.

pyquery — PyQuery complete API

class pyquery.pyquery.PyQuery(*args, **kwargs)

The main class

class Fn

Hook for defining custom function (like the jQuery.fn):

PyQuery.addClass(value)

Add a css class to elements:

PyQuery.after(value)

add value after nodes

PyQuery.append(value)

append value to each nodes

PyQuery.appendTo(value)

append nodes to value

PyQuery.base_url

Return the url of current html document or None if not available.

PyQuery.before(value)

insert value before nodes

PyQuery.children(selector=None)

Filter elements that are direct children of self using optional selector:

PyQuery.clone()

return a copy of nodes

PyQuery.closest(selector=None)

PyQuery.contents()

Return contents (with text nodes):

PyQuery.each(func)

apply func on each nodes

PyQuery.empty()

remove nodes content

PyQuery.encoding

return the xml encoding of the root element

PyQuery.end()

Break out of a level of traversal and return to the parent level.

PyQuery.eq(index)

Return PyQuery of only the element with the provided index:

PyQuery.extend(other)

Extend with anoter PyQuery object

PyQuery.filter(selector)

Filter elements in self using selector (string or function):

PyQuery.find(selector)

Find elements using selector traversing down from self:

PyQuery.hasClass(name)

Return True if element has class:

PyQuery.height(value=<NoDefault>)

set/get height of element

PyQuery.hide()

remove display:none to elements style

PyQuery.html(value=<NoDefault>, **kwargs)

Get or set the html representation of sub nodes.

Get the text value:

Extra args are passed to lxml.etree.tostring:

Set the text value:

PyQuery.insertAfter(value)

insert nodes after value

PyQuery.insertBefore(value)

insert nodes before value

PyQuery.is_(selector)

Returns True if selector matches at least one current element, else False:

PyQuery.items(selector=None)

Iter over elements. Return PyQuery objects:

Make all links absolute.

PyQuery.map(func)

Returns a new PyQuery after transforming current items with func.

func should take two arguments – ‘index’ and ‘element’. Elements can also be referred to as ‘this’ inside of func:

PyQuery.nextAll(selector=None)

PyQuery.not_(selector)

Return elements that don’t match the given selector:

PyQuery.outerHtml()

Get the html representation of the first selected element:

PyQuery.parents(selector=None)

PyQuery.prepend(value)

prepend value to nodes

PyQuery.prependTo(value)

prepend nodes to value

PyQuery.prevAll(selector=None)

PyQuery.remove(expr=<NoDefault>)

Remove nodes:

PyQuery.removeAttr(name)

Remove an attribute:

PyQuery.removeClass(value)

Remove a css class to elements:

PyQuery.remove_namespaces()

Remove all namespaces:

PyQuery.replaceAll(expr)

replace nodes by expr

PyQuery.replaceWith(value)

replace nodes by value

PyQuery.root

return the xml root element

PyQuery.show()

add display:block to elements style

PyQuery.siblings(selector=None)

PyQuery.text(value=<NoDefault>)

Get or set the text representation of sub nodes.

Get the text value:

Set the text value:

PyQuery.toggleClass(value)

Toggle a css class to elements

PyQuery.val(value=<NoDefault>)

Set the attribute value:

Get the attribute value:

PyQuery.width(value=<NoDefault>)

set/get width of element

PyQuery.wrap(value)

A string of HTML that will be created on the fly and wrapped around each target:

PyQuery.wrapAll(value)

Wrap all the elements in the matched set into a single wrapper element:

PyQuery.xhtml_to_html()

Remove xhtml namespace: