Id Or Class Are Simply.docx

Uploaded by: K Cor
0
0

May 2020
PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA

Overview

Download & View Id Or Class Are Simply.docx as PDF for free.

More details

Words: 958
Pages: 5

Preview
Full text

Id or class are simply “attributes” of a div tag. The id attribute should be unique. This is akin to href being an attribute of an a tag (e.g. is a div tag with a class attribute that has an assignment of “you-are-classy”

Xpath = ‘//table’ directs to all table elements within the entire HTML code Xpath = ‘/html/body/div[2]//table directs you to all table elements within the second child div of the body Xpath = ‘//span[@class=”span-class”]’  This will collect all span elements that have a class attribute equal to “span-class”. We could substitute span with div or whatever.

The asterisk (*) is a wildcard character. For example, xpath = '/html/body/*' will lead to all child elements within the body, regardless of tag.

xpath = '//p[@class="class-1"]'  This directs to all paragraph elements with a class attribute equal to class-1 xpath = '//*[@id="uid"]'  The wildcard marker will reduce to any element that has an id attribute equal to uid xpath = '//div[@id="uid"]/p[2]'  A step further from above, the second paragraph of any element that has an id attribute equal to uid

Contains function: contains(@attri-name, “string-expr”) xpath = '//*[contains(@class,"class-1")]'  This expression chooses all elements where the class attribute contains a string of “class-1”. This may include “class-1”, “class-1 class-2”, or “class-12” for example.

xpath = '/html/body/div/p[2]/@class'  By using xpath = '/html/body/div/p[2] we are directed to the paragraph element itself. By including the @class, we return the attribute itself.

Selecting Selectors An html string is used with the Selector function to create a list. An example setup is as follows: Sel = Selector(text=html) This is where the “html” is a previously defined string. Consider “sel” as having selected the entire html document.

Xpath selector method creates new selector objects. An example: Sel.xpath(“//p”) selects all paragraphs from our running example (the html text). The output will be a selector list of two selector objects such as [<Selector xpath='//p' data='

Hello World!

'>, <Selector xpath='//p' data='

Enjoy DataCamp!

'>]

Ps = sel.xpath(‘//p’) creates the selector list with all paragraph selector objects contained within ps. Second_ps = ps[1] can specifically choose the second selector object from that list (ps). second_p.extract() applies the extract function to the single selector. A selector only has one piece of data, so the output may look like out: '

Enjoy DataCamp!

'

Xpath Chaining Using the Selector (assuming the name is “sel” in this case), you can chain an xpath to produce the same results. For example: sel.xpath('/html/body/div[2]') sel.xpath('/html').xpath('./body/div[2]') sel.xpath('/html').xpath('./body').xpath('./div[2]') These all produce the same results. You must make certain to “glue” them together with the period that comes before the front slash of each subsequent chain.

HTML text to Selector We eventually need to get a webpage’s HTML code. This can be accomplished with the requests.get method. Import the python library “requests”. Create a string identifying the url. For example: url = 'https://www.datacamp.com/courses/all' Create the html string by then passing it to requests.get. For example:

html = requests.get( url ).content The above puts the html contents from https://www.datacamp.com/courses/all into a string called “html”. sel = Selector( text = html ) The above passes the content of the html source to the selector.

CSS LOCATOR CSS Locator is like Xpath / replaced by > (except first character) XPath: /html/body/div CSS Locator: html > body > div Each of the two examples above moves forward one generation on the html tree // replaced by a blank space (except first character) XPath: //div/span//p CSS Locator: div > span p From the two examples immediately above, the double front slash in XPath is equivalent to a blank space in CSS Locator notation. Both perform the task of looking forward to all generations. [N] replaced by :nth-of-type(N) XPath: //div/p[2] CSS Locator: div > p:nth-of-type(2)

The two following methods are equivalent: Xpath = ‘/html/body//div/p[2]’ Css = ‘html > body div > p:nth-of-type(2)

To find an element by class in CSS, use a period. For example: p.class-1 selects all paragraph elements belonging to class-1

To find an element by id, use a pound (#) sign. For example: Div#uid selects the div element with id equal to uid

Usage example: Css_locator = ‘div#uid > p.class1’ The above line first navigates to the div element whose id is uid and then further to the paragraph element whose class is class1 An alternative to the above: css_locator = ‘.class’ This directs to all elements in the html document whose class attribute belongs to class1. This directs to all elements belonging to that class even if they belong to other classes. For example:

…

and

…

This is different from xpath = '//*[@class="class1"]' which forces an exact match. Also different from using contains xpath = '//*[contains(@class,"class1")]' which snatches a string.

To find all the children of an element whose id is equal to “uid”: css_locator = "#uid > *" Must remember to add the star to follow through to those child elements.

To select displayed text on websites: XPath: <xpath-to-element>/@attr-name  xpath = '//div[@id="uid"]/a/@href' CSS: ::attr(attr-name)  css_locator = 'div#uid > a::attr(href)' The double colon selects the desired attribute (that which is in between the quotes); a web address in this case

Text Extraction In some instances, you pay want to extract text from an element. For example:

Hello world! Try DataCamp today!

We may just want to extract the text. To do this, we need to navigate to the paragraph id that is equal to “p-example”

sel.xpath('//p[@id="p-example"]/text()').extract() The above line takes a scrappy selector with an xpath that navigates to what we need. It extracts the following: ['\n Hello world!\n Try ', ' today!\n']

sel.xpath('//p[@id="p-example"]//text()').extract() The above results: ['\n Hello world!\n Try ', 'DataCamp', ' today!\n']

Id Or Class Are Simply.docx

Overview

More details

Related Documents

Id Or Class Are Simply.docx

Id Are

Con Sum Id Or

Convert Id Or Forward

Pro Sum Id Or

Convert Id Or

More Documents from "Leonardo Afroleonardito Quispsalvador"

Artificial-intelligence-the-end-of-the-beginning.pdf

Id Or Class Are Simply.docx

03-pd-entrevista.docx

Voorjaar09 Karcher Bv Nv

Comprehension M. Potter Chap9