How to handle css entities in locators and text

Praveen David Mathew
2 min readJan 23

--

What are entities in web ?

An HTML entity is a piece of text (“string”) that begins with an ampersand (&) and ends with a semicolon (;). Entities are frequently used to display reserved characters (which would otherwise be interpreted as HTML code), and invisible characters (like non-breaking spaces). You can also use them in place of other characters that are difficult to type with a standard keyboard.

some examples would be :

Problem…..:

Imagine you have a element like:

<div data="Praveen&nbsp;Mathew">Praveen&nbsp;Mathew</div>

how do we detect this element using data attribute and verify the text content ?

  1. can we use the css locator ? [data=”Praveen&nbsp;Mathew”] ?
  2. can we use the css locator ? [data=”Praveen Mathew”] ?

Solution…..:

Both the mentioned locators above will fail to locate the element &nbsp; is not actually whitespace . and &nbsp; will be considered as just text so that locator also wont work.

So how do we find such element ???

The answer is unicode, the unicorn of Web

we can use unicode equilent of css entity to identify entities. The unicode of non breaking space (&nbsp;) is \u00a0

so the correct locator would be : [data=”Praveen\u00a0Mathew”]

Note:

  • unicode are case insensitive so \u00A0 and \u00a0 are the same ( ‘u’ should be lowercase \u)
  • when checking for the element in browser use \00a0 , skip ‘u’

--

--

Praveen David Mathew

An open source advocator/WebdriverIO Projectcommiter/Postman Supernova/Postman-html-extra contributor/Stack overflow sqa moderator/Speaker