ScraperSensei provides a powerful selector engine based on CSS selectors with additional capabilities. This page demonstrates common selector patterns with examples to help you effectively extract data from web pages.
Text Matching
Basic Text Matching
article:has-text("ScraperSensei")
Matches any <article>
tag that contains the text “ScraperSensei”, even nested inside child elements.
<!-- Will match -->
<article>
<div>Welcome to ScraperSensei</div>
</article>
<!-- Will match -->
<article>ScraperSensei Documentation</article>
<!-- Will not match -->
<div>ScraperSensei</div>
Exact Text Matching
button:text-is("Log")
Matches elements with exact text content. Case-sensitive and trims whitespace.
<!-- Will match -->
<button> Log <span>in</span></button>
<!-- Will not match -->
<button>Log in</button>
<button>log</button>
<button>Login</button>
Text Pattern Matching
button:text-matches("Log\\s*in", "i")
Matches text using regex patterns. The example matches “Login”, “Log in”, “log IN”, etc.
<!-- All of these will match -->
<button>Login</button>
<button>Log in</button>
<button>LOG IN</button>
<button>log in</button>
Layout-Based Selectors
Position-Based
input:right-of(:text("Username"))
Matches input fields that are to the right of text “Username”
<!-- Will match -->
<div>
<label>Username</label>
<input type="text" />
</div>
<!-- Will not match -->
<div>
<input type="text" />
<label>Username</label>
</div>
button:near(.promo-card)
Matches buttons within 50 pixels of elements with class “promo-card”
<!-- Will match if button is within 50px of the div -->
<div class="promo-card">Special Offer!</div>
<button>Buy Now</button>
Distance Specification
button:near(:text("Username"), 120)
Matches buttons within 120 pixels of text “Username”
<!-- Will match if button is within 120px -->
<div>Username</div>
<button>Click me</button>
Element State and Visibility
button:visible
Matches only visible button elements
<!-- Will match -->
<button>Visible Button</button>
<!-- Will not match -->
<button style="display: none">Hidden Button</button>
<button hidden>Hidden Button</button>
<button style="visibility: hidden">Hidden Button</button>
Nested Elements
Has-Text with Specific Elements
article:has-text("ScraperSensei")
Matches article elements containing “ScraperSensei” text anywhere inside
<!-- Will match -->
<article>
<h2>Getting Started with ScraperSensei</h2>
<p>Some content...</p>
</article>
<!-- Will match -->
<article>ScraperSensei Guide</article>
<!-- Will not match -->
<section>ScraperSensei Guide</section>
Parent-Child Relationships
#nav-bar :text("Home")
Matches elements with text “Home” inside #nav-bar element
<!-- Will match -->
<nav id="nav-bar">
<a>Home</a>
<a>About</a>
</nav>
<!-- Will not match -->
<div id="content">
<a>Home</a>
</div>
Multiple Conditions
button:has-text("Log in"), button:has-text("Sign in")
Matches buttons containing either “Log in” or “Sign in” text
<!-- Both will match -->
<button>Log in</button>
<button>Sign in</button>
<!-- Will not match -->
<button>Register</button>
XPath Selectors
xpath=//button
Matches any button element anywhere in the document
<!-- All will match -->
<button>Click me</button>
<div><button>Nested button</button></div>
<form><button>Submit</button></form>
xpath=//div[@id='main']//button
Matches button elements inside div with id=“main”
<!-- Will match -->
<div id="main">
<button>Inside main</button>
<div>
<button>Nested inside main</button>
</div>
</div>
<!-- Will not match -->
<div id="sidebar">
<button>Outside main</button>
</div>
label:has-text("Password")
Matches label element containing “Password” text and can be used to target its associated input
<!-- Will match -->
<label for="pwd">Password:</label>
<input id="pwd" type="password">
<!-- Will match -->
<label>
Password:
<input type="password">
</label>
Framework-Specific Selectors
React Components
_react=BookItem
Matches React components named BookItem
// Will match
<BookItem title="React Guide" />
// Will match
<BookItem author="John Doe" year={2023} />
_react=BookItem[author = "Steven King"]
Matches BookItem components with specific author prop
// Will match
<BookItem author="Steven King" />
// Will not match
<BookItem author="John Doe" />
Vue Components
_vue=book-item
Matches Vue components named book-item
<!-- Will match -->
<book-item></book-item>
<book-item title="Vue Guide"></book-item>
_vue=book-item[author = "Steven King"]
Matches book-item components with specific author prop
<!-- Will match -->
<book-item author="Steven King"></book-item>
<!-- Will not match -->
<book-item author="John Doe"></book-item>
Testing-Specific Attributes
data-testid=submit
Matches elements with data-testid=“submit”
<!-- Will match -->
<button data-testid="submit">Submit</button>
<input data-testid="submit" type="submit">
<!-- Will not match -->
<button data-testid="cancel">Cancel</button>
id=login-form
Matches elements with id=“login-form”
<!-- Will match -->
<form id="login-form">
<input type="text">
</form>
<!-- Will not match -->
<form id="signup-form">
<input type="text">
</form>
CSS Selectors
Basic CSS
css=button
Matches any button element using standard CSS selector
<!-- All will match -->
<button>Click me</button>
<button class="primary">Submit</button>
<button id="cancel">Cancel</button>
CSS with Text Matching
css=#nav-bar :text("Home")
Matches smallest element containing “Home” text inside #nav-bar
<!-- Will match the <a> element -->
<div id="nav-bar">
<a>Home</a>
<div>Welcome Home</div>
</div>
<!-- Will not match -->
<div id="content">
<a>Home</a>
</div>
CSS with Has Selector
article:has(div.promo)
Matches article elements that contain div with class “promo”
<!-- Will match -->
<article>
<div class="promo">Special offer!</div>
</article>
<!-- Will not match -->
<article>
<div class="content">No promo here</div>
</article>
Nth Match Selectors
:nth-match(:text("Buy"), 3)
Matches the third element containing text “Buy”
<!-- Third button will match -->
<section> <button>Buy</button> </section>
<article><div> <button>Buy</button> </div></article>
<div><div> <button>Buy</button> </div></div>
Chaining Selectors
Basic Chaining
article >> .bar > .baz >> span[attr=value]
Chains multiple selectors, each queried relative to the previous match
<!-- Will match the span -->
<article>
<div class="bar">
<div class="baz">
<span attr="value">Target</span>
</div>
</div>
</article>
<!-- Will not match -->
<article>
<span attr="value">Wrong place</span>
</article>
*css=article >> text=Hello
The *
prefix captures the article element instead of the text element
<!-- Will match the article element -->
<article>
<div>Hello</div>
<p>World</p>
</article>
<!-- Will not match -->
<section>
<div>Hello</div>
</section>
Layout Combinations
[type=radio]:left-of(:text("Label 3")):near(.form-group)
Complex selector combining position and proximity
<!-- Will match if within proximity -->
<div class="form-group">
<input type="radio">
<label>Label 3</label>
</div>
<!-- Will not match if too far -->
<div class="form-group">
<input type="radio">
</div>
<div>
<label>Label 3</label>
</div>
Role-Based Selectors
[role="button"][aria-label="Submit"]
Matches elements with specific ARIA roles and labels
<!-- Will match -->
<div role="button" aria-label="Submit">Click me</div>
<!-- Will not match -->
<div role="button">Submit</div>
<button aria-label="Submit">Click me</button>
Union Selectors
xpath=//span[contains(@class, 'spinner__loading')]|//div[@id='confirmation']
Matches elements that satisfy either condition
<!-- Both will match -->
<span class="spinner__loading"></span>
<div id="confirmation">Confirmed!</div>
<!-- Will not match -->
<span class="spinner">Loading...</span>
<div id="other">Not confirmation</div>
CSS Pseudo-Classes
Visibility Matching
button:visible
Only matches visible buttons, useful to distinguish between similar elements
<!-- Will match -->
<button>Visible button</button>
<!-- Will not match any of these -->
<button style="display: none">Invisible</button>
<button style="visibility: hidden">Hidden</button>
<button hidden>Hidden</button>
Text Content Matching
article:has-text("ScraperSensei")
Matches elements containing specified text somewhere inside
<!-- Will match -->
<article>
<div>Testing with ScraperSensei</div>
<p>Some other content</p>
</article>
<!-- Will not match -->
<div>ScraperSensei</div>
Multiple Text Conditions
button:has-text("Log in"), button:has-text("Sign in")
Matches elements that satisfy any of the text conditions
<!-- Both will match -->
<button>Log in</button>
<button>Sign in</button>
<!-- Will not match -->
<button>Register</button>
<a>Log in</a>
Element Containment
article:has(div.promo)
Returns elements that have matching children
<!-- Will match -->
<article>
<div class="promo">Special offer!</div>
<p>Content</p>
</article>
<!-- Will not match -->
<article>
<div>No promo here</div>
</article>
Nth Element Selection
:nth-match(:text("Buy"), 3)
Matches the nth occurrence of an element (1-based index)
<!-- Third "Buy" button will match -->
<div>
<button>Buy</button>
<button>Buy</button>
<button>Buy</button> <!-- This one matches -->
<button>Buy</button>
</div>
Advanced Layout Selectors
Above/Below
button:above(.footer)
Matches buttons that are above the footer element
<!-- Will match if positioned above -->
<button>Click me</button>
<div class="content">Some content</div>
<footer class="footer">Footer content</footer>
input:below(.header)
Matches inputs that are below the header element
<!-- Will match if positioned below -->
<header class="header">Header content</header>
<div class="content">
<input type="text" /> <!-- This will match -->
</div>
Left/Right Positioning
button:right-of(.sidebar)
Matches buttons positioned to the right of the sidebar
<!-- Will match if positioned to the right -->
<div class="layout">
<div class="sidebar">Menu</div>
<button>Action</button> <!-- This will match -->
</div>
Near Elements
button:near(.card, 100)
Matches buttons within 100 pixels of a card element
<!-- Will match if within 100px -->
<div class="card">
Product details
<button>Add to cart</button>
</div>
<!-- Will not match if beyond 100px -->
<div class="card">Product details</div>
<div class="spacer"></div>
<button>Add to cart</button>
React Component Properties
Multiple Property Matching
_react=BookItem[author *= "king" i][year = 1990]
Matches React components with multiple property conditions
// Will match
<BookItem author="Stephen King" year={1990} />
// Will not match
<BookItem author="Stephen King" year={1991} />
<BookItem author="John Doe" year={1990} />
Nested Property Values
_react=[some.nested.value = 12]
Matches components with specific nested property values
// Will match
<Component some={{ nested: { value: 12 }}} />
// Will not match
<Component some={{ nested: { value: 13 }}} />
Property Pattern Matching
_react=BookItem[author = /Steven(\\s+King)?/i]
Matches components where properties match a regex pattern
// All of these will match
<BookItem author="Steven" />
<BookItem author="Steven King" />
<BookItem author="steven king" />
// Will not match
<BookItem author="Stephen King" />