Overview
Welcome to ScraperSensei docs!
What is ScraperSensei?
Welcome to ScraperSensei! We’re a service that helps you extract data from websites and automate your workflows right in your browser.
Ready to get started?
Quickstart
Learn how to get started with ScraperSensei in minutes
Features
Tips for Prompting
Reference
Learn more about the ScraperSensei API
Glossary
Learn about the terms used in the ScraperSensei documentation
FAQ
Can ScraperSensei smartly plan the actions according to my one-line goal? Like executing “Tweet ‘hello world‘“
ScraperSensei is an automation assistance SDK with a key feature of action stability — ensuring the same actions are performed in each run. To maintain this stability, we encourage you to provide detailed instructions to help the AI understand each step of your task.
If you require a ‘goal-to-task’ AI planning tool, you can develop one based on ScraperSensei.
Related Docs:
Limitations
There are some limitations with ScraperSensei. We are still working on them.
- The interaction types are limited to only tap, type, keyboard press, and scroll.
- It’s not 100% stable. Even GPT-4o can’t return the right answer all the time. Following the Prompting Tips will help improve stability.
- Making AI able to automate your browser accurately is challenging. ScraperSensei needs servers that will receive your OpenAI API key to performs some tasks like OCR, drawing boxes (like microsoft/Omniparser) and dealing with the OpenAI result to make result more accurate. We don’t store your OpenAI API key, and you can revoke access anytime if you want.
About the token cost
Here are some typical data with GPT-4o.
Task | Resolution | Prompt Tokens / Price | Completion Tokens / Price |
---|---|---|---|
Plan the steps to search on eBay homepage | 1280x800 | 6,975 / $0.034875 | 150 / $0.00225 |
Locate the search box on the eBay homepage | 1280x800 | 8,004 / $0.04002 | 92 / $0.00138 |
Query the information about the item in the search results | 1280x800 | 13,403 / $0.067015 | 95 / $0.001425 |
What data is sent to LLM ?
Currently, the contents are:
- the key information extracted from the DOM, such as text content, class name, tag name, coordinates;
- a screenshot of the page.