How to scrape CiteSeerX with AgentQL
Looking for a better way to scrape CiteSeerX? Say goodbye to fragile XPath or DOM selectors that easily break with website updates. AI-powered AgentQL ensures consistent web scraping across various platforms, from CiteSeerX to any other website, regardless of UI changes.
Not just for scraping CiteSeerX
Smart selectors work anywhere
https://citeseerx.ist.psu.edu
URL
Input any webpage.
{
papers[] {
title
authors[]
year
}
}
Query
Describe data in natural language.
{
"papers": [
{
"title": "A Novel Approach to Text Summarization",
"authors": [
"John Smith",
"Alice Johnson"
],
"year": 2023
},
{
"title": "Deep Learning for Natural Language Processing",
"authors": [
"Bob Williams",
"Eva Davis"
],
"year": 2022
}
]
}
Returns
Receive accurate output in seconds.
How to use AgentQL on CiteSeerX



1
Install the SDK
Install code for JS and Python
npm install agentql
pip3 install agentql
3
Run your script
Install code for both JS and Python
agentql init
python example.py
More Websites to Scrape
Get started
Holds no opinions on what’s and how’s. Build whatever makes sense to you.