Start with ready-made AI agents with instructions on how to manage them on the marketplace. Browse the library

Русский
English

Automated Trustpilot Review Collection and Analysis with DeepSeek and OpenAI

Learn how to automate review collection from Trustpilot and conduct in-depth sentiment analysis using OpenAI and DeepSeek. This article provides a step-by-step guide to scraping and setting up AI monitoring for effective brand reputation management. Optimize your customer support and reduce customer churn without breaking the bank.

Created by:

John

Last update:

21 March 2026

What is Trustpilot review scraping?

Trustpilot.com is the largest review site in the world and is expected to contain over one hundred twenty million reviews by 2026. Many businesses utilize the site as social proof of their product(s) or service(s) quality, and consumers utilize the site as a tool for evaluating and ultimately making purchasing decisions. Automating the collection of reviews from the Trustpilot.com website can assist a business owner in solving three main objectives simultaneously:

Monitor the real-time reputation of their businesses;
Analyze their competitive set in regards to the companies reviewed and identify useful information to compare and analyze.
Identifying current trends in user reviews.

How a Review Scraper Works: Key Concepts and an Example

A "Review Scraper" is a tool that automatically visits web pages of different companies, gathers all of the text of reviews (the actual sentences) and their associated ratings (from 1 to 5 stars), dates, authors’ names and any other relevant data). Because this is being done automatically, it allows for rapid analysis without having to manually copy reviews and also allows the data to be consistently up to date.

Automated Trustpilot Review Collection and Analysis with DeepSeek and OpenAI

The main steps required to operate a review scraper include:

HTML Parsing – using either CSS selectors or XPath, the scraper visits public web pages to locate and gather desired data fields (e.g., the review text is generally found in the HTML class “.review-content-body__text”).
Pagination – because reviews are usually “paginated” (meaning grouped into sets of 20-30 reviews per page), the scraper continues to increment (e.g., ?page=1 to ?page=2) until all reviews have either been aggregated or recorded in the scraper database.
Metadata – can include: rating; date of publication; if the purchase is verified; and responses provided by the company.
Request Management – with regard to rate management, the scraper offers delay periods between requests, uses multiple User-Agents, and uses proxy servers; since Trustpilot actively fights against bots, it will respond with a 429 (rate limit exceeded) error for any series of requests made within a brief period of time from the same IP address.

A practical example of a Review Scraper's operation is the response of the support team of a crypto currency platform reviewing negative reviews (rating = 1 or 2) posted after the delay of more than 6 hours from posting. As a result of the operation of the Review Scraper, a notification was sent to the support team via Telegram and email, which provided a significant improvement in the response time from several days to one hour. Python is the preferred programming language by most scraper developers, who utilize the “requests,” “beautiful Soup,” and “lxml” libraries to scrape web pages; and additionally, for dynamic web sites, many will also utilize headless browser utilities, like “selenium” and “playwright.” Asynchronous frameworks like Aiohttp or Scrapy will be utilized for high-volume scraping to increase speed on sites with large numbers of pages.

Disadvantages and Characteristics of Web Scraping

One problem associated with scraping data from the Trustpilot reviews section is that Trustpilot allows for review pages to utilize JavaScript, which makes it necessary to perform full rendering of a webpage to obtain the full data set. There are headless browsers (Puppeteer, Playwright and Selenium) which will create a complete webpage rendering.

Rate Limitations: Most websites will rate limit individual IP addresses or will limit the number of requests allowed to be made by an IP address. If an IP address exceeds the limit set by the site, then a 429 Error will be produced or the IP Address will be blocked. To avoid the limitations of an IP address when scrapping, rotating proxies can be used along with a time delay rule of 2-5 seconds between requests produced by the scraper.

Website Design Updates: With every design change or CSS class change on a website, the scraper will also need to be updated to reflect the change in selectors and other elements.

Captcha: In cases where Trustpilot detects suspicious activity being executed on the website, Trustpilot will add additional verification checks to verify that visitors are human. This results in slowing down the entire scraping process and will ultimately increase the cost of scraping the information.

Limited API Access: Trustpilot's API is available only to verified business partners on a paid basis. For companies without API access, web scraping remains the primary alternative — though it comes with the legal and technical considerations outlined above.

Legal and ethical aspects

As it relates to ethics and legal issues, the act of gathering publicly accessible data is generally deemed to be legal; as long as the person gathering the data is not using any method of circumventing protections or security implementing by the website. However, Trustpilot's terms of use do not allow for automated scrapers to gather data from the site without prior written consent from Trustpilot. Failure to comply with Trustpilot's terms can result in the individual being blocked from accessing the site or may subject the individual to legal action taken by Trustpilot for violating its terms. In the case of hiQ Labs, the court ruled that the collection of publicly available data will not be classified as "unauthorized use" under the Computer Fraud and Abuse Act of 1986. The hiQ ruling supports the analysis of the legality of scraping data from Trustpilot or from a website in general. It is always best to speak with an attorney for assistance prior to engaging in extensive data collection, and you need to comply with ethical standards:

Limit interaction frequency for requests.
Handle data thoughtfully to avoid spamming or manipulation.
Erase any check-ins or personal data if it was captured by mistake.
Comply with robots.txt and the internal operating procedures of the service you are utilizing.

Scraping allows businesses to track and respond to reviews as they appear. However, reproducing review text publicly — outside the original platform — may infringe on the reviewer's copyright or violate Trustpilot's Terms of Service. When in doubt, consult a legal professional.

How AI Helps Analyze Review Sentiment

Sentiment analysis is an automated process of determining how someone feels about something in a written format (positive, negative or neutral). This makes it easier for a company to assess the status of their products to identify areas for improvement without reading every single review.

Through the use of sentiment analysis tools, it is possible for companies to develop visual and meaningful metrics such as the percentage of negative reviews, average overall rating and the trend of the sentiment. Therefore, by utilizing sentiment analysis:

Identify problems earlier: If there is an increase in negative feedback of 15%-20% as compared to the prior week, that indicates that action needs to be taken immediately. This discovery is usually too late without automation; typically when the customer has already left for good.
Prioritize your resources: Customer support staff cannot manage or keep up with the number of customer complaints being received. The sentiment analysis tool can help support staff prioritize which issues/fixes are the most critical (i.e., refunds, product violations, etc.) and then focus their efforts on resolving these first.
Support product decisions: The fact that 34% of negative reviews mention 'long delivery time' clearly identifies it as an area for improvement — no guesswork needed.

Here is an example: An online school implemented a sentiment analysis tool to evaluate the opinions of their students using reviews. The data showed that 28% of dissatisfied reviews were directly attributed to a poorly designed webinar platform. After improvements were made to the technology, negative comments about the system were reduced to 7%, and average ratings increased from 3.8 stars to 4.3 stars. Companies respond faster after analyzing sentiment data (31% faster) and improve their customer service efficiency by 22%. Well-designed automation allows for the quick identification and correction of customer pain points.

Using OpenAI Models for Analysis

The OpenAI API (GPT-4, GPT-3.5-turbo) provides reliable sentiment analyses because they have a superior understanding of contextual meaning. These models can detect sarcasm and irony, handle multiple languages, and process complex text structures, enabling them to achieve reliable sentiment analysis results — for example:


Analyze the sentiment of the review and return a JSON with:
- sentiment: positive, negative, neutral
- confidence: from 0 to 1
- key_topics: list of key topics
Review: ""Delivery took two weeks instead of three days. The product arrived damaged.""
{
"sentiment": "negative",
"confidence": 0.95,
"key_topics": ["long delivery", "product damage"]
}

The biggest advantages of GPT over traditional methods (VADER, TextBlob) are:

Sophisticated understanding of context (including subtle irony and sarcasm)
Supports multiple languages without requiring additional training
Providing detailed reasons and identifying main themes
Does not require labeled datasets in large quantities.

Real-World Applications:

Shopify automatically reviews all merchant reviews to create Jira tickets detailing common problems.
Notion compiles both user complaints and competitor mentions in order to improve the quality of their product.
A cryptocurrency service uses both Trustpilot and social media channels (Reddit and Twitter) to monitor reputation threats.

Limitations and Rules

Costs: The range of processing costs for GPT-4 will be between $10 and $30 based on the volume (number of text reviews to be processed – sample size: 10,000);
Processing time: Each review takes 1–3 seconds; using an asynchronous queue is recommended for real-time processing;
Hallucination: Models may generate inaccurate content; carefully crafted prompts and few-shot examples help mitigate this risk.

When determining which model will process prompt well under various conditions, organisations should evaluate how the selected model can process prompt under various levels of subtlety, identify irony within prompt, and maintain the same voice throughout user interactions. In addition, the generated text should be evaluated against multiple metrics to evaluate text generated on classification accuracy, depth of contextual understanding, and the costs associated with the processing of each review. The following table shows how the some available popular competing algorithms compare with respect to these metrics:

Model/Algorithm	Classification (F1 Score)	Contextual Recognition	Processing Time	Cost
GPT-4	92% to 95%	Excellent	1 to 3 sec	Very High
GPT-3.5 Turbo	88% to 91%	Good	0.5 to 1 sec	Moderate
BERT (fine-tuned)	85% to 89%	Average	0.1 to 3 sec	Free
VADER	70% to 75%	Poor	<0.01 sec	Free
TextBlob	65% to 70%	Very Poor	<0.01 sec	Free

According to results from testing of a sample of 5,000 ratings of an online retailer, the accuracy of recognising feelings of comfort when customers rated products were; GPT-4 (94%), GPT-3.5 (89%), VADER (72%). Errors caused by irony and negative phrasing are negligible / drop to zero.

Negative Examples of Customer Feedback:

A cryptocurrency exchange found that 41% of negative reviews received were due to length of time to complete KYC verification; as of today (when verification time was reduced to 6 hours), 9% of negative reviews are due to long verification times.
When analyzing feedback for online degree programs, researchers found that 33% of negative experiences were recorded as neutral reviews.

Through the DeepSeek platform, reviews can be analyzed

DeepSeek is an AI-powered service that analyzes customer reviews, replacing the need for manual, open-ended feedback evaluation. The advantage to DeepSeek is that it will not only organize reviews into categories but also identify and will analyze the different themes of those reviews, as well as track the changing nature of how people feel about reviews over time.

Using the OpenAI API to enhance analysis by integrating DeepSeek

DeepSeek acts as a pre-processing layer: it filters out spam, extracts key entities, and detects language before forwarding only high-priority reviews to the OpenAI API. This two-stage pipeline reduces OpenAI API consumption by up to 80% and significantly speeds up the overall analysis workflow.

The way DeepSeek processes reviews from start to finish includes:

The collection of reviews from various sources.
DeepSeek's filtering of the reviews that it maintains only negative and/or critical reviews.
The extraction of key phrases (for example; "refund", "not working") from the reviews.
Each of the key phrases is then sent to OpenAI for processing with the instructions "determine the reason(s) for the negative review and offer a solution for how to correct the negative review."
Collection of the response from OpenAI.
Notification and reporting via Slack or Telegram.

The positive benefits for DeepSeek are:

Cost savings on the use of the OpenAI API (savings of up to 80%).
Process up to 500 reviews per minute (our only limit will be according to the API rate limit).
Improved accuracy due to the use of context and meta data.

Automated monitoring of reviews through DeepSeek

Automation through various sources such as Trustpilot to collect reviews;
Real time alerts for customers who have provided critical (≤2-star) reviews (provides both text and recommendations from the AI);
Analytics demonstrating the changes in positive/negative sentiments, with anomaly detection;
Provides key metrics for companies to compare against benchmark standards.

In addition, the use of the DeepSeek AI in responding to reviews has improved our response time by as much as 70%. For example, once a product has been launched, DeepSeek will check the reviews for updates every two (2) hours. The moment five distinct negative reviews stated, that it “does not function” on iPhone, the team had received notification and released a resolution in less than 2 hours. Therefore, a disaster was avoided.

Comparative advantages of DeepSeek over other market platforms

Platform	Price per Month	No Code	AI Analysis	Alerts	API Limitations
DeepSeek	$29—$199	Yes	GPT-4	Yes	50+
Brandwatch	$800—$3000	Partial	Proprietary NLP	Yes	30+
Sprinklr	> $2000	No	Proprietary NLP	Yes	100+

Key components of DeepSeek

Map of Typical Negative Reaction to Location – Displays geographical distribution of negative sentiment regarding local suppliers.
Comparative Analysis of Competitors – Provides data to assess performance vis-a-vis one's competition on important measurables.
Legal Validity of AI Response Templates – Provides pre-approved template responses from a legal perspective for regulated industries.

Complete Instructions For Automating Review Collection and Processing

The following establish the step-by-step method by which one can automate the collection of Trustpilot reviews and the creation and delivery of responses to customers.

Step 1: Setting Up the Trustpilot Scraper

In order to facilitate the automated and ongoing collection of Trustpilot reviews, you must first set up a process to collect new Trustpilot reviews on a regular basis, and store the information in a database or table.

Alternative A: No Code with ASCN.AI

Create an ASCN.AI Account
Set Up a Workflow, "Trustpilot Review Scraper"
Add the Schedule Trigger to your Workflow (6 Hour Interval)
Add the HTTP Request Node - https://www.trustpilot.com/review/[company] - replace [company] with the company's slug

Option B: Custom Python Script


import requests
from bs4 import BeautifulSoup
import time
def scrape_trustpilot(company, pages=5):
base_url = f"https://www.trustpilot.com/review/{company}"
all_reviews = []
for page in range(1, pages+1):
url = f"{base_url}?page={page}"
headers = {"User-Agent": "Mozilla/5.0"}
resp = requests.get(url, headers=headers)
if resp.status_code != 200:
print(f"Error loading page {page}")
continue
soup = BeautifulSoup(resp.content, 'html.parser')
reviews = soup.select('div.review-card')
for r in reviews:
text = r.select_one('p.review-content__text').text.strip()
rating = int(r.select_one('[data-service-review-rating]').get('data-service-review-rating'))
date = r.select_one('time').get('datetime')
author = r.select_one('span.consumer-information__name').text.strip()
all_reviews.append({
'text': text,
'rating': rating,
'date': date,
'author': author
})
time.sleep(3)
return all_reviews
data = scrape_trustpilot('amazon.com', pages=10)
print(f"Collected {len(data)} reviews")

Step 2: Move Your Data over to the AI Models (OpenAI)

Get an API Key at platform.openai.com.
Add an AI Agent node or build a direct connection to OpenAI through ASCN.AI.
Use the Prompt.
Set the temperature to 0.3 (for more consistent, deterministic outputs).

Step 3: Interpret and Visualize Results

Aggregate data: the respective shares of positive vs. negative in addition to top issues.
Set up a dashboard in ASCN.AI or Google Data Studio that contains a sentiment pie chart, rating trends and complaint frequency charts.
Automatically send reports through team Telegram or Email.

Step 4: Setup Automatic Notifications

Set parameters for alerts (if sentiment = negative and rating ≤ 2 send a notification through Telegram with review text and recommendation);
Integrate with CRM to create tickets automatically;
Create an escalation process if a review is not processed within 2 hours of initial posting;
Add trend monitoring and alerts if negative coverage increases 10% or more in a single day.

Reducing response times from 24–48 hours to 1–2 hours leads to a 30–40% increase in customer retention.

Frequently Asked Questions / FAQ:

Is it legal and ethical to collect reviews?

Collecting publicly available reviews is legal as long as there are no technical barriers or copyright infringement. However, you may be blocked by Terms of Service prohibiting you from doing so without permission.

Recommendations:

Do not make frequent requests and/or overload the system;
Do not use data obtained for spam or other unethical uses;
Do not forget to remove personal data prior to exporting;
If your company intends to use this information in a serious manner, you should consult an attorney.

Accuracy of Sentiment Analysis:

The following actions will help achieve a higher degree of accuracy:

Use GPT-4 with clear prompts and a limited number of examples (Few-Shot);
Consider irony and contradiction when reviewing different categories of reviews (mixed opinions);
Manually evaluate a sample prior to adjusting model settings and prompts.

What Can OpenAI and DeepSeek Do Together?

Collect and parse reviews from Trustpilot, Google Reviews, Yandex.Market and all other platforms;
Create sentiment identification, key problems identified and comparison against competitors;
Generate AI responses to reviews and thus speeding up the support process;
Comparative tools against competitors and real-time dynamic monitoring of chats;
Integrated with CRM and messengers for the automation of these processes;
Affordable pricing.

DeepSeek data collected and analyzed can easily be re-formatted for convenience.

Compatibility with other review sites:

DeepSeek can collect and analyze reviews from Google Reviews, App Store, Play Market and Amazon Reviews and many more to provide a comprehensive overview of a Brand's reputation.

Disclaimer:

The information provided in this message is of a general nature and is for informational purposes only and does not provide, replace, or represent investment, legal, or security advice. Use of AI assistants requires that you understand how each platform specifically functions, and you must be conscious about using AI Assistants.

FAQ

Still have a question

Do I need coding skills to set up this template?

No coding skills required! This template is designed for no-code users. Simply follow the step-by-step setup guide, connect your accounts, and you're ready to go.

How does this template help maintain data security?

All data is processed securely through official APIs with OAuth authentication. Your credentials are never stored in the workflow, and you maintain full control over connected accounts and permissions.

What is a module?

A module is a single building block in the workflow that performs a specific action — like sending a message, fetching data, or processing information. Modules connect together to create the complete automation.

Can I customize the template to fit my organization's specific needs?

Absolutely! You can modify triggers, add new integrations, adjust AI prompts, and customize responses to match your organization's workflow and branding requirements.

How customizable are the AI responses?

Fully customizable. You can edit the AI system prompt to change the tone, language, response format, and behavior. Add specific instructions for your use case or industry terminology.

Will this template work with my existing IT support tools?

This template integrates with popular tools like Gmail, Google Calendar, Slack, and Baserow. Additional integrations can be added using available API connectors or webhooks.

What if my FAQ knowledge base is empty?

No problem! The template includes setup instructions to help you populate your FAQ database with commonly asked questions and answers. Start small. As new questions arise, you can easily add more FAQs over time.

Is there a way to track unresolved issues that require follow-up?

Yes! You can configure the workflow to log unresolved queries to a database or spreadsheet, send notifications to your team, or create tickets in your issue tracking system for manual follow-up.

What if I want to switch from Slack to Microsoft Teams (or another chat tool)?

Simply replace the Slack module with a Microsoft Teams or other chat integration module. The core logic remains the same — just reconnect the input and output to your preferred platform.

If you have questions about the template or want to launch it for the best results, contact us and we'll help you set it up quickly

Order turnkey Ask a question

ArbitrageScan Developers LTD Office A, RAK DAO Business Centre, RAK BANK ROC Office, Ground Floor, Al Rifaa, Sheikh Mohammed Bin Zayed Road, Ras Al Khaimah, United Arab Emirates

By continuing to use our site, you agree to the use of cookies.