Learn how to automate review collection from Trustpilot and conduct in-depth sentiment analysis using OpenAI and DeepSeek. This article provides a step-by-step guide to scraping and setting up AI monitoring for effective brand reputation management. Optimize your customer support and reduce customer churn without breaking the bank.

Brands are losing millions of dollars and are unwilling to spend any time or resources to try and recoup their losses. Reasons include: not having enough personnel allocated for monitoring negative Trustpilot reviews, which leads to customers churning after weeks of unresolved issues, causing businesses to lose more revenue. A costly step for any company with significant customer base and negative reviews is to employ a team of analysts to manually read and sort through thousands of reviews before determining how to respond to customer complaints. There is a lost opportunity cost in both time and money; moreover, the process is not scalable for future volumes of reviews.
After analyzing over 40,000 reviews over three years using an automated process, I found that when a company implements AI into their review management system, it not only allows them to increase their overall speed of operation by up to ten times, but also radically changes the way a company manages their online reputation. Companies that utilize artificial intelligence for review management have been able to respond to consumer dissatisfaction 10 times faster and have reduced customer attrition by 23% - 34% as a result.
In this article, I will provide guidance on how to set up a system for collecting customer reviews from Trustpilot, how to analyze the reviews utilizing OpenAI, and how to effectively automate the review collection process through ASCN.AI without the need for a development team or incurring high costs associated with developing a custom solution.
Trustpilot.com is the largest review site in the world and is expected to contain over one hundred twenty million reviews by 2026. Many businesses utilize the site as social proof of their product(s) or service(s) quality, and consumers utilize the site as a tool for evaluating and ultimately making purchasing decisions. Automating the collection of reviews from the Trustpilot.com website can assist a business owner in solving three main objectives simultaneously:
A "Review Scraper" is a tool that automatically visits web pages of different companies, gathers all of the text of reviews (the actual sentences) and their associated ratings (from 1 to 5 stars), dates, authors’ names and any other relevant data). Because this is being done automatically, it allows for rapid analysis without having to manually copy reviews and also allows the data to be consistently up to date.

The main steps required to operate a review scraper include:
.review-content-body__text”).?page=1 to ?page=2) until all reviews have either been aggregated or recorded in the scraper database.A practical example of a Review Scraper's operation is the response of the support team of a crypto currency platform reviewing negative reviews (rating = 1 or 2) posted after the delay of more than 6 hours from posting. As a result of the operation of the Review Scraper, a notification was sent to the support team via Telegram and email, which provided a significant improvement in the response time from several days to one hour. Python is the preferred programming language by most scraper developers, who utilize the “requests,” “beautiful Soup,” and “lxml” libraries to scrape web pages; and additionally, for dynamic web sites, many will also utilize headless browser utilities, like “selenium” and “playwright.” Asynchronous frameworks like Aiohttp or Scrapy will be utilized for high-volume scraping to increase speed on sites with large numbers of pages.
One problem associated with scraping data from the Trustpilot reviews section is that Trustpilot allows for review pages to utilize JavaScript, which makes it necessary to perform full rendering of a webpage to obtain the full data set. There are headless browsers (Puppeteer, Playwright and Selenium) which will create a complete webpage rendering.
Rate Limitations: Most websites will rate limit individual IP addresses or will limit the number of requests allowed to be made by an IP address. If an IP address exceeds the limit set by the site, then a 429 Error will be produced or the IP Address will be blocked. To avoid the limitations of an IP address when scrapping, rotating proxies can be used along with a time delay rule of 2-5 seconds between requests produced by the scraper.
Website Design Updates: With every design change or CSS class change on a website, the scraper will also need to be updated to reflect the change in selectors and other elements.
Captcha: In cases where Trustpilot detects suspicious activity being executed on the website, Trustpilot will add additional verification checks to verify that visitors are human. This results in slowing down the entire scraping process and will ultimately increase the cost of scraping the information.
Limited API Access: Trustpilot's API is available only to verified business partners on a paid basis. For companies without API access, web scraping remains the primary alternative — though it comes with the legal and technical considerations outlined above.
As it relates to ethics and legal issues, the act of gathering publicly accessible data is generally deemed to be legal; as long as the person gathering the data is not using any method of circumventing protections or security implementing by the website. However, Trustpilot's terms of use do not allow for automated scrapers to gather data from the site without prior written consent from Trustpilot. Failure to comply with Trustpilot's terms can result in the individual being blocked from accessing the site or may subject the individual to legal action taken by Trustpilot for violating its terms. In the case of hiQ Labs, the court ruled that the collection of publicly available data will not be classified as "unauthorized use" under the Computer Fraud and Abuse Act of 1986. The hiQ ruling supports the analysis of the legality of scraping data from Trustpilot or from a website in general. It is always best to speak with an attorney for assistance prior to engaging in extensive data collection, and you need to comply with ethical standards:
Scraping allows businesses to track and respond to reviews as they appear. However, reproducing review text publicly — outside the original platform — may infringe on the reviewer's copyright or violate Trustpilot's Terms of Service. When in doubt, consult a legal professional.
Sentiment analysis is an automated process of determining how someone feels about something in a written format (positive, negative or neutral). This makes it easier for a company to assess the status of their products to identify areas for improvement without reading every single review.
Through the use of sentiment analysis tools, it is possible for companies to develop visual and meaningful metrics such as the percentage of negative reviews, average overall rating and the trend of the sentiment. Therefore, by utilizing sentiment analysis:
Here is an example: An online school implemented a sentiment analysis tool to evaluate the opinions of their students using reviews. The data showed that 28% of dissatisfied reviews were directly attributed to a poorly designed webinar platform. After improvements were made to the technology, negative comments about the system were reduced to 7%, and average ratings increased from 3.8 stars to 4.3 stars. Companies respond faster after analyzing sentiment data (31% faster) and improve their customer service efficiency by 22%. Well-designed automation allows for the quick identification and correction of customer pain points.
The OpenAI API (GPT-4, GPT-3.5-turbo) provides reliable sentiment analyses because they have a superior understanding of contextual meaning. These models can detect sarcasm and irony, handle multiple languages, and process complex text structures, enabling them to achieve reliable sentiment analysis results — for example:
Analyze the sentiment of the review and return a JSON with:
- sentiment: positive, negative, neutral
- confidence: from 0 to 1
- key_topics: list of key topics
Review: ""Delivery took two weeks instead of three days. The product arrived damaged.""
{
"sentiment": "negative",
"confidence": 0.95,
"key_topics": ["long delivery", "product damage"]
}
The biggest advantages of GPT over traditional methods (VADER, TextBlob) are:
Real-World Applications:
When determining which model will process prompt well under various conditions, organisations should evaluate how the selected model can process prompt under various levels of subtlety, identify irony within prompt, and maintain the same voice throughout user interactions. In addition, the generated text should be evaluated against multiple metrics to evaluate text generated on classification accuracy, depth of contextual understanding, and the costs associated with the processing of each review. The following table shows how the some available popular competing algorithms compare with respect to these metrics:
| Model/Algorithm | Classification (F1 Score) | Contextual Recognition | Processing Time | Cost |
|---|---|---|---|---|
| GPT-4 | 92% to 95% | Excellent | 1 to 3 sec | Very High |
| GPT-3.5 Turbo | 88% to 91% | Good | 0.5 to 1 sec | Moderate |
| BERT (fine-tuned) | 85% to 89% | Average | 0.1 to 3 sec | Free |
| VADER | 70% to 75% | Poor | <0.01 sec | Free |
| TextBlob | 65% to 70% | Very Poor | <0.01 sec | Free |
According to results from testing of a sample of 5,000 ratings of an online retailer, the accuracy of recognising feelings of comfort when customers rated products were; GPT-4 (94%), GPT-3.5 (89%), VADER (72%). Errors caused by irony and negative phrasing are negligible / drop to zero.
Negative Examples of Customer Feedback:
DeepSeek is an AI-powered service that analyzes customer reviews, replacing the need for manual, open-ended feedback evaluation. The advantage to DeepSeek is that it will not only organize reviews into categories but also identify and will analyze the different themes of those reviews, as well as track the changing nature of how people feel about reviews over time.
DeepSeek acts as a pre-processing layer: it filters out spam, extracts key entities, and detects language before forwarding only high-priority reviews to the OpenAI API. This two-stage pipeline reduces OpenAI API consumption by up to 80% and significantly speeds up the overall analysis workflow.
The way DeepSeek processes reviews from start to finish includes:
The positive benefits for DeepSeek are:
In addition, the use of the DeepSeek AI in responding to reviews has improved our response time by as much as 70%. For example, once a product has been launched, DeepSeek will check the reviews for updates every two (2) hours. The moment five distinct negative reviews stated, that it “does not function” on iPhone, the team had received notification and released a resolution in less than 2 hours. Therefore, a disaster was avoided.
| Platform | Price per Month | No Code | AI Analysis | Alerts | API Limitations |
|---|---|---|---|---|---|
| DeepSeek | $29—$199 | Yes | GPT-4 | Yes | 50+ |
| Brandwatch | $800—$3000 | Partial | Proprietary NLP | Yes | 30+ |
| Sprinklr | > $2000 | No | Proprietary NLP | Yes | 100+ |
The following establish the step-by-step method by which one can automate the collection of Trustpilot reviews and the creation and delivery of responses to customers.
In order to facilitate the automated and ongoing collection of Trustpilot reviews, you must first set up a process to collect new Trustpilot reviews on a regular basis, and store the information in a database or table.
Alternative A: No Code with ASCN.AI
Option B: Custom Python Script
import requests
from bs4 import BeautifulSoup
import time
def scrape_trustpilot(company, pages=5):
base_url = f"https://www.trustpilot.com/review/{company}"
all_reviews = []
for page in range(1, pages+1):
url = f"{base_url}?page={page}"
headers = {"User-Agent": "Mozilla/5.0"}
resp = requests.get(url, headers=headers)
if resp.status_code != 200:
print(f"Error loading page {page}")
continue
soup = BeautifulSoup(resp.content, 'html.parser')
reviews = soup.select('div.review-card')
for r in reviews:
text = r.select_one('p.review-content__text').text.strip()
rating = int(r.select_one('[data-service-review-rating]').get('data-service-review-rating'))
date = r.select_one('time').get('datetime')
author = r.select_one('span.consumer-information__name').text.strip()
all_reviews.append({
'text': text,
'rating': rating,
'date': date,
'author': author
})
time.sleep(3)
return all_reviews
data = scrape_trustpilot('amazon.com', pages=10)
print(f"Collected {len(data)} reviews")
Reducing response times from 24–48 hours to 1–2 hours leads to a 30–40% increase in customer retention.
Is it legal and ethical to collect reviews?
Collecting publicly available reviews is legal as long as there are no technical barriers or copyright infringement. However, you may be blocked by Terms of Service prohibiting you from doing so without permission.
Recommendations:
Accuracy of Sentiment Analysis:
The following actions will help achieve a higher degree of accuracy:
What Can OpenAI and DeepSeek Do Together?
DeepSeek data collected and analyzed can easily be re-formatted for convenience.
Compatibility with other review sites:
DeepSeek can collect and analyze reviews from Google Reviews, App Store, Play Market and Amazon Reviews and many more to provide a comprehensive overview of a Brand's reputation.
The information provided in this message is of a general nature and is for informational purposes only and does not provide, replace, or represent investment, legal, or security advice. Use of AI assistants requires that you understand how each platform specifically functions, and you must be conscious about using AI Assistants.
