Skip to content

Commit 1bb7250

Browse files
committed
Set User-Agent: header field in HTTP request for curl downloads
Some servers (for example wikimedia.org) don't allow downloads with the default user agent of libcurl and send HTTP status 403, so OCR for images on such servers fails. Setting the user agent to "Tesseract OCR" allows OCR for images on those servers. Signed-off-by: Stefan Weil <[email protected]>
1 parent bcd6144 commit 1bb7250

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

src/api/baseapi.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1184,6 +1184,10 @@ bool TessBaseAPI::ProcessPagesInternal(const char *filename, const char *retry_c
11841184
if (curlcode != CURLE_OK) {
11851185
return error("curl_easy_setopt");
11861186
}
1187+
curlcode = curl_easy_setopt(curl, CURLOPT_USERAGENT, "Tesseract OCR");
1188+
if (curlcode != CURLE_OK) {
1189+
return error("curl_easy_setopt");
1190+
}
11871191
curlcode = curl_easy_perform(curl);
11881192
if (curlcode != CURLE_OK) {
11891193
return error("curl_easy_perform");

0 commit comments

Comments
 (0)