让我们找到最好的独享代理IP服务器
Proxy-Seller的独享、ISP、移动代理 - 永远不会被阻止

Python Requests Retry: A Comprehensive Step-by-Step Guide

Are you looking for a way to automate retries in Python’s requests? Then come in and read our guide on how to do that effortlessly using a retry and backoff strategy.

By default, when a request fails, python’s requests will move to the next line of the code or the code will crash if there is no exception handling in place. For experimental and teaching purposes, most codes you will see only use the requests.get(url) line of code. There is nothing wrong with this provided the request didn’t fail. However, when it fails, your code will most likely break especially if it requires that web resource to function.

There is a need for a retry functionality and strategy which is not well covered but is most certainly a requirement as your web scraper or crawler will fail. In this guide, I will show you how to set up a retry strategy when sending web requests with python requests. By doing that, you should be able to develop a scraper, crawler, or an API data access tool that is resilient and can retry requests after the initial failure.


Retry Strategy and Functionality in Python Requests

Retry Strategy and Functionality in Python Requests

What do you need to code an effective retry strategy that works? Generally, you need to have a backoff strategy, the number of times you want to retry a request before it is termed a failure, the HTTP status codes that suggest a retry is a feasible solution to the failed request, and the specific request methods you want this to work for. Let’s take a look at each of this one after the other.

Total Number of Retry:

You need to decide how many times you want to retry the request. The standard is 3 and that is what I use in my web scrapers. When a request is retried 3 times and it didn’t succeed, then you can tell there is no need to retry it at that point in time. Depending on your own requirement, you can set it to 1, 2 or even a higher number than the 3 I recommend.

Backoff Strategy:

the backoff strategy is basically a wait period before retrying a request. This is required because you wouldn’t want to overwhelm your target server as it cold worsen the situation. In this guide, I will use the exponential backoff strategy which increase the wait time exponentially with each failed attempt. Take for instance. If the first wait period of 3 seconds, the next wait period will be 6 if the request fails.

HTTP Status Force List:

the status force list is basically the HTTP status code that suggest the request can be replied. Status codes like 500, 502, 503, and 504 are server level error codes and suggest the request can be tried. However, if you get the 200 status code, it means the request was successful.

Other status codes not explicitly mentioned will result in a failed request on the first try. Take, for instance, the 400 error code which suggests a client-level error has occurred, no amount of retry will help. In this case, you need to fix things from your end.

Request Method Support: Lastly, specify the specific request methods you want to support. In my case, I usually include all of the request methods as default so I wouldn’t need to keep making changes now and then. If you are working on a critical application that sends data and you want to risk sending the same piece of data twice, you might as well remove the POST request method.

All of the above forms your retry strategy and will be defined in the Retry object.


Step by Step Guide with Code on How to Try Requests

Now that you know what is required to retry a request in Python, let's try it out in practice. This will be done here in specific steps so you can follow and implement it yourself.

Step 1: Import The Necessary Libraries

All you need to create a retry logic is in the requests package. If you don’t have it installed, you can use the “pip install requests” command to install it. As for the imports, you need 3 libraries this includes requests, HTTPAdapter, and Retry. Create a new Python file and import them as done below.

import requests

from requests.adapters import HTTPAdapter

from requests.packages.urllib3.util.retry import Retry

Step 2: Create a Make Request Function

With the way retry works, you don’t litter your code with the chunks everywhere you need to send a request as it is not a one-liner. Instead, you should create a function. As you can see below, I created a function with the name make_request_with_retries and this function takes a URL, request method, retries, backoff_factor, headers, and data as input with only the URL as the required parameter — the other parameters already have a default set that will be used if no value is provided for them.

In this guide, the parameter you should pay attention to is the retries, backoff_factor as they are the most important for retries. In the function, let's define a variable known as a session and assign it a request session object as the value.

def make_request_with_retries(url, method="GET", retries=3, backoff_factor=0.3, headers=None, data=None):

session = requests.Session()

Step 3: Define the Retry Strategy

Here, we will define how Python’s request will handle the retry using the Retry object. All we need to do is assign values to parameters. The parameters are what we described in the retry strategy section at the beginning of the article. What you need to define here includes the total which takes on the value for the retries in our case, the default value is 3, the backoff_factor is 0.3, the status_forcelist which are the 500 level errors as those are the ones you will want to retry, and lastly, the requests methods.

After defining the Retry object, you should then add it to the HTTPAdapter for it to work and add the adapter to the session you created earlier.

retry_strategy = Retry(

total=retries,

backoff_factor=backoff_factor,

status_forcelist=[500, 502, 503, 504],

allowed_methods=["HEAD", "GET", "OPTIONS", "POST"]

)

adapter = HTTPAdapter(max_retries=retry_strategy)

session.mount("http://", adapter)

session.mount("https://", adapter)

Step 4: Plug in the Headers

Usually, you will need to set headers for your requests for them. Some of the basic headers you need to set are the user agent string and content type. If you don’t send the user agent string, your request will be identified as a request from the Python requests library and might get denied.

Set the user agent string of a popular browser like Chrome. In some cases, you need to set more headers. Look out for required headers using the Chrome dev tool. The headers will be provided as parameters to the function in the form of a dictionary. In this code, all you have to do is add the headers to the session object if headers are provided.

if headers:

session.headers.update(headers)

Step 5: Send Web Request

With the above, you are done with the hard part. All you have to do now is to send the web request. Mind you, because you intend to use this function for both GET and POST request methods, you need to use a conditional statement to check if the request is a GET request which requires no post data. If it is a POST request, the function should add the post data to the request before sending the request.

All of this should be within a try – except statement so that when a different type of request exception occurs such as a 400 client error, the script will gracefully handle it and not crash. Below is an example of that in practice.

try:

if method.upper() == "POST":

response = session.post(url, data=data)

else:

response = session.get(url)

response.raise_for_status()

return response.content

except requests.exceptions.RequestException as e:

print(f"Request failed: {e}")

return None

The Full Code for Retrying Requests in Python

Below is the full code. You can modify it to suit your specific use case. To use it, all you need is to import and call it with the required parameters and it should retry your requests when it fails.

import requests

from requests.adapters import HTTPAdapter

from requests.packages.urllib3.util.retry import Retry

def make_request_with_retries(url, method="GET", retries=3, backoff_factor=0.3, headers=None, data=None):

session = requests.Session()

retry_strategy = Retry(

total=retries,

backoff_factor=backoff_factor,

status_forcelist=[500, 502, 503, 504],

allowed_methods=["HEAD", "GET", "OPTIONS", "POST"]

)

adapter = HTTPAdapter(max_retries=retry_strategy)

session.mount("http://", adapter)

session.mount("https://", adapter)

if headers:

session.headers.update(headers)

try:

if method.upper() == "POST":

response = session.post(url, data=data)

else:

response = session.get(url)

response.raise_for_status()

return response.content

except requests.exceptions.RequestException as e:

print(f"Request failed: {e}")

return None

结论

As you can see above, retrying requests in Python is easy. However, the logic is better encapsulated in a function and used. The code above is quite basic and will do more with better exception handling and proxy usage. With the right proxy such as a rotating proxy, you can set up a logic to switch IP if you get a rate limited by your target. There is more you can do with it to make your code more robust and avoid crashing because of a failed request on the first try.

我们很高兴听到您的想法

发表回复

blank网络搜索代理 - 永远不会被阻止

在 Pinterest 上 Pin It

zh_CNChinese
Private Proxy Reviews