Following Redirects & Getting the Final URL
By default, HTTParty follows redirects automatically. To get the final URL after all redirects have been performed, you can use response.request.last_uri
:
response = HTTParty.get('https://mock.httpstatus.io/301')
final_url = response.request.last_uri.to_s
Explanation:
- HTTParty automatically follows redirects by default (no need to specify
follow_redirects: true
). response.request.last_uri
returns the final URI after all redirects.to_s
converts the URI object to a string
Getting the Redirect Location Without Following It
If you do not want to follow redirects, but still want to know where the page is redirecting to (useful for web crawlers, for example), you can use response.headers['location']
:
response = HTTParty.get('https://mock.httpstatus.io/301', follow_redirects: false)
if response.code >= 300 && response.code < 400
redirect_url = response.headers['location']
end
Explanation:
follow_redirects: false
- tells HTTParty not to automatically follow redirectsresponse.code >= 300 && response.code < 400
- checks if the response is a redirect (status codes 300-399)response.headers['location']
- contains the URL that the page is redirecting to
Configuration Options
HTTParty provides several options for controlling redirect behavior:
# Disable automatic redirect following
response = HTTParty.get('http://example.com', follow_redirects: false)
# Limit the number of redirects to follow
response = HTTParty.get('http://example.com', follow_redirects: { limit: 3 })
# Set a timeout for requests
response = HTTParty.get('http://example.com', timeout: 10)
# Combine multiple options
response = HTTParty.get('http://example.com',
follow_redirects: { limit: 5 },
timeout: 10,
headers: { 'User-Agent' => 'MyBot/1.0' }
)
Explanation:
follow_redirects: false
disables automatic redirect followingfollow_redirects: { limit: 3 }
follows up to 3 redirects, then stopstimeout: 10
sets a 10-second timeout for the request- You can combine multiple options in a single request
Getting the Redirect Chain
If you want to see all the URLs in the redirect chain, you can enable debug output or manually track redirects:
require 'httparty'
class RedirectFollower
include HTTParty
def self.get_with_redirects(url, max_redirects = 5)
redirect_chain = []
current_url = url
max_redirects.times do
response = get(current_url, follow_redirects: false)
redirect_chain << current_url
if response.code >= 300 && response.code < 400
current_url = response.headers['location']
else
break
end
end
# Get the final response
final_response = get(current_url)
redirect_chain << current_url
{
redirect_chain: redirect_chain,
final_url: current_url,
response: final_response
}
end
end
result = RedirectFollower.get_with_redirects('https://mock.httpstatus.io/301')
puts "Redirect chain: #{result[:redirect_chain]}"
puts "Final URL: #{result[:final_url]}"
Explanation:
- This custom class tracks each URL in the redirect chain
max_redirects
prevents infinite redirect loopsredirect_chain
is an array containing all URLs visited during redirectsget_with_redirects
method returns a hash with the redirect chain, final URL, and final response
Handling Different Redirect Types
Different HTTP redirect status codes have different meanings. Here's how to handle them:
response = HTTParty.get('http://example.com', follow_redirects: false)
case response.code
when 301
# Permanent redirect
redirect_url = response.headers['location']
puts "Permanently moved to: #{redirect_url}"
when 302
# Temporary redirect (Found)
redirect_url = response.headers['location']
puts "Temporarily redirected to: #{redirect_url}"
when 303
# See Other (should use GET for redirect)
redirect_url = response.headers['location']
puts "See other location: #{redirect_url}"
when 307
# Temporary redirect (preserves method)
redirect_url = response.headers['location']
puts "Temporary redirect (method preserved): #{redirect_url}"
when 308
# Permanent redirect (preserves method)
redirect_url = response.headers['location']
puts "Permanent redirect (method preserved): #{redirect_url}"
else
puts "No redirect (status: #{response.code})"
end
Explanation:
- 301 (Moved Permanently) - the resource has permanently moved to a new location
- 302 (Found) - the resource temporarily resides at a different location
- 303 (See Other) - the response to the request can be found at another URI using GET
- 307 (Temporary Redirect) - similar to 302, but guarantees the request method won't change
- 308 (Permanent Redirect) - similar to 301, but guarantees the request method won't change
Handling Relative Redirect URLs
Sometimes redirect locations are relative URLs and need to be combined with the base URL:
require 'uri'
response = HTTParty.get('http://example.com/page', follow_redirects: false)
if response.code >= 300 && response.code < 400
redirect_location = response.headers['location']
base_uri = URI.parse(response.request.last_uri.to_s)
# Handle both absolute and relative URLs
redirect_url = URI.join(base_uri, redirect_location).to_s
puts "Redirect URL: #{redirect_url}"
end
Explanation:
URI.parse(response.request.last_uri.to_s)
gets the base URI of the original requestURI.join(base_uri, redirect_location)
combines the base URI with the redirect location, handling both absolute and relative URLs correctly
Practical Example: Building a URL Checker
Here's a complete example that checks URLs and reports their redirect behavior:
require 'httparty'
require 'uri'
class URLChecker
include HTTParty
def self.check_url(url)
begin
# First, check without following redirects
response = get(url, follow_redirects: false, timeout: 10)
if response.code >= 300 && response.code < 400
redirect_location = response.headers['location']
base_uri = URI.parse(url)
redirect_url = URI.join(base_uri, redirect_location).to_s
{
original_url: url,
status: response.code,
redirects: true,
redirect_url: redirect_url,
redirect_type: redirect_type(response.code)
}
else
{
original_url: url,
status: response.code,
redirects: false,
redirect_url: nil,
redirect_type: nil
}
end
rescue => e
{
original_url: url,
error: e.message
}
end
end
def self.redirect_type(code)
case code
when 301 then "Permanent"
when 302 then "Temporary (Found)"
when 303 then "See Other"
when 307 then "Temporary (Method Preserved)"
when 308 then "Permanent (Method Preserved)"
else "Unknown"
end
end
end
# Usage
result = URLChecker.check_url('https://mock.httpstatus.io/301')
puts "Status: #{result[:status]}"
puts "Redirects: #{result[:redirects]}"
puts "Redirect URL: #{result[:redirect_url]}" if result[:redirects]
puts "Redirect Type: #{result[:redirect_type]}" if result[:redirects]
Explanation:
- This class provides a complete URL checking solution
- It handles both redirects and non-redirects
- It properly handles relative redirect URLs
- It includes error handling for network issues
- It reports the redirect type for better understanding
Summary
Scenario | Method |
---|---|
Get redirect URL without following | response.headers['location'] with follow_redirects: false |
Get final URL after following redirects | response.request.last_uri.to_s |
Check if response is a redirect | response.code >= 300 && response.code < 400 |
Handle relative redirect URLs | Use URI.join(base_uri, redirect_location) |
Limit number of redirects | follow_redirects: { limit: n } |