There are multiple ways to download files using python
Following are some
request
moduleurllib
module wget
module urllib
module urllib module can be used to download files over HTTP
It comes installed with Python
urlretrieve method of urllib.request
module can be used to download files
urlretrieve() method takes two arguments:
Following example downloads the free e-book Python for Everybody by Dr. Charles R. Severance
import urllib.request
url = 'http://do1.dr-chuck.com/pythonlearn/EN_us/pythonlearn.pdf'
urllib.request.urlretrieve(url, 'python_pdf.pdf')
requests
module requests
module can also be used to download file
It can be installed using pip
pip install requests
For Python 3
pip3 install requests
Following example downloads the free e-book Python for Everybody by Dr. Charles R. Severance
import requests
url = 'http://do1.dr-chuck.com/pythonlearn/EN_us/pythonlearn.pdf'
r = requests.get(url)
filename = 'p4e.pdf'
with open(filename,'wb') as output_file:
#write the content of downloaded file to a local file
output_file.write(r.content)
print(filename, 'is downloaded')
Based on file size, this can take some time
open(filename,'wb')
method is used to open a local file in binary write modeoutput_file.write(r.content) writes the downloaded content to the local file wget
module wget
is another such module of Python which allow to download files
It can be installed using pip
pip install wget
For python 3 pip
pip3 install wget
download()
method if wget
is used to download file
It takes url and local filename as parameters
import wget
url = 'http://do1.dr-chuck.com/pythonlearn/EN_us/pythonlearn.pdf'
wget.download(url, 'p4e_wget.pdf')
Files in can be downloaded in smaller chunks using requests
module
The requests.get()
method takes an optional stream
parameter which when set to True streams the file content
import requests
import shutil
def download_file(url, local_filename):
with requests.get(url, stream=True) as r:
with open(local_filename, 'wb') as f:
shutil.copyfileobj(r.raw, f)
return local_filename
url = 'http://do1.dr-chuck.com/pythonlearn/EN_us/pythonlearn.pdf'
local_filename = "p4e_streamed.pdf"
r.raw
treats the content like a file object, and requests stream=True
in get()
method
copyfileobj()
method of shutil is used to write to file while it is streamed
Some urls act as redirects to the actual download link
requests
module can be used to download files from such redirect links in Python
For this, get()
method has to be invoked with allow_redirects=True
import requests
url = 'https://readthedocs.org/projects/python-guide/downloads/pdf/latest/'
response = requests.get(url, allow_redirects=True)
open('python-guide.pdf', 'wb').write(response.content)