python - requests

2 분 소요

python - requests

  • python library인 requests는 http 통신을 위한 python library입니다.
  • 그냥 “웹의 리소스들에게 데이터를 보내고 받는 일들을 한다”라고만 생각하셔도 됩니다.

Install requests

  • 일단 설치부터 해보겠습니다.
  • 저는 conda를 사용하기 때문에 conda install requests를 사용해서 설치합니다.
  • 의존성이 있는 다른 라이브러리들이 아주 많군요 호호호. 당연히 다 설치해줍니다.
conda install requests
Collecting package metadata (current_repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.8.3
  latest version: 4.9.2

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /Users/.../opt/anaconda3/envs/python_scratch

  added / updated specs:
    - requests


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    brotlipy-0.7.0             |py39h9ed2024_1003         333 KB
    certifi-2020.12.5          |   py39hecd8cb5_0         141 KB
    cffi-1.14.5                |   py39h2125817_0         216 KB
    chardet-4.0.0              |py39hecd8cb5_1003         195 KB
    cryptography-3.4.7         |   py39h2fd3fbb_0         693 KB
    idna-2.10                  |     pyhd3eb1b0_0          52 KB
    pip-21.0.1                 |   py39hecd8cb5_0         1.8 MB
    pysocks-1.7.1              |   py39hecd8cb5_0          31 KB
    python-3.9.2               |       h88f2d9e_0         9.9 MB
    requests-2.25.1            |     pyhd3eb1b0_0          52 KB
    setuptools-52.0.0          |   py39hecd8cb5_0         724 KB
    six-1.15.0                 |   py39hecd8cb5_0          27 KB
    sqlite-3.35.3              |       hce871da_0         1.1 MB
    tzdata-2020f               |       h52ac0ba_0         113 KB
    urllib3-1.26.4             |     pyhd3eb1b0_0         105 KB
    ------------------------------------------------------------
                                           Total:        15.4 MB

The following NEW packages will be INSTALLED:

  brotlipy           pkgs/main/osx-64::brotlipy-0.7.0-py39h9ed2024_1003
  ca-certificates    pkgs/main/osx-64::ca-certificates-2021.1.19-hecd8cb5_1
  certifi            pkgs/main/osx-64::certifi-2020.12.5-py39hecd8cb5_0
  cffi               pkgs/main/osx-64::cffi-1.14.5-py39h2125817_0
  chardet            pkgs/main/osx-64::chardet-4.0.0-py39hecd8cb5_1003
  cryptography       pkgs/main/osx-64::cryptography-3.4.7-py39h2fd3fbb_0
  idna               pkgs/main/noarch::idna-2.10-pyhd3eb1b0_0
  libcxx             pkgs/main/osx-64::libcxx-10.0.0-1
  libffi             pkgs/main/osx-64::libffi-3.3-hb1e8313_2
  ncurses            pkgs/main/osx-64::ncurses-6.2-h0a44026_1
  openssl            pkgs/main/osx-64::openssl-1.1.1k-h9ed2024_0
  pip                pkgs/main/osx-64::pip-21.0.1-py39hecd8cb5_0
  pycparser          pkgs/main/noarch::pycparser-2.20-py_2
  pyopenssl          pkgs/main/noarch::pyopenssl-20.0.1-pyhd3eb1b0_1
  pysocks            pkgs/main/osx-64::pysocks-1.7.1-py39hecd8cb5_0
  python             pkgs/main/osx-64::python-3.9.2-h88f2d9e_0
  readline           pkgs/main/osx-64::readline-8.1-h9ed2024_0
  requests           pkgs/main/noarch::requests-2.25.1-pyhd3eb1b0_0
  setuptools         pkgs/main/osx-64::setuptools-52.0.0-py39hecd8cb5_0
  six                pkgs/main/osx-64::six-1.15.0-py39hecd8cb5_0
  sqlite             pkgs/main/osx-64::sqlite-3.35.3-hce871da_0
  tk                 pkgs/main/osx-64::tk-8.6.10-hb0a8c7a_0
  tzdata             pkgs/main/noarch::tzdata-2020f-h52ac0ba_0
  urllib3            pkgs/main/noarch::urllib3-1.26.4-pyhd3eb1b0_0
  wheel              pkgs/main/noarch::wheel-0.36.2-pyhd3eb1b0_0
  xz                 pkgs/main/osx-64::xz-5.2.5-h1de35cc_0
  zlib               pkgs/main/osx-64::zlib-1.2.11-h1de35cc_3


Proceed ([y]/n)?  


Downloading and Extracting Packages
sqlite-3.35.3        | 1.1 MB    | ################################################################################################################################################################# | 100% 
idna-2.10            | 52 KB     | ################################################################################################################################################################# | 100% 
tzdata-2020f         | 113 KB    | ################################################################################################################################################################# | 100% 
cffi-1.14.5          | 216 KB    | ################################################################################################################################################################# | 100% 
six-1.15.0           | 27 KB     | ################################################################################################################################################################# | 100% 
brotlipy-0.7.0       | 333 KB    | ################################################################################################################################################################# | 100% 
python-3.9.2         | 9.9 MB    | ################################################################################################################################################################# | 100% 
urllib3-1.26.4       | 105 KB    | ################################################################################################################################################################# | 100% 
certifi-2020.12.5    | 141 KB    | ################################################################################################################################################################# | 100% 
cryptography-3.4.7   | 693 KB    | ################################################################################################################################################################# | 100% 
pip-21.0.1           | 1.8 MB    | ################################################################################################################################################################# | 100% 
pysocks-1.7.1        | 31 KB     | ################################################################################################################################################################# | 100% 
setuptools-52.0.0    | 724 KB    | ################################################################################################################################################################# | 100% 
chardet-4.0.0        | 195 KB    | ################################################################################################################################################################# | 100% 
requests-2.25.1      | 52 KB     | ################################################################################################################################################################# | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Tutorial - GET google result

  • request를 사용하여 google 사이트에 접속해서 python을 검색하고 그 결과를 가져와봅시다.
  • 일단 웹브라우저에서 구글 사이트에 들어가서, “python”을 검색하고 그때 URL을 확인해 보면 다음과 같습니다.
  • 중간에 q=python이라는 부분이 눈에 띄는데, 이는 google과 통신할 때 GET 방식으로 {"q": "python}를 parameter로 넘겨줬다라는 말이죠.
https://www.google.co.kr/search?q=python&... 
  • 따라서, requests를 사용해서 같은 방식으로 GET 방식으로 같은 parameter를 넘겨주면 됩니다.

Do it

  • Code는 다음과 같습니다.
import requests

"""
google 서비스에 아래 URL로 접속하여, 
그 결과를 google_result.html에 저장하는 코드
https://www.google.co.kr/search?q=python
"""

target_url = "https://google.com/search?"
query = {"q": "python"}

# GET 방식으로 target_url에 접속하여, query를 parameter로 넘김
r = requests.get(target_url, params=query)

with open("google_result.html", "w") as f:
    f.write(r.text)  # type(r.text): str
  • 저장된 google_result.html 파일을 웹브라우저로 열어보면, 진짜 구글의 결과인것을 알 수 있습니다.
  • 위 코드에서는 GET 방식을 사용했지만, POST 방식을 포함하여 다른 방식도 가능합니다. 이는 나중에 필요할 때 해볼게요.

Wrap-up

  • web에서 데이터를 crawling할 때 지금까지는 항상 selenium을 사용해왔습니다. selenium의 경우 Firefox와 같은 브라우저 인스턴스를 생성한 다음 해당 브라우저를 통해 데이터를 가져오는 방식이죠. 브라우저 자체가 좀 무겁기 때문에, 약간 느리고 무겁습니다.
  • 반면, request의 경우는 브라우저 없이 직접 통신하는 것처럼 보여서, 제가 보기에는 좀 더 가볍게 느껴져요. 따라서 다음에 데이터를 크롤링해야 한다면 seleinum을 사용하는 것보다, requests를 사용하는 것이 더 효율적일 것으로 보입니다.

댓글남기기