Location>code7788 >text

Crawler case 2 - crawl the video of one of the three ways: selenium chapter (2)

Popularity:471 ℃/2024-09-11 19:51:11

@

catalogs
  • preamble
  • Introduction to selenium
  • real-life example
  • encourage sb to do sth
  • blog (loanword)

preamble

succeed sth. that has been usedRequests library crawls articles for good-looking videosAfter, this article shares the use of python third-party library selenium library then to crawl the video site, the follow-up will also then share the use of third-party library DrissionPage crawl video.

Introduction to selenium

selenium is a toolset for web application testing that runs directly in the browser, just as a real user would operate it. It is mainly used in automated testing, web crawlers and automated tasks. selenium provides interfaces to many programming languages such as java, python, c#, and so on. This allows developers to write their own scripts to automate web application testing.

real-life example

Without further ado, directly on the source code

from selenium import webdriver # Browser Driver
from import By # used to position elements on a web page
from import time # Time function
import os # File management module
import requests # Data request module


if not ('. /videos1'): # create folder
    ('. /videos1')
def video(data): # Define the function that requests each video detail
    for url in data: # Iterate over the addresses of each video detail.
        driver=() # Initialize the browser instance
        (url) # open the url page
        src=driver.find_element(by=By.CLASS_NAME, value='art-video') # Get the detail address of each detail video
        src=src.get_attribute('src')
        name=driver.find_element(by=By.CLASS_NAME, value='videoinfo-title') # Get the title of each detail video
        name=
        video_detail=(src).content # make a request for each detail video
        with open('. /videos1/'+name+'.mp4','wb') as f: # store the video
            (video_detail)
        print(name,src)
        () # Close the browser
driver=() # Initialize the browser instance
("/") # Open the URL
for i in range(1,6): driver.execute_script("=2000") # Open the URL.
    driver.execute_script("=2000") # slide down the page
    (1)
(2)
data_video=driver.find_elements(by=By.CLASS_NAME,value='videoItem_videoitem__Z_x08') # Locate the video info.
data=[] # Define an empty list to store the address of each time review
for a in data_video: #
    href=a.get_attribute("href") # Get the video address
    (href)
print(data)
(2)
() # Close the browser
video(data) # call the video() function

encourage sb to do sth

Ability determines the lower limit, opportunity determines the upper limit

blog (loanword)

  • I am a fan of infiltration, and from time to time, I will be on WeChat (laity's path to penetration testing) to update some real-world penetration of real-world cases, interested students can pay attention to it, we make progress together.
    • Previously in the public number released a kali crack WiFi article, interested students can go to see, in the b station (up master:laity1717) also released the correspondingInstructional Videos