[Python & Selenium Chromedriver] I have a txt file that has a list of URLs (sitemap.txt) and I want to use that list to automate a script that goes through each URL to validate whether an element (script tag) is present or not. Is there a better way to get the URL's from the txt file to loop into the automation example: get url from txt file > url is opened through chrome browser > inspect webpage > validate https://j.6sc.co/6si.min.js is present> go to the next line of url txt file looks like https://zendesk.com https://zendesk.com/service CODE from selenium import webdriver #-*- coding: utf-8 -*- import re import urlparse def findnth(haystack, needle, n): parts= haystack.split(needle, n+1) if len(parts)<=n+1: return -1 return len(haystack)-len(parts[-1])-len(needle) with open("sitemap.txt") as file: for line in file: substring = "https://essentials.zendesk.com/" if substring in line: start = line.find('h') end = findnth(line, "<", 2) print(line[start:end]) for i in line: driver = webdriver.Chrome() driver.get(i) source = driver.find_element_by_xpath("//script[@src='https://j.6sc.co/6si.min.js']");
[Python & Selenium Chromedriver] I have a txt file that has a list of URLs (sitemap.txt) and I want to use that list to automate a script that goes through each URL to validate whether an element (script tag) is present or not. Is there a better way to get the URL's from the txt file to loop into the automation
example: get url from txt file > url is opened through chrome browser > inspect webpage > validate https://j.6sc.co/6si.min.js is present> go to the next line of url
txt file looks like
<url><loc>https://zendesk.com</loc></url>
<url><loc>https://zendesk.com/service</loc></url>
CODE
from selenium import webdriver
#-*- coding: utf-8 -*-
import re
import urlparse
def findnth(haystack, needle, n):
parts= haystack.split(needle, n+1)
if len(parts)<=n+1:
return -1
return len(haystack)-len(parts[-1])-len(needle)
with open("sitemap.txt") as file:
for line in file:
substring = "https://essentials.zendesk.com/"
if substring in line:
start = line.find('h')
end = findnth(line, "<", 2)
print(line[start:end])
for i in line:
driver = webdriver.Chrome()
driver.get(i)
source = driver.find_element_by_xpath("//script[@src='https://j.6sc.co/6si.min.js']");

Step by step
Solved in 2 steps









