Skip to content

Commit adf1ccb

Browse files
Initial Commit
0 parents  commit adf1ccb

File tree

5 files changed

+137
-0
lines changed

5 files changed

+137
-0
lines changed

data.csv

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
"title","content","date","variant","images","verified","author","rating","product","url"
2+
"Look just like the photo","I love the shoes they are true to size , I wear a 7 1/2 but I ordered a 8 to allow a little extra room and I got just that","18 Sep 2018","Size: 8 Color: White/Metallic Silver/Dark Grey","https://images-na.ssl-images-amazon.com/images/I/81wdRdaAfmL._SY88.jpg","Yes","Diane Johnson","5.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
3+
"
4+
"After 20 returns","Just writing a rare review on these . I love Nike’s but my feet don’t usually. So I’ve ordered and returned a lot. Tried again lol and these ARE AMAZING comfortable. So much that I may order 3 more this year just to have them. The color is so cute and clean and sporty. I’m 99.9% sure I’ve dinally found a pair of Nikes I’m not going to return , fingers crossed 🤞😊","08 May 2019","Size: 10 Color: White/Metallic Silver/Dark Grey","https://images-na.ssl-images-amazon.com/images/I/717EKthL0BL._SY88.jpg","Yes","sherlain miranda","4.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
5+
"
6+
"Half size up :)","I have only run in them a couple of times, but so far they feel great. I normally wear a size 9, and I ordered a 9.5, and I’m glad I did. There is plenty of room, but not too much. The only thing that will take me getting used to is that they are slightly bulkier than my ASICS. Like any running shoes, I’m sure it will take a little bit of time to work out the stiffness and fully break them in, but they are amazing!","14 Dec 2018","Size: 9.5 Color: White/Metallic Silver/Dark Grey","https://images-na.ssl-images-amazon.com/images/I/71ap+mslLBL._SY88.jpg","Yes","Blondie","5.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
7+
"
8+
"Surprisingly comfortable","I have a very hard time finding shoes that are comfortable for me, I have severe fibromyalgia with no help from medications because nothing works for me. So I am constantly looking for comfortable clothing and sneakers. Usually nike isnt good for me but after deciding to give this pair a try I was happy I did. Not only are they nice looking but they really are comfortable. I wore them all day today did a lot of walking and have zero issues with my feet hurting. Worth the extra money","26 Aug 2018","Size: 8 Color: White/Metallic Silver/Dark Grey","","Yes","E Diaz","5.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
9+
"
10+
"Cute and Stylish..","Great workout shoes. Very comfortable.","04 Sep 2017","","https://images-na.ssl-images-amazon.com/images/I/713WLuPPK-L._SY88.jpg","Yes","Angel Buchanan","5.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
11+
"
12+
"Great, Comfortable Shoe","This is my second pair. They are the most comfortable tennis shoes I own, I can be on my feet all day and they don't hurt. I normally wear a 8.5 but in this shoe I wear a 9. I find the toe box is wider and more comfortable especially if I wear a little thicker sock, it's just roomy enough to my toes are scrunched together. I don't have a wide foot either.","18 Jul 2017","Size: 9 Color: White/Metallic Silver/Dark Grey","","Yes","msgrnbay","5.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
13+
"
14+
"Great shoes","This is my second pair. I think these shoes are comfy and aren't hard to keep looking fresh and white. I wear them to work every day and wash them about once a month and let them air dry. Love them.","02 Aug 2017","Size: 9 Color: White/Metallic Silver/Dark Grey","","Yes","Amy Klein","5.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
15+
"
16+
"Loving these shoes","I'm really loving these shoe's. A little pricey. I'm on my feet a lot for work plus I jog. I've worn them three times now and they're very comfortable. I wear a 6 1/2, purchase a 6 1/2, fits perfect. I like my shoes a little wide in the toe area. Read some reviews that the heels were too wide, I haven't an experience that.","15 Jul 2019","Size: 6.5 Color: White/Metallic Silver/Dark Grey","https://images-na.ssl-images-amazon.com/images/I/81V1HUrmAZL._SY88.jpg
17+
https://images-na.ssl-images-amazon.com/images/I/81XLfLSdzvL._SY88.jpg","Yes","stephanie stone","5.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
18+
"
19+
"Gorgeous shoes","Gorgeous shoes! I love it. Nice for running","24 Oct 2019","Size: 6.5 Color: White/Metallic Silver/Dark Grey","https://images-na.ssl-images-amazon.com/images/I/71pw3tp3dWL._SY88.jpg
20+
https://images-na.ssl-images-amazon.com/images/I/71wccS4Z-QL._SY88.jpg","Yes","Kamila","5.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
21+
"
22+
"Three Stars","Look great but my feet are killing me during my walk and after my walk.","02 Aug 2018","Size: 8.5 M US Color: White/Metallic Silver/Dark Grey","","Yes","Michelle Finnegan","3.0","Nike Women's Reax Run 5 Running Shoes","https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
23+
"
24+
"4GB RAM NOT 8GB RAM!","This is NOT an 8GB RAM! It is only 4GB RAM! With that being said, it is advertised as an 8GB RAM, packaging was basically triple boxes, packaged very well! Came with backpack and mouse pad as described. Initial start up was slow, but moving fast now that it’s set up. Only gave 3 stars because the RAM is a significant part in purchasing the right laptop!","26 Nov 2019","","https://images-na.ssl-images-amazon.com/images/I/71T29bOsioL._SY88.jpg","Yes","Atara83","3.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
25+
"Great Laptop, Good Value, well worth it","Amazon delivered my order promptly. This laptop bundle came in a solid double packaging, and it comes with a backpack and a mouse pad. I am so amazed by the backpack & mouse pad quality, and I don’t mind the little legendary logo both accessories. I had HP laptops previously and had been pleased with their performance. This one is exceptional for someone who isn't into a lot of gaming and such. I am self employed and I bought this laptop for my work, I only need basic internet browsing and need to do some basic excel reporting, so this computer is perfect for me and at an amazing price, especially it came with bonus accessories.","03 Sep 2019","","https://images-na.ssl-images-amazon.com/images/I/61+EuNBo3AL._SY88.jpg","Yes","Maggie","5.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
26+
"Nice, but faulty!","I got a faulty laptop from day 1! Before I ordered I made sure to ask the seller if it’s NEW. Well, I got it and it shuts down every 2 minutes. They offered refund, but I live outside the US and returning this mess will cost me a lot. I thought it was tested/upgraded as stated. How can they missed mines when it shuts off right after start up? Now I’ve trust issues with purchasing electronics online now.","02 Nov 2019","","","Yes","Candice","1.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
27+
"Pretty much everything that i was looking for in a large laptop","the box delivered on time and safe. it was double boxed and packed well. the included laptop bag is good quality and will work well when i need to use it. i would not buy this laptop for the bag but it is a nice value add. speed and response on this laptop is everything that i was looking for. set up was easy and came 1/2 charged so i didn't have to immediately plug in. My only concern is battery life. this is the largest laptop that i have ever had. i use mainly for school and work. i don't game or stream a lot of video. but in the 4 hours that i have had the laptop open today, the battery has already dropped to 25%. i don't know if that is to be expected with the multiple windows open and the large screen but i don't expect to get 8 hours out of the charge if i am on it. I will keep the power cable near by. if you are using less, you might make a full day on a charge.","31 Oct 2019","","","Yes","Gregg Stepp","4.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
28+
"Excellent","Excellent computer, pretty easy setup, excellent quality backpack. I’d recommend this to anyone looking for a good computer at an affordable price.","02 Oct 2019","What's this?","","Yes","Megs77","5.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
29+
"Nice laptop","Nice looking laptop. Very light. Quality is very good and very happy with this purchase. Got this laptop earlier than expected. Highly recommended seller and a product!","30 Aug 2019","","","Yes","T. Khadu","5.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
30+
"To be happy!!","I have nothing to say because since I opened the Laptop, already was defected. I was very disappointed that the next day I have to return it. However, AMAZON, customer services took care this issue. Amazon has so many customer services over sea like Filipinas that I was very happy how those tech support help you. Since, the price is so good, I decided to re buy it.","23 Aug 2019","","","Yes","Jorge","3.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
31+
"Nice laptop bundle","It is an amazing back to school laptop bundle, HP laptop is fast and portable for my school study, and the backpack looks awesome. Overall, it is worth every penny. Thanks","19 Aug 2019","","","Yes","Amazon Customer","5.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
32+
"Worst purchase of 2019!","Extremely dissatisfied with this laptops lack of speed and how low quality it looks and feels. It truly behaves like a 13 year old slow laptop and I have high speed internet. It lags and takes forever to turn on it is so slow that I can get most info on my cellphone so much faster. I tried twice to get Amazon to send me a new Factory sealed one and they sent two that had the security tape removed and laptop was not factory packed. If it wasnt too late to return this I would do it in a heartbeat! Worst purchase of 2019 hands down!","16 Dec 2019","","","Yes","Myreviews","1.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"
33+
"Works very good.","Nothing to dislike about it. Nice laptop for the price.","02 Sep 2019","What's this?","","Yes","Robert C","5.0","2019 HP 15.6 Inch HD Premium Business Laptop PC, Intel Dual-core i3-7100U, 8GB DDR4 RAM, 1TB HDD, USB 3.1, HDMI, WiFi, Bluetooth, Windows 10, W/ Legendary Computer Backpack & Mouse Pad Bundle","https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews"

requirements.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
python-dateutil
2+
requests
3+
selectorlib

reviews.py

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
from selectorlib import Extractor
2+
import requests
3+
import json
4+
from time import sleep
5+
import csv
6+
from dateutil import parser as dateparser
7+
8+
# Create an Extractor by reading from the YAML file
9+
e = Extractor.from_yaml_file('selectors.yml')
10+
11+
def scrape(url):
12+
headers = {
13+
'authority': 'www.amazon.com',
14+
'pragma': 'no-cache',
15+
'cache-control': 'no-cache',
16+
'dnt': '1',
17+
'upgrade-insecure-requests': '1',
18+
'user-agent': 'Mozilla/5.0 (X11; CrOS x86_64 8172.45.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.64 Safari/537.36',
19+
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
20+
'sec-fetch-site': 'none',
21+
'sec-fetch-mode': 'navigate',
22+
'sec-fetch-dest': 'document',
23+
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
24+
}
25+
26+
# Download the page using requests
27+
print("Downloading %s"%url)
28+
r = requests.get(url, headers=headers)
29+
# Simple check to check if page was blocked (Usually 503)
30+
if r.status_code > 500:
31+
if "To discuss automated access to Amazon data please contact" in r.text:
32+
print("Page %s was blocked by Amazon. Please try using better proxies\n"%url)
33+
else:
34+
print("Page %s must have been blocked by Amazon as the status code was %d"%(url,r.status_code))
35+
return None
36+
# Pass the HTML of the page and create
37+
return e.extract(r.text)
38+
39+
# product_data = []
40+
with open("urls.txt",'r') as urllist, open('data.csv','w') as outfile:
41+
writer = csv.DictWriter(outfile, fieldnames=["title","content","date","variant","images","verified","author","rating","product","url"],quoting=csv.QUOTE_ALL)
42+
writer.writeheader()
43+
for url in urllist.readlines():
44+
data = scrape(url)
45+
if data:
46+
for r in data['reviews']:
47+
r["product"] = data["product_title"]
48+
r['url'] = url
49+
if 'verified' in r:
50+
if 'Verified Purchase' in r['verified']:
51+
r['verified'] = 'Yes'
52+
else:
53+
r['verified'] = 'Yes'
54+
r['rating'] = r['rating'].split(' out of')[0]
55+
date_posted = r['date'].split('on ')[-1]
56+
if r['images']:
57+
r['images'] = "\n".join(r['images'])
58+
r['date'] = dateparser.parse(date_posted).strftime('%d %b %Y')
59+
writer.writerow(r)
60+
# sleep(5)
61+

selectors.yml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
product_title:
2+
css: 'h1 a[data-hook="product-link"]'
3+
type: Text
4+
reviews:
5+
css: 'div.review div.a-section.celwidget'
6+
multiple: true
7+
type: Text
8+
children:
9+
title:
10+
css: a.review-title
11+
type: Text
12+
content:
13+
css: 'div.a-row.review-data span.review-text'
14+
type: Text
15+
date:
16+
css: span.a-size-base.a-color-secondary
17+
type: Text
18+
variant:
19+
css: 'a.a-size-mini'
20+
type: Text
21+
images:
22+
css: img.review-image-tile
23+
multiple: true
24+
type: Attribute
25+
attribute: src
26+
verified:
27+
css: 'span[data-hook="avp-badge"]'
28+
type: Text
29+
author:
30+
css: span.a-profile-name
31+
type: Text
32+
rating:
33+
css: 'div.a-row:nth-of-type(2) > a.a-link-normal:nth-of-type(1)'
34+
type: Attribute
35+
attribute: title
36+
next_page:
37+
css: 'li.a-last a'
38+
type: Link

urls.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
https://www.amazon.com/Nike-Womens-Reax-Running-Shoes/product-reviews/B07ZPL752N/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews
2+
https://www.amazon.com/HP-Business-Dual-core-Bluetooth-Legendary/product-reviews/B07VMDCLXV/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews

0 commit comments

Comments
 (0)