Skip to content

Commit ca91998

Browse files
authored
Update README.md
1 parent 85b2aed commit ca91998

File tree

1 file changed

+47
-1
lines changed

1 file changed

+47
-1
lines changed

README.md

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,47 @@
1-
# scrape
1+
# scrape cli
2+
3+
It's the command-line version of the [great scraping tool](https://github.com/jeroenjanssens/data-science-at-the-command-line/blob/master/tools/scrape) written by [Jeroen Janssens](http://jeroenjanssens.com).
4+
5+
It extracts HTML elements using an XPath query or CSS3 selector.
6+
7+
Example usage:
8+
9+
```
10+
$ curl -L 'http://en.wikipedia.org/wiki/List_of_sovereign_states' -s \
11+
| scrape -be 'table.wikitable > tbody > tr > td > b > a'
12+
```
13+
14+
It gives you back:
15+
16+
```html
17+
<html>
18+
<head>
19+
</head>
20+
<body>
21+
<a href="/wiki/Afghanistan" title="Afghanistan">
22+
Afghanistan
23+
</a>
24+
<a href="/wiki/Albania" title="Albania">
25+
Albania
26+
</a>
27+
<a href="/wiki/Algeria" title="Algeria">
28+
Algeria
29+
</a>
30+
<a href="/wiki/Andorra" title="Andorra">
31+
Andorra
32+
</a>
33+
<a href="/wiki/Angola" title="Angola">
34+
Angola
35+
</a>
36+
<a href="/wiki/Antigua_and_Barbuda" title="Antigua and Barbuda">
37+
Antigua and Barbuda
38+
</a>
39+
<a href="/wiki/Argentina" title="Argentina">
40+
Argentina
41+
</a>
42+
<a href="/wiki/Armenia" title="Armenia">
43+
Armenia
44+
</a>
45+
</body>
46+
</html>
47+
```

0 commit comments

Comments
 (0)